class: middle, centered # Three more 🥄🥄🥄 of ggplot2 ### Post Processing, extensions and internals <br /> <br /> #### Geneve R User Meetup — April 26th 2018 #### Xavier Adam (twitter: [@xvrdm](http://twitter.com/xvrdm/)) <img src="./epfl-exts-logo.png" width="30%"> --- class: middle # Menu du jour A look at what you can do **around** a basic ggplot2 chart. -- ### 1. ggplot2 **post-processing** -- ### 2. ggplot2 **extensions** -- ### 3. ggplot2 **surgery** --- class: inverse, center, middle # 1. Post-processing charts without leaving R --- ### Polishing ggplot2 charts [ROpenSci](https://twitter.com/rOpenSci)'s [`magick` package](https://github.com/ropensci/magick) is your little photoshop inside R. It lets you use the underlying [ImageMagick](https://www.imagemagick.org/script/index.php) graphics library. - [Github source for `magick`](https://github.com/ropensci/magick) - [Vignette with lots of examples](https://cran.r-project.org/web/packages/magick/vignettes/intro.html) Install with: ```r install.packages("magick") # If you get some errors check github page for info on # potentially missing dependencies ``` --- We will work on a simple chart with a basic dataset. ```r library(ggplot2) library(magrittr) library(dplyr) data <- readRDS("questions_per_week.Rds") data ``` ``` ## # A tibble: 42 x 3 ## Week Assignee `# of questions` ## <date> <fct> <int> ## 1 2018-01-15 Bob 13 ## 2 2018-01-15 Alice 0 ## 3 2018-01-15 Unassigned 0 ## 4 2018-01-22 Bob 16 ## 5 2018-01-22 Alice 2 ## 6 2018-01-22 Unassigned 0 ## 7 2018-01-29 Bob 13 ## 8 2018-01-29 Alice 0 ## 9 2018-01-29 Unassigned 0 ## 10 2018-02-05 Bob 4 ## # ... with 32 more rows ``` --- Basic line chart skeleton with `ggplot2`. ```r data %>% ggplot(aes(x=Week, y=`# of questions`)) + geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=1.5) + labs(title="Number of questions per week") + theme_minimal() ``` ![](deck_files/figure-html/unnamed-chunk-3-1.svg)<!-- --> --- Basic line chart skeleton with `ggplot2`. `ggplot()` creates your underlying layer and maps columns to axes. ```r data %>% * ggplot(aes(x=Week, y=`# of questions`)) ``` ![](deck_files/figure-html/unnamed-chunk-4-1.svg)<!-- --> --- Basic line chart skeleton with `ggplot2`. `ggplot::geom_...()` creates the charts elements. Here `geom_line` creates lines (one color per Assignee). ```r data %>% ggplot(aes(x=Week, y=`# of questions`)) + * geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=1.5) ``` ![](deck_files/figure-html/unnamed-chunk-5-1.svg)<!-- --> --- Basic line chart skeleton with `ggplot2`. Lastly we play with some cosmetic settings (`ggplot::theme_...()`) and add a title (`ggplot::labs()`). ```r (data %>% ggplot(aes(x=Week, y=`# of questions`)) + geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=1.5) + * labs(title="Number of questions per week") + * theme_minimal() -> sc) ``` ![](deck_files/figure-html/unnamed-chunk-6-1.svg)<!-- --> --- How can we go from this... ![](deck_files/figure-html/unnamed-chunk-7-1.svg)<!-- --> ...to this: <img src="fc.png" width="80%"> --- Adding support for Google Fonts, thanks to [Yixuan Qiu](https://statr.me)'s [`showtext`](https://github.com/yixuan/showtext) package. ```r *library(showtext) *font_add_google("Rubik", "Rubik") *showtext_auto() data %>% ggplot(aes(x=Week, y=`# of questions`)) + geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=1.5) + labs(title="Number of questions per week") + theme_minimal() + * theme(text = element_text(family = "Rubik")) ``` ![](deck_files/figure-html/unnamed-chunk-8-1.svg)<!-- --> --- Save the chart in a `.png` file with `ggplot2::ggsave()`. ```r library(showtext) font_add_google("Rubik", "Rubik") showtext_auto() # Here we assign the chart to a `lchart` variable # at the end of the pipeline instead of printing it # Note that we did not add a title... data %>% ggplot(aes(x=Week, y=`# of questions`)) + geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=2) + theme_minimal() + * theme(text = element_text(family = "Rubik")) -> lchart ``` -- ```r # Then we save it, as a png. *ggsave(lchart, filename="chart.png", width = 8, height = 3) ``` Now we have `chart.png` in our computer, ready to be modified. --- Adding 6-pixels-thick horizontal borders to a chart with `magick::image_border()`. ```r library(magick) # Reopen the image and assign it to variable `chart` chart <- image_read("chart.png") # Save the height in pixel of the image height <- image_data(chart) %>% attributes() %>% .$dim %>% .[[3]] # Adding top and bottom red borders chart %>% image_border("red", "0x6") ``` <img src="deck_files/figure-html/unnamed-chunk-11-1.png" width="3200" /> --- Make a more complex margin: 50px + 6px + 170px. Then crop the bottom margin. I haven't found a way yet to not create the bottom margin, the `magick::image_border()` seems to only work symetrically ```r (chart_w_margins <- chart %>% image_border("white", "0x50") %>% image_border("red", "0x6") %>% image_border("white", "0x170") %>% # Crop the bottom margins image_chop(stringr::str_glue("0x{170}+0+{height+170+6*2+50*2}")) %>% image_border("grey", "2x2")) ``` <img src="deck_files/figure-html/unnamed-chunk-12-1.png" width="3205" /> --- Add a title with `magick::image_annotate()`. ```r (chart_w_title <- chart_w_margins %>% image_annotate("Number of questions asked per Assignee", font="Rubik", size = 70, color = "darkgrey", gravity="northwest", location = "+50+40")) ``` <img src="deck_files/figure-html/unnamed-chunk-13-1.png" width="3205" /> --- Load a `.svg` logo as another `magick` object, then use `magick::image_composite` to layer it on top of the chart. ```r logo <- image_read_svg("epfl-exts-logo.svg", height = 110) (chart_w_logo <- chart_w_title %>% image_composite(logo, offset = "+1870+30")) # Adding logo on top of chart ``` <img src="deck_files/figure-html/unnamed-chunk-14-1.png" width="3205" /> --- Wrapping the entire thing in two functions. A reusable template function: ```r epfl_exts_template <- function(chart, title) { chart %>% ggsave(filename="chart.png", width = 8, height = 3) chart <- image_read("chart.png") height <- image_data(chart) %>% attributes() %>% .$dim %>% .[[3]] chart %>% image_border("white", "0x50") %>% image_border("red", "0x6") %>% image_border("white", "0x170") %>% image_chop(stringr::str_glue("0x{170}+0+{height+170+6*2+50*2}")) %>% image_border("grey", "2x2") %>% image_annotate(title, font="Rubik", size = 70, color = "darkgrey", gravity="northwest", location = "+50+40") %>% image_composite(image_read_svg("epfl-exts-logo.svg", height = 110), offset = "+1870+30") } ``` A chart generator example using the template function: ```r epfl_exts_line_chart <- function(data, x, y, color, title) { data %>% { ggplot(., aes_string(x=x, y=y)) + geom_line(stat="identity", aes_string(color=color), alpha = 0.7, size=1.5) + theme_minimal() + theme(text = element_text(family = "Rubik"))} %>% * epfl_exts_template(title) } ``` --- Testing the final function. ```r epfl_exts_line_chart(data=data, x="Week", y="`# of questions`", color="Assignee", title="Let's try another title!") ``` <img src="deck_files/figure-html/unnamed-chunk-17-1.png" width="3205" /> --- class: inverse, center, middle # Animating charts in R --- ### Animating ggplot2 chart Back to our basic line chart. ```r data <- readRDS("questions_per_week.Rds") (data %>% ggplot(aes(x=Week, y=`# of questions`)) + geom_line(stat="identity", aes(color=Assignee), alpha = 0.7, size=1.5) + theme_minimal() -> chart) ``` ![](deck_files/figure-html/unnamed-chunk-18-1.svg)<!-- --> --- [David Robinson](https://twitter.com/drob)'s [`gganimate` package](https://github.com/dgrtwo/gganimate) provides simple R bindings for the animation features of ImageMagick. ```r *library(gganimate) data %>% * ggplot(aes(x=Week, y=`# of questions`, frame=Week)) + geom_line(stat="identity", * aes(color=Assignee, cumulative=TRUE), alpha = 0.7, size=2) + theme_minimal() + theme(text = element_text(size=30)) -> chart animation::ani.options(interval = 0.5) *gganimate(chart, "chart.gif", ani.width=1600, ani.height=600) ``` ![](./chart.gif) --- class: middle, center # But this looks still a bit jerky... ![](chart.gif) No worries, we have [Thomas Lin Pedersen](https://twitter.com/thomasp85)'s [`tweenr` package](https://github.com/thomasp85/tweenr)! --- [`tweenr`](https://github.com/thomasp85/tweenr) lets us create all the intermediary data points needed for a smooth transitions. From this dataframe:
To this:
--- ```r data %>% mutate(Week = lubridate::week(Week)) %>% split(.$Assignee) %>% purrr::map(function(d) { * purrr::map(seq(nrow(d)), * ~d[c(seq(.x), rep(.x, nrow(d) - .x)), ])}) %>% * purrr::map(~tweenr::tween_states(., 5, 2, 'cubic-in-out', 100)) %>% {do.call(rbind, .)} -> tweenr_data tweenr_data %>% ggplot(aes(Week, `# of questions`, frame = .frame)) + geom_path(aes(color=Assignee), alpha = 0.7, size=2) + theme_minimal() + theme(text = element_text(size=30))-> tween_chart animation::ani.options(interval = 0.05) *gganimate(tween_chart, "tween_chart.gif", title_frame = FALSE, * ani.width=1600, ani.height=600) ``` ![](./tween_chart.gif) --- class: middle, center # And Thomas is not done yet... <section> <img width="300" src="./transformr.gif"> </section> He is currently working on a [`transformr` package](https://github.com/thomasp85/transformr)<br /> for smooth polygon transformations! --- class: middle, center # And Thomas is not done yet... As well as on a complete grammar of animation for ggplot2! <img src="./thomas.gif" width="22%"> <blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">More grammar arrives:<br><br>ggplot(mtcars, aes(factor(cyl), mpg)) + <br> geom_boxplot() + <br> geom_point() +<br> transition_states(am, transition_length = 4, state_length = 1) + <br> view_follow() </p>— Thomas Lin Pedersen (@thomasp85) <a href="https://twitter.com/thomasp85/status/988863035388366851?ref_src=twsrc%5Etfw">April 24, 2018</a></blockquote> --- class: inverse, center, middle # 2. Using extensions --- # Where to find extensions? For extensions and themes, the best place to start is [ggplot2-exts.org](http://www.ggplot2-exts.org/gallery/). In addition to `gganimate`, there are 40 packages to discover. ![](./ggplot-extension.png) --- # `ggrepel` for smart label placement [Kamil Slowikowski](https://slowkow.com)'s [`ggrepel` package](https://github.com/slowkow/ggrepel) was one of the first extension to gain fame. It places your label for you so they don't overlap. ```r ggplot(data=mtcars, aes(x=mpg, y=wt)) + geom_point() + geom_text(label=rownames(mtcars)) + theme_minimal() + labs(title="Labels without ggrepel...") ``` ![](deck_files/figure-html/unnamed-chunk-23-1.svg)<!-- --> --- # `ggrepel` for smart label placement [Kamil Slowikowski](https://twitter.com/slowkow)'s [`ggrepel` package](https://github.com/slowkow/ggrepel) was one of the first extension to gain fame. It places your label for you so they don't overlap (lots of examples [on the vignette](https://cran.r-project.org/web/packages/ggrepel/vignettes/ggrepel.html)). ```r ggplot(data=mtcars, aes(x=mpg, y=wt)) + geom_point() + * ggrepel::geom_text_repel(label=rownames(mtcars)) + theme_minimal() + labs(title="Labels with ggrepel...") ``` ![](deck_files/figure-html/unnamed-chunk-24-1.svg)<!-- --> --- # `ggrepel` for smart label placement Since `ggrepel::geom_text_repel` is just another `geom` you can select subsets the same way as usual. ```r *subdata <- subset(mtcars, mpg>25 | wt>4) ggplot(data=mtcars, aes(x=mpg, y=wt)) + geom_point() + * ggrepel::geom_text_repel(label=rownames(subdata), data=subdata, * force=3, min.segment.length=0) + theme_minimal() + labs(title="Labels with ggrepel...") ``` ![](deck_files/figure-html/unnamed-chunk-25-1.svg)<!-- --> --- # `ggrepel` for smart label placement And if you want to manually select the points to label, you still have [Alicia Schep](https://twitter.com/aliciaschep)'s newly released [`gglabeller` package](https://github.com/AliciaSchep/gglabeller). ```r # Run the code below for live demo library(ggplot2) library(ggrepel) library(gglabeller) ggplot(mtcars, aes(mpg, wt)) + geom_point() + theme_minimal() -> p gglabeller(p, aes(label = rownames(mtcars))) ``` ### << `gglabeller` quick demo >> --- class: inverse, center, middle # 3. Spelunking ggplot internals --- # Word of caution > The internals of ggplot2 are dark and full of terrors... -- > ...or at least very sparsely documented. --- ## Three objects to explore If `p` is a plot done with `ggplot`. - The actual plot: `View(p)` - The build: `View(ggplot_build(p))` - The gtable: `View(ggplot_gtable(ggplot_build(p)))` -- ## A few nice ressources: - [Baptiste Auguie](http://baptiste.github.io)'s [Unofficial guide to gtable](https://cran.r-project.org/web/packages/gridExtra/vignettes/gtable.html) - [Bob Rudis](https://twitter.com/hrbrmstr)'s [guide to ggplot extensions](https://rud.is/books/creating-ggplot2-extensions/demystifying-ggplot2.html) - [Hadley Wickham](https://twitter.com/hadleywickham)'s official [ggplot2 extensions vignette](https://cran.r-project.org/web/packages/ggplot2/vignettes/extending-ggplot2.html) - Desperate Stack Overflow questions --- ## A (very) small dive in the internal ```r ggplot(mpg, aes(displ, hwy, colour = class)) + geom_point() -> gg str(ggplot_build(gg)) # Not for the faint-hearted... ``` ``` ## List of 3 ## $ data :List of 1 ## ..$ :'data.frame': 234 obs. of 10 variables: ## .. ..$ colour: chr [1:234] "#C49A00" "#C49A00" "#C49A00" "#C49A00" ... ## .. ..$ x : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ... ## .. ..$ y : num [1:234] 29 29 31 30 26 26 27 26 25 28 ... ## .. ..$ PANEL : int [1:234] 1 1 1 1 1 1 1 1 1 1 ... ## .. ..$ group : atomic [1:234] 2 2 2 2 2 2 2 2 2 2 ... ## .. .. ..- attr(*, "n")= int 7 ## .. ..$ shape : num [1:234] 19 19 19 19 19 19 19 19 19 19 ... ## .. ..$ size : num [1:234] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ... ## .. ..$ fill : logi [1:234] NA NA NA NA NA NA ... ## .. ..$ alpha : logi [1:234] NA NA NA NA NA NA ... ## .. ..$ stroke: num [1:234] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ... ## $ layout:Classes 'Layout', 'ggproto' <ggproto object: Class Layout> ## facet: <ggproto object: Class FacetNull, Facet> ## compute_layout: function ## draw_back: function ## draw_front: function ## draw_labels: function ## draw_panels: function ## finish_data: function ## init_scales: function ## map: function ## map_data: function ## params: list ## render_back: function ## render_front: function ## render_panels: function ## setup_data: function ## setup_params: function ## shrink: TRUE ## train: function ## train_positions: function ## train_scales: function ## vars: function ## super: <ggproto object: Class FacetNull, Facet> ## finish_data: function ## get_scales: function ## map: function ## map_position: function ## panel_layout: data.frame ## panel_ranges: list ## panel_scales: list ## render: function ## render_labels: function ## reset_scales: function ## setup: function ## train_position: function ## train_ranges: function ## xlabel: function ## ylabel: function ## super: <ggproto object: Class Layout> ## $ plot :List of 9 ## ..$ data :Classes 'tbl_df', 'tbl' and 'data.frame': 234 obs. of 11 variables: ## .. ..$ manufacturer: chr [1:234] "audi" "audi" "audi" "audi" ... ## .. ..$ model : chr [1:234] "a4" "a4" "a4" "a4" ... ## .. ..$ displ : num [1:234] 1.8 1.8 2 2 2.8 2.8 3.1 1.8 1.8 2 ... ## .. ..$ year : int [1:234] 1999 1999 2008 2008 1999 1999 2008 1999 1999 2008 ... ## .. ..$ cyl : int [1:234] 4 4 4 4 6 6 6 4 4 4 ... ## .. ..$ trans : chr [1:234] "auto(l5)" "manual(m5)" "manual(m6)" "auto(av)" ... ## .. ..$ drv : chr [1:234] "f" "f" "f" "f" ... ## .. ..$ cty : int [1:234] 18 21 20 21 16 18 18 18 16 20 ... ## .. ..$ hwy : int [1:234] 29 29 31 30 26 26 27 26 25 28 ... ## .. ..$ fl : chr [1:234] "p" "p" "p" "p" ... ## .. ..$ class : chr [1:234] "compact" "compact" "compact" "compact" ... ## ..$ layers :List of 1 ## .. ..$ :Classes 'LayerInstance', 'Layer', 'ggproto' <ggproto object: Class LayerInstance, Layer> ## aes_params: list ## compute_aesthetics: function ## compute_geom_1: function ## compute_geom_2: function ## compute_position: function ## compute_statistic: function ## data: waiver ## draw_geom: function ## finish_statistics: function ## geom: <ggproto object: Class GeomPoint, Geom> ## aesthetics: function ## default_aes: uneval ## draw_group: function ## draw_key: function ## draw_layer: function ## draw_panel: function ## extra_params: na.rm ## handle_na: function ## non_missing_aes: size shape colour ## optional_aes: ## parameters: function ## required_aes: x y ## setup_data: function ## use_defaults: function ## super: <ggproto object: Class Geom> ## geom_params: list ## inherit.aes: TRUE ## layer_data: function ## map_statistic: function ## mapping: NULL ## position: <ggproto object: Class PositionIdentity, Position> ## compute_layer: function ## compute_panel: function ## required_aes: ## setup_data: function ## setup_params: function ## super: <ggproto object: Class Position> ## print: function ## show.legend: NA ## stat: <ggproto object: Class StatIdentity, Stat> ## aesthetics: function ## compute_group: function ## compute_layer: function ## compute_panel: function ## default_aes: uneval ## extra_params: na.rm ## finish_layer: function ## non_missing_aes: ## parameters: function ## required_aes: ## retransform: TRUE ## setup_data: function ## setup_params: function ## super: <ggproto object: Class Stat> ## stat_params: list ## subset: NULL ## super: <ggproto object: Class Layer> ## ..$ scales :Classes 'ScalesList', 'ggproto' <ggproto object: Class ScalesList> ## add: function ## clone: function ## find: function ## get_scales: function ## has_scale: function ## input: function ## n: function ## non_position_scales: function ## scales: list ## super: <ggproto object: Class ScalesList> ## ..$ mapping :List of 3 ## .. ..$ x : symbol displ ## .. ..$ y : symbol hwy ## .. ..$ colour: symbol class ## ..$ theme : list() ## ..$ coordinates:Classes 'CoordCartesian', 'Coord', 'ggproto' <ggproto object: Class CoordCartesian, Coord> ## aspect: function ## distance: function ## expand: TRUE ## is_linear: function ## labels: function ## limits: list ## range: function ## render_axis_h: function ## render_axis_v: function ## render_bg: function ## render_fg: function ## train: function ## transform: function ## super: <ggproto object: Class CoordCartesian, Coord> ## ..$ facet :Classes 'FacetNull', 'Facet', 'ggproto' <ggproto object: Class FacetNull, Facet> ## compute_layout: function ## draw_back: function ## draw_front: function ## draw_labels: function ## draw_panels: function ## finish_data: function ## init_scales: function ## map: function ## map_data: function ## params: list ## render_back: function ## render_front: function ## render_panels: function ## setup_data: function ## setup_params: function ## shrink: TRUE ## train: function ## train_positions: function ## train_scales: function ## vars: function ## super: <ggproto object: Class FacetNull, Facet> ## ..$ plot_env :<environment: R_GlobalEnv> ## ..$ labels :List of 3 ## .. ..$ x : chr "displ" ## .. ..$ y : chr "hwy" ## .. ..$ colour: chr "class" ## ..- attr(*, "class")= chr [1:2] "gg" "ggplot" ``` --- ```r mpg %>% group_by(manufacturer) %>% summarize_at("cyl", "mean") %>% ggplot(aes(forcats::fct_reorder(manufacturer,cyl,.desc=TRUE), cyl)) + geom_point() + labs(x="manufacturer") -> gg gg + geom_text(aes(label=round(cyl,1)))# ``` ![](deck_files/figure-html/unnamed-chunk-28-1.svg)<!-- --> We could add a fixed `nudge_y` value to prevent point/label overlapping. But is there a more flexible way? Once you know where to look (which is the hardest part), there are gems to be found in ggplot internals. Below, we can get the final x and y axis ranges after they have been dynamically generated by `ggplot2`. ```r ggplot_build(gg) -> gb (gb$layout$panel_ranges[[1]]$y.range -> yrange) ``` ``` ## [1] 3.8 8.2 ``` --- Now use this information for your next geom. Rather than using fixed nudges... ```r *gg + geom_text(aes(label=round(cyl,1)), nudge_y=0.5) ``` ![](deck_files/figure-html/unnamed-chunk-30-1.svg)<!-- --> ... try this! Your optimal nudge distance is **automatically calculated**! ```r *gg + geom_text(aes(label=round(cyl,1)), nudge_y=(yrange[2]-yrange[1])/10) ``` ![](deck_files/figure-html/unnamed-chunk-31-1.svg)<!-- --> This solution scales to as many geoms as you need. --- class: inverse, middle ## In summary, recipe for happier ggplotting: #### 1. Don't forget you have <u>post-processing</u> options! -- #### 2. Remember that <u>extensions</u> are here to help! -- #### 3. Check these <u>internals</u> before thinking something is not doable! --- class: middle, centered # Thank you! #### Xavier Adam (twitter: [@xvrdm](http://twitter.com/xvrdm/)) <img src="./epfl-exts-logo.png" width="30%">