ggplot2 3.5.0

  ggplot2, ggplot2-3-5-0

  Teun van den Brand

We’re tickled pink to announce the release of ggplot2 3.5.0. ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

You can install it from CRAN with:

install.packages("ggplot2")

This blog post will cover a bunch of new features included in the latest release. In addition to rewriting the guide system, we made progress supporting newer R graphics capabilities, re-purposed the use of I(), and introduce an improved polar coordinate system, along with other improvements. As the release is quite large, we are making a series of blog posts covering the major changes.

You can see a full list of changes in the release notes

Guide rewrite

Axes and legends, collectively called guides, are an important component to plots, as they allow the translation of visual information back to data qualities. The extension mechanism of ggplot2 allows others to develop their own layers, facets, coords and scales through the ggproto object-oriented system. Finally, after years of being the only major system in ggplot2 still clinging to the S3 system, guides have been rewritten to use ggproto. With this rewrite, guides officially become an extension point that let developers implement their own guides. We have added a section to the Extending ggplot2 vignette on how to develop a new guide.

Alongside the rewrite, we made a slew of improvements to guides along the way. As these are somewhat meaty and focused topics, we are going to cover them in separate blog posts about axes and legends.

Patterns and gradients

Patterns and gradients are provided by the grid package, which ggplot2 builds on top of. They were first introduced in R 4.1.0 and were refined in R 4.2.0 to support multiple patterns and gradients. If your graphics device supported it, theme elements could already be set to patterns or gradients, even before this release.

Note: On Windows machines, the default device in RStudio and in the knitr package is png(), which does not support patterns. In RStudio, you can go to ‘Tools > Global Options > General > Graphics’ and choose the ‘ragg’ or ‘Cairo PNG’ device from the dropdown menu to display patterns.

gray_gradient <- linearGradient(scales::pal_grey()(10))

ggplot(mpg, aes(displ, hwy)) +
  geom_point() +
  theme(panel.background = element_rect(fill = gray_gradient))

Scatterplot of engine displacement versus highway miles per gallon. The panel background is a colour gradient starting from dark grey in the bottom-left corner ending at light grey in the upper-right corner.

We are pleased to report that as of this release, patterns can be used as the fill aesthetic in most layers. To use a pattern, first build a gradient using {grid}‘s linearGradient(), radialGradient() functions, or a pattern using the pattern() function. Because handling patterns and gradients is very similar, we will treat gradients as if they were patterns: when we say ‘pattern’ in the text below, please mind that we mean patterns and gradients alike. These patterns can be passed to a layer as the fill aesthetic. Below, you can see two behaviours of the linearGradient() pattern, depending on its group argument. The pattern with group = FALSE will display the gradient in every rectangle and group = TRUE will apply the gradient to all rectangles together.

colours <- scales::viridis_pal()(10)
grad_ungroup <- linearGradient(colours, group = FALSE)
grad_grouped <- linearGradient(colours, group = TRUE)

ungroup <- ggplot(mpg, aes(factor(cyl))) +
  geom_bar(fill = grad_ungroup) +
  labs(title = "Ungrouped gradient")

grouped <- ggplot(mpg, aes(factor(cyl))) +
  geom_bar(fill = grad_grouped) +
  labs(title = "Grouped gradient")

ungroup | grouped

Two barplots showing the counts of number of cylinders. The first plot is titled 'Ungrouped gradient' and shows individual gradients in the bars. The second is titled 'Grouped gradient' and shows a single gradient along all bars.

Besides passing a static pattern as the fill aesthetic, it is also possible to map values to patterns using scale_fill_manual(). To map values to patterns, pass a list of patterns to the values argument of the scale. When providing patterns as a list, the list can be a mix of patterns and plain colours, like "limegreen" in the plot below. We are excited that people may come up with nice pattern palettes that can be used in similar fashion.

patterns <- list(
  linearGradient(colours, group = FALSE),
  "limegreen",
  radialGradient(colours, group = FALSE),
  pattern(
    rectGrob(x = c(0.25, 0.75), y = c(0.25, 0.75), width = 0.5, height = 0.5),
    width = unit(5, "mm"), height = unit(5, "mm"), extend = "repeat",
    gp = gpar(fill = "limegreen")
  )
)

ggplot(mpg, aes(factor(cyl), fill = factor(cyl))) +
  geom_bar() +
  scale_fill_manual(values = patterns)

Barplot showing counts of number of cylinders with the bars filled by a linear gradient, a plain green colour, a radial gradient and a green checkerboard pattern.

The largest obstacle we had to overcome to support gradients in ggplot2 was to apply the alpha aesthetic consistently to the patterns. The regular scales::alpha() function does not work with patterns, so we implemented a new fill_alpha() function that applies the alpha aesthetic to the patterns. By switching out fill = alpha(fill, alpha) with fill = fill_alpha(fill, alpha) in the grid::gpar() function, extension developers can enable pattern fills in their own layer extensions.

The fill_alpha() function checks if the active device supports patterns and spits out a friendlier warning or error on demand. For extension developers that want to use newer graphics features, you can reuse the check_device() function to check feature availability or throw messages in a similar fashion.

# The currently active device is the ragg::agg_png() device
check_device(feature = "patterns", action = "test")
#> [1] TRUE
check_device(feature = "glyphs", action = "abort")
#> Error:
#> ! The agg_png device does not support typeset glyphs.

Ignoring scales

In this release, ggplot2 has changed how the plots interact with variables created with I() (‘AsIs’ variables). The change is somewhat subtle, so it takes a bit of explaining.

It used to be the case that ‘AsIs’ variables automatically added an identity scale to the plot. Identity scales in ggplot2 preserve the original input, without mapping or transforming them. For example, iif you give literal colour names as the colour aesthetic, the plot will use these exact colours.

set.seed(42)
my_colours <- sample(c("red", "green", "blue"), nrow(mpg), replace = TRUE)

ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = my_colours)) +
  scale_colour_identity()

Scatterplot of engine displacement versus highway miles per gallon with points in red, green and blue.

However, because identity scales are true scales, you cannot combine literal colours in one layer with mapped colours in the next. Trying to do so, will confront you with the ‘unknown colour name’ error.

ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = drv), shape = 1, size = 5) +
  geom_point(aes(colour = my_colours)) +
  scale_colour_identity()
#> Error in `geom_point()`:
#> ! Problem while converting geom to grob.
#>  Error occurred in the 1st layer.
#> Caused by error:
#> ! Unknown colour name: f

In order to prevent such clashes between identity scales that map nothing and regular scales, we have changed how ‘AsIs’ variables interact with scales. Instead of adding an identity scale, ‘AsIs’ variables are now altogether ignored by the scale systems. On the surface, the new behaviour is very similar to the old one, in that for example literal colours are used. However, with ‘AsIs’ variables ignored, you can now freely combine layers with ‘AsIs’ input with layers that map input. If you need a legend for the literal variable, we recommend to use the identity scale mechanism instead.

ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = drv), shape = 1, size = 5) +
  geom_point(aes(colour = I(my_colours)), show.legend = FALSE)

Scatterplot of engine displacement versus highway miles per gallon. Every point has two circles: a smaller one in red, green or blue and a larger one mapped to the 'drv' variable.

Perhaps more salient than avoid scale clashes, is that the same applies to the x and y position aesthetics. There has never been a scale_x_identity() or scale_y_identity() function, so what this means may be unexpected. Internally, scales transform every continuous variable to the 0-1 range before drawing the graphics. So too do ‘AsIs’ position aesthetics work: you can use numbers between 0 and 1 to set the position. These positions are relative to the plot’s panel and this mechanism opens up a great way to add plot annotations that are independent of the data.

t <- seq(0, 2 * pi, length.out = 100)

ggplot(mpg, aes(displ, hwy)) +
  geom_point(colour = "grey50") +
  annotate(
    "rect", 
    xmin = I(0.05), xmax = I(0.95),
    ymin = I(0.05), ymax = I(0.95),
    fill = NA, colour = "red"
  ) +
  annotate(
    "path",
    x = I(cos(t) / 2 + 0.5), y = I(sin(t) / 2 + 0.5),
    colour = "blue"
  ) +
  annotate(
    "text", 
    label = "Text in the middle",
    x = I(0.5), y = I(0.5),
    size = 8
  )

Scatterplot of engine displacement versus highway miles per gallon. The plot has a red rectangle slightly smaller than the panel, a blue circle touching the panel edges and text in the middle that reads: 'text in the middle'.

Please take note that discrete variables as ‘AsIs’ position aesthetic have no interpretation and will likely result in errors.

Other improvements

Coordinating text sizes between the theme and geom_text()/ geom_label() has been a hassle, since the theme uses text sizes in points (pt) and geoms use text size in millimetres. Now, one can control what the size aesthetic means for text, by setting the size.unit argument.

p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars)))
p +
  geom_text(size = 10, size.unit = "pt") +
  theme(axis.text = element_text(size = 10))

A plot showing weight versus miles per gallon with individual cars labelled by text. The text in the plot has the same size as the text labelling the axes.

Two improvements have been made to geom_label(). The first is that it now obeys an angle aesthetic.

p + geom_label(aes(angle = runif(nrow(mtcars), -45, 45)))

A plot showing weight versus miles per gallon with individual cars labelled by textboxes. The textboxes are displayed in different angles.

In addition, geom_label()‘s label.padding argument can be controlled individually for every side of the text by using the margin() function. The legend keys for labels has also changed to reflect the geom more accurately.

p + geom_label(
  aes(colour = factor(cyl)), 
  label.padding = margin(t = 2, r = 20, b = 1, l = 0)
)

A plot showing weight versus miles per gallon with individual cars labelled by textboxes. The textboxes have a large margin on the right.

Like geom_density() before it, geom_violin() now gains a bounds argument to restrict the range wherein density is estimated.

df <- data.frame(
  x = c(rbeta(100, 0.5, 0.5), rbeta(100, 1, 1), rbeta(100, 2, 2)),
  group = rep(c("A", "B", "C"), each = 100)
)

ggplot(df, aes(group, x)) +
  geom_violin(bounds = c(0, 1))

Violin plot showing random numbers drawn from beta distributions with different parameters. The ends of the first two violins are flat at the top and bottom.

The geom_boxplot() has acquired an option to remove (rather than hide) outliers. Setting outliers = FALSE removes outliers so that the plot limits do not take these into account. For hiding (and not removing) outliers, you can still set outlier.shape = NA. Also, it has gained a staplewidth argument that can be used to draw staples: horizontal lines at the end of the boxplot whiskers. The default, staplewidth = 0, will suppress the staples so your current box plots continue to look the same.

ggplot(diamonds, aes(cut, price)) +
  geom_boxplot(outliers = FALSE, staplewidth = 0.5)

Boxplot showing the price of diamonds per cut. The y-axis does not go much beyond the whiskers, and whiskers are decorated with a staple.

The scales functions now do a better job at reporting which scale has encountered an error.

scale_colour_brewer(breaks = 1:5, labels = 1:4)
#> Error in `scale_colour_brewer()`:
#> ! `breaks` and `labels` must have the same length.

ggplot(mpg, aes(class, displ)) +
  geom_boxplot() +
  scale_x_continuous()
#> Error in `scale_x_continuous()`:
#> ! Discrete values supplied to continuous scale.
#>  Example values: "compact", "compact", "compact", "compact", and "compact"

ggplot(msleep, aes(bodywt - 1, brainwt)) +
  geom_point(na.rm = TRUE) +
  scale_x_log10()
#> Warning in transformation$transform(x): NaNs produced
#> Warning in scale_x_log10(): log-10 transformation introduced infinite values.

Scatterplot showing body weight minus one versus brain weight of mammals. The x-axis is log-transformed.

Acknowledgements

Thank you to all people who have contributed issues, code and comments to this release:

@92amartins, @a-torgovitsky, @aarongraybill, @aavogt, @agila5, @ahcyip, @AlexanderCasper, @alexkrohn, @alofting, @andrewgustar, @antagomir, @aphalo, @Ari04T, @AroneyS, @Asa12138, @ashgreat, @averissimo, @bakerwm, @balling-dev, @banbh, @barracuda156, @BartJanvanRossum, @beansrowning, @benimwolfspelz, @bfordAIMS, @bguiastr, @bnicenboim, @BrianDiggs, @bsgerber, @burrapreeti, @bwiernik, @ccsarapas, @CGlemser, @chiajungTung, @chipsin87, @cjvanlissa, @CorradoLanera, @danielneilson, @danli349, @DasHammett, @davidhodge931, @DavisVaughan, @dieghernan, @Ductmonkey, @edent, @Elham-adabi, @ELICHOS, @eliocamp, @ellisp, @emuise, @erikdeluca, @f2il-kieranmace, @FDylanT, @fkohrt, @francisbarton, @fredcallaway, @frezza-metabolomics, @GabrielHoffman, @gaospecial, @garyzhubc, @gavinsimpson, @Generalized, @ghost, @giadasp, @GMSL1, @grantmcdermott, @hadley, @hlynurhallgrims, @holgerbrandl, @hpages, @HRodenhizer, @hub-shale, @hughjonesd, @ibuiltthis, @ingewortel, @isaacvock, @Istalan, @istvankleijn, @jacobkasper, @jammainen, @jan-glx, @JaredAllen2, @jashapiro, @jimjam-slam, @jmuhlenkamp, @jonspring, @JorisChau, @joshhwuu, @jpeasari, @jromanowska, @jsacerot, @jtlandis, @jtr13, @jttoivon, @karchern, @klin333, @kmavrommatis, @kramerrs, @krlmlr, @kylebutts, @larmarange, @latot, @lhami, @liang09255, @linzi-sg, @lionel-, @lnarwhale, @manjumc1975, @mariadelmarq, @matanhakim, @math-mcshane, @mattgalbraith, @matthewjnield, @mcwayrm, @melissagwolf, @MichaelChirico, @MikkoVihtakari, @MjelleLab, @mjskay, @mkoohafkan, @mmokrejs, @modmost, @moodymudskipper, @morrisseyj, @mps9506, @Nh-code, @njtierney, @oliviercailloux, @olivroy, @otaviolovison, @pablobernabeu, @paulatn240, @phauchamps, @quantixed, @ralmond, @ramiromagno, @reallzg, @retodomax, @robbiebatley, @Rong-Zh, @rossellhayes, @RoyalTS, @rvalieris, @s-andrews, @s-elsheikh, @schloerke, @Sckende, @sdmason, @sirallen, @slowkow, @spaette, @steveharoz, @sunroofgod, @szimmer, @tbates, @teunbrand, @tfjaeger, @thomasp85, @TimBMK, @TimTaylor, @tjebo, @trekonom, @tungttnguyen, @twest820, @UliSchopp, @vnijs, @warnes, @wbvguo, @willgearty, @Yann-C-INN, @yannk-lm, @Yunuuuu, @yutannihilation, @yuw444, @zekiakyol, and @zhenglukai.