dplyr 0.8.1

Photo by Sophie Elvis

Introduction

We’re delighted to announce the release of dplyr 0.8.1 on CRAN 🎉 !

This is a minor release that address follow ups from the community after the release of the 0.8.0 version.

group_map() and group_modify()

Shortly after the release of 0.8.0, we were notified by several members of the community that group_map() was great, except it didn’t do what they had expected 😬.

Because the function was (and still is) marked as experimental, we allowed ourselves to rectify the situation:

  • The name group_map() is now used for iterating on groups of grouped tibbles, characterised by .x and .y as before, but making no assumptions about the return type of each operation and combining the results in a list. We can see this as iterating, in the purrr::map() sense on the groups.
library(dplyr, warn.conflicts = FALSE)

# a list of vectors
iris %>%
  group_by(Species) %>%
  group_map(~ quantile(.x$Petal.Length, probs = c(0.25, 0.5, 0.75)))
#> [[1]]
#>   25%   50%   75% 
#> 1.400 1.500 1.575 
#> 
#> [[2]]
#>  25%  50%  75% 
#> 4.00 4.35 4.60 
#> 
#> [[3]]
#>   25%   50%   75% 
#> 5.100 5.550 5.875
  • The behaviour we previously had was renamed group_modify() to loosely echo purrr::modify(). In particular, group_modify() always returns a grouped tibble, which combines the tibbles returned by evaluating each operation with a reconstructed grouping structure.
# to use group_modify() the lambda must return a data frame
iris %>%
  group_by(Species) %>%
  group_modify(~ {
     quantile(.x$Petal.Length, probs = c(0.25, 0.5, 0.75)) %>%
     tibble::enframe(name = "prob", value = "quantile")
  })
#> # A tibble: 9 x 3
#> # Groups:   Species [3]
#>   Species    prob  quantile
#>   <fct>      <chr>    <dbl>
#> 1 setosa     25%       1.4 
#> 2 setosa     50%       1.5 
#> 3 setosa     75%       1.58
#> 4 versicolor 25%       4   
#> 5 versicolor 50%       4.35
#> 6 versicolor 75%       4.6 
#> 7 virginica  25%       5.1 
#> 8 virginica  50%       5.55
#> 9 virginica  75%       5.88

Attention to details in column wise functions

As we are phasing funs() out and prefer use of purrr-style lambda functions in column wise verbs, we missed a few subtleties.

Specifically, lambdas can now refer to:

  • local variables (from the scope):
to_inch <- function(data, ...) {
  # the local variable `inch` can be used in the lambda
  inch <- 0.393701
  data %>% 
    mutate_at(vars(...), ~ . * inch)
}
iris %>% 
  as_tibble() %>% 
  to_inch(-Species)
#> # A tibble: 150 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1         2.01        1.38        0.551      0.0787 setosa 
#>  2         1.93        1.18        0.551      0.0787 setosa 
#>  3         1.85        1.26        0.512      0.0787 setosa 
#>  4         1.81        1.22        0.591      0.0787 setosa 
#>  5         1.97        1.42        0.551      0.0787 setosa 
#>  6         2.13        1.54        0.669      0.157  setosa 
#>  7         1.81        1.34        0.551      0.118  setosa 
#>  8         1.97        1.34        0.591      0.0787 setosa 
#>  9         1.73        1.14        0.551      0.0787 setosa 
#> 10         1.93        1.22        0.591      0.0394 setosa 
#> # … with 140 more rows
  • other columns of the data (from the data mask):
iris %>% 
  as_tibble() %>% 
  mutate_at(vars(starts_with("Sepal")), ~ . / Petal.Width)
#> # A tibble: 150 x 5
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>  
#>  1         25.5       17.5           1.4         0.2 setosa 
#>  2         24.5       15             1.4         0.2 setosa 
#>  3         23.5       16             1.3         0.2 setosa 
#>  4         23.0       15.5           1.5         0.2 setosa 
#>  5         25         18             1.4         0.2 setosa 
#>  6         13.5        9.75          1.7         0.4 setosa 
#>  7         15.3       11.3           1.4         0.3 setosa 
#>  8         25         17             1.5         0.2 setosa 
#>  9         22         14.5           1.4         0.2 setosa 
#> 10         49         31             1.5         0.1 setosa 
#> # … with 140 more rows

Thanks

Thanks to all contributors for this release.

@abalter, @ambevill, @amitusa17, @AntoineHffmnn, @anuj2054, @batpigandme, @behrman, @billdenney, @burchill, @cgrandin, @clemenshug, @codetrainee, @ColinFay, @dan-reznik, @davidsjoberg, @DesiQuintans, @dirkschumacher, @earowang, @echasnovski, @eipi10, @grabear, @grandtiger, @gregorp, @hadley, @hanyroze, @hidekoji, @huftis, @iago-pssjd, @javierluraschi, @jennybc, @jgellar, @jhrcook, @jimhester, @joel23888, @JohnMount, @johnmous, @jonathan-g, @jwbeck97, @jzadra, @karimn, @kendonB, @koncina, @kperkins, @kputschko, @krlmlr, @kyzphong, @lionel-, @llrs, @mariodejung, @MichaelAdolph, @michaelwhammer, @MilesMcBain, @mjherold, @moodymudskipper, @msberends, @mvkorpel, @nathancday, @nicokuz, @nolistic, @oscci, @paulponcet, @PhilippRuchser, @philstraforelli, @psychometrician, @Ranonymous, @rinebob, @romagnolid, @romainfrancois, @rvg02010, @slyrus, @snp, @sowla, @ThiAmm, @thothal, @wfmackey, @will458, @wkdavis, @yutannihilation, @ZahraEconomist, and @zooman.

Upcoming events
Atlanta, GA
Oct 14-15
You should take this workshop if you have experience programming in R and want to learn how to tackle larger scale problems. The class is taught by Hadley Wickham, Chief Scientist at RStudio.
San Francisco, CA
Jan 27-30
rstudio::conf 2020 covers all things RStudio, including workshops to teach you the tidyverse, and talks to show you the latest and greatest features.