tidyr 0.8.0

Photo by Samuel Zeller

Hadley Wickham

We are pleased to announce that tidyr 0.8.0 is now available on CRAN. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the tidy data vignette. Install it with:

install.packages("tidyr")

This release mainly contains a bumper crop of small bug fixes and minor improvements, and a considerable increase in test coverage (84% to 99%). For the full details, see the release notes. Here we’ll highlight an important bug fix that might change existing code, and one new feature to try out.

API changes

There was a bug in separate() where negative values had an off-by-one error. Now -1 correctly refers to the first position between characters counting from the right hand side.

df <- tibble(x = c("male1", "female2", "male2"))
df %>% separate(x, c("gender", "number"), -1)
#> # A tibble: 3 x 2
#>   gender number
#>   <chr>  <chr>
#> 1 male   1
#> 2 female 2
#> 3 male   2

New features

Thanks to the suggestion of Andrew Bray, tidyr can now “uncount” a data frame, duplicating aggregate rows:

df <- tibble(x = c("a", "b", "c"), n = c(2, 3, 1))
df %>% uncount(n)
#> # A tibble: 6 x 1
#>   x
#>   <chr>
#> 1 a
#> 2 a
#> 3 b
#> 4 b
#> 5 b
#> 6 c

If you want a unique identifier for each row, use the .id argument:

df %>% uncount(n, .id = "id")
#> # A tibble: 6 x 2
#>   x        id
#>   <chr> <int>
#> 1 a         1
#> 2 a         2
#> 3 b         1
#> 4 b         2
#> 5 b         3
#> 6 c         1
Contents
Upcoming events
Miami, FL
December 16 – 18, 2019
Improve your tool building skills with this small hands-on workshop on a boat. All profits go to support the mission of the field school.
San Francisco, CA
January 27 – 30, 2020
rstudio::conf 2020 covers all things RStudio, including workshops to teach you the tidyverse, and talks to show you the latest and greatest features.