haven 2.1.0

Photo by Skitterphoto

We’re delighted to announce that haven 2.1.0 is now on CRAN. haven enables R to read and write various data formats used by other statistical packages by wrapping the ReadStat C library written by Evan Miller. For a full account of updates in this release, see the Changelog.

Improved labelling

Both labelled() and labelled_spss() now allow NULL labels. This makes both classes more flexible, allowing you to use them for their other attributes.labelled() also now tests that value labels are unique.

labelled objects now get pretty printing that shows the labels and NA values when inside of a tbl_df. You can turn this behaviour off by using option(haven.show_pillar_labels = FALSE).

tibble::tibble(s = haven::labelled(c(1, 10), labels = c("A" = 1, "B" = 10)))
#> # A tibble: 2 x 1
#>           s
#>   <dbl+lbl>
#> 1     1 [A]
#> 2    10 [B]

Minor improvements and fixes

This release is updated to the latest version of Evan Miller’s ReadStat, which includes the following changes:

  • read_por() can now read files from SPSS 25.
  • read_por() uses base-10 instead of base-30 for the exponent.
  • read_sas() can read zero-column files.
  • read_sav() now reads long strings, and has greater memory limit, allowing it to read more labels.
  • read_spss() reads long variable labels.
  • write_sav() no longer creates incorrect column names when >10k columns.
  • write_sav() no longer crashes when writing long label names.


Thank you to Evan Miller, as well as @armenic, @beckerbenj, @caayala, @gergness, @jeffeaton, @philstraforelli, @thays42, and @visseho for their contributions to this release.

Upcoming events
Bellevue WA
May 29 - June 1
Mara Averick, Garrett Grolemund, Javier Luraschi, Max Kuhn, and Kevin Kuo will be teaching workshops on text mining, the tidyverse, deep learning, and tidy modeling.
Toulouse, France
July 9
Jim Hester, Hadley Wickham, and Jenny Bryan are teaching a half-day tutorial on Package Development.