readr 1.4.0

readr 1.4.0 is now available on CRAN! Learn more about readr at https://readr.tidyverse.org. Detailed notes are always in the change log.

The readr package makes it easy to get rectangular data out of comma separated (csv), tab separated (tsv) or fixed width files (fwf) and into R. It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes. If you are new to readr, the best place to start is the data import chapter in R for data science.

Install readr with

install.packages("readr")

And load it with

library(tidyverse)

#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──

#> ✔ ggplot2 3.3.2     ✔ purrr   0.3.4
#> ✔ tibble  3.0.3     ✔ dplyr   1.0.2
#> ✔ tidyr   1.1.2     ✔ stringr 1.4.0
#> ✔ readr   1.4.0     ✔ forcats 0.5.0

#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag()    masks stats::lag()

Breaking Changes

Argument name consistency

The first argument to all of the write_() functions, like write_csv() had previously been path. However the first argument to all of the read_() functions is file. As of readr 1.4.0 the first argument to both read_() and write_() functions is file and path is now deprecated.

NaN behavior

Some floating point operations can produce a NaN value, e.g. 0 / 0. Previously write_csv() would output NaN values always as NaN and this could not be controlled by the write_csv(na=) argument. Now the output value of NaN is the same as the NA and can be controlled by the argument. This is a breaking change in that the same code would produce different output, but it should be rare in practice.

New features

Generate column specifications from datasets

Using as.col_spec() on any data.frame or tibble object will now generate a column specification with the column types in the data.

library(palmerpenguins)
spec <- as.col_spec(penguins)
spec

#> cols(
#>   species = col_factor(levels = c("Adelie", "Chinstrap", "Gentoo"), ordered = FALSE, include_na = FALSE),
#>   island = col_factor(levels = c("Biscoe", "Dream", "Torgersen"), ordered = FALSE, include_na = FALSE),
#>   bill_length_mm = col_double(),
#>   bill_depth_mm = col_double(),
#>   flipper_length_mm = col_integer(),
#>   body_mass_g = col_integer(),
#>   sex = col_factor(levels = c("female", "male"), ordered = FALSE, include_na = FALSE),
#>   year = col_integer()
#> )

You can also convert the column specifications to a condensed textual representation with as.character()

as.character(spec)

#> [1] "ffddiifi"

Writing end of line characters

Write functions now take a eol argument to allow control of the end of line characters. Previously readr only supported using a single newline (\n) character. You can now specify any number of characters, though windows linefeed newline (\r\n) is by far the most common alternative.

cli package is now used for messages

The cli package is now used for messages. The most prominent place you will notice this is printing the column specifications. Previously these functions used message(), which in RStudio prints the text in red.

While cli still uses message objects, they will now be more naturally colored, which hopefully will make them easier to read.

Rcpp dependency removed

The Rcpp dependency has been removed in favor of cpp11. Compiling readr should now take less time and use less memory.

Acknowledgements

As usual, there were many more additional changes and bugfixes included in this release see the change log for details.

Thank you to the 132 contributors who made this release possible by opening issues or submitting pull requests: @adamroyjones, @aetiologicCanada, @ailich, @antoine-sachet, @archenemies, @ashuchawla, @Athanasiamo, @bastianilso, @batpigandme, @Ben-Cox, @bergen288, @boshek, @bovender, @bransonf, @brianrice2, @briatte, @c30saux, @cboettig, @cderv, @cdhowe, @ceresek, @charliejhadley, @chipkoziara, @cwolk, @damianooldoni, @dan-reznik, @DanielleQuinn, @DarwinAwardWinner, @dhmontgomery, @djbirke, @dkahle, @dmitrienka, @dmurdoch, @dpprdan, @dwachsmuth, @EarlGlynn, @edo91, @ellessenne, @Fernal73, @firasm, @fjuniorr, @frahimov, @frousseu, @GegznaV, @georgevbsantiago, @geotheory, @greg-minshall, @hadley, @hidekoji, @huashan, @ifendo, @ijlyttle, @isaactpetersen, @jangorecki, @jdblischak, @jemunro, @jennahamlin, @jesse-ross, @jimhester, @jmarshallnz, @jmcloughlin, @jmobrien, @jnolis, @jokedurnez, @jpwhitney, @jssa98, @juangomezduaso, @junqi108, @JustGitting, @jxu, @kainhofer, @katgit, @kbzsl, @keesdeschepper, @kiernann, @knausb, @krlmlr, @kvittingseerup, @lambdamoses, @leopoldsw, @lsaravia, @MihaiBabiac, @mkearney, @mlaunois, @mmuurr, @moodymudskipper, @MZellou, @nacnudus, @natecobb, @NFA, @NikKrieger, @njtierney, @nogeel, @orderlyquant, @oscci, @Ozan147, @pcgreen7, @perog, @phil-grayson, @pralitp, @psychelzh, @QuLogic, @r2evans, @Rajesh-Ramasamy, @ralsouza, @rcragun, @romainfrancois, @salim-b, @sfrenk, @Shians, @shrektan, @skaltman, @sonhan18, @StevenMMortimer, @thays42, @ThePrez, @tmalsburg, @TrentLobdell, @ttimbers, @vnijs, @wch, @we-hop, @wehopkins, @wibeasley, @wolski, @wwgordon, @xianwenchen, @xiaodaigh, @xinyue-li, @yutannihilation, @Zack-83, and @zenggyu.