readxl 1.4.0

  readxl

  Jenny Bryan

We’re pleased to announce the release of readxl 1.4.0. The readxl package makes it easy to get tabular data out of Excel files and into R with code, not mouse clicks. It supports both the legacy .xls format and the modern XML-based .xlsx format. readxl is designed to be easy to install (so: no external dependencies) and to cope with many of the less savory features of Excel files created by humans and 3rd party applications.

The easiest way to install the latest version from CRAN is to install the whole tidyverse.

install.packages("tidyverse")

Alternatively, install just readxl from CRAN:

install.packages("readxl")

Regardless, you will still need to attach readxl explicitly. It is not a core tidyverse package, i.e. readxl is NOT attached via library(tidyverse). Instead, do this in your script:

This release has practically no changes that should be noticeable by the typical user. However, internally, there have been extensive updates that set the stage for future user-facing improvements. Therefore, this post will be quite short and the main point is to encourage readxl users to kick the tires. We set out to upgrade the foundation to support building new features and we’d love to hear about any unintended regressions.

You can see a full list of changes in the release notes.

Updated libxls

readxl now embeds libxls v1.6.2 (the previous release embedded v1.5.0). The libxls project is maintained by Evan Miller and is hosted at https://github.com/libxls/libxls, where you can read more in its release notes. These accumulated releases fix a number of edge cases, allowing readxl to read even more weird and wonderful .xls files.

Switch from Rcpp to cpp11

Thanks to Shelby Bearrows, readxl now uses cpp11. Shelby is a new member of the tidyverse team and she blogged about this project during her 2021 summer internship.

Other small improvements and what’s next

“Date or Not Date”: readxl’s understanding of number formats has gotten more sophisticated (thanks @nacnudus and @reviewher!). Non-datetime formats that incorporate colours or currencies should no longer be confused with datetime formats. We anticipate this will result in more accurate guessing of cell and column types.

What’s coming next? I won’t go so far as to promise that 2022 is the year of readxl 😉. But I can say that top priorities include equipping readxl with better problem reporting and column specification, making its interface feel more similar to that of readr and vroom.

Acknowledgements

Thanks to the 103 people who have contributed to readxl since we last blogged about it (upon the release of version 1.2.0 in December 2018) by reporting bugs and suggesting new features: @abcdef123ghi, @acvelozo, @ahbon123, @ajit555, @artinmg, @aswansyahputra, @averiperny, @batpigandme, @ben1787, @benmatthewsed, @benwatsoncpa, @benzipperer, @bhive01, @bjorn81, @boshek, @brkbrc, @Brunox13, @cderv, @DavisVaughan, @ddekadt, @dkgaraujo, @donnekgit, @druedin, @dxbhans, @elephann, @eringrand, @estern95, @fary90, @fermumen, @fndemarqui, @gaborcsardi, @gbganalyst, @ghost, @hadley, @hammao, @hannes101, @hddao, @hidekoji, @HughParsonage, @idontgetoutmuch, @j-sirgo, @jennybc, @jeromyanglim, @jimhester, @jmcurran, @josh-m-sharpe, @jwhendy, @jzadra, @kfhk, @kiernann, @ksetdekov, @kwebihaf-github, @llrs, @loureynolds, @lucasmation, @lucifersFall1n1, @luisvalenzuelar, @matthiasgomolka, @MeoWoo6, @MichaelChirico, @mine-cetinkaya-rundel, @misea, @mkoohafkan, @moodymudskipper, @msgoussi, @nacnudus, @narayanana, @nfultz, @nickschurch, @nlneas1, @nqkhanh2209, @ntsigilis, @pitakakariki, @pmallot, @qdread, @queleanalytics, @ramay, @ramiromagno, @Rindrics, @rsbivand, @rwbaer, @saanasum, @sbearrows, @Sbirch556, @seanchrismurphy, @Shicheng-Guo, @Sibojang9, @simowaves, @smsaladi, @songc-93, @SteveDeitz, @struckma, @sureshvigneshbe, @tfulge, @topepo, @ucb, @vchouraki, @wanttobenatural, @wgrundlingh, @WilDoane, @zerogetsamgow, @zhangbs92, and @zx8754.