In an effort to keep the community up to date with the evolution of the tidyverse, we’ll be doing regular roundups cataloging the latest developments.
tidyverse package updates
- tidyverse 1.2.0 (release notes)
rlang 0.1.4 (release notes)
tidyselect 0.2.3 (release notes)
tidyselect provides a common back-end for
tidyr::gather(), as well as for modelling packages. It is also the source of selection helpers, such as
starts_with(). tidyselect allows you to create selecting verbs that are consistent across tidyverse packages.
yardstick allows you to easily create tidy performance estimates. Using a syntax similar to dplyr’s you can compute common performance metrics, such as precision, and recall (for classification), or numeric metric outcomes for regression, and have them returned in a tidy data frame.
recipes is an extensible framework for feature selection and the creation of preprocess design matrices, which can then be applied to statistical and machine learning models. The updated version of recipes includes a tidy method for many of the step functions. The tidy method returns relevant information about the step. This could include estimated parameters or which variables were affected by the step. NEWS
rsample’s major upgrade from caret is that it allows for nested resampling. The goal is to have a modular, extensible set of methods that can be used across R packages for traditional resampling techniques, and estimating model performance. rsample can be used to create objects containing resamples of the original data, allowing you to create a model and optimization parameters with placeholders for features to be defined later. The package website has examples for resampling time series, survival models, and neural networks using
tidyposterior is used to conduct Bayesian post hoc analyses of resampling results generated by models. It can be considered an upgraded version of
caret::resample. Though it works natively with rsample, it can be used with any data frame of results.
While the tidyverse consists of highly-opinionated tools for data science; r-lib contains mostly-unopinionated infrastructure tools with fewer dependencies.