We’re excited to announce the release of broom 0.7.0 on CRAN!
broom is a package for summarizing statistical model objects in tidy tibbles. While several compatibility updates have been released in recent months, this is the first major update to broom in almost two years. This update includes many new tidier methods, bug fixes, improvements to existing tidier methods and their documentation, and improvements to maintainability and internal consistency. The full list of changes is available in the package release notes.
This release was made possible in part by the RStudio internship program, which has allowed one of us ( Simon Couch) to work on broom full-time for the last month.
You can install the most recent broom update with the following code:
Then attach it for use with:
We’ll outline some of the more notable changes below!
For one, this release includes support for several new model objects—many of these additions came from first-time contributors to broom!
anovaobjects from the
pamobjects from the
drmobjects from the
summary_emmobjects from the
epi.2by2objects from the
fixestobjects from the
regsubsetsobjects from the
lm.betaobjects from the
rmaobjects from the
betamfxobjects from the
glmrobobjects from the
sarlmobjects from the
speedglmobjects from the
svyglmobjects from the
- We have restored a simplified version of
This update also features many bug fixes improvements to existing tidiers. Some of the more notable ones:
- Many improvements to the consistency of
- If you pass a dataset to
newdataarguments, you are now guaranteed that the augmented dataset will have exactly the same number of rows as the original dataset. This differs from previous behavior primarily when there are missing values. Previously
augment()would drop rows containing
NA. This should no longer be the case. As a result,
augment.*()methods no longer accept an
- In previous versions, several
augment.*()methods inherited the
augment.lm()method, but required additions to the
augment.lm()method itself. We have shifted away from this approach in favor of re-implementing many
augment.*()methods as standalone methods making use of internal helper functions. As a result,
augment.lm()and some related methods have deprecated (previously unused) arguments.
.residcolumn in the output of
augment().*methods is now consistently defined as
y - y_hat.
augment()tries to give an informative error when
dataisn’t the original training data.
- If you pass a dataset to
glance.*()methods have been refactored in order to return a one-row tibble even when the model matrix is rank-deficient.
glance()methods now return a
nobscolumn, which contains the number of data points used to fit the model!
- Various warnings resulting from changes to the tidyr API in v1.0.0 have been fixed.
- Added options to provide additional columns in the outputs of
This release also contains a number of breaking changes and deprecations meant to improve maintainability and internal consistency.
- We have changed how we report degrees of freedom for
lmobjects. This is especially important for instructors in statistics courses. Previously the
glance.lm()reported the rank of the design matrix. Now it reports degrees of freedom of the numerator for the overall F-statistic. This is equal to the rank of the model matrix minus one (unless you omit an intercept column), so the new
dfshould be the old
- We are moving away from supporting
summary.*()objects. In particular, we have removed
tidy.summary.lm()as part of a major overhaul of internals. Instead of calling
summary-like objects, please call
tidy()directly on model objects moving forward.
- We have removed all support for the
tidy()methods. This is to simplify internals and is for maintainability purposes. We anticipate this will not influence many users as few people seemed to use it. If this majorly cramps your style, let us know, as we are considering a new verb to return only model parameters. In the meantime,
tibble::enframe()provides most of the functionality of
tidy(..., quick = TRUE).
conf.intarguments now default to
FALSE, and all
conf.levelarguments now default to
0.95. This should primarily affect
tidy.survreg(), which previously always returned confidence intervals, although there are some others.
- Tidiers for
emmeans-objects use the arguments
conf.levelinstead of relying on the argument names native to the
multcomp-tidiers now include a call to
summary()as previous behavior was akin to setting the now removed argument
quick = TRUE. Both families of tidiers now use the
adj.p.valuecolumn name when appropriate. Finally,
TukeyHSD-tidiers now consistently use the column names
This release of broom also deprecates several helper functions as well as tidier methods for a number of non-model objects, each in favor of more principled approaches from other packages (outlined in the NEWS file). Notably, though, tidiers have been deprecated for data frames, rowwise data frames, vectors, and matrices. Further, we have moved forward with the planned transfer of tidiers for mixed models to
Most all unit testing for the package is now supported by the modeltests package!
Also, we have revised several vignettes and moved them to the tidymodels website. For backward compatibility, the existing vignettes will now simply link to the revised versions.
Finally, the package’s website has moved from its previous tidyverse domain to broom.tidymodels.org.
Most notably, the broom dev team is changing the process to add new tidying methods to the package. Instead, we ask that issues/PRs requesting support for new model objects be directed to the model-owning package (i.e. the package that the model is exported from) rather than to broom. If the maintainers of those packages are unable or unwilling to provide tidying methods in the model-owning package, it might be possible to add the new tidier to broom. broom is near its limit of tidiers; adding more may make the package unsustainable.
For developers exporting tidying methods directly from model-owning packages, we are actively working to provide resources to both ease the process of writing new tidiers methods and reduce the dependency burden of taking on broom generics and helpers. As for the first point, we recently posted an
article on the tidymodels website providing notes on best practices for writing tidiers. This article will be kept up to date as we develop new resources for easing the process of writing new tidier methods. As for the latter, the
r-lib/generics package provides lightweight dependencies for the main broom generics. We hope to soon provide a coherent suite of helper functions for use in external broom methods.
We anticipate that the most active development on the broom package, looking forward, will center on improving
augment() methods. We are also hoping to change our CRAN release cycle and to provide incremental updates every several months rather than major changes every couple years.
This release features work and input from over 140 contributors (over 50 of them for their first time) since the last major release. See the package release notes to see more specific notes on contributions. Thank you all for your thoughtful comments, patience, and hard work!
@abbylsmith, @acoppock, @ajb5d, @aloy, @AndrewKostandy, @angusmoore, @anniew, @aperaltasantos, @asbates, @asondhi, @asreece, @atyre2, @bachmeil, @batpigandme, @bbolker, @benjbuch, @bfgray3, @BibeFiu, @billdenney, @BrianOB, @briatte, @bruc, @brunaw, @brunolucian, @bschneidr, @carlislerainey, @CGMossa, @CharlesNaylor, @ChuliangXiao, @cimentadaj, @crsh, @cwang23, @DavisVaughan, @dchiu911, @ddsjoberg, @dgrtwo, @dmenne, @dylanjm, @ecohen13, @economer, @EDiLD, @ekatko1, @ellessenne, @ethchr, @florencevdubois, @GegznaV, @gershomtripp, @grantmcdermott, @gregmacfarlane, @hadley, @haozhu233, @hasenbratan, @HenrikBengtsson, @hermandr, @hideaki, @hughjonesd, @iago-pssjd, @ifellows, @IndrajeetPatil, @Inferrator, @istvan60, @jamesmartherus, @JanLauGe, @jasonyang5, @jaspercooper, @jcfisher, @jennybc, @jessecambon, @jkylearmstrongibx, @jmuhlenkamp, @JulianMutz, @Jungpin, @jwilber, @jyuu, @karissawhiting, @karldw, @khailper, @krauskae, @kuriwaki, @kyusque, @KZARCA, @Laura-O, @ldlpdx, @ldmahoney, @lilymedina, @llendway, @lrose1, @ltobalina, @LukasWallrich, @lukesonnet, @lwjohnst86, @malcolmbarrett, @margarethannum, @mariusbarth, @MatthieuStigler, @mattle24, @mattpollock, @mattwarkentin, @mine-cetinkaya-rundel, @mkirzon, @mlaviolet, @Move87, @namarkus, @nlubock, @nmjakobsen, @ns-1m, @nt-williams, @oij11, @petrhrobar, @PirateGrunt, @pjpaulpj, @pkq, @poppymiller, @QuLogic, @randomgambit, @riinuots, @RobertoMuriel, @Roisin-White, @romainfrancois, @rsbivand, @serina-robinson, @shabbybanks, @Silver-Fang, @Sim19, @simonpcouch, @sjackson1236, @softloud, @stefvanbuuren, @strengejacke, @sushmitavgopalan16, @tcuongd, @thisisnic, @topepo, @tyluRp, @vincentarelbundock, @vjcitn, @vnijs, @weiyangtham, @william3031, @x249wang, @xieguagua, @yrosseel, and @zoews