tune 0.1.2

  tidymodels, tune, workflows

  Max Kuhn

We’re pleased to announce the release of version 0.1.2 of the tune package. tune is a tidy interface for optimizing model tuning parameters.

You can install it from CRAN with:

install.packages("tune")

There is a lot to discuss! So much that this is the first of three blog posts. Here, we’ll show off most of the new features. The two other blog posts will talk about how to benefit from sparse matrices with tidymodels and improvements to parallel processing.

Pick a class level

Deciding how to define the event of interest for two-class classification models is a major pain. Sometimes the second level of the factor is assumed to be the event of interest, but this is a vestigial notion almost entirely driven by how things were in The Old Days when outcome classes were encoded as zero and one. Thankfully, we’ve evolved significantly since those days. tidymodels assumes that the first factor level is the event as a default.

However, we want to accommodate multiple preferences. Previously, there was a global option that you could set to decide whether the first or second factor level is the event. We have come to realize that this was not the best idea from a technical standpoint. The new approach uses control arguments to the tune functions to make this specification. For example, control_grid(event_level = "second") would change the default when using tune_grid().

Adding variables

There is a new variable specification interface in the workflows package called add_variables(). This can be a good approach to use if you are not interested in using a recipe or formula to declare which columns are outcomes or predictors. You can now use this interface with the tune package.

Gaussian process options

For Bayesian optimization, you can now pass options to GPfit::GP_fit() through tune_bayes(). If you are a “Go Matérn covariance function or go home” person, this is a nice addition.

Augmenting tune objects

There is now an augment() method for tune_* objects. This method does not have a data argument and returns the out-of-sample predictions for the object, different from other augment() methods you may have used. For objects produced by last_fit(), the function returns the test set results.

Acknowledgements

Thanks to everyone who contributed code or filed issues since the last version: @AndrewKostandy, @bloomingfield, @cespeleta, @cimentadaj, @DavisVaughan, @dmalkr, @EmilHvitfeldt, @hnagaty, @jcpsantiago, @juliasilge, @kbzsl, @kelseygonzalez, @matthewrspiegel, @mdneuzerling, @MxNl, @SeeNewt, @simonschoe, @Steviey, @topepo, @trevorcampbell, and @UnclAlDeveloper