dials 0.0.3

  tidymodels, dials

  Max Kuhn

A new version of dials is on CRAN. The package has contains basic frameworks for managing tuning parameters for models. It is a significant update to the package. The major change is that parameter objects are now generated by functions (as opposed to the prototype objects in the previous version). For example, to make a dials object for the number of PCA components in a model:

# previously
pca_comps <- num_comp

# now
pca_comps <- num_comp()

For numeric parameters, the range of values can be set using the first argument:

library(tidymodels)
## ── Attaching packages ──────────────────────────────────────── tidymodels 0.0.2 ──
## ✔ broom     0.5.2       ✔ purrr     0.3.2  
## ✔ dials     0.0.3       ✔ recipes   0.1.7  
## ✔ dplyr     0.8.3       ✔ rsample   0.0.5  
## ✔ ggplot2   3.2.1       ✔ tibble    2.1.3  
## ✔ infer     0.4.0.1     ✔ yardstick 0.0.4  
## ✔ parsnip   0.0.3.1
## ── Conflicts ─────────────────────────────────────────── tidymodels_conflicts() ──
## ✖ purrr::discard()  masks scales::discard()
## ✖ dplyr::filter()   masks stats::filter()
## ✖ dplyr::lag()      masks stats::lag()
## ✖ ggplot2::margin() masks dials::margin()
## ✖ dials::offset()   masks stats::offset()
## ✖ recipes::step()   masks stats::step()
num_comp()
## # Components  (quantitative)
## Range: [1, ?]
num_comp(range = c(2, 10))
## # Components  (quantitative)
## Range: [2, 10]

Sets of tuning parameters can be created and managed:

boosting_set <- param_set(list(trees(), splits = tree_depth(), min_n()))
boosting_set
## Collection of 3 parameters for tuning
## 
##      id parameter type object class
##   trees          trees    nparam[+]
##  splits     tree_depth    nparam[+]
##   min_n          min_n    nparam[+]
# modifying the parameter range:
boosting_set %>% update(trees = trees(c(100, 1000)))
## Collection of 3 parameters for tuning
## 
##      id parameter type object class
##   trees          trees    nparam[+]
##  splits     tree_depth    nparam[+]
##   min_n          min_n    nparam[+]

Note that the tree depth parameter has a user-defined identification variable. This can come in handy when there are multiple tuning parameters of the same type. For example, suppose two variables (x1 and x2) were modeled using splines. The flexibility of each grouped be represented in a parameter set:

splines <- param_set(list(x1_df = deg_free(), x2_df = deg_free()))
splines
## Collection of 2 parameters for tuning
## 
##     id parameter type object class
##  x1_df       deg_free    nparam[+]
##  x2_df       deg_free    nparam[+]

This version of dials also contains two functions for creating space-filling designs, a technique from statistical experimental design theory. The two functions are grid_max_entropy() and grid_latin_hypercube().

svm_set <- param_set(list(rbf_sigma(), cost()))
set.seed(463)
me_grid <- grid_max_entropy(svm_set, size = 20) %>% mutate(type = "max entropy")
ls_grid <- grid_latin_hypercube(svm_set, size = 20) %>% mutate(type = "latin hypercube")
rn_grid <- grid_random(svm_set, size = 20) %>% mutate(type = "random")

bind_rows(me_grid, ls_grid, rn_grid) %>% 
  ggplot(aes(x = cost, y = rbf_sigma)) + 
  geom_point() + 
  facet_wrap( ~ type) +
  scale_x_log10() + 
  scale_y_log10()  + 
  coord_fixed(ratio = 1/4)

dials will be central to the upcoming framework for optimizing tuning parameters so there is much more to come regarding this package.