Tidyverse packages

Installation and use

  • Install all the packages in the tidyverse by running install.packages("tidyverse").

  • Run library(tidyverse) to load the core tidyverse and make available in your current R session.

Learn more about the tidyverse package at http://tidyverse.tidyverse.org.

ggplot2

ggplot2 is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. Learn more ...

dplyr

dplyr provides a grammar of data manipulation, providing a consistent set of verbs that solve the most common data manipulation challenges. Learn more ...

tidyr

tidyr provides a set of functions that help you get to tidy data. Tidy data is data with a consistent form: in brief, every variable goes in a column, and every column is a variable. Learn more ...

readr

readr provides a fast and friendly way to read rectangular data (like csv, tsv, and fwf). It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes. Learn more ...

purrr

purrr enhances R’s functional programming (FP) toolkit by providing a complete and consistent set of tools for working with functions and vectors. Once you master the basic concepts, purrr allows you to replace many for loops with code that is easier to write and more expressive. Learn more ...

tibble

tibble is a modern re-imaginging of the data frame, keeping what time has proven to be effective, and throwing out what it has not. Tibbles are data.frames that are lazy and surly: they do less and complain more forcing you to confront problems earlier, typically leading to cleaner, more expressive code. Learn more ...

The tidyverse also includes many other packages with more specialised usage. They are not loaded automatically with library(tidyverse), so you’ll need to load each one with its own call to library().

Import

As well as readr, for reading flat files, the tidyverse includes:

  • readxl for .xls and .xlsx sheets.
  • haven for SPSS, Stata, and SAS data.

There are a handful of other packages that are not in the tidyverse, but are tidyverse-adjacent. They are very useful for importing data from other sources:

  • jsonlite for JSON.

  • xml2 for XML.

  • httr for web APIs.

  • rvest for web scraping.

  • DBI for relational databases. To connect to a specific database, you’ll need to pair DBI with a specific backend like RSQLite, RPostgres, or odbc. Learn more at http://db.rstudio.com.

Wrangle

As well as tidyr, and dplyr, there are five packages designed to work with specific types of data:

  • stringr for strings.
  • lubridate for dates and date-times.
  • forcats for categorical variables (factors).
  • hms for time-of-day values.
  • blob for storing blob (binary) data.

Program

As well as purrr which faciliates functional programming, there are three tidyverse packages that help with general programming challenges:

  • rlang provides tools to work with core language features of R and the tidyverse

  • magrittr provides the pipe, %>% used throughout the tidyverse. It also provide a number of more specialised piping operators (like %$% and %<>%) that can be useful in other places.

  • glue provides an alternative to paste() that makes it easier to combine data and strings.

Model

Modelling within the tidyverse is largely a work in progress. You can see some of the pieces in the recipes and rsample packages but we do not yet have a cohesive system that solves a wide range of challenges. This work will largely replace the modelr package used in R4DS.

You may also find broom to be useful: it turns models into tidy data which you can then wrangle and visualise using the tools you already know.

Upcoming events
San Francisco, CA
Sep 19-20
An intense two day workshop that gives you the skills to build your own tidy tools. Take this class if you have some experience programming in R and you want to learn how to effectively tackle larger scale problems.
Washington, DC
Oct 5-Oct 6
This two-day workshop covers the most important parts of “R for Data Science”, giving you a running start in learning the tidyverse.