Learn the tidyverse

book cover for R for Data Science

R for data science

The best place to start learning the tidyverse is R for Data Science (R4DS for short), an O’Reilly book written by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund. It’s designed to take you from knowing nothing about R or the tidyverse to having all the basic tools of data science at your fingertips. You can read it online for free, or buy a physical copy.

We highly recommend pairing R4DS with the Posit cheatsheets. These cheatsheets have been carefully designed to pack a lot of information into a small amount of space. You can keep them handy at your desk and quickly jog your memory when you get stuck. Most of the cheatsheets have been translated into multiple languages.

Books

Workshops

  • Mastering the Tidyverse by Jumping Rivers. This course will show you how you can use R to efficiently clean and wrangle your data into a format that’s ready for analysis. You will learn about the Tidyverse, what tidy data really is, and how to practically achieve it with packages such as dplyr, tidyr, lubridate, and forcats.
  • Learn R for Data Analysis by Locke Data. Attend this two day course to get hands-on with the R programming language. Learn how to connect to different data sources, wrangle the data into the shape you need, visualise it, and compile everything into reports.

Teaching materials

Data Science in a Box contains the complete materials for teaching a semester-long introductory data science course. The “box” contains materials for an undergraduate level introductory data science course, such as slide decks, homework assignments, guided labs, sample exams, a final project assignment, as well as materials for instructors such as pedagogical tips, information on computing infrastructure, technology stack, and course logistics. The website exposes the source materials that live in a GitHub repository and use datasets from the dsbox package.