Dawn Koffman bio
Boriana Pratt bio
This workshop introduces two modern R packages, both written by Hadley Wickham and part of R’s “tidyverse,” that provide intuitive tools for handling common data management tasks. The first package, tidyr, provides functions that reshape data so it conforms to a specific “tidy” structure where each variable is saved in its own column, each observations is saved in its own row, and each type of observational unit is stored in a separate table. The second package, dplyr, provides a set of functions (referred to as “verbs”) that allow you to easily subset observations, re-order observations, select specific variables, add new variables, group observations, and summarize groups of observations.
Participants will walk away with both a general understanding of “tidy” representations of data and practical knowledge of how to leverage it in R.
Participants should have at least basic familiarity with R and RStudio – this session is not appropriate for people with no prior R experience.
This session is heavily hands-on. To follow along with the exercises, participants should have both R and RStudio installed on their laptops. Instructions for how to do this can be found on the advance setup guide for PICSciE virtual workshops. Ideally, participants will also have installed the tidyr and dplyr packages in advance.
Alternately, participants who prefer to run RStudio remotely on one of Princeton’s systems can do so via the “myadroit” web interface to the Adroit cluster. To do so, you should first register for an account on Adroit, as described in the advance setup guide for PICSciE virtual workshops. Then, connect to “myadroit” and start an RStudio session, as described here.
Lecture, discussion, and hands-on
All presentation materials are here (see links at the bottom of that page).
This session was NOT recorded.