Created for use in NYU Abu Dhabi’s Data Analysis course (POLSC-AD 209)
The tidyverse is a set of R packages designed to streamline common data science procedures. The functions, classes, and methods available in these packages follow the philosophy of “tidy data”. Packages range in application from reading and scraping to merging, modeling, and plotting.
Using tidyverse packages makes life easier. Functions in the tidyverse are designed for you to learn quickly. Implementing these functions can often make code faster, clearer to read, and easier to write.
To learn more about the tidyverse, check out the lessons and exercises in R for Data Science.
These materials are designed to supplement an introductory Data Analysis course taught in NYU Abu Dhabi’s political science program. They assume a basic understanding of R and some familiarity with its data structures, particularly data frames. Examples will also try to introduce (social science) data terminology to help readers looking for more information on these topics.
To follow along with the lessons, clone or download the GitHub repository and open tidy_intro.Rproj
in RStudio. You’ll be able to view a lesson by opening its HTML file in a web browser, and you can run the code in its corresponding .Rmd file.
If you’d just like to read through the lessons, you can follow the links here:
Your suggestions and contributions are more than welcome. Please submit pull requests to the tidy_intro repo; you can direct email correspondence to coletl@nyu.edu. Good data sets for teaching and practicing aren’t easy to find, so if you come across any you’d like to share, please let me know. It’d be great to have a small selection picked for different topics available here. I am trying to orient the examples toward political science, but I’m happy to host demonstrations and data (smaller than 100MB) for any social-science applications of the tidyverse packages.