1 Preface

These notes are a first draft and quite prelimenary. If you find errors, please let me know by emailing

An example of my code for this course is available on my github repository at https://github.com/r-introtodatascience/sample_repo.

The goal of these short courses are to give graduate students a very prelimenary introduction into coding in R. While having an ‘instructor’ teach you these things may be helpful, in my belief, coding, like math, you have to learn by doing. So I highly encourage following these notes, but then play around with the code. Change things and see how it changes the output. I think this is the best way to learn.

Over the next 5 ‘sessions’ we will be looking into county level unemployment data. The goal will be to provide examples and descriptions of the commands and techniques I have discovered to be most useful while working with data. Specifically the goal is by the end of the sessions it is my hope that you will know how to:

  • download data straight into R
  • read in data (Rda, csv, xlsx, dta)
  • merge datasets
  • subset datasets
  • create summary statistics and plots
  • run basic regressions

all while paying particular attention to file pathing and to teach you how to export graphics and tables from R into a form that can be directly read into LaTex (so that it can be easily read into paper/presentation documents).

Below is an example of an interactive plot you can make with a package called plotly. Try interacting with it! In the plot you can select/deselect states you want to view. There is also a link to a short post I made about using GIS with R. I show these to illustrate a bit of what is possible to do in R after we familiarize ourselves with the basic functions.