Data Manipulation and Scripting in R: Becoming an R-expert
- Start Date: May 27, 2014
- End Date: May 29, 2014
- Time: 8:30am - 4:30pm
- City: Revelstoke, BC
- Venue: The Hillcrest Hotel
- Instructor: Dr. Carl Schwarz, department of Statistics and Actuarial Science, Simon Fraser University
R is a free, open source statistical package that is increasingly being used in many fields. While R is free, it is not cheap — meaning that mastering R requires some time investment. It is particularly helpful to have some guidance in the more complex applications of R.
Class Size: class size is limited to 16 people.
Bring: Laptop computer pre-loaded with software (see below)
Prerequisites: It is assumed that participants have a basic familiarity with R — this is NOT a course for beginners for R. For example, it will be assumed that you can read basic data using read.table() or read.csv() into a data frame; that you can compute basic statistics, e.g. using the mean() function, and have a basic understanding of plotting, e.g. using basic plot() command.
1. Quick review of basic R
- data frames, vs vectors vs. matrices. vs lists
- selecting rows/columns of objects
- more advanced functions, e.g. grep, recode,
- dealing with dates and times
2. Basic model fitting
- The basic ANOVA and regression models using the lm() function.
- Multiple comparisons using the lsmeans package
- Predictions; confidence intervals for the mean and prediction intervals
3. Basic Plotting
- Basic plotting using Base R plot(), histogram(), strip.chart(), boxplot() functions
- Saving these for use in your reports
4. Better graphing via ggplot – much better than Base R graphics
- Basic qplot and ggplot commands
- More advanced features of ggplot
- Saving ggplot graphics
5. Improve your R output
- making nicer tables and text output
- generating nicer reports using R Markdown (but not using Sweave)
6. Functions – generalize your work
- how to write functions
- different data structures for input and output (e.g. data frames, lists, etc)
- passing data among functions; scoping rules;
- debugging your functions
- sourcing and function management
7. Subgroup processing
- the plyr package – processing for sub-groups of your data
8. Bootstraping/simulation studies
- how to find standard errors for non-standard cases
Dr. Carl Schwarz, Department of Statistics and Actuarial Science, Simon Fraser University. Carl has taught many courses with CMI, and is back by popular request!
- Base R. (available for Mac, Linux, and Windoze-systems)
- Adobe Reader
- Microsoft Excel
- More details about software requirements are below
Preparation for the course
About 2 weeks before the course you will be sent a web link where you can download pre-reading, a course manual, a set of practice exercises to load on your computer before the class, and a reminder about where to get the course software.
You will need to bring your own laptop pre-loaded with the required software and downloaded files.
Consider bringing along an external monitor and an external keyboard if you have a small laptop.
You will need to make your own hotel booking, and remember to ask for the rates we’ve arranged for people attending this course (see below).
** The course starts at 8:30 a.m. sharp, you will need to arrive before that so you can set up your computer.
Required Software details
Base R. (available for Mac, Linux, and Windoze-systems).
You should have version 3.0.2 or later.
Visit http://www.r-project.org and install the most recent version on your machine.
*** IMPORTANT *** In order to install packages after R is installed, you will need
either Administrative Access to your machine or a personal library on your desktop
where you can install packages on the fly.
If you want a personal package library, follow the instructions at:
Install the following packages PRIOR for the workshop.
for instructions on how to install packages.
This will also test if your personal library for packages is working.
car, doBy, gdata, ggplot2, gplots, knitr,
lme4, lmerTest, lsmeans, nlme, plyr, xlsx
A VERY brief (interactive) tutorial on R is available at http://tryr.codeschool.com. Earn some badges!
Rstudio. Visit http://www.rstudio.com and install on your machine.
Adobe Reader – Your laptop needs to have Adobe Reader to access the course manual during the course. The manual is about 65 MB and will be supplied as a download prior to the course. You need at least 65 MB of free space so you can copy the manual to the hard drive and read from the hard drive during the course.
MS Excel – You will need to unzip the course exercises files before the course. Your laptop needs to have MS Excel installed for opening the un-zipped files.
Tip : Allow sufficient time for your tech support people to approve and install the software.
For questions about course content and software, please contact: Dr. Carl Schwarz at firstname.lastname@example.org
For questions about registration, contact the CMI office, email@example.com