Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. The lessons below were designed for those interested in working with ecology data in R.
This is an introduction to R designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about R syntax, the RStudio interface, and move through how to import CSV files, the structure of data frames, how to deal with factors, how to add/remove rows and columns, how to calculate summary statistics from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from R.
This lesson assumes no prior knowledge of R or RStudio and no programming experience.
Data Carpentry’s teaching is hands-on, and to follow this lesson learners must have R and RStudio installed on their computers. They also need to be able to install a number of R packages, create directories, and download files.
To avoid troubleshooting during the lesson, learners should follow the instruction below to download and install everything beforehand. If they are using their own computers this should be no problem, but if the computer is managed by their organization’s IT department they might need help from an IT administrator.
R and RStudio are two separate pieces of software:
If you don’t already have R and RStudio installed, follow the instructions for your operating system below. You have to install R before you install RStudio.
.exefile that was just downloaded
.pkgfile for the latest R version
sudo apt-get install r-base, and for Fedora
sudo yum install R), but we don’t recommend this approach as the versions provided by this are usually out of date. In any case, make sure you have at least R 3.3.1.
sudo dpkg -i rstudio-x.yy.zzz-amd64.debat the terminal).
If you already have R and RStudio installed, check if your R and RStudio are up to date:
sessionInfo()into the console. If your R version is 4.0.0 or later, you don’t need to update R for this lesson. If your version of R is older than that, download and install the latest version of R from the R project website for Windows, for MacOS, or for Linux
Help" > Check for updates. If a new version is available, quit RStudio, follow the instruction on screen.
Note: It is not necessary to remove old versions of R from your system, but if you wish to do so you can check How do I uninstall R?
During the course we will need a number of R packages. Packages contain useful R code written by other people. We will use the packages
To try to install these packages, open RStudio and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press the Enter (Windows and Linux) or Return (MacOS) to execute the command.
Alternatively, you can install the packages using RStudio’s graphical user interface by going to
Tools > Install Packages and typing the names of the packages separated by a comma.
R tries to download and install the packages on your machine. When the installation has finished, you can try to load the packages by pasting the following code into the console:
If you do not see an error like
there is no package called ‘...’ you are good to go!
We will download the data directly from R during the lessons. However, if you are expecting problems with the network, it may be better to download the data beforehand and store it on your machine.
The data files for the lesson can be downloaded manually here: https://doi.org/10.6084/m9.figshare.1314459
The list of contributors to this lesson is available here.
Page built on: 📆 2021-05-05 ‒ 🕢 21:43:38