# Code Handout - Introduction to R

This document contains all of the functions that were covered in the Introduction to R workshop. Each function is presented alongside an example of how it can be used.

## Creating Objects

• <- – “assignment arrow”, assigns a value (vector, dataframe, single value) to the name of a variable

### R

x <- 3
y <- c(1, 2, 3)
z <- x + y
• c() – the “concatenate” function combines inputs to form a vector, the values have to be the same data type.

### R

animals <- c("bird", "cat", "dog")
numbers <- c(1, 14, 57, 89)
logicals <- c(TRUE, FALSE, TRUE, TRUE)

## Inspecting Objects

• str() – compact display of the structure of an R object

### R

str(animals)
• class() – returns the type of element of any R object

### R

class(logicals)
• typeof() – returns the data type or storage mode of any R object

### R

typeof(numbers)

## Functions in R

• args() – returns the arguments of a function

### R

args(round)
• named arguments – the name of the argument the function expects
• You can choose to not name your arguments, if you know the exact order they should be in!
• However, we generally discourage this.

### R

## Either of these work, since the digits argument is named explicitly.
round(3.14159, digits = 2)
round(digits = 2, 3.14159)

## This does not work, since the arguments are not named and in the incorrect order.
round(2, 3.14159)

## Functions to Summarize Data

• sqrt() – returns the square root of a numeric variable

### R

sqrt(numbers)
• mean() – returns the mean of a numeric variable
• You can add the na.rm argument, to remove NA values before calculating the mean.

### R

sqrt(numbers)
• max() – returns the maximum of a numeric variable
• You can add the na.rm argument, to remove NA values before calculating the max.

### R

sqrt(numbers)
• sum() – returns the sum of a numeric variable
• You can add the na.rm argument, to remove NA values before calculating the sum.

### R

sqrt(numbers)
• length() – returns the length of a vector (of any datatype)

### R

length(animals)

## Subsetting Data

• [] – used to subset elements from a vector

### R

animals[3]
## selects the third element

animals[2:3]
## selects the second and third element

animals[c(1, 3)]
## selects the first and third element
• relational operators – return logical values indicating where a relation is satisfied. The most commonly used logical operators for data analysis are as follows:
• == means “equal to”
• != means “not equal to”
• > or < means “greater than” or “less than”
• >= or <= means “greater than or equal to” or “less than or equal to”

### R

animals == "dog"

animals != "cat"

numbers > 4

numbers <= 12
• logical operators – join subset criteria together
• & means “and” – where two criteria must both be satisfied
• | means “or” – where at least one criteria must be satisfied

### R

numbers > 4 & numbers < 20

animals == "dog" | animals == "cat"
• %in% – the “inclusion operator”, allows you to test if any of the elements of a search vector (on the left hand side) are found in the target vector (on the right hand side).
• The levels of the target vector must be included in a vector (c()).

### R

possessions <- c("car", "bicycle", "radio", "television", "mobile_phone")

possessions %in% c("car", "bicycle", "motorcycle")

## Missing Data

• is.na() – returns a vector of logical values indicating which elements of a vector have NA values
• Often combined with !, where the ! negates the previous statement (e.g. !TRUE is equal to FALSE).

### R

missing <- c(1, 3, NA, 7, 12, NA)

is.na(missing)

!is.na(missing)
• na.omit() – removes the observations with NA values

### R

na.omit(missing)
• complete.cases() – returns a vector of logical values indicating which elements of a vector are not missing (NA) values

### R

complete.cases(missing)