Vectors Basics

  • Remember that all values in R have a type
  • A vector is a sequence of values that all have the same type
  • Create using the c() function, which stands for “combine”
states <- c("FL", "FL", "GA", "SC")
  • Using the str function we learned last time shows that this is a vector of 4 character strings
str(states)
  • Select pieces of a vector by slicing the vector (like slicing a pizza)
  • Use square brackets []
  • In general [] in R means, “give me a piece of something”
  • states[1] gives us the first value in the vector
  • states[1:3] gives us the first through the third values
  • 1:3 works by makeing a vector of the whole numbers 1 through 3.
  • So, this is the same as states[1:3] is the same as states[c(1, 2, 3)]
  • You can use a vector to get any subset or order you want states[c(4, 1, 3)]

  • Many functions in R take a vector as input and return a value
  • This includes the function length which determines how many items are in a vector
length(states)
  • We can also calculate common summary statistics
  • For example, if we have a vector of population counts
count <- c(9, 16, 3, 10)
mean(count)
max(count)
min(count)
sum(count)

Do Basic Vectors.

Null values

  • So far we’ve worked with vectors that contain no missing values
  • But most real world data has values that are missing for a variety of reasons
  • For example, kangaroo rats don’t like being caught by humans and are pretty good at escaping before you’ve finished measuring them
  • Missing values, known as “null” values, are written in R as NA with no quotes, which is short for “not available”
  • So a vector of 4 population counts with the third value missing would look like
count_na <- c(9, 16, NA, 10)
  • If we try to take the mean of this vector we get NA?
mean(count_na)
  • Hard to say what a calculation including NA should be
  • So most calculations return NA when NA is in the data
  • Can tell many functions to remove the NA before calculating
  • Do this using an optional argument, which is an argument that we don’t have to include unless we want to modify the default behavior of the function
  • Add optional arguments by providing their name (na.rm), =, and the value that we want those arguments to take (TRUE)
mean(count_na, na.rm = TRUE)

Do Nulls in Vectors.

Working with multiple vectors

  • Build on example where we have information on states and population counts by adding areas
states <- c("FL", "FL", "GA", "SC")
count <- c(9, 16, 3, 10)
area <- c(3, 5, 1.9, 2.7)

Vector math

  • Perform the same mathematical operation on each value in a vector by treating it like we would a single value
  • So if we wanted to double all of the values in the area vector
area * 2
  • This works because when do this multiplication, R multiplies the first value in the vector by 2, then multiplies the second values in the vector by 2, and so on
  • Element-wise: operating on one element at a time

  • Remember - this isn’t saved unless we store it
  • So area hasn’t changed
area
  • If we want to keep the results of the calculation them in a new variable
doubled_area <- area * 2
doubled_area
  • We can also do element-wise math with multiple vectors of the same length
  • Let’s divide the count vector by the area vector to get a vector of the density of individuals in that area
density <- count / area
  • When we divide the two vectors, R divides the first value in the first vector by the first value in the second vector, then divides the second values in each vector, and so on
  • Element-wise: operating on one element at a time

Filtering

  • Subsetting or “filtering” is done using []
  • Like with slicing, the [] say “give me a piece of something”
  • Selects parts of vectors based on “conditions” not position
  • Get the density values for sites in Florida
density[states == 'FL']
  • == is how we indicate “equal to” in most programming languages.
  • Not =. = is used for assignment.

  • Can also do “not equal to”
density[states != 'FL']
  • Numerical comparisons like greater or less than
  • Select states that meet with some restrictions on density
states[density > 3]
states[density < 3]
states[density <= 3]
  • Can subset a vector based on itself
  • If we want to look at the densities greater than 3
  • density is both the vector being subset and part of the condition
density[density > 3]

Do Shrub Volume Vectors 1-3.

  • What’s actually happening when we subset vectors this way?
  • Let’s look at the piece of the code inside the []
density > 3
  • This does an element-wise check to see if each value is > 3
  • If it is the result is TRUE, if not it is FALSE
  • The density[] part of the code then keeps those values in the density vector where this inner vector is TRUE
  • You don’t need to remember this last piece now, we’ll come back to it