### Basic `for` loop

• Loops are the fundamental structure for repetition in programming
• `for` loops perform the same action for each item in a list of things
``````for (item in list_of_items) {
do_something(item)
}
``````
• To see an example of this let’s calculate masses from volumes using a loop
• Need `print()` to display values inside a loop or function
``````volumes = c(1.6, 3, 8)
for (volume in volumes){
mass <- 2.65 * volume ^ 0.9
print(mass)
}
``````
• Code in the loop will run once for each value in volumes
• Everything between the curly brackets is executed each time through the loop
• Code takes the first value from `volumes` and assigns it to `volume` and does the calculation and prints it
• Then it takes the second value from `volumes` and assigns it to `volume` and does the calculation and prints it
• And so on
• So, this loop does the same exact thing as
``````volume <- volumes[1]
mass <- 2.65 * volume ^ 0.9
print(mass)
volume <- volumes[2]
mass <- 2.65 * volume ^ 0.9
print(mass)
volume <- volumes[3]
mass <- 2.65 * volume ^ 0.9
print(mass)
``````

Do Tasks 1 & 2 in Basic For Loops.

### Looping with an index & storing results

• R loops iterate over a series of values in a vector or other list like object
• When we use that value directly this is called looping by value
• But there is another way to loop, which is called looping by index
• Looping by index loops over a list of integer index values, typically starting at 1
• These integers are then used to access values in one or more vectors at the position inicated by the index
• If we modified our previous loop to use an index it would look like this
• We often use `i` to stand for “index” as the variable we update with each step through the loop
``````volumes = c(1.6, 3, 8)
for (i ...)
``````
• We then create a vector of position values starting at 1 (for the first value) and ending with the length of the object we are looping over
``````volumes = c(1.6, 3, 8)
for (i in 1:3)
``````
• We don’t want to have to know the length of the vector and it might change in the future, so we’ll look it up using the `length()` function
``````volumes = c(1.6, 3, 8)
for (i in 1:length(volumes)){

}
``````
• Then inside the loop instead of doing the calculation on the index (which is just a number between 1 and 3 in our case)
• We use square brackets and the index to get the appropriate value out of our vector
``````volumes = c(1.6, 3, 8)
for (i in 1:length(volumes)){
mass <- 2.65 * volumes[i] ^ 0.9
print(mass)
}
``````
• This gives us the same result, but it’s more complicated to understand
• So why would we loop by index?
• The advantage to looping by index is that it lets us do more complicated things

• One of the most common things we use this for are storing the results we calculated in the loop
• To do this we start by creating an empty object the same length as the results will be before the loop starts
• To store results in a vector we use the function `vector` to create an empty vector of the right length
• `mode` is the type of data we are going to store
• `length` is the length of the vector
``````masses <- vector(mode = "numeric", length = length(volumes))
masses
``````
• Then add each result in the right position in this vector
• For each trip through the loop put the output into the empty vector at the `i`th position
``````for (i in 1:length(volumes)){
mass <- 2.65 * volumes[i] ^ 0.9
masses[i] <- mass
}
masses
``````
• Walk through iteration in debugger

Do Tasks 3-4 in Basic For Loops.

End of 1 hour class

### Looping over multiple values

• Looping with an index also allows us to access values from multiple vectors
``````as <- c(2.65, 1.28, 3.29)
bs <- c(0.9, 1.1, 1.2)
volumes = c(1.6, 3, 8)
masses <- vector(mode="numeric", length=length(volumes))
for (i in 1:length(volumes)){
mass <- as[i] * volumes[i] ^ bs[i]
masses[i] <- mass
}
``````

Do Task 5 in Basic For Loops.

### Looping with functions

• It is common to combine loops with functions by calling one or more functions as a step in our loop
• For example, let’s take the non-vectorized version of our `est_mass` function that returns an estimated mass if the `volume > 5` and `NA` if it’s not.
``````est_mass_max <- function(volume, a, b){
if (volume < 5) {
mass <- a * volume ^ b
} else {
mass <- NA
}
return(mass)
}
``````
• We can’t pass the vector to the function and get back a vector of results because of the `if` statements
• So let’s loop over the values
• First we’ll create an empty vector to store the results
• And them loop by index, calling the function for each value of `volumes`
``````masses <- vector(mode="numeric", length=length(volumes))
for (i in 1:length(volumes)){
mass <- est_mass_max(volumes[i], as[i], bs[i])
masses[i] <- mass
}
``````
• This is the for loop equivalent of an `mapply` statement
``````masses_apply <- mapply(est_mass_max, volumes, as, bs)
``````

### Looping over data frames

• By default when R loops over a data frame it loops over the columns
``````data <- data.frame(a = as, b = bs, volume = volumes)
for (i in data) {
print(i)
}
``````
• To loop over rows, loop by index and subset
``````for (i in 1:nrow(data)) {
print(data[i, ])
}
``````
• If we want to use a specific column
``````masses <- vector(mode="numeric", length=length(volumes))
for (i in 1:nrow(data)) {
mass <- est_mass_max(data[i, "volume"], data[i, "a"], data[i, "b"])
masses[i] <- mass
}
``````

### Looping over files

• Repeat same actions on many similar files
``````download.file("http://www.datacarpentry.org/semester-biology/data/locations.zip",
"locations.zip")
unzip("locations.zip")
``````
• Now we need to get the names of each of the files we want to loop over
• We do this using `list.files()`
• If we run it without arguments it will give us the names of all files in the directory
``````list.files()
``````
• But we just want the data files so we’ll add the optional `pattern` argument to only get the files that start with `"locations-"`
``````data_files = list.files(pattern = "locations-")
``````
• Once we have this list we can loop over it count the number of observations in each file
• First create an empty vector to store those counts
``````num_files = length(data_files)
results <- vector(mode = "integer", length = num_files)
``````
• Then write our loop
``````for (i in 1:num_files){
filename <- data_files[i]
count <- nrow(data)
results[i] <- count
}
``````

Do Task 1 of Multiple-file Analysis. Exercise uses different collar data

### Storing loop results in a data frame

• We often want to calculate multiple pieces of information in a loop making it useful to store results in things other than vectors
• We can store them in a data frame instead by creating an empty data frame and storing the results in the `i`th row of the appropriate column
• Associate the file name with the count
• Also store the minimum latitude
• Start by creating an empty data frame
• Use the `data.frame` function
• Provide one argument for each column
• “Column Name” = “an empty vector of the correct type”
``````results <- data.frame(file_name = vector(mode = "character", length = num_files),
count = vector(mode = "integer", length = num_files),
min_lat = vector(mode = "numeric", length = num_files))
``````
• Now let’s modify our loop from last time
• Instead of storing `count` in `results[i]` we need to first specify the `count` column using the `\$`: `results\$count[i]`
• We also want to store the filename, which is `data_files[i]`
``````for (i in 1:n_files){
filename <- data_files[i]
count <- nrow(data)
min_lat = min(data\$lat)
results[i, "file_name"] <- filename
results[i, "count"] <- count
results[i, "min_lat"] <- min_lat
}
``````

Do Task 2 Multiple-file Analysis. Exercise uses different collar data

### Subsetting Data (optional)

• Loops can subset in ways that are difficult with things like `group_by`
• Look at some data on trees from the National Ecological Observatory Network
``````library(ggplot2)
library(dplyr)

ggplot(neon_trees, aes(x = easting, y = northing)) +
geom_point()
``````
• Look at a north-south gradient in number of trees
• Need to know number of trees in each band of y values
• Start by defining the size of the window we want to use
• Use the grid lines which are 2.5 m
``````window_size <- 2.5
``````
• Then figure out the edges for each window
``````south_edges <- seq(4713095, 4713117.5, by = window_size)
north_edges <- south_edges + window_size
``````
• But we don’t want to go all the way to the far edge
``````south_edges <- seq(4713095, 4713117.5 - window_size, by = window_size)
north_edges <- south_edges + window_size
``````
• Set up an empty data frame to store the output
``````counts <- vector(mode = "numeric", length = length(left_edges))
``````
• Look over the left edges and subset the data occuring within each window
``````for (i in 1:length(south_edges)) {
data_in_window <- filter(neon_trees, northing >= south_edges[i], northing < north_edges[i])
counts[i] <- nrow(data_in_window)
}
counts
``````

### Nested Loops (optional)

• Sometimes need to loop over multiple things in a coordinate fashion
• Pass a window over some spatial data
• Look at full spatial pattern not just east-west gradient

• Basic nested loops work by putting one loop inside another one
``````for (i in 1:10) {
for (j in 1:5) {
print(paste("i = " , i, "; j = ", j))
}
}
``````
• Loop over x and y coordinates to create boxes
• Need top and bottom edges
``````east_edges <- seq(731752.5, 731772.5 - window_size, by = window_size)
west_edges <- east_edges + window_size

``````
• Redefine out storage
``````output <- matrix(nrow = length(south_edges), ncol = length(east_edges))
``````
``````for (i in 1:length(south_edges)) {
for (j in 1:length(east_edges)) {
data_in_window <- filter(neon_trees,
northing >= south_edges[i], northing < north_edges[i],
easting >= left_edges[j], easting < right_edges[j],)
output[i, j] <- nrow(data_in_window)
}
}
output
``````

### Sequence along (optional)

• `seq_along()` generates a vector of numbers from 1 to `length(volumes)`