Working with Spatial Data

Learning Objectives

Following this assignment students should be able to:

import, view properties, and plot a raster

perform simple raster math

extract points from a raster using a shapefile

evaluate a time series of raster

Reading

Topics
- raster
- Raster math
- Plotting spatial images
- Shapefile import
- Integrate raster and vector data
Readings
Additional information
- Overview of Coordinate Reference Systems (CRS) in R
- Rasters in R
- Vectors in R
  - Part I
  - Part II
- Projections in R
- Combining rasters and vectors in R

Lecture Notes

Exercises

Canopy Height from Space (30 pts)

The National Ecological Observatory Network has invested in high-resolution airborne imaging of their field sites. Elevation models generated from LiDAR can be used to map the topography and vegetation structure at the sites.

Check to see if there is a data directory in your workspace with an SJER subdirectory in it. If not, Download the data and extract it into your working directory. The SJER directory contains raster data for a digital terrain model (sjer_dtmcrop.tif) and a digital surface model (sjer_dsmcrop.tif), and vector data on plot locations (sjer_plots.shp) and the site boundary (sjer_boundar.shp) for the San Joaquin Experimental Range.
1. Map the digital terrain model for SJER using the viridis color ramp.
2. Create and map the canopy height model for SJER using the viridis color ramp. To do this subtract the values in the digital terrain model from the values in the digital surface model using raster math (chm = dsm - dtm).
3. Create a map that shows the SJER boundary and the plot locations colored by plot_type.
4. Transform the plot data to have the same CRS as the CHM and create a map that shows the canopy height model from (3) with the plot locations on top.
5. Extract the mean canopy heights at each plot location for SJER and display the values.
6. Add the canopy height values from (5) to the spatial data frame you created for the plots and display the full data frame.
7. Create a map that shows the SJER boundary and the plot locations colored by the canopy height values.
8. Create a map that shows the canopy height model raster, but in cm rather than m (i.e., multiply the canopy height model by 100).
9. Create a map that shows the digital terrain model raster, the plot locations, and the SJER boundary, using transparency as needed to allow all three layers to be seen. Remember all three layers will need to have the same CRS.
10. Conduct an analysis of the relationship between elevation and canopy height at the SJER plots. Start by extracting the mean elevations (i.e., the values from the digital terrain model) at each plot location for SJER and adding them to the spatial plots data so that this data now includes both the elevations and the canopy heights. Then make a scatter plot showing the relationship between elevation and canopy height using this data. Color the points by plot_type and fit a linear model through all of the points together (not separately by plot_type). Finally, use dplyr to calculate the average canopy height and average elevation for the two different plot types. Give the axes good labels.
Expected outputs for Canopy Height from Space: 1 2 3 4 5 6 7 8 9
Species Occurrences Map (40 pts)

A colleague of yours is working on a project on banner-tailed kangaroo rats (Dipodomys spectabilis) and is interested in what elevations these mice tend to occupy in the continental United States. You offer to help them out by getting some coordinates for specimens of this species and looking up the elevation of these coordinates.

Start by getting banner-tailed kangaroo rat occurrences from GBIF, the Global Biodiversity Information Facility, using the spocc R package, which is designed to retrieve species occurrence data from various openly available data resources. Use the following code to do so:
```
```
dipo_df = occ(query = "Dipodomys spectabilis", 
			from = "gbif",
			limit = 1000,
			has_coords = TRUE)
dipo_df = data.frame(dipo_df$gbif$data)
```
```
1. Clean up the data by:
  - Filter the data to only include those specimens with Dipodomys_spectabilis.basisOfRecord that is PRESERVED_SPECIMEN and a Dipodomys_spectabilis.countryCode that is US
  - Remove points with values of 0 for Dipodomys_spectabilis.latitude or Dipodomys_spectabilis.longitude
  - Remove all of the columns from the dataset except Dipodomys_spectabilis.latitude and Dipodomys_spectabilis.longitude and rename these columns to latitude and longitude using select. You can rename while selecting columns using a format like this one select(new_column_name = old_column_name)
  - Use the head() function to show the top few rows of this cleaned dataset
2. Do the following to display the locations of these points on a map of the United States:
  - Get data for a US map using usmap = map_data("usa")
  - Plot it using geom_polygon. In the aesthetic use group = group to avoid weird lines cross your graph. Use fill = "white" and color = "black".
  - Plot the kangaroo rat locations
  - Use coord_quickmap() to automatically use a reasonable spatial projection
Expected outputs for Species Occurrences Map: 1 2
Species Occurrences Elevation Histogram (30 pts)

This is a follow up to Species Occurrences Map.

Now that you’ve mapped some species occurrence data you want to understand how environmental factors influnece the species distribution.
1. The raster package comes with some datasets, including one of global elevations, that can be retrieved with the getData function as follows:
```
 elevation = getData("alt", country = "US")
 elevation = elevation[[1]]
```
  Create a new version of the map from Species Occurrences Map that shows the elevation data as well. Plotting the elevation data may take a while because there are a lot of data points in the dataset. Pay attention to the order that the geom_ objects are plotted in. The name of the elevation variable is USA1_msk_alt. If the website is down you can download a copy from the course site by downloading http://www.datacarpentry.org/semester-biology/data/wc10.zip and unzipping it into your home directory (/home/username on Mac and Linux, C:\Users\username\Documents on Windows) and using the command elevation = getData("alt", country = "US", path = ".")
2. Turn the dipo_df dataframe from Species Occurrences Map into a SpatialPointsDataframe, making sure that its projection matches that of the elevation dataset, and extract the elevation values for all of the kangaroo rat occurrences. Turn this subset of elevation values into a dataframe and plot a histogram of the elevations.
3. Part 2 showed us the elevations where banner-tailed kangaroo rats occur, but without context it’s hard to tell how important elevation is. Make a new graph that shows histograms for all elevations in the US in gray and the kangaroo rat elevations in red. Plot the kangaroo elevations on top of the full elevations and make them transparent so that you can see the overlap. To get the histograms on the same scale we need to plot the density of points instead of the total number of points. This can be done in ggplot using code like:
```
 ggplot() +
   geom_histogram(data = elevations, aes(x = USA1_msk_alt, y = ..density..))
```
  Lable the x axis elevation and add the title “Kangaroorat habitat elevation relative to background”.
Expected outputs for Species Occurrences Elevation Histogram: 1 2 3

Assignment submission & checklist

Assignment

Learning Objectives

Reading

Lecture Notes

Exercises

Canopy Height from Space (30 pts)

Species Occurrences Map (40 pts)

Species Occurrences Elevation Histogram (30 pts)