Learning Objectives

Following this assignment students should be able to:

• properly structure a computational project
• use good style
• start to build more complex computational tasks

1. Climate Space (40 pts)

Understanding how environmental factors influence species distributions can be aided by determining which areas of the available climate space a species currently occupies. You are interested in showing how much and what part of the available global temperature and precipitation range is occupied by some common tree species. Create three graphs, one each for Quercus alba, Picea glauca, and Ceiba pentandra. Each graph should show a scatterplot of the mean annual temperature and mean annual precipitation for points around the globe and highlight the values for 1000 locations of the plant species. Start by decomposing this exercise into small manageable pieces.

Here are some tips that will be helpful along the way:

• Climate data data is available from the WorldClim dataset. Using ```getData('worldclim', var = 'bio', res = 10)``` (from the `raster` package) will download all of the bioclim variables. The two variables you need are `bio1` (temperature) and `bio12` (precipitation).
• There are over 500,000 global data points which can make plotting slow. You can choose to plot a random subset of 10,000 points (e.g., using `sample_n` from the `dplyr` package) to limit the time it takes to generate.
• Choose good labels and make the points transparent to see their density.
• You might notice that the temperature values seem large. Storing decimal values uses more space than integers, so the WorldClim creators provide temperature values multiplied by 10. For example, 19.5 is stored as 195. Make sure to display the actual temperatures, not the raw values provided. See more information about WorldClim units here.
• Species’ occurrence data is available from GBIF using the `spocc` package. An example of how to get the data you need is available in the Species Occurrences Map exercise.
• To extract climate values for each occurrence from the climate data you will need a dataframe of occurrences that only only contains longitude and latitude columns.
• If the projections for WorldClim and the species occurrence data aren’t the same you will need a SpatialPointsDataframe.

Challenge (optional): If you want to challenge yourself trying making a single plot with all three species, either all on the same plot of split over three faceted subplots.

2. Create a plot showing histograms of masses for extant mammals and those that went extinct during the pleistocene (`extant` and `extinct` in the `status` column). There should be one sub-plot for each continent and that sub-plot should show the histograms for both groups. Don’t include islands (`Insular` and `Oceanic` in the `continent column) and only include continents with species that went extinct in the pleistocene. Scale the x-axis logarithmically and stack the sub-plots vertically like in the original paper (but don’t worry about the order of the subplots being the same). Use good axis labels.
4. The 3rd figure in the original paper explores Australia as a case study. Australia is interesting because there is good data on both Pleistocene extinctions (`extinct` in the `status` column) and more modern extinctions occuring over the last 300 years (`historical` in the `status` column). Make a plot similar to the previous plots that compares these three different categories `extinct`, `extant`, and `historical`). Has the size pattern in exinctions changed for more modern extinctions?