#### Learning Objectives

Following this assignment students should be able to:

• use, modify, and write custom functions
• use the output of one function as the input of another
• understand and use the basic relational operators
• use an `if` statement to evaluate conditionals

• Topics

• Functions
• Conditionals

1. #### Writing Functions (5 pts)

Write a function that converts pounds to grams (there are 453.592 grams in one pound). It should take a value in pounds as the input and return the equivalent value in grams (i.e., the number of pounds times 453.592). Use that function to calculate how many grams there are in 3.75 pounds.

2. #### Use and Modify (10 pts)

The length of an organism is typically strongly correlated with its body mass. This is useful because it allows us to estimate the mass of an organism even if we only know its length. This relationship generally takes the form:

Mass = a * Lengthb

Where the parameters `a` and `b` vary among groups. This allometric approach is regularly used to estimate the mass of dinosaurs since we cannot weigh something that is only preserved as bones.

The following function estimates the mass of an organism in kg based on its length in meters for a particular set of parameter values, those for Theropoda (where `a` has been estimated as `0.73` and `b` has been estimated as `3.63`; Seebacher 2001).

``````get_mass_from_length_theropoda <- function(length){
mass <- 0.73 * length ** 3.63
return(mass)
}
``````
1. Add a comment to this function so that you know what it does.
2. Use this function to print out the mass of a Spinosaurus that is 16 m long based on its reassembled skeleton. Spinosaurus is a predator that is bigger, and therefore, by definition, cooler, than that stupid Tyrannosaurus that everyone likes so much.
3. Create a new version of this function called `get_mass_from_length()` that estimates the mass of an organism in kg based on its length in meters by taking length, a, and b as parameters. To be clear we want to pass the function all 3 values that it needs to estimate a mass as parameters. This makes it much easier to reuse for all of the non-theropod species. Use this new function to estimate the mass of a Sauropoda (`a = 214.44`, `b = 1.46`) that is 26 m long.
3. #### Combining Functions (10 pts)

This is a follow up to Use and Modify.

Measuring things using the metric system is the standard approach for scientists, but when communicating your results more broadly it may be useful to use different units (at least in some countries). Write a function that converts kilograms into pounds (there are 2.205 pounds in a kilogram). Use that function along with your dinosaur mass function from Use and Modify to estimate the weight, in pounds, of a 12 m long Stegosaurus (12 m is about as big as they come and nothing gets folks excited like a giant dinosaur). In Stegosauria, `a` has been estimated as `10.95` and `b` has been estimated as `2.64` (Seebacher 2001).

4. #### Choice Operators (10 pts)

Create the following variables.

``````w <- 10.2
x <- 1.3
y <- 2.8
z <- 17.5
dna1 <- "attattaggaccaca"
dna2 <- "attattaggaacaca"
colors <- c("green", "pink", "red")
``````

Use them to print whether or not the following statements are

`TRUE` or `FALSE`.

1. `w` is greater than 10
2. `"green"` is in `colors`
3. `x` is greater than `y`
4. 2 * `x` + 0.2 is equal to `y`
5. `dna1` is the same as `dna2`
6. `dna1` is not the same as `dna2`
7. `w` is greater than `x`, and `y` is greater than `z`
8. `x` times `w` is between 13.2 and 13.5
9. `dna1` is longer than 5 bases (use `nchar()` to figure out how long a string is), or `z` is less than `w` * `x`
5. #### Simple If Statement (10 pts)

To determine if a file named `thesis_data.csv` exists in your working directory you can use the code to get a list of available files and directories:

``````list.files()
``````
1. Use the `%in%` operator to write a conditional statement that checks to see if `thesis_data.csv` is in this list.
2. Write an `if` statement that loads the file using `read.csv()` only if the file exists.
3. Add an `else` clause that prints “OMG MY THESIS DATA IS MISSING. NOOOO!!!!” if the file doesn’t exist.
4. Make sure your actual thesis data is backed up.
6. #### Size Estimates by Name (20 pts)

This is a follow up to Use and Modify.

To make it even easier to work with your dinosaur size estimation functions you decide to create a function that lets you specify which dinosaur group you need to estimate the size of by name and then have the function automatically choose the right parameters.

Create a new function `get_mass_from_length_by_name()` that takes two arguments, the `length` and the name of the dinosaur group. Inside this function use `if`/`else if`/`else` statements to check to see if the name is one of the following values and if so set `a` and `b` to the appropriate values.

• Stegosauria: `a` = `10.95` and `b` = `2.64` (Seebacher 2001).
• Theropoda: `a` = `0.73` and `b` = `3.63` (Seebacher 2001).
• Sauropoda: `a` = `214.44` and `b` = `1.46` (Seebacher 2001).

Once the function has assigned `a` and `b` have it run `get_mass_from_length()` with the appropriate values and return the estimated mass.

Run the function for:

1. A Stegosauria that is 10 meters long.
2. A Theropoda that is 8 meters long.
3. A Sauropoda that is 12 meters long.

Challenge (optional): If the name doesn’t match any of these values have the function return `NA` and print out a message that it doesn’t know how to convert that group.

7. #### DNA or RNA (15 pts)

Write a function, `dna_or_rna()`, which takes a `sequence` as an argument, that determines if a sequence of base pairs is DNA, RNA, or if it is not possible to tell given the sequence provided. Since all the function will know about the material is the sequence the only way to tell the difference between DNA and RNA is that RNA has the base Uracil (`"u"`) instead of the base Thymine (`"t"`). You can check if a string contains a character (or a longer substring) in R using `grepl(substring, string)`. Have the function return one of three outputs: `"DNA"`, `"RNA"`, or `"UNKNOWN"`. Use the function to test each of the following sequences.

``````seq1 <- "ttgaatgccttacaactgatcattacacaggcggcatgaagcaaaaatatactgtgaaccaatgcaggcg"
seq2 <- "gauuauuccccacaaagggagugggauuaggagcugcaucauuuacaagagcagaauguuucaaaugcau"
seq3 <- "gaaagcaagaaaaggcaggcgaggaagggaagaagggggggaaacc"
``````

Optional: For a little extra challenge make your function work with both upper and lower case letters, or even strings with mixed capitalization

1. Create a function to download occurrence data and extract the corresponding climate data, which should return a dataset of all the bioclim variables for each species. Because the latitude and longitude columns for each occurrence dataset will be different, generalize them using the column index, instead of the column name, to get only those columns (e.g., `select(longitude = 2, latitude = 3`).