Introduction

  • Large language models are a form of machine learning, or AI, that can be used for generating text in response to written prompts
  • Who’s heard about ChatGPT and other similar models?
  • One of the kinds of text these models can generate is code

  • The very simplistic version of how these models work is that they look at a string of words and figure out what word is most likely to come next
  • They learn the most likely next word by looking at millions of examples from the internet
  • In other words they are an advanced form of autocomplete or a parrot

  • Since there is lots of code written by software developers on the internet they are pretty good at generating code
  • And since there are lots of lessons on how to learn to code on the internet they can also be good generating text that explains code

Examples

  • Open ChatGPT
  • Let’s prompt ChatGPT to solve something like we’ve been working on
  • Enter the following prompt

How do you calculate the sum of the vector numbers <- c(2.1, 2.7, 2.7, 3.2, 2.9, NA, 3.9, 2.1, 4.5, 2.6) in R?

  • ChatGPT will produce a code answer with a variable holding the sum
  • This looks like the right answer and it even saw the NA, handled it appropriately, and explained that
  • Let’s prompt ChatGPT for the result of this code

What is the value of ?

  • Copy the code into R and run it
  • The result returned is currently wrong, but that could change
  • In this case the LLM is wrong
  • It can’t run code and it doesn’t know how to do math, it just knows that when the word “sum” is used for a string of numbers that looks like roughly like this one that there tends to be a number that looks roughly like 24.7 that follows it
  • So, LLMs can be powerful, but also wrong

Right answer wrong approach

  • Because LLM aren’t specifically designed for this course they may show you ways to do things that we aren’t learning

  • Start a new Chat
  • Type the following prompt

In the R programming language use code to print the sum of the following vector.

numbers <- c(2.1, 2.7, 2.7, 3.2, 2.9, NA, 3.9, 2.1, 4.5, 2.6, 2.9, 3.1)

  • Because of the small differences in the phrasing of the question (and the stochasticity of LLMs) we get a different answer
  • It still works, but it’s more complicated, and it’s not the approach that we’re learning

Using LLMs for learning

  • There are a variety of meaningful ethical concerns about using LLMs
    • The use a lot of energy to train and run and therefore put a lot of CO2 in the atmosphere
    • They use millions of peoples work without credit or payment, arguably in violation of copyright and licenses
    • And since they parrot what’s on the internet they often lot of bias and bigotry
  • That said, LLMs can be useful for learning and you are welcome to use them for this in this course
  • Using them to directly answer the exercises won’t help you learn, because humans need practice to learn
  • That’s the only reason we have exercises
  • So what are useful ways to use them?
  • You can prompt them to explain things you don’t understand and they will parrot relevant advice from material on the web
  • This can be easier, especially for folks learning to code, than trying to search for a specific site that has the answer
  • You can also use them to help debug your code, which we’ll talk about more when we talk about debugging

Using LLMs for work

  • Once you’ve finished the course you can use them to automate things you already know how to do
  • But LLMs are parrots so you know enough so that you can check and make sure that the model produced a valid result