Teaching: 10 min
Exercises: 0 min
  • What is OpenRefine useful for?

  • Describe OpenRefine’s uses and applications.

  • Differentiate data cleaning from data organization.

  • Experiment with OpenRefine’s user interface.

  • Locate helpful resources to learn more about OpenRefine.


Motivations for the OpenRefine Lesson

Before we get started

The following setup is necessary before we can get started (instructions here.

Do you need help with any of the following?

What is OpenRefine?

It can help you:

OpenRefine is a powerful, free and open source tool with a large growing community of practice. More help can be found at

Basics of OpenRefine

You can find out a lot more about OpenRefine at and check out some great introductory videos. There is a Google Group that can answer a lot of beginner questions and problems. There is also an OpenRefine Google Plus community where you can find a lot of help, especially from community members from the life sciences. OpenRefine recipes, scripts, projects, and extensions are available too, where you can find and copy them into your OpenRefine instance to run on your dataset.

The OpenRefine GitHub wiki page has a reference of the General Refine Expression Language (GREL).


Key Points

  • OpenRefine is a powerful, free and open source tool that can be used for data cleaning.

  • OpenRefine will automatically track any steps you take in working with your data.