OpenRefine for Social Science Data: Setup


The data for this lesson is a part of the Data Carpentry Social Sciences workshop. It is a teaching version of the Studying African Farmer-Led Irrigation (SAFI) database. The SAFI dataset represents interviews of farmers in two countries in eastern sub-Saharan Africa (Mozambique and Tanzania). These interviews were conducted between November 2016 and June 2017 and probed household features (e.g. construction materials used, number of household members), agricultural practices (e.g. water usage), and assets (e.g. number and types of livestock).

The data used in this lesson is a subset of the teaching version that has been intentionally ‘messed up’ for this lesson.

Download the data file to your computer.


For this lesson you will need OpenRefine (formerly Google Refine) and a web browser. Basic installation steps are provided on this page. The OpenRefine installation manual provides more details about installation, upgrades and configuration.

Note: this is a Java program that runs on your machine (not in the cloud). It runs inside your browser, but no web connection is needed for this lesson.

You do not need administrative rights on the computer to install OpenRefine. However, if anti-malware software blocks OpenRefine when you try to start it, you may need administrative rights to allow OpenRefine to run. OpenRefine is safe to run.