Exporting and Saving Data from OpenRefine
OverviewTeaching: 10 min
Exercises: 5 minQuestions
How can we save and export our cleaned data from OpenRefine?Objectives
Save an OpenRefine project.
Export cleaned data from an OpenRefine project.
Saving and Exporting a Project
In OpenRefine you can save or export the project. This means you’re saving the data and all the information about the cleaning and data transformation steps you’ve done. Once you’ve saved a project, you can open it up again and be just where you stopped before.
By default OpenRefine is saving your project. If you close OpenRefine and open it up again, you’ll see a list of your projects. You can click on any one of them to open it up again.
You can also export a project. This is helpful, for instance, if you wanted to send your raw data and cleaning steps to a collaborator, or share this information as a supplement to a publication.
- Click the
Exportbutton in the top right and select
tar.gzfile will download to your default
tar.gzextension tells you that this is a compressed file. Which means that this file contains multiple files. You can double-click on the
tar.gzfile and it will expand into a directory. A folder icon will now appear.
- On Windows, opening
tar.gzfiles requires additional software such as 7-zip or WinZip. Download and run the installer of your choice.
- Double-click the exported
tar.gzfile. If Windows asks how you want to open the file, check the ‘Always use this app to open
.gzfiles’ box, then select “More apps”.
- If your chosen application is not listed, select ‘Look for another app on this PC’.
- In the file browser, navigate to
C:\Program Files, find the application you installed, and double-click on its executable (
7zFM, for example).
- On Windows, opening
- Look at the files that appear in this folder. What files are here? What information do you think these files contain?
You should see:
historyfolder which contains three
zipfiles. Each of these files itself contains a
change.txtfiles are the records of each individual transformation that you did to your data.
data.zipfile. When expanded, this
zipfile includes a file called
data.txtwhich is a copy of your raw data. You may also see other files.
You can import an existing project into OpenRefine by clicking
Open... in the upper right >
Import Project and selecting the
project file. This project will include all of the raw data and cleaning steps that were part of the original project.
Exporting Cleaned Data
You can also export just your cleaned data, rather than the entire project.
Exportin the top right and select the file type you want to export the data in.
csv) would be good choices.
- That file will be exported to your default
Downloaddirectory. That file can then be opened in a spreadsheet program or imported into programs like R or Python, which we’ll be discussing later in our workshop.
Remember from our lesson on Spreadsheets that using widely-supported, non-proprietary file formats like
csv improves the ability of yourself and others to use your data.
Cleaned data or entire projects can be exported from OpenRefine.
Projects can be shared with collaborators, enabling them to see, reproduce and check all data cleaning steps you performed.