Exporting and Saving Data from OpenRefine
Last updated on 2023-07-24 | Edit this page
- How can we get our cleaned data out of OpenRefine?
- How can we save the whole project with all history as a file?
- Export cleaned data from an OpenRefine project.
- Save an OpenRefine project as a shareable file.
When you completed the cleaning steps, you probably want to save the cleaned dataset as a new file, so that you can further analyse the data using other applications. OpenRefine allows you to do so by exporting the data in various file formats.
Exportin the top right and select the file type you want to export the data in.
csv) would be good choices.
- OpenRefine creates a file whose name is based on the project name and asks the browser to download it. Depending on your browser settings, this file is automatically saved in the default location for downloaded files, or you see a dialog window to choose where you want to save the file.
The downloaded file can then be opened in a spreadsheet program or imported into programs written in R or Python, for example.
Remember from our lesson on Spreadsheets that using widely-supported,
non-proprietary file formats like
improves the ability of yourself and others to use your data.
OpenRefine only operates on rows that match all enabled filters. This is also true for exporting data. So if you want to export a selection from a larger dataset, you can use filters and facets to select what data you want to export.
However, if you wanted to export all data and forget to reset all facets and filters, the exported dataset may appear to be incomplete. OpenRefine does not provide a warning about enabled filters when you export data.
Next to exporting the data, you can export the project as well. When you export the project, OpenRefine creates a single file that includes the data and all the information about the cleaning and data transformation steps that you have taken.
You can use this file as a project backup, transfer it to another computer to continue working on the data or share it with a collaborator who can open it to see what you did and continue the work.
By default OpenRefine is saving your project continuously while you work on it. If you close OpenRefine and open it up again, you can see a list of your projects when you select “Open Project” on the start screen. You can open an existing project by clicking on its title.
In this exercise, we will export the project and examine the contents of the exported file.
- Click the
Exportbutton in the top right and select
OpenRefine project archive to file.
- OpenRefine then presents a
tar.gzfile for download. Depending on your browser you may have to specify where you want to save the file, or it may be downloaded to your default directory for downloaded files. The
tar.gzextension tells you that this is a compressed file. The downloaded
tar.gzfile is actually a folder of files which have been compressed. Linux and Mac machines will have software installed to automatically expand this type of file when you double-click on it. For Windows based machines you may have to install a utility like ‘7-zip’ in order to expand the file and see the files in the folder.
- After you have expanded the file, look at the files that appear in this folder. What files are here? What information do you think these files contain?
You should see:
historyfolder which contains a collection of
zipfiles. Each of these files itself contains a
change.txtfiles are the records of each individual transformation that you did to your data.
data.zipfile. When expanded, this
zipfile includes a file called
data.txtwhich is a copy of your raw data. You may also see other files.
You can import an existing project into OpenRefine by clicking
Open... in the upper right, then opening the
Import Project tab and selecting the