Find us on GitHub

A Data Carpentry Workshop

National Institutes of Health, NIH Library Training Room

May 26-27, 2016

9:00 am - 4:30 pm

Instructors: Jason Williams, Ryan Dale, Adam Thomas

Helpers: Vinai Roopchansingh, John Lee

General Information

Data Carpentry workshops are for any researcher who has data they want to analyze, and no prior computational experience is required. This hands-on workshop teaches basic concepts, skills and tools for working more effectively with data.

We will cover Cloud computing and command line for genomics and Data analysis and visualization in R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at graduate students and other researchers.

Where: 10 Center Dr, Bethesda, MD 20892. Get directions with OpenStreetMap or Google Maps.

The workshop is in the NIH Library training room in Building 10 (interior map) To get to the Library, enter Building 10 through the South Entrance; the Library is the only door down the left corridor.

Requirements: Participants must bring a laptop with a few specific software packages installed (listed below). They are also required to abide by Data Carpentry's Code of Conduct.

Contact: Please mail for more information. Registration is directly through NIH at

Pre-Survey: Please take this pre-survey prior to the workshop.

We will use results collected by May 20th, 2016 to make fine adjustments to the agenda.


Day 2: 9AM - 4:30PM

Morning: Using Linux to organize and process Genomics Data
Intro to the Linux Shell - Searching and Metadata Ryan
Project Organziation and Documentation Adam
'For' loops - QC of Sequencing Data Jason
Afternoon: Using Linux to Automate
Automating Analyses - Shell Scripting Adam
Creating Workflows - Varient Calling Workflow Jason
Workshop Conclusion Please take the post-survey

Post Workshop

How to Make This Work on Your Own
Launching Your Own Cloud Instances On Your Own

We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


To participate in a Data Carpentry workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions.

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.