Find us on GitHub

A Genomics Data Carpentry Workshop

Stony Brook University

January 19-20

9:00 am - 4:30 pm

Instructors: Jason Williams, Sheldon McKay, Matthew Aiello-Lammens

Helpers: Joslynn Lee, Laura Graham, Jonathan Borrelli

General Information

This Genomics Data Carpentry workshop is for learning how to manage and analyze genomic data. This hands-on workshop requires no prior computational experience and teaches basic concepts, skills and tools for working more effectively with genomic data.

We will cover data and project organization, using cloud computing and the command line, running bioinformatics pipelines at the command line, and data analysis and visualization with R. Participants should bring their laptops and plan to participate actively. By the end of the workshop learners should be able to more effectively manage and analyze data and be able to apply the tools and approaches directly to their ongoing research.

Who: The course is aimed at graduate students and other researchers.

Where: Wang Center - Lecture Hall 2; 100 Nicolls Road, Stony Brook, NY 11790. Get directions with OpenStreetMap or Google Maps.

Requirements: Participants must bring a laptop with a few specific software packages installed (will be listed below soon). They are also required to abide by Data Carpentry's Code of Conduct.

Contact: Please mail for more information.

Fee: Fees will cover coffee/snacks. Waivers can be requested by contacting organizer.

Agenda and Lessons

Day 1: 9AM - 5PM

Intro, Data Processing, and Organization
Pre-Survey (only take if you did not complete the link sent via email)
Intro to Data Carpentry Jason
Introduction Jason
Data tidiness Jason
Getting started with the data Jason
Getting your project started Jason
Command line exploration of the data (unix shell) Sheldon
Quality Control of NGS Data Jason

Day 2: 9AM - 5PM

NGS Data analysis
Know your data Jason
Automating a workflow with a shell script Sheldon
Automating a variant calling workflow Sheldon
data analysis and visualization with R
Data analysis and visualization with R Matthew
Moving Your Data Jason
Please complete the post-workshop Survey Everyone

Post-Workshop (or using these lessons on your own)

Launching cloud instances after the workshop
Launching your own instance
Note: Start here if you are working on the lessons on your own
On your own

We will use this Etherpad for chatting, taking notes, and sharing URLs and bits of code.


To participate in a Data Carpentry workshop, you will need working copies of the described software. Please make sure to install everything (or at least to download the installers) before the start of your workshop. Participants should bring and use their own laptops to insure the proper setup of tools for an efficient workflow once you leave the workshop.

Please follow these Setup Instructions

We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.