Introduce that we’re teaching data organization, and that we’re using
spreadsheets, because most people do data entry in spreadsheets or
have data in spreadsheets.
Emphasize that we are teaching good practice in data organization and that
this is the foundation of their research practice. Without organized and clean
data, it will be difficult for them to apply the things we’re teaching in the
rest of the workshop to their data.
Much of their lives as a researcher will be spent on this ‘data wrangling’ stage, but
some of it can be prevented with good strategies for data collection up front.
Tell that we’re not teaching data analysis or plotting in spreadsheets, because it’s
very manual and also not reproducible. That’s why we’re teaching bash shell scripting!
Now let’s talk about spreadsheets, and when we say spreadsheets, we mean any program that
does spreadsheets like Excel, LibreOffice, OpenOffice. Most learners are probably using Excel.
Ask the audience any things they’ve accidentally done in spreadsheets. Talk about an example of your own, like that you accidentally sorted only a single column and not the rest
of the data in the spreadsheet. What are the pain points!?
As people answer highlight some of these issues with spreadsheets
Go through the point about keeping track of your steps and keeping raw data raw
Go through the cardinal rule of spreadsheets about columns, rows and cells
Hand them a messy data file and have them pair up and work together to clean up the data.
Learners should not actually download the ENA files in the “Downloading a few sequencing files: EMBL-EBI” section.
Now your data is organized so that a computer can read and understand it. This
lets you use the full power of the computer for your analyses as we’ll see in the
rest of the workshop.
Technical tips and tricks
Provide information on setting up your environment for learners to view your
live coding (increasing text size, changing text color, etc), as well as
general recommendations for working with coding tools to best suit the
Excel looks and acts different on different operating systems
The main challenge with this lesson is that Excel looks very different and how you
do things is even different between Mac and PC, and between different versions of
Excel. So, the presenter’s environment will only be the same as some of the learners.
We need better notes and screenshots of how things work on both Mac and PC. But we
likely won’t be able to cover all the different versions of Excel.
If you have a helper who has experience with the other OS than you, it would be good
to prep them to help with this lesson and tell how people to do things in the other OS.
People are not interactive or responsive on the exercises
This lesson depends on people working on the exercise and responding with things
that are fixed. If your audience is reluctant to participate, start out with
some things on your own, or ask a helper for their answers. This generally gets
even a reluctant audience started.