File Organization: Naming

Overview

Teaching: 30 min
Exercises: 10 min
Questions
  • What are the common file organization errors?

  • What are best practices for file organization?

Objectives
  • Highlight common SNAFUs

  • Learn to employ unit testing

Names matter

NO

myabstract.docx
Joe’s Filenames Use Spaces and Punctuation.xlsx
figure 1.png
fig 2.png
JW7d^(2sl@deletethisandyourcareerisoverWx2*.txt

YES

2014-06-08_abstract-for-sla.docx
joes-filenames-are-getting-better.xlsx
fig01_scatterplot-talk-length-vs-interest.png
fig02_histogram-talk-attendance.png
1986-01-28_raw-data-from-challenger-o-rings.txt

Three principles for (file) names:

  1. Machine readable
  2. Human readable
  3. Plays well with default ordering

Awesome file names :)

plot of chunk unnamed-chunk-1


Machine readable

Machine readable


Globbing

Except of complete file listing:

plot of chunk unnamed-chunk-2


Example of globbing to narrow file listing:

plot of chunk unnamed-chunk-3


Same using Mac OS Finder search facilities:

plot of chunk unnamed-chunk-4


Same using regex in R:

plot of chunk unnamed-chunk-5


Punctuation

Deliberate use of “-“ and “_” allows recovery of meta-data from the filenames:

plot of chunk unnamed-chunk-6


plot of chunk unnamed-chunk-7

This happens to be R but also possible in the shell, Python, etc.


Recap: machine readable

Human readable

Human readable


Example

Which set of file(name)s do you want at 3 a.m. before a deadline?

plot of chunk unnamed-chunk-8


Embrace the slug

plot of chunk unnamed-chunk-9

plot of chunk unnamed-chunk-10


Recap: Human readable

Easy to figure out what the heck something is, based on its name


Plays well with default ordering

Plays well with default ordering

Examples

Chronological order:

plot of chunk unnamed-chunk-11


Logical order: Put something numeric first

plot of chunk unnamed-chunk-12


Dates: Use the ISO 8601 standard for dates: YYYY-MM-DD

plot of chunk unnamed-chunk-13


plot of chunk unnamed-chunk-14

From twitter


Left pad other numbers with zeros

plot of chunk unnamed-chunk-15

If you don’t left pad, you get this:

 10_final-figs-for-publication.R
 1_data-cleaning.R
 2_fit-model.R

which is just sad :(


Recap: Plays well with default ordering

Recap


Three principles for (file) names

  1. Machine readable
  2. Human readable
  3. Plays well with default ordering

Pros


Go forth and use awesome file names :)

plot of chunk unnamed-chunk-16

plot of chunk unnamed-chunk-17

Key Points

  • File organization is important.