Reason number 1: reproducibility helps to avoid disaster
Reason number 2: reproducibility makes it easier to write papers
Reason number 3: reproducibility helps reviewers see it your way
Reason number 4: reproducibility enables continuity of your work
Reason number 5: reproducibility helps to build your reputation
Whom do we need to share with?
peer reviewers & journal editors
broad scientific community
generally the public
For research to be reproducible, the research products (data, code) need to be publicly available in a form that people can find and understand them.
Catalog the artifacts you produced this morning
What needs to be published?
What does not need to be published?
Anything that cannot be published?
starting data set (raw data)
data cleaning steps
processed / cleaned data
confidential (e.g., patient) data
material already published
pre-existing restrictive license
passwords, private keys
Advice: One way to determine what you need to publish is to go through and redo the analyses in your paper. Make note of the data and code and notes you needed to do that analysis. Make sure all of that is available. This might seem time consuming, but it assures that what you think you did is what you actually did.
The Open Definition sets out principles that define “openness” in relation to data and content. It makes precise the meaning of “open” in the terms open data, open content, and open source:
“Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness).”
“Open data and content can be freely used, modified, and shared by anyone for any purpose”
CC0 enables scientists, educators, artists and other creators and owners of copyright- or database-protected content to waive those interests in their works and thereby place them as completely as possible in the public domain, so that others may freely build upon, enhance and reuse the works for any purposes without restriction under copyright or database law.
[…] in the scholarly research community the act of citation is a commonly held community norm when reusing another community member’s work.
Community norms can be a much more effective way of encouraging positive behaviour, such as citation, than applying licenses. A well functioning community supports its members in their application of norms, whereas licences can only be enforced through court action and thus invite people to ignore them when they are confident that this is unlikely.
licenses are legal instruments
There are legal implications to your choices.
Citation is a professional norm in science.
We have good systems for ensuring proper citation.
Would you try to sue someone in court who fails to cite you properly?
Keep it simple by putting the least-restrictive license possible
Let scientists do science without having to talk to lawyers.
Challenges and concerns about publishing data and code
What are some of the challenges of publishing research products? What are some of the concerns that people have?