Case studies in reproducible research in bioinformatics
and other disciplines
This web page was created to provide convenient links to
resources on reproducible research in bioinformatics.
The term "reproducible research" is taken as a primitive
to refer to a scheme of conducting and publishing quantitative research.
Hallmarks of the scheme are:
- publication of versioned instances of numerical and textual data and
metadata
underlying quantitative analyses;
- publication of computer program code used to generate numerical
summaries and statistical reports;
- publication of documentation that links the programming task
results to analytical report elements;
- reasonable support for consumers of research who wish to
recreate and, possibly, enhance published research findings.
It is clear that reproducible research activities involve
considerable bookkeeping and coordination efforts for
producers, and may require nontrivial computational
resources and skills for consumers.
It is also worth noting that the distinction between
research producer and consumer is not always clear; adoption
of reproducible research disciplines is often found to be
useful in research production, regardless of the impact on
research consumers, principally through auditing
and archiving practices that support concrete reproducibility.
It would be useful
to establish categories of best practices in quantitative
scientific work that contribute to efficiencies of reproducible
research for both producers and consumers of research artifacts.
The resources linked on this page represent concrete instances
of reproducible research discipline, and literature describing the
discipline.
- Work of Coombs and Baggerly of M.D. Anderson Cancer Center
shows how published reports can be carefully examined for
validity; they use reproducible research discipline to document
difficulties encountered in reproducing published findings.
- Work of Gentleman and Temple Lang defines workflows and components for
creating reproducible research artifacts. A link is included to Gentleman's
compendium for the reproduction of a classic microarray analysis.
- Work of Peng and colleagues constitutes a compendium for
a large-scale epidemiologic study of air pollution and mortality.
- Work and sites created in other disciplines is collected in the
final link.
Made 22 Jan 2009 by VJ Carey stvjc at channing dot harvard dot edu.