Social Overtones in Model Reproducibility Databasing Sharing and
Social Overtones in Model Reproducibility, Databasing, Sharing, and Dissemination James Bassingthwaighte and Herbert Sauro Bioengineering, University of Washington, Seattle University of Washington
The Issue is: Most models aren’t reproducible! n Publication. NE. Reproduction or even close. n n Databasing. NE. Reproduction, but better. n n n No standards for publication or review Database formats are limited…. Mainly to ODEs Database curation is incomplete Attempts to disperse models widely has had limited success. SIAM, Montreal 4 Aug 08 2
Why is the level of reproducibility so low? n n n Are authors standards low? No pride in producing high quality work? Or don’t know what it takes? Are numbers of publications more important than the quality of the science. Are funding agencies setting the wrong goals. Are universities setting the wrong priorities for tenure, and scientific societies the wrong criteria for senior fellows or other recognition? Or is it just too much work to make it right? SIAM, Montreal 4 Aug 08 3
Checklist: Standards for reproducibility: Name and Description n Identification and description: n n n Model name 1 to 2 line description A more detailed description - a paragraph The reference publication, e. g. Hodgkin-Huxley 1952 d Pointer to the web source of the publication Related models: antecedents and successors SIAM, Montreal 4 Aug 08 4
Checklist for reproducibility: Structure n Model Structure and Content: n n n n Domain definition (mito, cell, tissue, organ, system) Main variables (pressures, flows, concentrations, etc), units Parameters, units, sources for values References for subsidiary models Inputs, outputs; maybe nodes. edges Ontology base for notation Numerical solvers used, conditions. SIAM, Montreal 4 Aug 08 5
Checklist for reproducibility: Verification n Verification of the mathematical representation: n n n n Unitary balance on all equations for parameters and variables. Mass balance Charge balance, energy balance, osmotic balance, thermodynamic balance (Very difficult to achieve these. Most models achieve one at most. ) Equations correct: same in code and paper. Numerical solutions checked against analytical solutions when these exist for limiting cases. Solutions show little dependence on step size. Running code supplied to reviewers and reader. SIAM, Montreal 4 Aug 08 6
Checklist for reproducibility: Validation n Validation: Model is physiologically realistic n n Initial and boundary conditions defined and appropriate to the physiology and physical chemistry. Data testing the model provided, and demonstrated compatible with the data. n n Data include the anatomy, physico-chemical data, kinetic data on components, and other information known about the system. Model is descriptive first, then explanatory. Model is predictive, and therefore testable under the pressure of novel experiments. All parameter values justified, documented. SIAM, Montreal 4 Aug 08 7
Checklist for reproducibility: Availability. n Model code is available for download n n n Website or archive from which to retrieve model code, and to retrieve the original data used to test it. (MIRIAM) Code for a model should run on a least one standard system. Website or email for queries. Website for public commentary. References to alternative or successor models. SIAM, Montreal 4 Aug 08 8
Benefits of Achieving High Standards n n n n The Nobel prizes go to novel reproducible work, likewise for election to National Academies, society leadership, etc. Demonstrably solid work leads the field. Brings good students and postdocs. Academic advancement is much faster. Allows other to build upon the work, enhancing the author’s prestige, especially since the work will be quoted, not just appreciated at a distance. Dissemination is more and more required by funding sources. Open source: Soon to be required by journals? SIAM, Montreal 4 Aug 08 9
Obstructions to making models reproducible n n It’s more work, much more than writing the code, or the paper. (But should be done before the paper is submitted. ) Effort not recognized by grant reviewers. Reveals trade secrets (= the lab’s advances over others) Inhibits commercialization? SIAM, Montreal 4 Aug 08 10
Is it selfishness, or modesty, to keep one’s model half hidden? n n n Takes to much time away from next project? Author doesn’t feel that it’s good enough to have people work further on it. (The better is the enemy of the good, inhibiting information spread. ) Can’t get a really solid detailed model published in a decent journal? (Things are improving. ) Gives away your secrets? Let’s everyone catch up with your (advanced) thinking? Loses your lead? Inhibits your work toward commercialization? SIAM, Montreal 4 Aug 08 11
Complex Models are composed of reliable reproducible modules Multiscale modeling: (complicated) • • Work at highest level to compute at the speed of thought Reincorporate detailed levels when higher levels fail SIAM, Montreal 4 Aug 08 12
Databasing and Collaboration n Herb Sauro will discuss. SIAM, Montreal 4 Aug 08 13
- Slides: 13