EDITORIAL

Trust and Recognition: Coming to Terms with Models James E. Stahl, MD, CM, MPH

S

pecial issues serve many purposes. Commonly they are used to bring attention to and focus on topics whose importance is increasing or whose importance is unrecognized or insufficiently recognized, to reprise the state of the art, or provide an educational opportunity for an interested community. In this special issue, all of these goals are addressed. The articles by Enns and others, Jackson and others, and Kong explore developing methods used to calibrate models. O’Mahoney and others and Goldhaber-Fiebert and Brandeau explore and elucidate time-related calibration issues. Alagoz and others, Eismann and others, and Joranger and others use calibration methods to establish and report on foundational work for their modeling programs. An article by Kimmel and another by van Calster demonstrate the very different outcomes that might result when models are not calibrated appropriately and the value of using local data to calibrate, validate, and refine models. The word calibrate appears to have originated around the end of the early industrial age and the beginning of mass-produced artillery. It was important that the caliber of each cannon be uniform and consistent with the standard. This idea was then extended to other mechanical instruments. As one might imagine, in the age of steam, calibration was critical. Poorly calibrated instruments could literally explode in your face. Over time, calibration has become understood as comparing the output of any

Received 25 August 2014 from the Massachusetts General Hospital, Boston, MA (JES). Revision accepted for publication 14 November 2014. Address correspondence to James E. Stahl, Massachusetts General Hospital, MGH-Institute for Technology Assessment, 101 Merrimac St, 10th floor, Boston, MA 02114; e-mail: [email protected]. Ó The Author(s) 2014 Reprints and permission: http://www.sagepub.com/journalsPermissions.nav DOI: 10.1177/0272989X14563080

136 

particular instrument to a known standard and adjusting this instrument so that its behavior, usually characterized by some output, is as close as possible to the standard. As described in the recent SMDM-ISPOR consensus articles,1–5 calibration is central to the modeling world and embodies two main activities. First, it is used in its commonly understood form, involving a process of defining the set of parameters relevant to the problem, identifying calibration targets (often measures such as morbidity and mortality), selecting measures of goodness of fit, choosing a search strategy for identifying parameter values, and specifying stopping rules. Second, it is used to help interpolate parameter values that may or may not be directly measurable but are part of the underlying explanatory model underpinning the analytic model. There is another reason why calibration is so important—trust. Model-based research needs to demonstrate, even more than experimental or observational research,6 that models reflect reality and are robust and reproducible. Validation and calibration go hand in hand. Science is based on reproducibility and reliability, which are primary goals of calibration. Well-done, transparent calibration and validation of models are essential if they are going to serve as the foundations of specific analyses or larger long-term research programs. To paraphrase Henri Poincare´, science is built from the accumulation of knowledge, brick by brick, but accumulation of facts is not enough: Both a strong foundation and a well-tested plan are necessary. Models synthesize the bricks of knowledge into a common construct, and calibration and validation test the soundness of the construct. But this too is not enough: Science also demands that we show our work, so that the eyes of our peers can challenge and test our hypotheses and the explanatory models supporting them. Without this, errors can build over time and our modern instruments too can blow up in our faces. If calibrated foundational work is so important, where are all the papers? This is where one of the

MEDICAL DECISION MAKING/FEBRUARY 2015

Downloaded from mdm.sagepub.com at UNIV OF MEMPHIS on April 8, 2015

TRUST AND RECOGNITION

gaps in the literature becomes apparent. It is a dilemma faced by researchers all over but particularly in the modeling community. Where do I publish the foundational articles for my research program that demonstrate my model’s concordance with reality, its reliability, and its accuracy, and how can I share what I have learned? Most model-based research programs start with a specific problem to be solved or a system to be explored. This in turn is followed by the crafting of a representation of that problem or system in a particular modeling framework. The model is then validated and calibrated to ensure its trustworthiness. Then the model is used to address the specific challenge. Most nonmodelers don’t appreciate what a long and difficult process this is. A lot of time and effort go into a model before Popperian hypothesis testing is possible. The mandate to ‘‘show your work’’ is often translated as ‘‘publish or perish.’’ Without showing your work, you can’t get recognition, and without recognition there is rarely continued support. Yet journals also face demands. Their readership must find their content valuable enough to keep reading and subscribing. Because of the seeming reluctance of many journals to publish this kind of work, many articles that lay the foundation for future work, often developed in the long process of model building, are never published. As a result, how many research programs (and careers) are cut short? I would propose there is intrinsic value for all researchers to look at modeling projects from their very beginning and to understand the methods and techniques used to lay the bricks and vet the building plan of a research program. How can an intelligent reader critique or have faith in a model if its underpinnings are obscure, unknown, or not available for examination? How many of us are familiar with the foundations upon which modeling programs to which we refer or consult are built? Without understanding the foundations of a model, how can we judge it or test it independently? How often does a model not adequately predict the future? Poor predictions reduce trust in models. Without access to peer review criticism, how can models be vetted? Without access to core models and the trust generated through peer review and experience, science slows down. People in academia are understandably reluctant to share their models without being able to receive credit, establish priority, or demonstrate to funders and promotions committees the extent of their effort. Therefore, we need to provide a venue for foundational articles and the tools to evaluate them.

A common recommendation is to place this work in technical appendices, although this may be insufficient as these appendices are often not given the same recognition as the articles themselves. Providing the right venue in turn should provide an incentive for scientists to share their work earlier and make it more robust, thereby broadening its acceptance and increasing the tempo of good science. REFERENCES 1. Pitman R, Fisman D, Zaric G, et al. Dynamic transmission modeling: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force Working Group—5. Med Decis Making. 2012; 32(5):712–21. 2. Karnon J, Stahl J, Brennan A, et al. Modeling using discrete event simulation: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force—4. Med Decis Making. 2012; 32(5):701–11. 3. Siebert U, Alagoz O, Bayoumi A, et al. State-transition modeling: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force—3. Value Health. 2012;15(6):812–20. 4. Roberts M, Russell L, Paltiel A, et al. Conceptualizing a model: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force—2. Med Decis Making. 2012;32(5):678–89. 5. Caro J, Briggs A, Siebert U, Kuntz K, Force I-SMGRPT. Modeling good research practices—overview: a report of the ISPOR-SMDM Modeling Good Research Practices Task Force—1. Med Decis Making. 2012;32(5):667–77. 6. Ioannidis J. Why most published research findings are false. PLoS Med. 2005;2(8):e124.

SPECIAL ISSUE ARTICLES  Enns EA, Cipriano LE, Simons CT, Kong CY. Identifying best-fitting inputs in health-economic model calibration: a Pareto frontier approach. Med Decis Making. 2015;35(2):170–82.  Goldhaber-Fiebert JD, Brandeau ML. Modeling and calibration for exposure to time-varying, modifiable risk factors: the example of smoking behavior in India. Med Decis Making. 2015;35(2):196–210.  Jackson CH, Jit M, Sharples LD, De Angelis D. Calibration of complex models through Bayesian evidence synthesis: a demonstration and tutorial. Med Decis Making. 2015;35(2):148–61.  O’Mahony JF, van Rosmalen J, Mushkudiani NA, Goudsmit FW, Eijkemans MJC, Heijnsdijk EAM, Steyerberg EW, Habbema JDF. The influence of disease risk on the optimal time interval between screens for the early detection of cancer: a mathematical approach. Med Decis Making. 2015;35(2):183–95.  Kimmel A, Fitzgerald DW, Pape JW, Schackman BR. Performance of a mathematical model to forecast lives

EDITORIAL

137

Downloaded from mdm.sagepub.com at UNIV OF MEMPHIS on April 8, 2015

STAHL

saved from HIV treatment expansion in resource-limited settings. Med Decis Making. 2015;35(2):230–42.  Van Calster B, Vickers AJ. Calibration of risk prediction models: impact on decision-analytic performance. Med Decis Making. 2015;35(2):162–69.  Codella J, Safdar N, Heffernan R, Alagoz O. An agentbased simulation model for Clostridium difficile infection control. Med Decis Making. 2015;35(2): 211–29.  Eisemann N, Waldmann A, Garbe C, Katalinic A. Development of a microsimulation of melanoma

138 

mortality for evaluating the effectiveness of population-based skin cancer screening. Med Decis Making. 2015;35(2):243–54.  Joranger P, Nesbakken A, Hoff G, Sorbye H, Oshaug A, Aas E. Modeling and validating the cost and clinical pathway of colorectal cancer. Med Decis Making. 2015;35(2):255–65.  Enns EA, Cipriano LE, Simons CT, Kong CY. Identifying best-fitting inputs in health-economic model calibration: a Pareto frontier approach. Med Decis Making. 2015;35(2):170–82.

MEDICAL DECISION MAKING/FEBRUARY 2015

Downloaded from mdm.sagepub.com at UNIV OF MEMPHIS on April 8, 2015

Trust and recognition: coming to terms with models.

Trust and recognition: coming to terms with models. - PDF Download Free
37KB Sizes 0 Downloads 12 Views