EDITORIAL

Essentially all models are wrong, but some are useful

J. Marc Simard, MD, PhD

Correspondence to Dr. Simard: [email protected] Neurology® 2015;85:210–211

There has been intense scrutiny of the failure to translate preclinical studies in stroke. By contrast, much less attention has been directed to asking how treatments get from an early-phase to late-phase clinical trial. While much of the focus has been in stroke, a similar failure to translate has been all too common in other neurologic disorders. In this issue of Neurology®, Kent et al.1 address decision-making regarding prioritization of early- to late-phase clinical trials, and the frequent failure of late-phase trials to replicate apparently positive earlyphase results. Pitchaiah Mandava and Thomas Kent have been studying this process for nearly a decade. Mandava, an electrical engineer turned vascular neurologist, and Kent, a vascular neurologist trained in basic and clinical pharmacology, have been on the outside looking in since their initial foray into modeling outcomes. Their first study, published in Neurology in 2007,2 used published outcome data and showed that a heterogeneous group of early-generation endovascular interventions, mostly case series at the time, yielded outcomes that were no better than the natural history. It would take another 6 years for large randomized controlled trials (RCTs) to reach the same conclusion.3 In their seminal 2007 study, when Mandava and Kent2 looked at the variation within the series, they also found that patients who had the most severe stroke fared best when they had the least aggressive intervention. Interestingly, a study from Japan showed that patients with low-dose endovascular lytic therapy had better outcomes than those with more aggressive intervention.4 Similarly, a reanalysis of data from the Interventional Management of Stroke III and Multicenter Randomized Clinical Trial of Endovascular Treatment for Acute Ischemic Stroke in the Netherlands trials5 showed that a similar NIH Stroke Scale (NIHSS) score predicted by Mandava and Kent, NIHSS 20, was the threshold for benefit with endovascular intervention. Both of these predictions by Mandava and Kent in 2007 were buried by the enthusiasm for Food and Drug Administration approval of endovascular devices, with enthusiasm

powered by the understandable desire in the stroke community for something other than IV recombinant tissue plasminogen activator. Since their initial study, Mandava and Kent have developed more sophisticated models based on data from additional RCTs, and have used these models to contrast their approach to standard statistical approaches.6–8 Because stroke is so heterogeneous and outcomes, as in many diseases, are dependent on baseline factors, nearly every clinical study leading up to a phase 3 trial has required some sort of statistical manipulation. Although statistical methods to account for heterogeneity are widely accepted, Mandava and Kent argue that these methods are not appropriate when the interaction among factors is more complex than a simple linear relationship. Even more problematic is that subjective clinical outcomes are plagued by errors, adding additional noise that may not be randomly distributed in small trials.9 Nearly every early-phase clinical trial that they surveyed had one or more problems: either it violated an assumption necessary for the valid application of traditional statistical manipulations, or it misinterpreted signals within the study as positive, when either the control arm did worse than expected or when there was no signal of efficacy in the treatment arm. It is therefore easy to conclude that the heavy reliance on standard statistical manipulations has resulted in huge expenditures of resources for what, in retrospect, have been futile pursuits of new therapies. Recognizing these shortcomings, clinical trialists have attempted to address heterogeneity with new designs. Adaptive designs and post hoc methods that are used to parse out subgroups, including the propensity score, are becoming popular. As noted by Mandava and Kent, these methods are more appropriate for large phase 3 trials, when the size of the population makes variation and noise more likely to be random. Indeed, they cite literature showing that premature application of adaptive design risks investigator bias, which could be equally damaging to an objective assessment of outcome. Mandava and Kent6 have shown that their method would have correctly predicted the only accepted

See page 274 From Departments of Neurosurgery, Pathology, and Physiology, University of Maryland School of Medicine, Baltimore. The sentiment in the title is attributed to the great English statistician, George E.P. Box (Box GEP, Draper NR. Empirical Model-Building and Response Surfaces. Hoboken, NJ: John Wiley & Sons, Inc.; 1987). Go to Neurology.org for full disclosures. Funding information and disclosures deemed relevant by the authors, if any, are provided at the end of the editorial. 210

© 2015 American Academy of Neurology

ª 2015 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.

positive trial, the National Institute of Neurological Disorders and Stroke tissue plasminogen activator trial, and would have correctly predicted nearly all trials that turned out to be negative. Their models from the initial stent retriever case series suggested that the new-generation endovascular devices would prove beneficial, a finding more recently confirmed in multiple RCTs.10 While there is much to consider before implementing their method, such as whether an individual trial’s inclusion/exclusion criteria are similar enough to those in the Mandava and Kent model, the sheer number of studies now included suggests that it may be time to consider the application of their strategy more broadly. The recommendations with which their new article1 conclude are sobering. Instead of smaller increments of improvement, as long as subjective outcomes are used, we should be increasing the anticipated strength of effect in early phases, given all the potential sources of errors. However, the bright side, as the authors acknowledge, is that if we are so bad at telling that a treatment will be effective, we are probably just as bad when we conclude in an early trial that an agent would be negative. Indeed, they tease us with the suggestion that they have identified potential false-negatives that could be resurrected. That indeed would be a good outcome. STUDY FUNDING No targeted funding reported.

DISCLOSURE The author reports no disclosures relevant to the manuscript. Go to Neurology.org for full disclosures.

REFERENCES 1. Kent TA, Shah SD, Mandava P. Improving early clinical trial phase identification of promising therapeutics. Neurology 2015;85:274–283. 2. Mandava P, Kent TA. Intra-arterial therapies for acute ischemic stroke. Neurology 2007;68:2132–2139. 3. Broderick JP, Palesch YY, Demchuk AM, et al. Endovascular therapy after intravenous t-PA versus t-PA alone for stroke. N Engl J Med 2013;368:893–903. 4. Ezura M, Kagawa S. Selective and superselective infusion of urokinase for embolic stroke. Surg Neurol 1992;38: 353–358. 5. Broderick JP, Dippel DW, Palesch YY, et al. Pooled analysis of the IMS III and MR CLEAN trials for patients with NIHSS of 20 or more. International Stroke Conference 2015. Available at: http://my.americanheart.org/professional/ Sessions/InternationalStrokeConference/ScienceNews/ISC2015-Late-Breaking-Science-Oral-Abstracts_UCM_471621_ Article.jsp#pooled. Accessed February 12, 2015. 6. Mandava P, Kent TA. A method to determine stroke trial success using multidimensional pooled control functions. Stroke 2009;40:1803–1810. 7. Mandava P, Kalkonde YV, Rochat RH, Kent TA. A matching algorithm to address imbalances in study populations: application to the National Institute of Neurological Diseases and Stroke recombinant tissue plasminogen activator acute stroke trial. Stroke 2010;41:765–770. 8. Mandava P, Murthy SB, Munoz M, et al. Explicit consideration of baseline factors to assess recombinant tissue-type plasminogen activator response with respect to race and sex. Stroke 2013;44:1525–1531. 9. Mandava P, Krumpelman CS, Shah JN, White DL, Kent TA. Quantification of errors in ordinal outcome scales using shannon entropy: effect on sample size calculations. PLoS One 2013;8:e67754. 10. Campbell BC, Mitchell PJ, Kleinig TJ, et al. Endovascular therapy for ischemic stroke with perfusion-imaging selection. N Engl J Med 2015;372:1009–1118.

Neurology 85

July 21, 2015

211

ª 2015 American Academy of Neurology. Unauthorized reproduction of this article is prohibited.

Essentially all models are wrong, but some are useful J. Marc Simard Neurology 2015;85;210-211 Published Online before print June 24, 2015 DOI 10.1212/WNL.0000000000001769 This information is current as of June 24, 2015 Updated Information & Services

including high resolution figures, can be found at: http://www.neurology.org/content/85/3/210.full.html

References

This article cites 9 articles, 4 of which you can access for free at: http://www.neurology.org/content/85/3/210.full.html##ref-list-1

Subspecialty Collections

This article, along with others on similar topics, appears in the following collection(s): All Cerebrovascular disease/Stroke http://www.neurology.org//cgi/collection/all_cerebrovascular_disease_ stroke All Clinical trials http://www.neurology.org//cgi/collection/all_clinical_trials Clinical trials Methodology/study design http://www.neurology.org//cgi/collection/clinical_trials_methodology_ study_design_

Permissions & Licensing

Information about reproducing this article in parts (figures,tables) or in its entirety can be found online at: http://www.neurology.org/misc/about.xhtml#permissions

Reprints

Information about ordering reprints can be found online: http://www.neurology.org/misc/addir.xhtml#reprintsus

Neurology ® is the official journal of the American Academy of Neurology. Published continuously since 1951, it is now a weekly with 48 issues per year. Copyright © 2015 American Academy of Neurology. All rights reserved. Print ISSN: 0028-3878. Online ISSN: 1526-632X.

Essentially all models are wrong, but some are useful.

Essentially all models are wrong, but some are useful. - PDF Download Free
147KB Sizes 3 Downloads 8 Views