Commentary

Virtual Populations, Real Decisions: Making Sense of Stochastic Simulation Studies Allan B. Massie, PhD, MHS,1,2 Eric K.H. Chow, MS,1 and Dorry L. Segev, MD, PhD1,2

T

he truism that clinical trials are the “gold standard” in medical research often appears as an aside as researchers make the case for other, less rigorous designs. Clinical trials are impractical for most questions in medical research because of the cost, time, and ethical barriers that are well understood.1 However, observational studies may also be impractical in many contexts. Stochastic simulation studies allow researchers to study virtual patient populations when data on actual populations are unavailable or cannot be easily measured. In this issue of Transplantation, Nguyen and colleagues2 use a Markov decision process model to compare cost and postdonation graft loss from donor screening with beadbased assays (at various levels of mean fluorescence intensity) in addition to complement-dependent cytotoxicity (CDC) crossmatch, versus screening with CDC crossmatch alone. They report cost savings of $1,192,303 (Australian) per 100 transplants, and 7.5 fewer graft losses at 5 years per 100 transplants, under screening with bead-based assays at a threshold of 500 mean fluorescence intensity. Higher mean fluorescence intensity thresholds lead to similar 5-year graft survival, with slightly smaller cost savings. However, these results come from simulation; no actual dollars, or actual patients, were observed. How, then, should the scientific community evaluate the strength of the evidence of Nguyen's study and other such stochastic simulations? DESIGN OF A MARKOV DECISION PROCESS MODEL A Markov decision process model consists of a number of states (e.g., posttransplant without rejection, rejection, dialysis, death) and a virtual population which progresses from state to state based partly on choices made by a decision-maker (in this case, which screening method to use) and partly on state transition probabilities (e.g., probability of graft loss for patients with antibody-mediated rejection). Nguyen and colleagues constructed a virtual population of

Received 18 November 2014. Accepted 3 December 2014. 1 Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, MD. 2 Department of Epidemiology, Johns Hopkins School of Public Health, Baltimore, MD.

The authors declare no funding or conflicts of interest. Correspondence: Allan Massie, PhD, MHS, Johns Hopkins Medical Institutions, 720 Rutland Ave, Ross 34, Baltimore, MD 21205. ([email protected]) Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved. ISSN: 0041-1337/15/9905-901 DOI: 10.1097/TP.0000000000000698

Transplantation



May 2015



Volume 99



Number 5

transplant recipients of varying ages who received a transplant after being screened using CDC crossmatch or CDC crossmatch + bead-based assays. The state transition probabilities were determined from a variety of external sources; for example, probabilities of graft loss were derived from a previous observational study of graft loss in Australia and New Zealand. The study was run with a time horizon of 5 years for the outcome of graft loss, meaning that graft loss outcomes were simulated for up to 5 years after transplantation (the time horizon was 1 year for rejection and 20 years for qualityadjusted life years). A discount rate of 5% per year was applied to cost and benefit calculations; in other words, costs and benefits incurred had 5% less weight per year in the future. Several design choices in this study might affect validity of inferences. First, virtual patients are identical except with respect to age and sensitivity to donor-specific antigens. The study cannot determine whether benefit of screening varies by other characteristics, such as sex, race, or panel-reactive antibody. Second, the model does not distinguish between deceased-donor and living-donor transplants, despite the fact that costs and outcomes vary widely between the 2 modalities. Third, performance of the model depends heavily on state transition probabilities: in other words, on predicted risk of rejection, graft loss, and death derived from previous studies. The Markov model therefore effectively inherits the limitations of all studies contributing to these state transition probabilities, with regards to both methodology and generalizability. To test the sensitivity of the model to uncertainty in the state transition probabilities, the authors performed, as is conventional, sensitivity analyses in which they varied some state transition probabilities. Perhaps, most importantly, the model performance depends on the question asked. The Nguyen study compared outcomes from screening with bead-based assays to screening with CDC crossmatching alone. They did not evaluate screening with flow cytometry, another common technique.3 They compared rejection, graft survival, quality-adjusted life years, and cost among transplant recipients; the outcomes of patients who remain on dialysis after a positive crossmatch are not considered, even though they bear burdens of cost, decreased quality of life, and mortality risk. For example, in our experience, highly sensitized patients who receive CDC-negative, bead-positive incompatible transplants have better survival than patients who remain on the waitlist in hopes of receiving a compatible kidney.4 An incompatible transplant can provide patient benefit, then, even if it is riskier and costlier than a compatible transplant for a different patient. WHY SIMULATE? These limitations can be summed up by saying that the simulated world of the Markov model is inevitably a crude www.transplantjournal.com

Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

901

902

Transplantation



May 2015



Volume 99



Number 5

(albeit sometimes very useful) approximation of real life. Although the contours of the current simulation were determined by specific design choices, the challenge of approximating reality is one that any simulation study will inevitably meet imperfectly. Why, then, conduct simulation studies? In this case, not only would a trial be infeasible; even observational data would probably be impossible to obtain. The screening method used for transplant candidates is determined at the center level (if not the national level), and nowadays, centers performing only CDC screening may be difficult to find. Markov decision process models and other stochastic simulation designs have previously been used to study other questions for which observational data are lacking or infeasible, such as risks of disease transmission after transplantation with CDC infectious risk donor kidneys,5 hypothetical changes to national organ allocation systems,6,7 and the potential impact of innovations, such as kidney paired donation.8 Moreover, the simulation design gives researchers a greater level of control than observational studies, or even trials. By design, “patients” in the CDC+ bead group of the Nguyen study are exactly the same as patients in the CDC-only group, except with regards to antibody strength; no clinical trial could match that level of exchangeability. CONCLUSION Simulation studies must be understood in the context of their limitations; however, the same is true of other study designs. The scientific community should not reject simulation

www.transplantjournal.com

studies out of hand because the data are not “real”; nor should they treat simulation results as infallible. As with observational studies, readers of simulation studies should carefully consider the study design and methodological assumptions to weigh the strength of evidence.

REFERENCES 1. Massie AB, Kucirka LM, Segev DL. Big data in organ transplantation: registries and administrative claims. Am J Transplant. 2014;14(8):1723–1730. 2. Nguyen HTD, Lim WH, Craig JC, et al. The relative benefits and costs of solid phase bead technology to detect preformed donor specific antihuman leukocyte antigen antibodies in determining suitability for kidney transplantation. Transplantation. 2015;99(5):957–964. 3. Karpinski M, Rush D, Jeffery J, et al. Flow cytometric crossmatching in primary renal transplant recipients with a negative anti-human globulin enhanced cytotoxicity crossmatch. J Am Soc Nephrol. 2001;12(12): 2807–2814. 4. Montgomery RA, Lonze BE, King KE, et al. Desensitization in HLAincompatible kidney recipients and survival. N Engl J Med. 2011;365(4): 318–326. 5. Chow EK, Massie AB, Muzaale AD, et al. Identifying appropriate recipients for CDC infectious risk donor kidneys. Am J Transplant. 2013;13(5): 1227–1234. 6. Ratcliffe J, Young T, Buxton M, et al. A simulation modelling approach to evaluating alternative policies for the management of the waiting list for liver transplantation. Health Care Manag Sci. 2001;4(2):117–124. 7. Gentry SE, Massie AB, Cheek SW, et al. Addressing geographic disparities in liver transplantation through redistricting. Am J Transplant. 2013;13(8): 2052–2058. 8. Segev DL, Gentry SE, Warren DS, Reeb B, Montgomery RA. Kidney paired donation and optimizing the use of live donor organs. JAMA. 2005;293(15):1883–1890.

Copyright © 2015 Wolters Kluwer Health, Inc. All rights reserved.

Virtual populations, real decisions: making sense of stochastic simulation studies.

Virtual populations, real decisions: making sense of stochastic simulation studies. - PDF Download Free
117KB Sizes 0 Downloads 7 Views