ORIGINAL ARTICLE

Ranking Trauma Center Quality Can Past Performance Predict Future Performance? Laurent G. Glance, MD,∗ Dana B. Mukamel, PhD,† Turner M. Osler, MD, MS,‡ and Andrew W. Dick, PhD§ Objective: To explore whether trauma center quality metrics based on historical data can reliably predict future trauma center performance. Background: The goal of the American College of Surgeons Trauma Quality Improvement Program is to create a new paradigm in which high-quality trauma centers can serve as learning laboratories to identify best practices. This approach assumes that trauma quality reporting can reliably identify high-quality centers using historical data. Methods: We performed a retrospective observational study on 122,408 patients in 22 level I and level II trauma centers in Pennsylvania. We tested the ability of the Trauma Mortality Prediction Model to predict future hospital performance based on historical data. Results: Patients admitted to the lowest performance hospital quintile had a 2-fold higher odds of mortality than patients admitted to the best performance hospital quintile using either 2-year-old data [adjusted odds ratio (AOR): 2.11; 95% confidence interval (CI): 1.36–3.27; P < 0.001] or 3-year-old data (AOR: 2.12; 95% CI: 1.34–3.21; P < 0.001). There was a trend toward increased mortality using 5-year-old data (AOR: 1.70; 95% CI: 0.98–2.95; P = 0.059). The correlation between hospital observed-to-expected mortality ratios in 2009 and 2007 demonstrated moderate agreement (intraclass correlation coefficient = 0.56; 95% CI: 0.22–0.77). The intraclass correlation coefficients for observed-to-expected mortality ratios obtained using 2009 data and 3-, 4-, or 5-year-old data were not significantly different from zero. Conclusions: Trauma center quality based on historical data is associated with subsequent patient outcomes. Patients currently admitted to trauma centers that are classified as low-quality centers using 2- to 5-year-old data are more likely to die than patients admitted to high-quality centers. However, although the future performance of individual trauma centers can be predicted using performance metrics based on 2-year-old data, the performance of individual centers cannot be predicted using data that are 3 years or older. Keywords: benchmarking, injury, quality of health care, quality reporting, trauma (Ann Surg 2014;259:682–686)

P

erformance reporting is at the center of national efforts to incentivize quality under the Affordable Care Act through valuebased purchasing.1 Although the quality of trauma care is not yet publicly reported and tied to hospital payments, the American Col-

lege of Surgeons (ACS) has created the Trauma Quality Improvement Program to “measure and continually improve the quality of trauma care.”2(p254) The goal of this new program by the ACS is to create a new “paradigm” in which high-quality trauma centers can serve as learning laboratories to identify and share best practices with poor performers.3 This approach assumes that performance measurement can reliably identify high- and low-performing centers. Quality measurement requires risk adjustment to account for differences in patient case mix across providers. One of the greatest challenges to the face validity of quality measurement is that different severity measures will frequently disagree on the identity of high- and low-quality hospitals.4,5 We have previously shown that risk adjustment models based on either the Injury Severity Score (ISS) or an empirical measure of injury severity, the Trauma Mortality Prediction Model (TMPM), exhibit substantial agreement on the quality of trauma centers.6 The finding that these very different risk-adjusted outcome measures, whether based on expert-based or empirical measures of injury severity, agree on trauma center quality is reassuring and suggests that trauma center quality can be objectively measured. What has not been shown is whether trauma center performance based on historical data can reliably predict future performance. In theory, one would expect that hospitals delivering highquality care would continue to do so in future years. If, however, hospital performance varies significantly over time, then the validity of using hospitals as learning laboratories to identify best practices may be open to question. We have used data from the Pennsylvania Trauma Outcome Study (PTOS) to explore whether hospital quality metrics based on prior years of data reliably predict future performance. Because it has been argued that the use of hierarchical modeling improves the reliability of performance measurement,7 we also compare the ability of hierarchical and nonhierarchical analytical approaches to forecast future outcomes. Using quality metrics based on data that are 2, 3, 4, or 5 years old, we examine the stability of hospital quality measurements over time. Our findings will help policy makers better understand the utility of identifying best practices by studying the processes of care at centers identified as high-performing hospitals to improve outcomes for injured patients.

METHODS Data Source

From the ∗ Department of Anesthesiology, University of Rochester Medical Center, Rochester, NY; †Department of Medicine, Center for Health Policy Research, University of California, Irvine; ‡Department of Surgery, University of Vermont Medical College, Burlington; and §RAND, RAND Health, Boston, MA. Disclosure: Supported by a grant from the Agency for Healthcare and Quality Research (RO1 HS 16737). The views presented in this article are those of the authors and may not reflect those of the Agency for Healthcare and Quality Research. These data were provided by the Pennsylvania Trauma Systems Foundation. The foundation specifically disclaims responsibility for any analyses, interpretations, or conclusions. The authors declare no conflicts of interest. Reprints: Laurent G. Glance, MD, Department of Anesthesiology and Department of Community and Preventive Medicine, University of Rochester Medical Center, 601 Elmwood Ave, Box 604, Rochester, NY 14642. E-mail: Laurent_ [email protected]. C 2013 by Lippincott Williams & Wilkins Copyright  ISSN: 0003-4932/13/25904-0682 DOI: 10.1097/SLA.0000000000000334

682 | www.annalsofsurgery.com

This study was conducted using data from the Pennsylvania Trauma Systems Foundation (PTSF) on trauma patients admitted to PTSF-accredited level I and level II trauma centers between 2004 and 2009. Pennsylvania, which includes both rural and urban areas, provides a representative case mix of trauma patients.8 The PTOS database is a population-based statewide trauma registry. It includes information on all trauma admissions at accredited trauma centers meeting any one of the PTOS inclusion criteria: admission to the intensive care unit or step-down unit, hospital length of stay greater than 48 hours, hospital admissions transferred from another hospital, transfers out to an accredited trauma center, or trauma death.9 The PTOS database consists of deidentified data on patient demographics, Abbreviated Injury Score and ICD-9-CM (International Classification of Diseases, Ninth Revision, Clinical Modification) codes, mechanism Annals of Surgery r Volume 259, Number 4, April 2014

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Annals of Surgery r Volume 259, Number 4, April 2014

of injury (based on ICD-9-CM Ecodes), comorbidities, physiology information, in-hospital mortality and complications, transfer status, processes of care, and encrypted hospital identifiers. Ecodes were mapped to 1 of 6 injury mechanisms by an experienced trauma surgeon (T.M.O.): gunshot wound, stab wound, injuries sustained from low falls, blunt injury, motor vehicle crash, and pedestrian injury.10 Data quality is ensured through the use of standard abstraction software with automatic data checks, a data definition manual, and internal and external data auditing.11

Patient Population The study sample consisted of trauma patients older than 16 years admitted to 22 level I or level II trauma centers contributing data to the PTOS database between 2004 and 2009, after excluding patients with burns, hypothermia, isolated hip fractures, superficial injuries, unspecified injuries, and nontraumatic mechanism of injury and patients transferred out to another hospital. We excluded patients with missing information on transfer status (n = 191) and demographics (n = 102), patients with invalid Abbreviated Injury Score codes (n = 6514), and patients with missing blood pressure (n = 3421) or Glasgow Coma Scale motor information (n = 8557). The final study cohort consisted of 122,408 patients in 22 level I and level II trauma centers.

Risk Adjustment Models Our goal was to examine the ability of hospital quality metrics based on prior years of data to reliably predict future performance. Because there is some controversy over the optimal statistical approach to use for risk adjustment, we used 3 different statistical approaches to risk adjustment: standard logistic regression; fixedeffects logistic regression; and hierarchical logistic regression.12,13 We used 3 different logistic regression models with in-hospital mortality as the dependent variable: (1) standard logistic regression; (2) fixed-effects logistic regression; and (3) hierarchical logistic regression. The standard logistic regression model was based on the previously validated TMPM14 enhanced with the addition of age, sex, mechanism of injury, transfer status, comorbidities,∗ motor component of the Glasgow Coma Scale, and systolic blood pressure.15 In prior work, the TMPM exhibited excellent discrimination with a Cstatistic of 0.90 and very good calibration as measured using graphical techniques and the Hosmer-Lemeshow statistic.14 The fixed-effects logistic regression model included all the same patient-level risk factors as the baseline model and incorporated hospital effects directly as fixed effects. The hierarchical logistic regression model was also based on the baseline model and included hospitals as a random intercept.16 We then calculated a hospital quality metric based on each of the 3 modeling approaches. We first calculated the observed mortality rate for each hospital. We then used the baseline model to calculate the predicted probability of death for each patient and then averaged these over individual hospitals to calculate the expected mortality rate of each hospital. The hospital observed-to-expected mortality (OE) ratio,17 defined as the ratio of the observed mortality rate to the expected mortality rate, is used by New York State in its cardiac surgery reporting system,18 by the ACS Trauma Quality Improvement Program,2 and by the ACS and Veterans Affairs’ National Surgi∗ Preexisting

conditions included in the model—coronary artery disease, congestive heart failure, chronic obstructive pulmonary disease, renal disease (serum creatinine level >2 mg/dL), dialysis, cerebrovascular accident, insulindependent diabetes, acquired coagulopathy, warfarin therapy, HIV/AIDS, steroid therapy, active chemotherapy, bilirubin level more than 2 mg/dL (on admission), cirrhosis, and metastatic cancer.  C 2013 Lippincott Williams & Wilkins

Ranking Trauma Center Quality

cal Quality Improvement Program19,20 to report risk-adjusted hospital outcomes. Hospitals with OE ratios significantly greater than 1 are considered low-performance outliers, whereas high-performance hospitals have OE ratios significantly less than 1. The fixed-effects model incorporates the hospital effect directly as a fixed effect in the regression model. We parameterized hospital fixed effects using an approach described by DeLong and colleagues17 so that the hospital adjusted odds ratio (AOR) represented the adjusted odds of mortality compared with the average hospital. The hospital AOR corresponds to the OE ratio and represents the risk of a patient dying in a specific hospital relative to the risk of mortality in an average hospital.17 Using the random-effects model in which hospitals are specified as a random effect, the shrinkage estimator for hospital effects can be exponentiated to yield an estimated odds of mortality attributable to each hospital that is robust to small hospital sample sizes.17

Analysis We performed 3 separate analyses to examine the ability of risk-adjusted measures of hospital performance to predict future hospital performance. In the first analysis, we identified hospitals in the highest and lowest performance quintiles, based on the hospital OE ratio estimated using the baseline risk adjustment model and data from 2007. Using the approach described by Rogowski et al21 and Birkmeyer et al,22 we then estimated a logistic model using subsequent data from 2009 in which we controlled for all the patient characteristics included in the baseline model. We also included indicators based on the 2-year-old data for whether the hospital was ranked in the lowest performance quintile and the average performance group (3 middle quintiles), with the highest performance quintile serving as the reference group. The estimated odds ratio for the lowest performance hospital quintile relative to the highest performance group is a measure of the predictive performance of 2-year-old data. We repeated this analysis both using 3-, 4-, and 5-year-old data and using the hospital performance metrics based on fixed- and random-effects models. We performed a second analysis, also based on work described by Rogowski et al21 and Birkmeyer et al,22 in which we estimated the proportion of hospital-level variance in mortality in 2009 that was explained by rankings based on data from 2007. We estimated a random-effects logistic model using data from 2009 in which we controlled for all the patient characteristics included in the baseline model and added the hospital OE ratio estimated using the baseline model and 2-year-old data from 2007. We reestimated this model also using data from 2009 but not including the hospital OE ratio, to estimate the proportion of hospital-level variance attributable to hospital rankings based on the 2-year-old data as a measure of the predictive performance of hospital quality information. We repeated this analysis both using 3-, 4-, and 5-year-old data and using the hospital performance metrics based on fixed- and random-effects models. In the third analysis, we examined the level of agreement using the intraclass correlation coefficient (ICC)23 between a trauma center’s performance based on data from 2009, as quantified using the hospital OE ratio, and its performance based on 2-year-old data from 2007. We repeated this analysis both using 3-, 4-, and 5-year-old data and using the hospital performance metrics based on fixed- and random-effects models. We used robust variance estimators to account for clustering of observations by hospitals. All statistical analyses were performed using STATA SE/MP (version 11.0; STATA Corp, College Station, TX). This study was approved by the institutional review board at the University of Rochester. www.annalsofsurgery.com | 683

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Annals of Surgery r Volume 259, Number 4, April 2014

Glance et al

RESULTS This study was based on data from 122,408 patients admitted to 22 level I and level II trauma centers in Pennsylvania (Table 1). The median age was 48 years (interquartile range: 30–70 years), 64% were men, and 29% of the patients were transferred in from other hospitals. Forty-two percent of the patients sustained blunt trauma, 25% were involved in motor vehicle crashes, 13% had sustained injuries from low falls, and the remainder of traumatic injuries were due to gunshot wounds, stab injuries, or pedestrian accidents. The overall mortality rate was 6.1%. We first examined the extent to which hospital quality quintiles based on 2-, 3-, 4-, and 5-year-old data predicted subsequent mortality (Table 2). We found that patients admitted to the lowest performance quintile, based on 2-year-old data, had a 2-fold higher odds of mortality than patients admitted to best performance hospital

TABLE 1. Patient Characteristics and Outcomes N = 122,408 Demographics Age, median (IQR), yr Sex Male Female Transfer in from other hospital Transferred in Not transferred in Glasgow Coma Scale Motor 1 2 3 4 5 6 Systolic blood pressure 0–30 31–60 61–90 91–160 161–220 >220 Mechanism of trauma Blunt Motor vehicle accident Gunshot Stab Pedestrian trauma Injury from low falls Mortality

48 (30–70) 77,931 (63.7) 44,477 (36.3) 35,671 (29.1) 86,737 (70.9) 9,871 (8.1) 283 (0.23) 439 (0.36) 1,484 (1.21) 3,482 (2.84) 106,849 (87.3) 1,909 (1.56) 370 (0.30) 3,288 (2.69) 93,805 (76.6) 22,250 (18.2) 786 (0.64)

quintile [AOR: 2.11; 95% confidence interval (CI): 1.36–3.27; P < 0.001] when the hospital quality metric was based on standard regression. These results were essentially unchanged when the risk-adjusted quality metric was based on fixed- or random-effects modeling (Table 2). Using 3-year-old data, only the quality metric based on standard regression was associated with a significant difference in mortality in patients admitted to high-performance hospitals compared with low-performance hospitals (AOR: 2.12; 95% CI: 1.34– 3.21; P < 0.001). Using 4- or 5-year-old data, none of the quality metrics were significantly associated with subsequent mortality in 2009. However, there was a trend toward increased mortality in highmortality hospitals compared with the low-mortality hospitals, using the 5-year-old data and either standard regression (AOR: 1.70; 95% CI: 0.98–2.95; P = 0.059) or fixed-effects modeling (AOR: 1.69; 95% CI: 0.92–3.09; P = 0.088). We then examined the proportion of hospital-level variation in 2009 explained by rankings based on prior years of data. Quality metrics based on 2-year-old data accounted for approximately 35% of the hospital-level variation in mortality in 2009 (Table 3). The ability of historical data to explain subsequent mortality did not vary as a function of whether standard regression, fixed-effects regression, or random-effects regression was used to create the quality metrics (Table 3). Quality metrics based on 3-, 4-, or 5-year-old data explained 10% or less of the hospital-level variation in 2009 (Table 3). Finally, we examined the association between hospital’s OE ratio based on 2009 data and 2-year-old data. The ICC of 0.56 (95% CI: 0.22–0.77) indicates moderate agreement between the OE ratios obtained using 2009 data and 2-year-old data (Fig. 1). Similar results were obtained comparing AORs based on 2009 data and 2-year-old data using either fixed-effects models (ICC = 0.51; 95% CI: 0.16– 0.75) or random-effects models (ICC = 0.44; 95% CI: 0.15–0.68). Comparing OE ratios obtained using 2009 data and 3-, 4-, or 5-yearold data led to ICCs not significantly different from zero for all 3 models (Fig. 1).

DISCUSSION

51,617 (42.2) 31,133 (25.4) 7,392 (6.04) 4,682 (3.82) 11,533 (9.42) 16,051 (13.1) 6.12%

Unless otherwise indicated, all values are number (percentage). GCS motor indicates motor component of the Glasgow Coma Scale; IQR, interquartile range.

We found that trauma patients admitted to trauma centers identified as high-mortality hospitals using report cards based on 2- or 3-year-old data had a 2-fold higher risk of death than patients admitted to hospitals identified as low-mortality hospitals. A similar trend was observed using 5-year-old data, although it was not statistically significant. When we examined the correlation between “current” hospital rankings and historical hospital rankings, we found that hospital ranking based on 2-year-old data were strongly associated with current hospital rankings. However, we did not find that performance rankings based on 3-, 4-, or 5-year-old data predicted current hospital performance. In other words, high-quality hospitals, as a group, tend to have better outcomes in the future than low-quality hospitals. But the future performance of individual hospitals is not predicted by

TABLE 2. Odds Ratio for Mortality in 2009, “Worst” Versus “Best” Hospital Quintile Standard AOR Hospitals ranked using 2007 data Hospitals ranked using 2006 data Hospitals ranked using 2005 data Hospitals ranked using 2004 data

2.11∗

2.12∗ 1.48 1.70†

Fixed Effects

95% CI

AOR

1.36–3.27 1.34–3.36 0.91–2.41 0.98–2.95

2.07∗ 1.62† 1.16 1.69†

Random Effects

95% CI

AOR

95% CI

1.34–3.21 0.96–2.75 0.66–2.04 0.92–3.09

2.07∗

1.34–3.21 0.96–2.75 0.60–2.03 0.76–2.51

1.62† 1.10 1.38

∗ P < = 0.001. †P < 0.10.

684 | www.annalsofsurgery.com

 C 2013 Lippincott Williams & Wilkins

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Annals of Surgery r Volume 259, Number 4, April 2014

Ranking Trauma Center Quality

TABLE 3. Proportion of Hospital-Level Variation in Mortality in 2009 Explained by Rankings

Hospitals ranked using 2007 data Hospitals ranked using 2006 data Hospitals ranked using 2005 data Hospitals ranked using 2004 data

Standard

Fixed Effects

Random Effects

0.36 0.077 0.065 0.054

0.38 0.060 0.098 0.078

0.38 0.10 0.091 0.051

Risk adjustment model adjusts for patient risk using risk factors in the enhanced the TMPM.

their past performance if the “old” report cards are based on data that are 3 years or older. Other investigators have found that risk-adjusted mortality rates based on recent data can predict subsequent hospital performance for coronary artery bypass surgery, abdominal aortic aneurysm repair, pancreatic cancer resection, and gastric bypass.7,22,24 However, these studies did not report whether the historical performance of individual hospitals predicted future hospital performance, nor did they examine whether the predictive ability of quality metrics changes over time. Using New York State publicly reported measures of hospital performance for coronary artery bypass surgery, we previously found that patients admitted to low-quality hospitals based on 2-year-old data but not 3-year-old data had worse outcomes than patients admitted to average-quality hospitals. However, similar to our current study, the future performance of individual hospitals for coronary artery bypass surgery was not correlated with their past performance.25 So, the answer to “Can past performance predict future performance?” is “yes” and “no.” Patients admitted to hospitals classified as low-quality hospitals based on historical data tend to have worse outcomes than patients admitted to high-quality hospitals, even when the data used to classify hospital quality is 5 years old. However, the ability of report cards to predict the future performance of individual trauma centers is good only using 2-year-old data and is poor using 3-, 4-, or 5-year-old data. Taken together, the findings of this exploratory study suggest that there is considerable noise in trauma center rankings and that individual hospital quality rankings may not be as robust as generally assumed—even under the best circumstances when risk adjustment is based on accurate data and statistical models with excellent performance characteristics. In theory, hospital performance measures can improve accountability, drive quality improvement, incentivize high-quality care, and serve as the basis for evidence-based referrals and patient choice.26 In practice, assuming our findings were to be extended to other patient populations, caution need to be exercised in using hospital rankings as the basis for selectively referring patients to high-performing hospitals and selectively avoiding low-performance hospitals. Furthermore, if we are to use hospitals as learning laboratories to identify best practices, it may be important to examine processes of care that are shared by a large number of high-quality hospitals, as opposed to a few select hospitals, because the reliability of individual hospital rankings may not be very good. In light of the ongoing debate over whether to use conventional versus hierarchical modeling for performance profiling, we also examined whether the predictive ability of hospital ranking was improved with the use of hierarchical modeling. Proponents of hierarchical modeling argue that such models are better able to accommodate providers with low case volumes and avoid “regression to the mean,” meaning the tendency for hospitals classified as high- or low-quality outliers in the past to be reclassified as nonoutliers in  C 2013 Lippincott Williams & Wilkins

FIGURE 1. Predictive ability of hospital OE ratio based on baseline regression model. The 45-degree line represents perfect agreement between the OE ratio based on 2009 data and historical data (2-, 3-, or 4-year-old data), and the dotted line represents the results of linear regression. Results for quality metrics based on fixed- and random-effects models are very similar and are not shown. www.annalsofsurgery.com | 685

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Annals of Surgery r Volume 259, Number 4, April 2014

Glance et al

the future.27,28 Recent work by Dimick and colleagues7 showed that hierarchical modeling led to more reliable forecasting of future mortality using 1- to 2-year-old data than using conventional modeling. In our current study, we tested the stability of risk-adjusted mortality estimates based on conventional modeling, fixed-effects modeling, and hierarchical modeling over time. We did not find evidence that the use of hierarchical modeling improved the stability of the quality signal over time compared with nonhierarchical modeling. However, this finding is not unexpected, given the high case volumes of individual trauma centers; hierarchical modeling “shrinks” provider mortality rates toward the population average to an extent inversely proportional to hospital case volume.29 Our study does not rule out the possibility that the stability of the quality signal for low-volume hospitals is improved with hierarchical modeling. The findings of this study cannot be generalized to other medical and surgical conditions or to other risk adjustment models. Our analysis was based on the TMPM, as opposed to the better known trauma injury models based on the ISS,30–32 because our work has shown that the performance of the TMPM is superior to the ISS using traditional measures of statistical performance.14 It is possible, although not likely, that mortality prediction models based on the ISS would better predict future hospital performance. Furthermore, we explored only a single domain of quality—mortality, and further work will be necessary to examine other quality domains such as complications and functional outcomes. Finally, our analysis is based on data from a single state and is not necessarily generalizable to other states. Our results should be considered exploratory in nature and need to be replicated by other researchers using other surgical procedures and medical conditions and larger hospital and patient samples.

CONCLUSIONS Our findings suggest that patients currently admitted to highquality trauma centers are less likely to die than patients admitted to low-quality centers, even when the data used to identify high- and low-quality trauma centers is up to 5 years old. However, although the future performance of individual trauma centers can be predicted using 2-year-old data, the performance of individual trauma centers cannot be reliably predicted using performance reports based on data that is 3, 4, or 5 years old.

REFERENCES 1. VanLare JM, Conway PH. Value-based purchasing–national programs to move from volume to value. N Engl J Med. 2012;367:292–295. 2. Hemmila MR, Nathens AB, Shafi S, et al. The Trauma Quality Improvement Program: pilot study and initial demonstration of feasibility. J Trauma. 2010;68:253–262. 3. Shafi S, Nathens AB, Cryer HG, et al. The Trauma Quality Improvement Program of the American College of Surgeons Committee on Trauma. J Am Coll Surg. 2009;209:521–530.e1. 4. Iezzoni LI. The risks of risk adjustment. JAMA. 1997;278:1600–1607. 5. Shahian DM, Wolf RE, Iezzoni LI, et al. Variability in the measurement of hospital-wide mortality rates. N Engl J Med. 2010;363:2530–2539. 6. Glance LG, Osler TM, Mukamel DB, et al. Expert consensus vs empirical estimation of injury severity: effect on quality measurement in trauma. Arch Surg. 2009;144:326–332; discussion 332. 7. Dimick JB, Staiger DO, Birkmeyer JD. Ranking hospitals on surgical mortality: the importance of reliability adjustment. Health Serv Res. 2010;45:1614–1629. 8. Mohan D, Rosengart MR, Farris C, et al. Assessing the feasibility of the American College of Surgeons’ benchmarks for the triage of trauma patients. Arch Surg. 2011;146:786–792.

686 | www.annalsofsurgery.com

9. Pennsylvania Trauma Systems Foundation. Pennsylvania Trauma Systems Foundation Annual Report 2007. Mechanicsburg, PA: Pennsylvania Trauma Systems Foundation; 2007. 10. Glance LG, Osler TM, Mukamel DB, et al. TMPM-ICD9: a trauma mortality prediction model based on ICD-9-CM codes. Ann Surg. 2009;249:1032–1039. 11. Rogers FB, Osler T, Lee JC, et al. In a mature trauma system, there is no difference in outcome (survival) between level I and level II trauma centers. J Trauma. 2011;70:1354–1357. 12. Glance LG, Dick A, Osler TM, et al. Impact of changing the statistical methodology on hospital and surgeon ranking: the case of the New York State cardiac surgery report card. Med Care. 2006;44:311–319. 13. Mukamel DB, Glance LG, Dick AW, et al. Measuring quality for public reporting of health provider quality: making it meaningful to patients. Am J Public Health. 2010;100:264–269. 14. Osler T, Glance L, Buzas JS, et al. A trauma mortality prediction model based on the anatomic injury scale. Ann Surg. 2008;247:1041–1048. 15. Glance LG, Osler TM, Mukamel DB, et al. Outcomes of adult trauma patients admitted to trauma centers in Pennsylvania, 2000–2009. Arch Surg. 2012;147:732–737. 16. Snijders TB, Bosker RJ. Multilevel Analysis. Newbury Park, CA: Sage; 1999. 17. DeLong ER, Peterson ED, DeLong DM, et al. Comparing risk-adjustment methods for provider profiling. Stat Med. 1997;16:2645–2664. 18. Hannan EL, Cozzens K, King SB, III, et al. The New York State cardiac registries: history, contributions, limitations, and lessons for future efforts to assess and publicly report healthcare outcomes. J Am Coll Cardiol. 2012;59:2309– 2316. 19. Khuri SF, Daley J, Henderson WG. The comparative assessment and improvement of quality of surgical care in the Department of Veterans Affairs. Arch Surg. 2002;137:20–27. 20. Hall BL, Hamilton BH, Richards K, et al. Does surgical quality improve in the American College of Surgeons National Surgical Quality Improvement Program: an evaluation of all participating hospitals. Ann Surg. 2009;250:363– 376. 21. Rogowski JA, Horbar JD, Staiger DO, et al. Indirect vs direct hospital quality indicators for very low-birth-weight infants. JAMA. 2004;291:202–209. 22. Birkmeyer JD, Dimick JB, Staiger DO. Operative mortality and procedure volume as predictors of subsequent hospital performance. Ann Surg. 2006;243:411–417. 23. Fleiss JL. Reliability of measurement. In: Design and Analysis of Clinical Experiments. New York: Wiley; 1999:1–31. 24. Dimick JB, Osborne NH, Nicholas L, et al. Identifying high-quality bariatric surgery centers: hospital volume or risk-adjusted outcomes? J Am Coll Surg. 2009;209:702–706. 25. Glance LG, Dick AW, Mukamel DB, et al. How well do hospital mortality rates reported in the New York State CABG report card predict subsequent hospital performance? Med Care. 2010;48:466–471. 26. Institute of Medicine. Performance Measurement: Accelerating Improvement. Washington, DC: National Academies Press; 2007. 27. Krumholz HM, Brindis RG, Brush JE, et al. Standards for statistical models used for public reporting of health outcomes: an American Heart Association Scientific Statement From the Quality of Care and Outcomes Research Interdisciplinary Writing Group: cosponsored by the Council on Epidemiology and Prevention and the Stroke Council Endorsed by the American College of Cardiology Foundation. Circulation. 2006;113: 456–462. 28. Dimick JB, Welch HG. The zero mortality paradox in surgery. J Am Coll Surg. 2008;206:13–16. 29. Shahian DM, Edwards FH, Jacobs JP, et al. Public reporting of cardiac surgery performance, part 2: implementation. Ann Thorac Surg. 2011;92(suppl):S12– S23. 30. Baker SP, O’Neill B, Haddon W, Jr, et al. The Injury Severity Score: a method for describing patients with multiple injuries and evaluating emergency care. J Trauma. 1974;14:187–196. 31. Boyd CR, Tolson MA, Copes WS. Evaluating trauma care: the TRISS method. Trauma Score and the Injury Severity Score. J Trauma. 1987;27:370–378. 32. Osler T, Baker SP, Long W. A modification of the Injury Severity Score that both improves accuracy and simplifies scoring. J Trauma. 1997;43: 922–925.

 C 2013 Lippincott Williams & Wilkins

Copyright © 2013 Lippincott Williams & Wilkins. Unauthorized reproduction of this article is prohibited.

Ranking trauma center quality: can past performance predict future performance?

To explore whether trauma center quality metrics based on historical data can reliably predict future trauma center performance...
180KB Sizes 0 Downloads 0 Views