ORIGINAL ARTICLE

Longitudinal assessment of colonoscopy quality indicators: a report from the Gastroenterology Practice Management Group Lyndon V. Hernandez, MD, MPH,1 Thomas M. Deas, MD,2 Marc F. Catalano, MD,3 Nalini M. Guda, MD,3 Lin Huang, MD,4 Scott R. Ketover, MD,5 Kyle P. Etzkorn, MD,6 Kumar G. Gutta, MD,2 Steve J. Morris, MD,7 Michael J. Schmalz, MD,3 Dominic Klyve, PhD,8 John I. Allen, MD, MBA9 Milwaukee, Wisconsin, USA

Background: There is increasing demand for colonoscopy quality measures for procedures performed in ambulatory surgery centers. Benchmarks such as adenoma detection rate (ADR) are traditionally reported as static, one-dimensional point estimates at a provider or practice level. Objective: To evaluate 6-year variability of ADRs for 370 gastroenterologists from across the nation. Design: Observational cross-sectional analysis. Setting: Collaborative quality metrics database from 2007 to 2012. Patients: Patients who underwent colonoscopies in ambulatory surgery centers. Interventions: Colonoscopy. Main Outcome Measurements: The number of colonoscopies with an adenomatous polyp divided by the total number of colonoscopies (ADR-T), inclusive of indication and patient’s sex. Results: Data from 368,157 colonoscopies were included for analysis from 11 practices. Three practice sites (5, 8, and 10) were significantly above and 2 sites (3, 7) were significantly below mean ADR-T, with a 95% confidence interval (CI). High-performing sites had 9.0% higher ADR-T than sites belonging to the lowest quartile (P! .001). The mean ADR-T remained stable for 9 of 11 sites. Regression analysis showed that the 2 practice sites where ADR-T varied had significant improvements in ADR-T during the 6-year period. For each, mean ADR-T improved an average of 0.5% per quarter for site 2 (P Z .001) and site 3 (P Z .021), which were average and low performers, respectively. Limitations: Summary-level data, which does not allow cross-reference of variables at an individual level. Conclusion: We found performance disparities among practice sites remaining relatively consistent over a 6-year period. The ability of certain sites to sustain their high-performance over 6 years suggests that further research is needed to identify key organizational processes and physician incentives that improve the quality of colonoscopy. (Gastrointest Endosc 2014;-:1-7.)

Abbreviations: ADR, adenoma detection rate; ADR-T, adenoma detection rate-total; ASC, ambulatory surgery centers; GPMG, Gastroenterology Practice Management Group; IT, information technologies. DISCLOSURE: T. Deas is the chief medical officer for Sandlot Solutions and a member of the Physician Leadership Board for Surgical Care Affiliates. N. Guda is a consultant for Boston Scientific. S. Morris is on the Advising Board for CRH Medical Corp. J. Allen is an advisor for Pentax, Johnson and Johnson, and Myriad Genetics. All other authors disclosed no financial relationships relevant to this publication. Copyright ª 2014 by the American Society for Gastrointestinal Endoscopy 0016-5107/$36.00 http://dx.doi.org/10.1016/j.gie.2014.02.1043 Received October 24, 2013. Accepted February 27, 2014.

www.giejournal.org

Current affiliations: GI Associates and Medical College of Wisconsin, Milwaukee, Wisconsin (1); Gastroenterology Associates of North Texas, Fort Worth, Texas (2); GI Associates, LLC, Milwaukee, Wisconsin (3); Digestive Health Specialists, Tacoma, Washington (4); Minnesota Gastroenterology, Minneapolis, Minnesota (5); Borland-Groover Clinic, Jacksonville, Florida (6); Atlanta Gastroenterology Associates, Atlanta, Georgia (7); Central Washington University, Ellensburg, Washington (8); Yale University, New Haven, Connecticut (9). Reprint requests: Dr Lyndon Hernandez, Medical College of Wisconsin/GI Associates, 3111 W. Rawson Avenue, Suite 240, Franklin, WI 53132. If you would like to chat with an author of this article, you may contact Dr Hernandez at [email protected].

Volume

-,

No.

-

: 2014 GASTROINTESTINAL ENDOSCOPY 1

Longitudinal assessment of colonoscopy quality indicators

Hernandez et al

Since 1999, the number of ambulatory surgery centers (ASCs) in the United States has grown by 8.3% annually.1 Colonoscopy is one of the most common procedures performed in ASCs, and there is increasing demand for colonoscopy quality metrics for procedures performed in ASCs from hospital networks, commercial and government payers, and patients. Current information technology platforms have significant barriers that limit the ability of clinicians to measure performance metrics across ASCs. Adenoma detection rates (ADRs) have been linked to interval colon cancers and thus have emerged as an important measure of colonoscopy quality.2 A physician’s ADR is traditionally defined as the percentage of screening colonoscopy examinations in which the endoscopist identifies and removes an adenomatous polyp. ADRs usually are calculated as average point estimates that are obtained over a specified time period (usually annually) and are static and 1-dimensional. Little information exists concerning temporal changes in ADR, so it is unclear whether a physician’s individual or practice site ADR varies over time.3 Our aim was to evaluate the 370 members of the Gastroenterology Practice Management Group (GPMG), focusing on the variability of ADRs over a 6-year time period.

METHODS This is a cross-sectional study of ongoing data collected by the GPMG from 2007 to the third quarter of 2012. GPMG is a consortium of 13 large gastroenterology practices in the nation, representing 370 gastroenterologists. Part of the mission of GPMG is to share information on quality metrics under the broad umbrella of collaborating on best practices. The data are derived from diverse geographic areas of the United States (in alphabetical order: Colorado, Florida, Georgia, Illinois, Minnesota, Mississippi, Missouri, Nevada, Tennessee, Texas, Washington, Washington DC, Wisconsin) and represent large practices (10-75 gastroenterologists per practice) and high-volume ASCs (5000-60,000 endoscopic procedures per year per group) of similar operating structures. The Centers for Medicare & Medicaid Services defines ASCs as a distinct entity that operates exclusively for the purpose of furnishing outpatient surgical services to patients.4 Each group has its own quality management framework and provides predefined data elements on several quality measures at different time points. For each participating site, a research coordinator collected de-identified, group-level data on a quarterly basis from one or multiple ASCs. Database auditing is performed periodically for accuracy and consistency between quarters. Each practice site provides its physician members a performance report card on various quality metrics benchmarked with group and national averages on a 2 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2014

Take-home Message  Adenoma detection rates (ADRs) inclusive of colonoscopy for all indications and patient sex (total number of colonoscopies [ADR-T]) can be a helpful quality metric for colonoscopies performed in ambulatory surgery centers. ADR-T can demonstrate temporal variability when analyzed over a 6-year period.  There is an opportunity to streamline database extraction in ambulatory surgery centers and to identify key organizational processes and physician incentives that improve the quality of colonoscopy.

yearly or quarterly basis, beginning this process at different time periods from 2008 onward. We defined ADR as the number of colonoscopies with an adenomatous polyp divided by the total number of colonoscopies (ADR-total or ADR-T). We divided the sites into 3 groups (high, middle, and low), based on their ADR-T performances. ADR-T was not stratified by patient sex or indication (such as screening, surveillance, or to evaluate for symptoms) in the summary-level database. The quality of bowel preparation was recorded as excellent, good, fair, and poor. Unusual reporting frequencies from each quarter, such as inconsistent data, were excluded from analysis. Because this unusual reporting may simply be a result of data entry error, we excluded only those measurements lying more than 5 standard deviations from those usually reported by a practice site.

Statistical analysis We calculated the overall ADR-T and assessed for consistency of performance by using the quarterly ADR’s of each practice to find a 95% t-confidence interval (CI) for each practice. The expectation is that an ASCs quarterly ADR-T will fall within this CI about 95% of the time. An ASC with significantly higher ADR-T than expected has a CI that lies entirely above the average ADR-T taken from all practices, whereas an ASC with significantly lower ADR-T than expected has a CI entirely below the average ADR-T in the study. To determine whether ASCs were improving over time, we also used regression analysis on the quarterly ADR-T values for each practice. The P value for the deterministic trends in Figure 3 gives the probability of an increase if there were not a true increase in the actual ADR values. IBM SPSS 21 for Windows package (IBM, Armonk, NY, USA) was used for statistical analysis.

RESULTS From a total of 418,978 colonoscopies performed in ASCs, quarterly data entries demonstrating unusual frequencies were excluded (n Z 50,821), leaving 368,157 cases for analysis (Table 1). Of the 11 practice sites that www.giejournal.org

Hernandez et al

Longitudinal assessment of colonoscopy quality indicators

TABLE 1. Aggregate demographic and clinical characteristics of all practice sites Variable, no. (%) Total colonoscopies performed

368,157

Age R50 y

299,780 (81)

Male sex

163,611 (44)

Indication Initial screening

153,526 (42)

Surveillance

96,589 (27)

Symptoms

113,711 (31)

Quality of bowel preparation Excellent/good

219,473 (85)

Fair/poor

39,146 (15)

Cecum reached Documented

298,652 (96)

Withdrawal time, min !5

8295 (4)

R6

183,745 (93)

Unknown

5230 (3)

TABLE 2. Summary statistics in percentages for each practice site (excluding sites 6 and 13) Male

Age R50 y

Initial screening

1

46

78

30

2

47

84

53

3

48

79

40

4

45

84

30

5

42

83

21

7

45

83

39

8

47

80

31

9

43

85

48

10

46

83

40

11

46

82

45

12

47

82

44

Practice site

remain, 9 had available data for R4 years. Table 2 shows the average distribution of patient age, sex, and screening indication for each practice site. www.giejournal.org

Figure 1 shows the ADR-T of each 11 de-identified group, calculated as the mean of all recorded quarterly ADR-Ts. Three sites (5, 8, and 10) were significantly above, and 2 sites (3, 7) were significantly below the average ADR-T based on 95% CIs. High-performers, defined as sites in the highest quartile of ADR-T, had on average 9.0% higher ADR-Ts than sites in the lowest quartile (P ! .001). Figure 2 shows the ADR-T time plot for all sites combined, showing a significant increase (P ! .001 by linear regression) during the study period. Of the 7 sites reporting data in 2007, the combined ADR-T was 23.4%; by the third quarter of 2012, the combined ADR-T for the 10 sites reporting data was 32.7%. When we restricted analysis to just the 7 original sites, we had a very similar ADR-T of 32.6% in the third quarter of 2012. Thus, the sites that started to report data later were consistent with the original sites. Practice sites were stratified according to ADR-T performance (Fig. 3). The mean ADR-T remained stable over the study period, with the exception of 2 practice sites. By using regression analysis, we showed that improvements in ADR-T for these 2 sites were 0.5% per quarter during the 6-year study period for each, and both were significant (practice site 2, P Z .001; site 3, P Z .021). At the beginning of the study period, mean ADR-Ts for sites 2 and 3 were less than or equal to the entire group average. Site 3 had the greatest improvement in ADR-T (mean increase of 0.22% per quarter (P Z .001 by linear regression). No site demonstrated a significant decline in ADR-T. We used a regression model to compare the relationship between several other variables and ADR-T, treating each group’s quarterly values as data points. No single variable by itself predicted ADR-T by univariate analysis. However, multivariate regression with 5 independent variables was significant (P Z .01). Of these 5 variables, 3 were also individually significant in the model, whereas 2 others contributed significantly to the model as a whole (Table 3). Other variables, such as withdrawal time and bowel preparation, were not significantly associated with ADR-T.

DISCUSSION With increasing numbers of colonoscopies performed in ASCs, there is a need for a systematic, standardized method for measuring outcomes in order to demonstrate high-value practices. The Centers for Medicare & Medicaid Services requires ASCs to migrate from a straight feefor-service payment toward value-based reimbursement. As performance metrics for colonoscopy continue to evolve, clinical outcomes or surrogates for clinical outcomes will be emphasized. It is important to know whether surrogate measures such as ADR relate closely to patient outcomes, whether they are variable over Volume

-,

No.

-

: 2014 GASTROINTESTINAL ENDOSCOPY 3

Longitudinal assessment of colonoscopy quality indicators

Hernandez et al

Adenoma Detection Rate (ADR-T) 45 40

ADR-T (95% Cl)

35 30 25 20 Average ADR

15 10 5 0 1

2

3

4

5

6

7

8

9

10

11

Practice Site Figure 1. Percent adenoma detection rate-total (95% confidence interval) for each de-identified practice site. ADR, adenoma detection rate; ADR-T, adenoma detection rate-total.

ADR-T (all groups combined) 35% 30% 25% 20% 15% 10% 5% 0% 2007

2008

2009

2010

2011

2012

2013

Figure 2. Time plot series of percent ADR-T for all practice sites combined. Data from each quarter represent the mean from all sites that reported an ADR-T. ADR-T, adenoma detection rate-total.

time, and, if so, the reasons for variation (for example, random or intentional). Assessing quality without considering temporal changes among high and low ranking ASCs could lead to erroneous conclusions about their performances. We found performance disparities among groups that when analyzed more closely showed a temporal trend. Low-ranking groups demonstrated improvement in performance over time. Ranking among the lowest in a single year may not truly represent the performance of an ASC when viewed over the long term. The awareness of performance assessment certainly can improve performance of individual endoscopists and perhaps the entire ASC medical staff to some degree. The Hawthorne effect has been linked to improving 4 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2014

short-term quality of inspection technique,5 but the durability of its effect and whether or not it actually leads to an improvement of ADR-T remains unclear. Another intrinsic challenge when measuring valuebased outcomes for colonoscopies is how health information technologies (IT) function as siloes that do not link across ASC entities.6 To mitigate these impediments, data definitions must be standardized and streamlined for easy and inexpensive data extraction. A significant advancement that addresses this issue is the establishment of the GI Quality Improvement Consortium to develop quality indicators and assist participating physicians and facilities to benchmark their performances.7 Yet, economic barriers to adoption persist in the www.giejournal.org

Hernandez et al

Longitudinal assessment of colonoscopy quality indicators 50%

Group 1 - low performers

45% 40% 35%

Site 1

30%

Site 2

25%

Site 3

20%

Site 7

15%

2010Q1 mean

10% 5%

,3 09 ,3 20 10 ,1 20 10 ,2 20 10 ,3 20 11 ,1 20 11 ,2 20 11 ,3 20 11 ,4 20 12 ,1 20 12 ,2 20 12 ,3 20

08

20

20

07

,3

0%

0.5

Group 2 - middle performers

0.45 0.4 0.35 0.3

Site 10

0.25

Site 8

0.2

Site 4

0.15

Site 5 2010Q1 mean

0.1 0.05

,3 10 ,1 20 10 ,2 20 10 ,3 20 11 ,1 20 11 ,2 20 11 ,3 20 11 ,4 20 12 ,1 20 12 ,2 20 12 ,3

,3

20

20

09

08

20

20

07

,3

0

0.5

Group 3 - high performers

0.45 0.4 0.35 0.3

Site 11

0.25

Site 12

0.2

Site 13

0.15

Site 9 2010Q1 mean

0.1 0.05

,3

,2

12 20

,1

12

20

,4

12 20

,3

11 20

,2

11 20

,1

11 20

,3

11

10

20

,2

20

,1

10

10

20

,3

20

,3

09

20

08 20

20

07

,3

0

Figure 3. Time plot series for percent adenoma detection rate (ADR-T) of each de-identified practice site, stratified according to level of performance. Among the low performers, site 3 (purple line) demonstrated the largest improvement in ADR-T per quarter (P Z .001). Note that 12 sites are represented in these graphs, even though several sites did not report data every quarter. Gaps in the lines represent a lack of data. Practice site 6 reported data only for 2007 and 2008 and was excluded from analysis.

www.giejournal.org

Volume

-,

No.

-

: 2014 GASTROINTESTINAL ENDOSCOPY 5

Longitudinal assessment of colonoscopy quality indicators

Hernandez et al

TABLE 3. Multivariate regression model for predicting ADR-T P value

Variable Initial screening

.01

Surveillance

.01

Symptoms

.08

Age R50 y

.09

Total no. of colonoscopies

.01

ADR-T, Adenoma detection rate-total.

community. There are no incentives for ASCs to extract and analyze quality metrics from their current electronic health records or ASC-based ITs. ASCs, unlike hospitals, are not eligible for electronic health record incentive payments; thus it is not surprising that only about one fourth of freestanding ASCs have adopted electronic health records.8 Many ASCs rely on legacy systems of manually translating quality metrics into spreadsheets; thus, building a detailed dataset on individual-level information necessitates updated ITs, which in turn requires substantial capital and ongoing administrative costs. Our dataset has several limitations. GPMG has no information on each individual patient or physician; thus we are unable to cross-reference variables nor can we analyze at a granular level to look at how physician turnover can influence ADR-Ts. Whereas it is possible that the group average might change solely from the departure or retirement of low performers and arrival of high performers, we believe this neither substantially changes the final outcome nor limits our conclusions. It has been shown that physicians’ lifetime colonoscopy experiences (including prior training) and case volumes had no significant correlation with ADR.9 Certainly, it is possible that an influx of new recruits who happen to be outliers can skew the mean ADR of the whole group, but this is probably unlikely given the size and longitudinal nature of the GPMG database. GPMG also relied on self-reporting that makes it possible for certain ASCs to overrepresent its ADR-T. However, we did not observe any evidence of such practice, and this probably has minimal impact in a large registry. We excluded data from two practice sites because of inconsistent data compared with other sites and incomplete ADR-T values. It is possible that this could have resulted from errors in data input, but despite periodic reminders to make corrective action, the 2 groups provided wide data ranges with several quarters of missing values during the study period. Although we acknowledge the limitations of our database, we identified extreme values and made corrective measures by excluding values lying more than 5 standard deviations from those usually reported by a practice site.

6 GASTROINTESTINAL ENDOSCOPY Volume

-,

No.

-

: 2014

Prior studies that show an association between patient sex and ADR10 used individual patient-level data as a separate data points. In contrast, our study based each practice site’s quarterly data as a separate data point. Thus, the effect size of patient sex and indication on the ADR may be consistent with previous studies but did not reach statistical significance because of the summary nature of our dataset. This could explain why, in the univariate analysis, patient sex and age did not influence ADR-T, unlike prior studies on ADR, although our multivariate analysis did show an association. The ability of some practice sites to be consistently above average in ADR-T suggests underlying differences that affect how each group assures quality. Although each site provided its members a report card that included ADR-T, there were subtle differences on how the material was delivered. There are sites that provide physicians their mean ADR-Ts without comparing them with their other partners, whereas other sites explicitly rank each partner’s performance in reference to the whole group. Another area of variability is the manner of dealing with underperforming endoscopists. Although some sites do not make corrective actions, other sites mandate interventions such as formal eye examinations and remedial training by using instructional videos. The circumstances that led to significant improvements of practice sites 2 and 3 is unclear, because our database does not allow us to determine how multiple variables affected performance. Indeed, no endoscopist performance–related intervention has been shown in the literature to categorically improve ADRs.11 Our study, by using a collaborative nationwide registry of ASCs, highlights the multidimensional nature of ADRs and characterizes its temporal variability over a long-term period. In an environment of fragmented ASC-based health ITs, it is imperative to use data definitions that are simple to extract and benchmark. Calculating ADR-Ts can be a helpful tool as part of quality assurance in ASCs, but it is limited in scope for cross-referencing individual variables such as patient demographics and physician turnover. Determining the most effective manner of measuring quality and providing the appropriate intervention will need to evolve, and they are essential components for practice sites to transform into high-value organizations. Vigorous research is needed to identify key organizational processes and physician incentives that improve the quality of colonoscopy.

REFERENCES 1. MedPAC. Ambulatory surgical center services: assessing payment adequacy and updating payments. MedPAC data book, June 2006. Available at: http://www.medpac.gov/chapters/Mar12_Ch05.pdf. Accessed October 2013.

www.giejournal.org

Hernandez et al 2. Kaminski MF, Regula J, Kraszewska E, et al. Quality indicators for colonoscopy and the risk of interval cancer. N Engl J Med 2010;362: 1795-803. 3. Hernandez LV, Allen JI, Klyve D, et al. Colonoscopy quality indicators from a nationwide consortium of GI practices [abstract]. Gastrointest Endosc 2011;73:AB396. 4. Medicare Claims Processing Manual. Chapter 14. Available at: http:// www.cms.gov/Regulations-and-guidance/guidance/manuals/downloads/ clm104c14.pdf. Accessed September 2013. 5. Rex DK, Hewett DG, Raghavendra M, et al. The impact of video recording on the quality of colonoscopy performance: a pilot study. Am J Gastroenterol 2010;105:2312-7. 6. Mehrotra A, Dellon ES, Schoen RE, et al. Applying a natural language processing tool to electronic health records to assess performance on colonoscopy quality measures. Gastrointest Endosc 2012;75:1233-9.

www.giejournal.org

Longitudinal assessment of colonoscopy quality indicators 7. GI Quality Improvement Consortium. Available at: http://giquic.gi.org/ index.asp. Accessed September 2013. 8. Hing E, Hall MJ, Ashman JJ. Use of electronic medical records by ambulatory care providers: United States, 2006. Natl Health Stat Report 2010;22:1-21. 9. Adler A, Wegscheider K, Lieberman D, et al. Factors determining the quality of screening colonoscopy: a prospective study on adenoma detection rates, from 12,134 examinations (Berlin colonoscopy project 3, BECOP-3). Gut 2013;62:236-41. 10. Chen SC, Rex DK. Endoscopist can be more powerful than age and male gender in predicting adenoma detection at colonoscopy. Am J Gastroenterol 2007;102:856-61. 11. Corley DA, Jensen CD, Marks AR. Can we improve adenoma detection rates? A systematic review of intervention studies. Gastrointest Endosc 2011;74:656-65.

Volume

-,

No.

-

: 2014 GASTROINTESTINAL ENDOSCOPY 7

Longitudinal assessment of colonoscopy quality indicators: a report from the Gastroenterology Practice Management Group.

There is increasing demand for colonoscopy quality measures for procedures performed in ambulatory surgery centers. Benchmarks such as adenoma detecti...
688KB Sizes 1 Downloads 4 Views