Note: This copy is for your personal non-commercial use only. To order presentation-ready copies for distribution to your colleagues or clients, contact us at www.rsna.org/rsnarights.

Original Research  n  Special

The Complementary Nature of Peer Review and Quality Assurance Data Collection1 Purpose:

To assess the complementary natures of (a) a peer review (PR)–mandated database for physician review and discrepancy reporting and (b) a voluntary quality assurance (QA) system for anecdotal reporting.

Materials and Methods:

This study was institutional review board approved and HIPAA compliant; informed consent was waived. Submissions to voluntary QA and mandatory PR databases were searched for obstetrics and gynecology–related keywords. Cases were graded independently by two radiologists, with final grades resolved via consensus. Errors were categorized as perceptional, interpretive, communication related, or procedural. Effect of errors was assessed in terms of clinical and radiologic follow-up.

Results:

There were 185 and 64 cases with issues attributed to 32 and 27 radiologists in QA and PR databases, respectively; 23 and nine radiologists, respectively, had cases attributed to only them. Procedure-related entries were submitted almost exclusively through the QA database (62 of 64 [97%]). In QA and PR databases, respectively, perceptional (47 of 185 [25%] and 27 of 64 [42%]) and interpretative (64 of 185 [34%] and 30 of 64 [47%]) issues constituted most errors. Most entries in both databases (104 of 185 [56%] in QA and 49 of 64 [76%] in PR) were considered minor events: wording in the report, findings already known from patient history or prior imaging or concurrent follow-up imaging, or delay in diagnosing a benign finding. Databases had similar percentages of moderate events (28 of 185 [15%] in QA and nine of 64 [14%] in PR), such as recommending unnecessary follow-up imaging or radiation exposure in pregnancy without knowing the patient was pregnant (nine of 64 [14%] in PR and 28 of 185 [15%] in QA). The PR database had fewer major events (one of 64 [1.6%]) than the QA database (32 of 185 [17%]).

Conclusion:

The two quality improvement systems are complementary, with the QA database yielding less frequent but more clinically important errors, while the PR database serves to establish benchmarks for error rate in radiologists’ performance.

1

 From the Department of Radiology, Beth Israel Deaconess Medical Center, 1 Deaconess Rd, Boston, MA 02215. Received December 23, 2013; revision requested February 18, 2014; revision received June 12; accepted July 3; final version accepted July 15. Address correspondence to O.R.B. (e-mail: [email protected]).

q RSNA, 2014

Online supplemental material is available for this article.

 RSNA, 2014

q

Radiology: Volume 274: Number 1—January 2015  n  radiology.rsna.org

221

Report

Olga R. Brook, MD Janneth Romero, MD Alexander Brook, PhD Jonathan B. Kruskal, MD, PhD Chun S. Yam, PhD Deborah Levine, MD

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

Q

uality improvement (QI) is an important aspect of providing optimal patient care (1,2). Medicine in general, radiology included, has used mortality and morbidity conference for a voluntary reporting of clinically important cases. However, a more robust system is required for participation of all physicians, tracking of errors, and ongoing QI. Recognizing this, a peer review (PR) process (3,4) is required for accreditation by the Joint Commission (5,6), Intersocietal Accreditation Commission, and American College of Radiology (ACR) (7). A number of programs for PR are currently available, such as RADPEER (8) or peerVUE (9). However, these programs are based on a sample of cases and can thus lead to missing rare events that are reported in other forms of QI endeavors, such as databases where quality issues are voluntarily reported as they come to the attention of individuals in a radiology department. Thus, quality assurance (QA) databases are important because they are used to collect voluntarily reported errors (10,11), which are important for creating strategies to avoid less common but important errors in the future. However, voluntary reporting is subject to subjective interpretation, underreporting, and reporting bias (12–14). In a 2012 ACR survey of PR, 33% of respondents believed that there was underreporting of significant disagreement in the PR process in their practices (15). Therefore, it is important to identify an optimal format for error detection,

Advance in Knowledge nn We used our obstetric and gynecologic imaging quality improvement (QI) databases to demonstrate that in a radiology QI program, it is important to track errors both in a peer review (PR)–mandated system that mandates physician review and reporting of discrepancies and in a separate voluntary system that tracks anecdotal reporting of less common but potentially more clinically important errors. 222

Brook et al

review, and prevention (16). As monitoring of QI initiatives is undertaken, an important issue to address is whether the data being collected are useful. At our institution, we have two separate QI initiatives being conducted in parallel—a PR process and a voluntary QA reporting process. We wanted to assess whether each was serving the intended purpose and how they complemented each other. Resources in terms of technician, physician, and information technology support are needed to maintain and grow each of these databases. Thus, this information is important for ongoing departmental and medical center support of our initiatives. To compare these types of quality reporting techniques, we needed a well-defined subset of cases that would allow for sufficient follow-up to obtain reference standard diagnosis and a multimodality subject that spanned different clinical subspecialties. We chose obstetric and gynecologic imaging, since it fulfilled these criteria. Therefore, the purpose of our study was to assess the complementary nature of two types of QI databases, a PR-mandated system that mandates physician review and reporting of discrepancies and a separate voluntary QA system that tracks anecdotal reporting of less common but potentially more clinically important errors, by using our obstetric and gynecologic QI databases.

Materials and Methods This retrospective study was conducted with the approval of our institutional review board and was compliant with Health Insurance Portability and Accountability Act regulations, with waiver of informed consent.

Implication for Patient Care nn It is important to track errors both in a PR–mandated system that mandates physician review and reporting of discrepancies and in a separate voluntary system that tracks anecdotal reporting of less common but potentially more clinically important errors.

Database Descriptions and Cohort Selection There are two QI databases instituted in our department: QA and PR. Our online QA database is a voluntary system implemented in May 2005 with the goal of using errors as learning opportunities (10). Entries can be submitted electronically by any member of the department—professional or technical staff—for either a clinical or technical issue. Patient and clinician complaints are also used as material for QA entries. All entries are reviewed by a section-designated radiologist (clinical issues) or technologist (for technical issues) responsible for QA and are presented at section QA meetings. During the case discussion, a mitigation and/or prevention mechanism may be suggested. When appropriate, interventions are followed by the QA team that consists of designated radiologist, technologist, and/ or radiology nurse. Some issues may also lead to Performance QI projects. The submission to the online QA database is available from the departmental radiology Web site and from picture archiving and communication system workstations. The online program allows detailed submission, followed by disposition screens that include the system mitigation and prevention measures, filled in by section-designated

Published online before print 10.1148/radiol.14132931  Content code: Radiology 2015; 274:221–229 Abbreviations: ACR = American College of Radiology PR = peer review QA = quality assurance QI = quality improvement Author contributions: Guarantors of integrity of entire study, O.R.B., D.L.; study concepts/study design or data acquisition or data analysis/interpretation, all authors; manuscript drafting or manuscript revision for important intellectual content, all authors; approval of final version of submitted manuscript, all authors; literature research, O.R.B., D.L.; clinical studies, O.R.B., J.B.K., D.L.; experimental studies, C.S.Y.; statistical analysis, O.R.B., A.B., C.S.Y., D.L.; and manuscript editing, all authors Conflicts of interest are listed at the end of this article.

radiology.rsna.org  n Radiology: Volume 274: Number 1—January 2015

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

QA radiologist or technologist. When appropriate, the QA issues are submitted to the hospital QA system and reported to appropriate authorities. Our PR program was developed in-house in February 2007 in response to Joint Commission and ACR requirements for accreditation (7). Every radiologist is required to review and grade a number of cases equivalent to 5% of his or her total yearly volume (up to 200 cases per year) with the initial instructions to look at prior imaging studies for the first set of cases being clinically interpreted during the work day, although the actual set of cases reviewed was left to the discretion of the individual radiologist. The program was based on the ACR RADPEER system (8) and is scored as follows: 1, concur with interpretation; 2, difficult diagnosis, not ordinarily expected to be assigned; 3, diagnosis should be assigned most of the time; and 4, diagnosis should be assigned almost every time. Personal performance and PR case submissions are clearly visible in the system. Category 3 and 4 errors are communicated directly to the readers involved in the case. In the obstetrics and gynecology section, the category 3 and 4 cases are then reviewed by a group of three radiologists (including the section QA designated radiologist and two other specialists in the area of involvement of the case, chosen for each case by means of organ system and modality). Cases deemed by any one of the three radiologists to have QA issues that require additional review or necessitate more diffuse dissemination beyond the readers involved in the case are entered into the departmental QA database. Once in the QA database, these cases are then presented and discussed at the section monthly QA conference, similar to the entries that initially originate from the QA database. For the purpose of the current study, the dual submissions to both databases were kept as PR submissions only. All submissions to the departmental QA and PR databases were searched in December 2012 with the following

Brook et al

Figure 1

Figure 1:  Flowchart demonstrates cases reviewed in the study. obgyn = obstetric and gynecologic.

obstetric and gynecologic keywords: adnexa*, pelvi*, endometr*, fibroid, myom*, tub*, pregnan*, fet*, embryo*, OB (US), hsg, hystero*, sonoh*, uter*, fallop*, dermoid, gyneco*, salpinx, and ovar*. This search yielded 1049 submissions in the QA database and 220 cases in the PR database. Most of these cases (n = 892) were not related to obstetric and gynecologic imaging and were thus excluded. Duplicate cases (n = 83) within and between databases were excluded. Since any comment could be captured as a keyword, some category 1 PR submissions (n = 7) were captured in this search process. These were also excluded. This resulted in 202 cases in the QA database and 73 cases (27 category 2 cases, 31 category 3 cases, and six category 4 cases) in the PR database (Fig 1).

Evaluation of Cases Review and grading of cases was performed independently by two radiologists (J.R. and D.L., with 6 and 19 years of experience in obstetric and gynecologic imaging, respectively). All cases were evaluated for category of error, effect of error based on the clinical and radiologic follow-up, and subjective assessment of probability of the error being repeated, as adapted from prior reports of grading of errors (17–20) (Tables E1, E2, Appendix E1

Radiology: Volume 274: Number 1—January 2015  n  radiology.rsna.org

[online]). Cases where two readers disagreed were discussed and consensus was reached, with the final consensus grading used for further analysis. Seventeen cases from the QA database and nine cases from the PR database were felt to not be true QA issues according to consensus agreement and were thus excluded from further analysis. The final study group included 185 cases in the QA database and 64 cases in the PR database (Fig 1).

Clinical Information The following information was obtained for each case: patient age, modality, date of imaging study, date of error reporting, physician submitting the study, radiologists interpreting the initial study, and location of scan for ultrasonography (US) cases, as they could be performed on-site, off-site, or in the emergency department. For follow-up studies that were used to judge the severity of the case and the effect on outcome, we also acquired the date of follow-up study and modality. Clinical correlation was obtained through surgical and pathologic confirmation, when available. The total number of cases with each modality for evaluation of pelvic organs in women at our institution during the time periods given for the PR and QA databases were determined by 223

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

using current procedural technology codes. The QI issues in each database were summarized to subjectively assess for trends in types of issues that were reported in each database.

Statistical Analysis The x2 test was used to compare differences in distribution between various groups. The Wilcoxon rank sum test was used to compare the mean patient ages and the mean time period between study and error reporting. k coefficient and percentage agreement were used to assess agreement between observers. k values were interpreted according to Landis and Koch: less than 0.20, slight agreement; 0.21–0.40, fair agreement; 0 . 41 – 0 . 6 0 ,   m o d e ra t e   a g re e m e n t ; 0.61–0.80, good agreement; and 0.81– 0.99, very good agreement. Statistical analysis was performed with Matlab software (Mathworks, Natick, Mass). The level for statistical significance was set at P , .05.

Brook et al

Table 1 Clinical and Demographic Data Parameter No. of cases Mean age (y)* Mean time between submission   and reporting (d)* Submission period Modality  CT   MR imaging  US  Angiography   Nuclear medicine  Radiography  Fluoroscopy Examination location  Inpatient  Outpatient   Emergency room Outpatient US  Off-site  On-site

PR Database

QA Database

P Value

64 44 6 18 (18–87) 298 6 584 (1–2640)

185 42 6 16 (17–88) 152 6 368 (0–2958)

… .41 .10

March 21, 2007 to   November 9, 2012

September 2, 2004 to   November 9, 2012



13 (20) 2 (3) 46 (72) 0 (0) 0 (0) 1 (2) 2 (3)

43 (23) 18 (10) 111 (60) 7 (4) 2 (1) 1 (1) 3 (2)

2 (3) 37 (58) 25 (39)

20 (11) 139 (75) 26 (14)

11 (46) 13 (54)

24 (30) 57 (70)

.23

,.001

.14

Note.—Unless indicated otherwise, numbers in parentheses are percentages.

Results Clinical and Demographic Data There were 185 pelvic imaging submissions in the QA database and 64 cases in the PR database. Mean patient age 6 standard deviation was 44 years 6 18 (range, 18–87 years) for the QA database versus 42 years 6 16 (range, 17–88 years) for the PR database (P = .41). Mean time period between study and error reporting was 298 days 6 584 (range, 1–2640 days) for the QA database versus 152 days 6 368 (range, 0–2958 days) for the PR database (P = .10) (Table 1). Neither mean age nor mean time period between study and error reporting were significantly different between QA and PR databases. The QA database included cases from US, computed tomography (CT), magnetic resonance (MR) imaging, angiography, radiography, and nuclear medicine, while in the PR database, there were no entries from angiography or nuclear medicine (Table 1). Cases in the QA database were submitted by attending radiologists (71 of 185, 38%), fellows (79 of 185, 43%), residents (47 224

* Data are means 6 standard deviations, with ranges in parentheses.

of 185, 25%), and technologists (five of 185, 3%). Twenty-four attending radiologists submitted cases into the QA database, with range of cases submitted from one to 29, with a median of one case submitted. Four attending radiologists submitted QI cases only through the PR database, and nine submitted cases through the QA database only. There were seven radiologists who specialized in obstetric and gynecologic imaging out of the total pool of 52 radiologists involved in pelvic imaging during the study period. Most QI submissions were made by the seven radiologists who specialized in obstetric and gynecologic imaging in both PR (46 of 64, 72%) and QA (47 of 71, 66%) databases; however, these same seven radiologists were found to have less than 50% of the QI cases attributed to them, and the proportion was lower in the PR database (21 of 64, 33%) than in the QA database (90 of 185, 49%). QI issues included cases from 27 attending radiologists in the PR database

(with a mean of 4% of cases attributable to any single attending physician, ranging from 0% to 13%) and 32 attending physicians in the QA database (with a mean of 3% of cases attributable to any single attending physician, ranging from 0% to 15% of cases in the database). Nine radiologists had cases attributed to them in the PR database only and 23 in the QA database only. A graphic summary of case distribution is shown in Figure E1 (online). Entries for outpatient studies constituted most of the submissions in the QA database (139 of 185, 75%), similar to the PR database (37 of 64, 58%), while more emergency room studies were submitted to the PR database (25 of 64, 39%) than to the QA database (26 of 185, 14%) (P , .001). When compared with the total number of pelvic imaging studies performed during the submission period, there was a higher proportion of respective QI issues in the QA database than in the PR database for CT (0.04%

radiology.rsna.org  n Radiology: Volume 274: Number 1—January 2015

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

Table 2

Table 4

Obstetric and Gynecologic Quality Issues Submitted to PR and QA Databases Compared with Total Pelvic Imaging Performed Modality CT MR imaging US Angiography Fluoroscopy

Categories of Submitted Cases

No. of QI Issues in the PR Database*

No. of QI Issues in the QA Database†

P Value

13/31 935 (0.04) 2/3668 (0.05) 46/85 986 (0.05) 0/144 (0) 2/908 (0.22)

43/42 367 (0.1) 18/4625 (0.4) 111/112 982 (0.1) 7/209 (0.3) 3/1249 (0.24)

.003 .002 .0004 .03 .92

Note.—Denominators are total number of cases. Numbers in parentheses are percentages. * Study dates range from May 2005 to December 2012. †

Brook et al

Study dates range from February 2007 to December 2012.

Table 3

27 (42) 2 25 30 (47) 3 27 5 (8) 2 (3) 2 0

47 (25) 0 47 64 (34) 6 58 12 (6) 62 (34) 53 9

* All communication errors were output errors.

Effect of Error

PR Database

No effect Not gradable, insufficient follow-up Minor event   Semantics, incidental, not important finding, or already known   from prior imaging   Delay in diagnosis of benign finding   Potentially important event, but no effect Moderate event   Unnecessary follow-up imaging recommended   Increased level of care or length of stay   Radiation exposure unknowingly recommended during pregnancy Major event   Delay in diagnosis of a malignant finding or major fetal anomaly,   which could have been seen, where care might have been different   Permanent lessening of bodily functioning or surgical intervention   required for complication Catastrophic event

QA Database

0 (0) 5 (8) 49 (76) 19

18 (10) 3 (2) 104 (56) 5

21 9 9 (14) 9 0 0 1 (2) 1

33 66 28 (15) 4 16 8 32 (17) 25

0

7

0 (0)

0 (0)

Note.—Numbers in parentheses are percentages.

Categories of Errors The distribution of error categories differed significantly between PR and QA databases (P , .001, Table 3). Procedure-related entries were submitted almost exclusively through the QA database (62 of 64, 97%), with only two of 64 (3%) procedure-related issues submitted through the PR database. Perceptional errors (47 of 185 [25%] and 27 of 64 [42%]) and interpretative errors (64

Perceptional   Technical limitations   True perceptual error Interpretive  False-positive  Misclassification Communication related* Procedural   Technical issue  Complications

PR QA Database Database

Note.—Numbers in parentheses are percentages.

Clinical Effect of Errors

vs 0.1%, P = .003), MR imaging (0.05% vs 0.4%, P = .002), US (0.05% vs 0.1%, P = .004), and angiography (0 vs 3.3%, P = .03, Table 2).

Main Category and Subcategory

of 185 [34%] and 30 of 64 [47%]) represented most errors in the QA and PR databases, respectively (Table 4). The most frequent errors in the PR database were 11 (17%) misclassified adnexal lesions, 10 (16%) missed adnexal lesions, 10 (16%) missed uterine lesions, and eight (12%) misclassifications of endometrial abnormality. In the QA database, the leading errors were 53 (29%) technical issues, 25 (14%) misclassified adnexal lesions, 18 (10%) missed adnexal abnormalities, 15 (8%) missed fetal anomalies, and 12 (6%) misclassified fetal anomalies (Table E3 [online], Figures 1–4).

Radiology: Volume 274: Number 1—January 2015  n  radiology.rsna.org

Severity of Error Effect The distribution of error effect was significantly different between QA and PR databases (P , .001). Most entries in both databases (49 of 64 [76%] in the PR database and 104 of 185 [56%] in the QA database) were considered to be minor events, owing to wording in the report (a semantics issue), a finding that was already known from the patient’s history or prior imaging or concurrent follow-up imaging, or a delay in diagnosis of a benign finding. Each database had a similar percentage of cases considered to be moderate events (nine of 64 [14%] in the PR database and 28 of 185 [15%] in the QA database), such as recommending follow-up imaging that was not needed or radiation exposure in pregnancy without knowledge that the patient was pregnant (nine of 64 [14%] in the PR database and 28 of 185 [15%] in the QA database). The PR database had fewer entries considered to be major events (one of 64 [2%] of all category 2, 3, and 4 PR cases and one of 37 [3%] of category 3 and 4 PR cases) when compared with the QA database (32 of 185 [17%]), which were predominately due to delay in diagnosis of a malignancy (Figs 4, 5) or a major structural fetal abnormality (Table 3). No catastrophic events were reported in either database. 225

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

Brook et al

Figure 2

Figure 3

Figure 2:  Images represent an example of a perceptual error in a 37-year-old woman who was evaluated for infertility. (a) Radiograph from hysterosalpingography showed a filling defect in the uterine cavity (arrow), but the uterus was identified as normal. (b) Nine months later, the patient underwent sonohysterography for continued infertility issues. The sagittal sonohysterogram shows an endometrial polyp (arrow), which was missed previously.

Probability of Error Occurrence There was a significant difference between the estimated probabilities of case recurrence. Most errors submitted to the PR database were assessed as frequent (44 of 64, 69%) and occasional (18 of 64, 28%) errors, while the submissions to the QA database were believed to be occasional (94 of 185, 51%) and uncommon (70 of 185 [38%], P , .001, Table 5).

Interobserver Agreement There was moderate agreement (k values, 0.40–0.57) between the two readers regarding major error category and its effect. There was slight agreement (k values, 0.09–0.12) regarding probability of the error to occur (Table E4 [online]).

Discussion As expected, we found that our PR and QA databases are complementary to each other in identifying different aspects of QI. The QA database yielded more error identification, with more clinically important errors, than did the PR database. The QA database included all modalities, while PR focused predominantly on US (given the nature of our review of obstetric and gynecologic 226

cases) and interpretative errors. Procedural complications were submitted through the QA database, but not through the PR database. While these findings are to be expected given the nature of the reporting mechanisms, the types of imaging studies included in this review, and the varied culture of self-reporting among individuals, they nevertheless highlight important issues. When we focus on quality metrics and evaluate submissions to a database such as our PR database, we conform to institutional guidelines regarding participation in a QI process. This allows for reporting from all physicians and by its nature should allow for a spectrum of types of imaging studies. However, this type of reporting, based on a proportion of imaging studies interpreted, will by nature lead to important quality issues being missed. Thus, each type of reporting serves separate but overlapping purposes. The PR data are structured to provide an estimate of incidence of interpretative errors and to give data that are important for institutional assessment of physician competency, offering information on the distribution and relative effect of errors in a sample of cases. In contrast to this, the QA database is structured for any type of

Figure 3:  Images represent an example of a perceptual error in a 70-year-old woman who underwent MR cholangiopancreatography for follow-up of pancreatic cysts. (a) Coronal MR image obtained with half-Fourier rapid acquisition with relaxation enhancement demonstrates a cyst in the right ovary (arrow), seen just on the edge of the images, that was not noticed by the interpreting radiologist. (b) Nearly 2 years after the MR imaging examination, the patient was evaluated with pelvic US for bloating. The sagittal US image shows a complex solid and cystic mass in the right ovary, which at pathologic examination was shown to be mucinous ovarian adenocarcinoma.

error to be submitted (for technical performance of the study or for professional interpretation) but tends to attract the more clinically important errors. Thus, errors with perceived major effect represented 1% and 17% of total cases performed in our PR and QA databases, respectively. This is also supported by the finding that errors submitted through PR database were issues that were believed to occur more commonly, while the QA database had

radiology.rsna.org  n Radiology: Volume 274: Number 1—January 2015

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

Figure 4

Brook et al

Table 5 Distribution of the Probability of the Error to Occur Error Probability Frequent Occasional Uncommon Remote

PR Database

QA Database

44 (69) 18 (28) 1 (2) 1 (2)

7 (4) 94 (51) 75 (40) 9 (5)

Note.—Numbers in parentheses are percentages.

Figure 4:  Images demonstrate a misclassification error in a 65-year-old woman who underwent CT colonography after incomplete colonoscopy. (a) Coronal nonenhanced CT image shows bilateral pelvic lesions (arrows), which were called fibroids. (b) Coronal contrast material–enhanced CT image acquired 2½ years later for abdominal pain shows bilateral complex ovarian lesions (arrows), with pathologic findings of serous neoplasm in the right ovary and borderline serous cystadenoma in the left ovary.

Figure 5

Figure 5:  Images represent an example of a major event in a 58-year-old woman who was examined as part of follow-up of pancreatic cancer. (a) Coronal contrast-enhanced CT image of the pelvis shows a complex adnexal lesion (arrow) that was not reported. (b) Coronal contrast-enhanced CT image acquired 8 months later for follow-up of pancreatic cancer shows that the lesion (arrow) had grown in size and complexity. Pathologic findings demonstrated mucinous adenocarcinoma, likely metastatic from pancreatic carcinoma.

a higher percentage of occasional and uncommon errors. We have also seen a different pattern of pelvic imaging QI submissions by different radiologists—some exclusively used the QA database, and some only submitted PR cases. Out of 23 radiologists, all but one routinely involved in pelvic imaging were found to have QI

cases attributed to them only through the QA database, compared with nine radiologists who had cases attributed to them through the PR database only. The explanation for this is likely that quality reporting is performed differently by modality-based and organbased sections, as well as differently by various attending physicians. PR cases

Radiology: Volume 274: Number 1—January 2015  n  radiology.rsna.org

are usually submitted from the routine work within the subspecialty and are designed to be representative of volume of examinations reported; therefore, most PR submissions in this study were from radiologists who specialized in obstetric and gynecologic imaging, which is a US-based practice. QA cases can be submitted by any member of the radiology department; thus, any attending physician, regardless of their primary subspecialty, may submit a case. This highlights the complementary nature of the two QI databases and shows the benefit of evaluating the cases from both databases together, not just according to modality, but also according to organ system. Since both of these methods provide data that can be helpful, it behooves departments to ensure that the time spent in submission and analysis of cases is optimized. McEnery et al (21) suggest integration of the PR system into the picture archiving and communication system workstation. In a direct extension of this idea, Sheu et al (22) propose implementation of a mathematic model to “optimally select the types of cases to be reviewed, for each radiologist undergoing review, on the basis of the past frequency of interpretive error, the likelihood of morbidity from an error, the financial cost of an error, and the time required for the reviewing radiologist to interpret the study.” One of the important issues in PR and QA is the subjective nature of the submission and evaluation of cases. Soffa et al (23) showed that radiologists disagree in the interpretation of 3.5% of cases. In our grading of 227

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

category and effect of QI cases, there was only moderate agreement, with k values of 0.40–0.57, showing that disagreements regarding the nature of errors are common. This may relate to the fact that many errors have multiple contributing causes, making it difficult to assign exact classification of error cause and resulting in disagreement between readers. In an attempt to reduce bias, Liu et al (24) proposed establishing benchmarks for diagnostic performance in radiology by means of correlation with clinical and pathologic follow-up. This would decrease subjective bias in QI programs but would require substantial investment in the manpower to perform this follow-up. The data on individual radiologists cannot be compared across subspecialty areas, as different subspecialties have different case mixes (eg, interventional cases with procedural complications or pelvic bony anatomy cases, where gynecologic findings are not suspected given the patient’s presenting symptoms). However, our analysis allows for grouped data to show if individual radiologists need feedback on their performance. Our study had limitations. First, the time periods within the two databases differed. We decided to use all of the data available, and we partially corrected for this by comparing submission to total pelvic imaging cases in each time period. Second, submissions to the QA database could be made by any individual; thus, an attending physician might delegate cases for submission to a technologist or trainee, so accurate data on submission by a radiologist to the QA database is lacking. Third, we evaluated only category 2–4 PR cases if a comment containing a pelvic imaging keyword was made. It is possible that we have not included all obstetric and gynecologic cases (if the keyword was misspelled in the original comment or an unusual term was used). However, since we used the same list of keywords in both searches, the influence of this bias would be equal in the two databases. As mentioned previously, the grading of types of errors is variable. We chose a grading system 228

Brook et al

that had not been validated previously, since we had no other structure for this type of review. We hope that with future studies of this nature, a grading system for types of errors will be accepted among radiology practices to allow for better tracking of errors and more defined interventions to improve patient care. Finally, the results of our study are limited to the experience in a single institution, and this assessment is limited to obstetric and gynecologic imaging. Results might therefore not be applicable to other types of practices or areas of imaging. In conclusion, a voluntary QA database is an important mechanism for detection of more clinically relevant but less frequent cases, while a PR database may be useful for routine assessment of physician performance. Therefore, although recent regulations have encouraged the use of PR databases, we should continue our efforts to encourage voluntary entry of all QA cases, as these parallel processes have complementary functions in continuous QI. Disclosures of Conflicts of Interest: O.R.B. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: author received an educational grant from Guerbet. Other relationships: disclosed no relevant relationships. J.R. disclosed no relevant relationships. A.B. disclosed no relevant relationships. J.B.K. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: author received a stipend from Up-To-Date. Other relationships: disclosed no relevant relationships. C.S.Y. disclosed no relevant relationships. D.L. Activities related to the present article: disclosed no relevant relationships. Activities not related to the present article: author received payment from Elsevier and Up-To-Date. Other relationships: disclosed no relevant relationships.

References 1. Newman-Toker DE, Pronovost PJ. Diagnostic errors—the next frontier for patient safety. JAMA 2009;301(10):1060–1062. 2. Neale G, Hogan H, Sevdalis N. Misdiagnosis: analysis based on case record review with proposals aimed to improve diagnostic processes. Clin Med 2011;11(4):317–321. 3. Halsted MJ. Radiology peer review as an opportunity to reduce errors and improve patient care. J Am Coll Radiol 2004;1(12):984– 987.

4. Butler GJ, Forghani R. The next level of radiology peer review: enterprise-wide education and improvement. J Am Coll Radiol 2013;10(5):349–353. 5. Donnelly LF, Strife JL. Performance-based assessment of radiology faculty: a practical plan to promote improvement and meet JCAHO standards. AJR Am J Roentgenol 2005;184(5):1398–1401. 6. Steele JR, Hovsepian DM, Schomer DF. The joint commission practice performance evaluation: a primer for radiologists. J Am Coll Radiol 2010;7(6):425–430. 7. American College of Radiology. New accreditation physician peer-review requirements effective April 1, 2007. American College of Radiology Web site. http://gm.acr.org/SecondaryMainMenuCategories/quality_safety/ radpeer/new_requirements.asp. Accessed May 30, 2014. 8. Borgstede JP, Lewis RS, Bhargavan M, Sunshine JH. RADPEER quality assurance program: a multifacility study of interpretive disagreement rates. J Am Coll Radiol 2004;1(1):59–65. 9. Jackson VP, Cushing T, Abujudeh HH, et al. RADPEER scoring white paper. J Am Coll Radiol 2009;6(1):21–25. 10. Kruskal JB, Anderson S, Yam CS, Sosna J. Strategies for establishing a comprehensive quality and performance improvement program in a radiology department. RadioGraphics 2009;29(2):315–329. 11. Melvin C, Bodley R, Booth A, Meagher T, Record C, Savage P. Managing errors in radiology: a working model. Clin Radiol 2004;59(9):841–845. 12. Lee JK. Quality—a radiology imperative: interpretation accuracy and pertinence. J Am Coll Radiol 2007;4(3):162–165. 13. Mahgerefteh S, Kruskal JB, Yam CS, Blachar A, Sosna J. Peer review in diagnostic radiology: current state and a vision for the future. RadioGraphics 2009;29(5):1221–1231. 14. Larson DB, Nance JJ. Rethinking peer review: what aviation can teach radiology about performance improvement. Radiology 2011;259(3):626–632. 15. Abujudeh H, Pyatt RS Jr, Bruno MA, et al. RADPEER peer review: relevance, use, concerns, challenges, and direction forward. J Am Coll Radiol 2014 May 16. [Epub ahead of print] 16. Jones DN, Benveniste KA, Schultz TJ, Mandel CJ, Runciman WB. Establishing national medical imaging incident reporting systems: issues and challenges. J Am Coll Radiol 2010;7(8):582–592.

radiology.rsna.org  n Radiology: Volume 274: Number 1—January 2015

SPECIAL REPORT: Complementary Nature of Peer Review and Quality Assurance Data Collection

17. Brook OR, Kruskal JB. In search of improvement: characterizing errors in diagnostic radiology. ARRS Categorical Course Syllabus: pitfalls in clinical imaging. Leesburg, Va: American Roentgen Ray Society, 2012. 18. DeRosier J, Stalhandske E, Bagian JP, Nudell T. Using health care Failure Mode and Effect Analysis: the VA National Center for Patient Safety’s prospective risk analysis system. Jt Comm J Qual Improv 2002;28(5):248–267, 209. 19. Renfrew DL, Franken EA Jr, Berbaum KS, Weigelt FH, Abu-Yousef MM. Error in radi-

ology: classification and lessons in 182 cases presented at a problem case conference. Radiology 1992;183(1):145–150. 20. Donald JJ, Barnard SA. Common patterns in 558 diagnostic radiology errors. J Med Imaging Radiat Oncol 2012;56(2):173–178. 21. McEnery KW, Suitor CT, Hildebrand S, Downs RL. Integration of radiologist peer review into clinical review workstation. J Digit Imaging 2000;13(2 Suppl 1):101–104. 22. Sheu YR, Feder E, Balsim I, Levin VF, Bleicher AG, Branstetter BF 4th. Opti-

Radiology: Volume 274: Number 1—January 2015  n  radiology.rsna.org

Brook et al

mizing radiology peer review: a mathematical model for selecting future cases based on prior errors. J Am Coll Radiol 2010;7(6):431–438. 23. Soffa DJ, Lewis RS, Sunshine JH, Bhargavan M. Disagreement in interpretation: a method for the development of benchmarks for quality assurance in imaging. J Am Coll Radiol 2004;1(3):212–217. 24. Liu PT, Johnson CD, Miranda R, Patel MD, Phillips CJ. A reference standard–based quality assurance program for radiology. J Am Coll Radiol 2010;7(1):61–66.

229

The complementary nature of peer review and quality assurance data collection.

To assess the complementary natures of (a) a peer review ( PR peer review )-mandated database for physician review and discrepancy reporting and (b) a...
659KB Sizes 4 Downloads 7 Views