Computed Tomography to Stage Lung Cancer Approaching a Controversy Using Meta-analysis1-3

ROBERT E. DALES,4 RYAN M. STARK, and SANKARANARAYANAN RAMAN

Introduction SUMMARY The ability of computed tomography (CT) to detect mediastinal lymph node metastases Worldwide, lung cancer is the most from nonsmall cell bronchogenic lung cancer is highly controversial, as evidenced by reported accommon cause of cancer in men, and curacies ranging from 0.35 to 0.95 over the past eight years. Weexamined all studies on this matter sixth among women (1). Accurate stagpublished between January 1980 and April 1988, both to describe the overall experience and to ing, assessing the extent of local and disidentify characteristics (stUdy design and methodology and CT scan techniques) that influenced tant disease, is necessary to determine rereported accuracy. Of 79 relevant pUblications, 37 were excluded because they were review reports, sectability and overall prognosis. Preopassessed small cell lung cancer, or contained InSUfficient evidence to construct a contingency table erative mediastinoscopy and/or anterior (CT result versus node histology). The pooled, unweighted (weighted) results based on the remainmediastinotomy, with lymph node saming 42 studies were as follows: sensitivity, 0.79 (0.83); specificity, 0.78 (0.81);accuracy, 0.79 (0.81). Using a node size greater than 1.0 em to define a ·posltive" CT result, as compared to a smaller pling, has been shown to reduce the numdiameter, was associated with significantly higher specificity, 0.89 versus 0.7&, and accuracy, 0.86 ber of unnecessary thoracotomies (2). If versus 0.75 (p" 0.005), but not sensitivity, 0.79 versus 0.75. The observed differences in accuracy computed tomography (CT), a noninvabetween a fourth generation CT (0.83) and either a third or a second generation CT, (o.n and 0.78, sivetest, could adquately evaluate the merespectively) were not significant at p < 0.05. No characteristics, either singly or in combination, diastinum for the presence or absence of resulted in accuracies exceeding 0.86. There exists random variation of individual stUdy results metastases, the morbidity, mortality, and arounlJ an overall mean accuracy of only 0.79, which is marginally ImproVed by advances In CT techhospital costs associated with mediastinologyand methods. Significant advances in the noninvasive detection of lymph node metastases noscopy/mediastinotomy would be remust await an approach fundamentally different from CT-determined node size. duced considerably. Many studies assessAM REv RESPIR DIS 1990; 141:1096-1101 ing accuracy of CT in detecting mediastinal disease, however, have reported widely varying accuracies, ranging from 0.35 to 0.95 (3, 4). This uncertainty makes of lymph nodes and the presence of met- by Sacks and colleagues (16)and L'Abbe and clinical decisions based on CT results dif- astases. They concluded that newer im- coworkers (17). ficult if not impossible. aging techniques had little impact in asLiterature Search Baron and colleagues (4) and Lewis sessing patients and that surgical stagMedical literature in English and French (from and associates (5), in prospective studies ing should proceed regardless of CT January 1, 1980through March 1, 1988) was of 75 and 98 patients, respectively, pre- findings. Rhoads and colleagues (15) searched for studies that examined mediastisented favorable results using CT for stag- similarly reported CT to have both a poor nal lymph node assessment by CT to stage ing bronchogenic carcinoma. Consider- sensitivity, 0.56, and accuracy, 0.64. nonsmall cell bronchogenic carcinoma of the ing only nodes larger than 1 em as abThe results of these studies done over lung. Current Contents and Index Medicus normal, Lewisand associates (5) reported the past decade are difficult to reconcile. wereextensivelyreviewedfor potential reports. a sensitivity of 0.91 and a specificity of There is, between studies, wide variation A computer search of the data bases of the 0.94. Similarly, Baron and colleagues (4) in the size criteria used to define "posi- National Library of Medicine was carried out reported that CT correctly staged 0.94 of tive" nodes, CT scanning techniques, and using the key words "CT staging," "neoplasm," resectable lesions and concluded that pa- the extent of surgical exploration used and/or "bronchogenic carcinoma" in the titients should proceed directly to thora- to prove metastases. cotomy when no nodes larger than 1 cm The present study approaches this conare visualized because the incidence of troversy using meta-analysis, a quantita(Received in original form June 6, 1989 and in mediastinal metastatic disease in this in- tive approach to integrating results from revised form November 2, 1989) stance was found to be low. More recent- many studies. This technique allowed us ly, Glazer and coworkers (6) and Gross to summarize the results of CT as a diFrom the Departments of Medicine, Epidemiand colleagues (7) reported a sensitivity agnostic test for mediastinal lymph node ology, and Community Health, University of Otof 0.95 and a specificity of 0.82 using the involvement by nonsmall carcinoma of tawa, and the Ottawa General Hospital, Ottawa, Ontario, Canada. l-cm criterion. Other investigators report- the lung and to examine the influence of 2 Supported by a grant from the Ontario Thoing similar findings have also suggested study methods on the observed results. racic Society, Canada. 3 Correspondence and requests for reprints should that mediastinoscopy can be avoided if be addressed to Robert Dales, M.D., Ottawa General CT is negative and the patient is otherHospital, 501 Smyth Road, Ottawa, ON KIH 8L6, Methods wise fit for surgical resection (4-13). ConCanada. versely, McKenna and coworkers (14) The methods used in our meta-analysis preA Career Scientist, Ontario Ministry of Health, found a poor correlation between the size sented below followed those recommended Canada. I

4

1096

CT TO STAGE WNG CANCER

de, abstract, or text. References of each article found were also searched for appropriate studies.

Inclusion Criteria Studies wereincluded if they represented original research and presented sufficient data required to construct a 2 x 2 contingency table (CT positive/negative versus histologic evidence of nonsmall cell bronchogenic carcinoma in lymph node positive/negative). Where a study of one group of subjects appeared in more than one publication, they were counted only once. A log was kept of all studies assessed and reasons for their exclusion recorded. Information Extracted Factors that could potentially influence CT staging results through precision or bias were extracted, where possible, from the publications. These were as follows: year of study, country, study center, study design, blinding of surgeon and radiologist, scanner generation, scan time, scan spacing, scan thickness, contrast enhancement used, lymph node size used to define a positive and negative test, thoroughness of mediastinal surgical dissection used to provehistologic existenceof nodal metastases, and cancer location. Taking these factors into account, many of which presumably reflected study "quality," allowed us to assess quantitatively the differences between study results due to these factors and thus avoid making a subjective distinction between relatively"good" and relatively"poor" studies. For example, rather than excluding a priori studies in which the mediastinum was not completely dissected for lymph nodes, we compared the results between these studies and those that reported complete dissection.

1097 TABLE 1 UNWEIGHTED SENSITIVITY, SPECIFICITY, AND ACCURACY OF COMPUTED TOMOGRAPHY PRESENTED BY INDIVIDUAL STUDY CHARACTERISTICS'

Characteristic Study design Prospective Retrospective Year study began

1980-1984 1985-1988 Tumor location Central Peripheral N/A CT generation Second, third Fourth N/A CT scan time, s

.. 4.0 > 4.0 N/A CT scan thickness, cm

.. 1.0 > 1.0 N/A CT scan spacing, cm

.. 1.0 > 1.0 N/A Contrast enhancement Used Not used N/A Node size for "positive" CT, cm

0.5-1.0

> 1.0 N/A Extent of mediastinal node dissection Occasional sampling Biopsy of visible, palpable nodes Complete mediastinal dissection N/A

Number of Studles

SensitiVity

Specificity

Accuracy

38 5

0.79 (0.29-0.96) 0.77 (0.57-0.96)

0.79 (0.38-0.98) 0.80 (0.46-0.99)

0.78 (0.35-0.95) 0.84 (0.64-0.92)

21 22

0.75 (0.29-0.96) 0.82 (0.57-0.96)

0.78 (0.38-0.98) 0.80 (0.46-0.99)

0.77 (0.35-0.95) 0.81 (0.59-0.93)

7t 7 36

0.67 (0.50-0.98) 0.86 (0.53-0.95) 0.77 (0.29-0.96)

0.77 (0.25-0.98) 0.74 (0.53-0.92) 0.79 (0.38-0.99)

0.84 (0.73-0.93) 0.81 (0.53-0.90) 0.78 (0.35-0.95)

28 14 1

0.75 (0.79-0.95):j: 0.78 (0.38-0.99) 0.84 (0.76-0.95) 0.82 (0.59-0.98) 0.94 0.63

0.77 (0.35-0.93) 0.83 (0.69-0.95) 0.76

15 14 13

0.85 (0.73-0.96) 0.78 (0.79-0.96) 0.72 (0.37-0.96)

0.81 (0.50-0.99) 0.76 (0.38-0.97) 0.78 (0.50-0.98)

0.84 (0.69-0.95) 0.77 (0.35-0.90) 0.75 (0.59-0.92)

30 8 4

0.80 (0.37-0.96) 0.74 (0.29-0.96) 0.75 (0.61-0.83)

0.79 (0.46-0.99) 0.75 (0.38-0.97) 0.79 (0.67-0.91)

0.80 (0.59-0.95) 0.75 (0.35-0.84) 0.78 (0.71-0.88)

29 11 2

0.80 (0.37-0.96) 0.75 (0.29-0.96) 0.80 (0.77-0.83)

0.79 (0.46-0.99) 0.76 (0.38-0.97) 0.79 (0.67-0.91)

0.80 (0.59-0.95) 0.77 (0.35-0.89) 0.80 (0.71-0.88)

22 3 17

0.78 (0.29-0.96) 0.84 (0.76-0.94) 0.79 (0.40-0.96)

0.75 (0.38-0.99) 0.84 (0.63-0.98) 0.82 (0.50-0.98)

0.77 (0.35-0.92) 0.84 (0.76-0.89) 0.81 (0.60-0.95)

22 10 11

0.75 (0.29-0.96) 0.79 (0.50-0.91) 0.86 (0.61-0.96)

0.76 (0.38-0.99)§ 0.75 (0.35-0.93)§ 0.89 (0.67-0.98) 0.86 (0.78-0.95) 0.73 (0.50-0.85) 0.81 (0.74-0.92)

3 23 6 11

0.93 0.78 0.83 0.74

0.78 0.79 0.75 0.81

(0.88-0.96) (0.29-0.96) (0.60-0.96) (0.40-0.91)

(0.68-0.94) (0.38-0.98) (0.50-0.99) (0.46-0.98)

0.85 0.79 0.78 0.78

(0.79-0.92) (0.35-0.93) (0.59-0.91) (0.61-0.95)

Definition of abbrevietions: N/A = not available; CT = computed tomography.

Method of Analysis Sensitivity, specificity, and accuracy wereused as indicators of CT performance as a diagnostic test because, unlike predictive values, they are independent of disease prevalence that varied between studies. Sensitivity was defined as the probability of CT indicating mediastinal disease (i.e., test positive) when histologic evidence of mediastinal metastases existed (i.e., disease present). Specificity was defined as the probability of CT not indicating mediastinal disease (i.e., test negative) when histologic evidence of mediastinal metastases was absent (i.e., disease absent). Accuracywas defined as the probability that CT was positive when disease was present or CT was negative when disease was absent. Because methods of analysis could potentially influence findings and thus conclusions, we subjected the data to many different methods looking for consistency of results as an indicator of validity. Studies were pooled to provide an overall experience, both unweighted and weighted. Weighting each study by the inverse of its variance - a commonly used method - that reflects the certainty of reported results would have systematically biased

• Range of values is in parentheses.

t Seven of the 42 studies stratified their results by tumor location and therefore appear twice in the rows titled central and peripheral. t p" 0.05. § p" 0.005.

the weighted means toward better results because, above 0.5, the variance of an observed proportion decreases with its increasing magnitude. Thus, if in a study of 100subjects the true accuracy was 0.7, studies reporting 0.5 would be given considerably less weight (weight = 400) than those reporting 0.9 (weight = 1250). To remove this extreme effect and stabilize the variance, results of each study were transformed by taking the arcsine and then the square root so that the variances of the transformed results became the inverses of the sample sizes (18). Transformed results were then weighted by the inverse of the variance (now the sample size), averaged, and retransformed. Group comparisons weremade using t tests or ftests (ANOVA), depending on the number of groups, to compare sensitivity, specificity, and accuracy between the groups of interest (table I). Studies were then divided in-

to two groups, based on having higher or lower accuracy, and discriminant analyses wereperformed to determine which study characteristics were predictive of being in the higher accuracy group. Finally, subjects from all studies were then pooled to create one large data set to effectively weight the overall experience by sample size, and logistic regression was performed. In this latter analysis, each subject was attributed the characteristics of the study that contained that subject and thus a number of subjects shared identical characteristics. These naturally imposed correlations may influence the observed association between accuracy and anyone study characteristic.

Publication Bias As an indication of publication bias, a "funnel plot" was constructed to determine whether or not smaller studies (with larger random

DALES, STARK, AND RAMAN

1098 1.2

1.1

1.0

!

!

0.9

tI

I

L

J.

1

0.8

1

I

0.7

>-

oc(

a: 0.8 ~

o

00.5 c(

!

I

0.4

0.3

0.2

0.1

0.0 20

30

40

50

60

70

80

90

100

110

120

130

140

150

160

170

180

190

200

210

220

SAMPLE SIZE Fig. 1. For each study, the accuracy and 950/b confidence interval, represented on a vertical bar, are plotted against sample size. The weighted mean of the reported accuracy and its 95% confidence interval are represented by the three horizontal bars. The asymmetry of this latter confidence interval is a consequence of the transformation used (see text). Studies of similar sample size share the same vertical bar.

error) found in the published literature were more likely to report better or worse accuracy than larger studies (with less random error) (figure 1). See Vandenbrouke (1988) for a further example of this technique (19).

Results

Seventy-ninepeer-reviewed articles in English and French addressing the CT evaluation of mediastinal lymph nodes in lung cancer were published between January 1, 1980 and March 1, 1988. Thirty-seven were excluded for the following reasons: review articles (n = 17), insufficient data presented to construct a contingency table (CT results versus node histology) (n = 16), and primary cancer type was small cell (n = 4). Thus, 42 studies contained sufficient information for inclusion in our study (3, 4, 6-15, 20-49) (table 4),.the majority (30/42) of which were from the United States. One to three studies each came from Belgium, Canada, England, Italy, Japan, Netherlands, New Zealand, Scotland, Sweden, and Switzerland. Because of the small numbers, no conclusions about the association between accuracy and center/country could be drawn. The unweighted mean (range) preva-

lence of mediastinal nodal metastases was 0.38 (0.20 to 0.91) in the 42 studies analyzed. The unweighted means of sensitivity, specificity, and accuracy were 0.79, 0.78, and 0.79, respectively, and veryclose to their respective weighted values (table 2). The corresponding proportions of false positive results (l - specificity) and false negative results (l - sensitivity) were0.22 and 0.21. The overallmean positive and negative predictive values, 0.71 and 0.86, respectively, were not considered further in the analysis because their interpretation depends on prevalence, which varied widely between studies.

Many studies, as expected, had missing values for a few of the many characteristics we assessed (table 1). No study contained all the variables considered; very few (n = 7) reported results stratified by tumor location (central/peripheral) or full details of the scanning technique used. Blinding of the radiologist to surgical staging was rarely mentioned (n = 3) and blinding of the surgeon and pathologist to CT results was never indicated. In one-quarter of the studies, criteria used to define a positive CT or the presence of disease were not described in detail. The use of contrast enhance-

TABLE 2 POOLED UNWEIGHTED AND WEIGHTED SENSITIVITY, SPECIFICITY, AND ACCURACY

Unweighted analysis, n = 42 Mean 95% confidence interval Weighted analysis", n = 42 Mean 95% confidence interval

SensitiVity

Specificity

Accuracy

0.79 0.74-0.84

0.79 0.74-0.84

0.79 0.75-0.83

0.83 0.78-0.87

0.82 0.78-0.85

0.80 0.78-0.84

• Values were transformed by taking the arcsine and square root, weighted by inverse of variance, summed. and then retransformed.

1099

CT TO STAGE WNG CANCER

ment and actual scan time were often not mentioned. In general, accuracy tended to be slightly lower in those studies not reporting details of their methods as compared to those that did. The observed higher accuracy was associated with the use of the fourth-generation CT (0.83) as compared to a third or second generation (0.77 and 0.78, respectively) and was not significant at p < 0.05. The use of larger-size criterion (from 1.1 to 2.5 em) to define a positive node was associated with better specificity and accuracy (p < 0.005 for both). The unexpectedly greater, observed sensitivity associated with a large node sizewas small, not statistically significant, and must have resulted from other differences between studies that influence sensitivity. Contrasting the ten studies reporting the highest accuracy (~ 0.89) with the ten studies reporting the lowest (~ 0.74) revealed no statistically significant differences in characteristics using the Fisher exact test at p < 0.05. The three differences of largest magnitude were as follows. Of the highest accuracy studies, 60070 were performed between 1985 and 1988inclusive, 50% used fourth generation scanners, and 38% used node sizes greater than 1 em (between 1.1 and 2.5 em) to define a positive test. Respective values for the lowest accuracy studies were 40, 10, and 0%. Studies were then divided into "higher" and "lower" accuracy groups based on whether reported accuracies were higher or lower than the overall median value of 0.80. Linear discriminant functions included three to five variable combinations of the following: study designs, year study began, CT generation, scan spacing, size criteria used to define a "positive" node, CT scan thickness, scan time, extent of mediastinal dissection to establish nodal metastases. The models classified studies into "higher" and "lower" groups with accuracies ranging from 0.60 to 0.77; the highest accuracy was obtained by a combination of the former first five variables and excluded five studies because of missing values. CT generation was consistently the most important variable. Using a stepwise approach, only generation would enter the model, F = 4.31, 1and 38 degreesof freedom, p < 0.05. Strong correlations, found between many of the dependent variables, reduced the ability of models to distinguish between higher and lower accuracy groups, limited interpretation of individual coefficients, and therefore are not reported. Examples of correlations

TABLE 3 DIAGNOSTIC PERFORMANCE OF COMPUTED TOMOGRAPHY PRESENTED BY COMBINATIONS OF FAVORABLE STUDY CHARACTERISTICS Variable Combination Fourth generation scanner, node size criterion> 1 cm Fourth generation scanner, node size criterion> 1 em, occasional mediastinal node sampling Fourth generation scanner, scan time"" 4 s, scan thlckness c 1.0 cm Fourth generation scanner, scan nrne c 4 s, scan thlckness x 1.0 cm, scan spacing"" 1.0 cm

were as follow: CT generation and scan time, r = 0.82, p = 0.0001; scan time, and scan thickness, and scan spacing, r = 0.60 to 0.64, p ~ 0.001. Logistic regression analysis (50), using the pooled data set composed of all subjects, confirmed the poor explanatory power of the study characteristics seen with both the univariate analyses (Table 1)and the multivariate discriminant analyses. The adjusted odds ratios (derived from logistic regression) of correct classification (a true positive or true negative result) with a fourth generation versus a second or third generation CT ranged from 1.7to 2.2 (both p < 0.0001), depending on the accompanying dependent variables in the model. Thus, the observed higher accuracy of a fourth generation scanner (0.83)reached statistical significance where each subject was a separate observation (p < 0.(01) but not by the previous ANOVA where each study was a separate observation (p > 0.05). Corresponding odds ratios for criteria used to define a positive node, extent of dissection, CT scan time, scan thickness, and scan spacing were all less than two. Guided by the results of the previously described univariate and multivariate analyses, weassessedthe reported accuracies of studies with various combinations of "favorable" characteristics (table 3). Again, no one combination was found to be greatly superior to another. Figure 1 shows that many confidence intervals included the overall mean, indicating that they were not significantly different from the overall experience at p < 0.05. This funnel plot also demonstrated no evidence of publication bias, i.e., an excessof small studies with results either better or worse than the overall average. Discussion

We found that the mean sensitivity, specificity, and accuracy of CT as a diagnos-

Number of Studies

Sensitivity

Specificity

Accuracy

10

0.84

0.83

0.84

7

0.83

0.81

0.82

12

0.84

0.82

0.83

10

0.84

0.83

0.84

tic test for lymph node metastases approximated 0.79, and results weighted by sample size were similar, ranging from 0.81 to 0.83. As might be expected a priori from improved technology, fourth generation CT scanners were associated with improved accuracy, statistically significant only when each subject was the unit of observation. Using a large node size criterion improved accuracy by a significant reduction in the false positive rate with little effect on the false negative rate. However, no single variable or combination of weightedand unweighted variables greatly improved accuracy, indicating that the majority of differences between studies represent unexplained random variation around a true mean of about 0.80 as illustrated in figure 1. Between studies, there were many possible differences that could account for this variation in results. First, the wide range in prevalence of mediastinal nodal metastases between studies (see METHODS) suggests that perhaps some study groups are investigated at a later stage in their disease, which may alter CT performance. Second, although we have shown that CT scan technique is important, we were unable to assess aspects of the radiologist's interpretation, except for lymph node size, used to define a nodal metastases. Third, extensive mediastinal dissection, compared to sampling of only visible or palpable nodal abnormalities, did not appear to increase significantly the observed false negative or positive reports either by finding histologicevidence of metastases in "normal" sized lymph nodes or benign disease causing enlargement in "positive" nodes. However, the averagesensitivity of 0.93 reported by the studies that performed only occasional sampling of nodes is difficult to interpret because "disease" was never welldefined. Fortunately, these latter reports were few in number (n = 3). Fourth, blinding of the surgeon to the CT results, which could have considerably influenced

DALES, STARK, AND RAMAN

1100 TABLE 4 STUDIES USED FOR PRESENT META·ANALYSIS' First Author (reference number) Hirleman, MT (30) Richardson, JV (43) Ekholm, S (3) Rea, HH (9) Faling, LJ (8) Modini, C (36) Osborne, DR (10) Wouters, EFM (48) Moak, GO (37) Yee, ES (49) Baron, RL (4) Goldstraw, P (11) Shevland, JE (45) Richey, HM (44) Libshitz, H (33) Glazer, GM (6) Friedman, PJ (27) Breyer, RH (12) Daly, BD (39) Frederick, HM (26) Imhof, E (31) Feigin, OS (24) Webb, RW (47) Brion, JP (21) Conte, CC (22) Heelan, RT (29) Graves, G (28) McKenna, RJ (14) Khan, A (32) Nakata, H (38) Ferguson, MK (25) Milroy, R (35) Rhoads, AC (15) Daly, B (23) Patterson, GA (40) Matthews, JI (34) Backer, CL (20) Rendina, EA (42) Platt, JF (41) Gross, BH (7) Ratto, GV (13) Staples, CA (46)

Year 1980 1980 1980 1981 1981 1982 1982 1982 1982 1982 1982 1983 1983 1984 1984 1984 1984 1984 1984 1984 1985 1985 1985 1985 1985 1985 1985 1985 1985 1986 1986 1986 1986 1987 1987 1987 1987 1987 1987 1988 1988 1988

Nt 50 50 35 22 49 41 42 21 59 302 98 39 35 48 50 49 45 56 97 74 57 54 21 153 75 20 41 102 50 59 61 66 75 199 84 174 77 171 103 39 100 151

Sensitivity

Specificity

Accuracy

0.96 0.61 0.29 0.80 0.88 0.50 0.94 0.40 0.49 0.96 0.91 0.77 0.86 0.95 0.37 0.95 0.78 0.82 0.80 0.86 0.81 0.95 0.90 0.89 0.85 0.91 0.89 0.60 0.83 0.85 0.68 0.72 0.57 0.78 0.76 0.86 0.95 0.96 0.86 0.73 0.96 0.79

0.73 0.81 0.38 0.76 0.94 0.97 0.63 0.92 0.94 0.50 0.98 0.85 0.76 0.68 0.81 0.64 0.59 0.97 0.91 0.77 0.78 0.61 0.82 0.46 0.89 0.70 0.96 0.58 0.91 0.91 0.91 0.82 0.69 0.67 0.98 0.78 0.91 0.85 0.87 0.82 0.50 0.99

0.84 0.74 0.35 0.77 0.92 0.83 0.76 0.68 0.60 0.92 0.95 0.82 0.80 0.79 0.64 0.78 0.69 0.91 0.88 0.80 0.79 0.74 0.86 0.61 0.88 0.81 0.93 0.59 0.88 0.90 0.87 0.77 0.64 0.71 0.89 0.81 0.92 0.89 0.86 0.79 0.63 0.91

• Foreachstudy,sensitivity, specificity, andaccuracy wererecalculated from rawdatabecause inconsistencies were occasionally found in reported results. t N represents the number of subjects we usedin our analysis. It will be smaller than the number reported in the original reportif some subjects already appeared in other studies.

the extent of dissection and contributed to unexplained variation between studies, was never mentioned. Presented with large mediastinal nodes on CT, one could easily envision a surgeon performing a more thorough mediastinoscopy in order to identify and biopsy these suspicious nodes than in the situation where the CT was unremarkable. It is possible that our meta-analysis does not represent the true overall experience because of the studies we chose to consider or our methods of combination and summarization. To avoid a bias in selecting studies, we used minimal exclusion criteria, i.e., the inability to form a 2 x 2 contingency table comparing CT results with histologic findings. Rather than make subjective assessments of

study "quality," usually recommended in meta-analysis, we stratified by characteristics related to quality such as extent of node dissection and prospective versus retrospective study designs. This way quality issues were assessed for their effects on reported accuracy rather than resulting in exclusions that provide no information. Publication bias, whereby unpublished studies have systematically different results than those published, can never be excluded, but the funnel plot showed no evidence of this with equal scatter on either side of the overall estimated mean by both smaller and larger studies. The similarity of weighted to unweighted mean values suggests that they wererepresentative of the true experience. The purpose of our study was to pro-

vide an overview of the performance of CT as a test to diagnose lymph node metastases. We found that despite new generations of scanners and different node size criteria used for diagnosis, in the majority of situations, CT cannot stage mediastinal lymph nodes with sufficient accuracy when surgical resection of lung cancer is being considered. Contrary to our findings, Gross and colleagues (7) recently recommended that "... a negative mediastinal CT examination (when there are no distant metastases) should be followed by thoracotomy, and routine staging procedures such as mediastinoscopy should be bypassed." A more conservative suggestion by Daly and coworkers (23) was that only in the presence of a peripheral lung cancer, which has low prior (pretest) probability of mediastinal metastases, does a negative CT make the posterior (post-test) probability low enough that thoracotomy may proceed without mediastinoscopy. However, not all studies have reported such high negative predictive values (probability that metastases are absent when the CT is negative), which would make this latter approach reasonable. In a recent study by McKenna and associates (14), 13% (3 of 23) of presumed T2NOlesions (by chest radiography) had mediastinal metastases and CT detected none of them. The mean negative predictive value from the 42 studies analyzed in the present report was 860/0, indicating that 14% of patients with a negative CT may have mediastinal involvement at thoracotomy. This limitation of CT is explained by the observations of McKenna and colleagues (14): of the 25 cases of histologically proved nodal metastases found, ten cases had metastatic nodes less than 1 em in greatest diameter. In summary, we believe that the overall true accuracy of CT scanning the mediastinum is only 80% with approximately 200/0 false positive and 20% false negative results. Thus far, advances in CT over the past eight years have had little measurable impact in this respect, and although studies addressing this question continue to appear increasingly in literature, we believe that no clinically important advances will be made until lymph node size is replaced by a fundamentally different indicator of lymph node pathology.

Acknowledgment The writers wish to thank Nicole Quirouette, Erin McGowan, and Lise Lebrun for typing the manuscript.

CT TO STAGE WNG CANCER

References 1. Stjernsward J, Stanley K. Lung cancer-a world-wide health problem (abstract). Lung Cancer 1988; 4:11-2. 2. Center S. Does computed tomography aid in the staging of lung cancer? J Thorac Cardiovasc Surg 1981; 82:334. 3. Ekholm S, Albrechtsson V,Kugelberg J, Tylen V. Computed tomography in preoperative staging of bronchogenic carcinoma. J Com put Assist Tomogr 1980; 4:763-5. 4. Baron RL, Levitt RG, Sagel SS, White MJ, Roper CL, Marbager JP. Computed tomography in the preoperative evaluation of bronchogenic carcinoma. Radiology 1982; 145:727-32. 5. Lewis JW, Madrazo B, Gross G, et al. The value of radiographic and computed tomography in the staging of lung cancer. Ann Thorac Surg 1982; 34:553-8. 6. Glazer GM, Orringer MB, Gross BH, Quint LE. The mediastinum in non-small cell lung cancer: CT-surgical correlation. Am 1 Radiol 1984; 142:1101-5. 7. Gross BH, Glazer GM, Orringer MB, Spizarny DL, Flint A. Bronchogenic carcinoma metastatic to normal-sized lymph nodes: frequency and significance. Radiology 1988; 166:71-4. 8. Faling LJ, Pugatch RD, lung-Legg Y, et al. Computed tomographic scanning of the mediastinum in the staging of bronchogenic carcinoma. Am Rev Respir Dis 1981; 124:690-5. 9. Rea HH, Shevland JE, House AJS. Accuracy of computed tomographic scanning in assessment of the mediastinum in bronchial carcinoma. J Thorac Cardiovasc Surg 1981; 81:825-9. 10. Osborne DR, Korobkin M, Ravin CE, et al. Comparison of plain radiography, conventional tomography, and computed tomography in detecting intrathoracic lymph node metastases from lung carcinoma. Radiology 1982; 142:157-61. I I. Goldstraw P, Kunzer M, Edwards D. Preoperative staging of lung cancer: accuracy of computed tomography versus mediastinoscopy. Thorax 1983; 38:10-5. 12. BreyerRH, Karstaedt N, Mills SA, et al. Computed tomography for evaluation of mediastinal lymph nodes in lung cancer: correlation with surgical staging. Ann Thorac Surg 1984; 38:215-20. 13. Ratto GB, Mereu C, Motta G. The prognostic significance of peroperative assessment of mediastinal lymph nodes in patients with lung cancer. Chest 1988; 93:807-13. 14. McKenna RJ, Libshitz HI, Mountain CE, McMurtrey MJ. Roentgenographic evaluation of mediastinal nodes for preoperative assessment of lung cancer. Chest 1985; 85:206-10. 15. Rhoads AC, Thomas JH, Hermreck AS, Pierce GE. Comparative studies of computerized tomography and mediastinoscopy for the staging of bronchogenic carcinoma. Am J Surg 1986; 152:587-90. 16. Sacks HS, Bernier 1, Reitman 0, Ancona-Berk VA, Chalmers Te. Meta-analyses of randomized controlled trials. N Engl 1 Med 1987; 316:450-5. 17. L'Abbe KA, Detsky AS, O'Rourke K. Metaanalysis in clinical research. Ann Intern Med 1987; 107:224-33.

1101 18. Snedecor GW, Cochran WG. Statistical methods. 7th ed. Ames, Iowa: Iowa State University Press, 1980; 290-1. 19. Vandenbroucke JP. Passive smoking and lung cancer: a publication bias? Br Med 1 1988; 296: 391-2. 20. Backer CL, Shields TW, Lockart CG, Vogelzang R, LoCicero 1. Selective preoperative evaluation for possible N2 disease in carcinoma of the lung. J Thorac Cardiovasc Surg 1987; 93:337-43. 21. Brion lP, Depau L, Kuhn G, et al. Computed tomography and mediastinoscopy in preoperative staging of lung cancer. 1 Com put Assist Tomogr 1985; 9:480-4. 22. Conte CC, Buckman CA. Role of computerized tomography in assessment of the mediastinum in patients with lung carcinoma. Am 1 Surg 1985; 149:449-52. 23. Daly BOT, Faling LJ, Bite G, et al. Mediastinallymph node evaluation by computed tomography in lung cancer. J Thorac Cardiovasc Surg 1987; 94:664-72. 24. Feigin OS, Friedman PJ, Liston SE, Haghighi P, Peters RM, Hill JG. Improvingspecificityof computed tomography in diagnosis of malignant mediastinal lymph nodes. J Comput Assist Tomogr 1985; 9:21-32. 25. Ferguson NK, MacMahon YH, Little AG, Golomb HM, Hoofman PC, Skinner DB. Regional accuracy of computed tomography of the mediastinum in staging of lung cancer. J Thorac Cardiovasc Surg 1986; 91:498-504. 26. Frederick HM, Bernandino ME, Baron M, Colvin R, Mansoer, K, Miller 1. Accuracy of chest computerized tomography in detecting malignant hilar and mediastinal involvement by squamous cell carcinoma of the lung. Cancer 1984; 54:2390-5. 27. Friedman Pl, Feigin OS, Liston SE, et al. Sensitivity of chest radiography, computed tomography, and gallium scanning to metastasis of lung carcinoma. Cancer 1984; 54:1300-6. 28. Graves WG, Martinez MJ, Carter DL, Barry Ml, Clarke lS. The value of computed tomography in staging bronchogenic carcinoma: a changing role for mediastinoscopy. Ann Thorac Surg 1985; 40:57-9. 29. Heelan RT, Martini N, Westcott JW, et al. Carcinomatous involvement of the hilum and mediastinum: computed tomographic and magnetic resonance evaluation. Radiology 1985; 156:111-5. 30. Hirleman MT, Yiv-ChivVS,Chiv LC, Schapiro RL. The resectability of primary lung carcinoma: a diagnostic staging review. 1 Com put Assist Tomogr 1980; 4:146-63. 31. Imhof E, Penuchoud AP, Tan KG, Heitz M, Hasse J, Graedel E. Mediastinal staging of bronchial carcinoma: can computed tomography replace mediastinoscopy? Respiration 1985; 48:251-60. 32. Khan A, Gersten KC, Garvey J, Khan FA, Steinberg H. Oblique hilar tomography, computed tomography, and mediastinoscopy for prethoracotomy staging of bronchogenic carcinoma. Radiology 1985; 156:295-8. 33. Libshitz HI, McKenna RJ, Haynie TP, McMurtrey Ml, Mountain CT. Mediastinal evaluation in lung cancer. Radiology 1984; 151:295-9.

34. Mathews n, Ricley HM, Helsel RA, Grishkin BA. Thoracic computed tomography in the preoperative evaluation of primary bronchogenic carcinoma. Arch Intern Med 1987; 147:449-53. 35. Milroy R, Smith ML, Faichney A, et al. Mediastinal imaging in cancer. Q J Med 1986;60:715-23. 36. Modini C, Passariello R, lascone C, et al. TNM staging in lung cancer: role of computed tomography. 1 Thorac Cardiovasc Surg 1982; 84:509-74. 37. Moak GO, Cockerill EM, Farber MO, Yaw PH, Manfredi F. Computed tomography versus standard radiology in the evaluation of mediastinal adenopathy. Chest 1982; 82:69-75. 38. Nakata H, Ishimaru H, Nakayama C, Yoshimatsu H. Computed tomography for preoperative evaluations of lung cancer. 1 Comput AssistTomogr 1986; 10:147-51. 39. Daly BOT, Faling LJ, Pugatch RD, Gale ME, Snider GL, Rheinlander HF. Computed tomography: an effective technique for mediastinal staging in lung cancer. J Thorac Cardiovasc Surg 1984; 88:486-94. 40. Patterson GA, Ginsberg RJ, Poon PY, et al. A prospectiveevaluation of magnetic resonance imaging, computed tomography, and mediastinoscopy in the preoperative assessment of mediastinal node status in bronchogenic carcinoma. 1 Thorac Cardiovase Surg 1987; 94:679-84. 41. Platt lF, Glazer GM, Gross BH, Quiat LE, Frances IR, Orringer MB. CT evaluation of mediastinal lymph nodes in lung cancer: influence of the lobar site of the primary neoplasm. Am J Radiol 1987; 149:683-6. 42. Rendina EA, Bognolo DA, Mineo TC, et al. Computed tomography for the evaluation of intrathoracic invasion by lung cancer. J Thorac Cardiovasc Surg 1987; 94:57-63. 43. Richardson lV, Zenk BA, Rossi NP. Preoperative non-invasive mediastinal staging in bronchogenic carcinoma. Surgery 1980; 88:382-5. 44. Richey, HM, Matthews n, Helsel RA, Cable H. Thoracic CT scanning in the staging of bronchogenic carcinoma. Chest 1984; 85:218-21. 45. Shevland JE, House AJ, Rea HH. Computed tomographic assessment of the mediastinum in patients with lung cancer. Australas Radio11983; 27: 240-5. 46. Staples CA, Muller NL, Miller RR, Evans KG, Nelems B. Mediastinal nodes in bronchogenic carcinoma: comparison between CT and mediastinoscopy. Radiology 1988; 167:367-72. 47. Webb WR, lensen BG, Solitto R, ei al. Bronchogenic carcinoma; staging with MR compared with staging with CT and surgery. Radiology 1985; 156:117-24. 48. WoutersEFM, Oei TK, Van EnglestovenlMA, Lemmens HAJ, Greve LH. Evaluation of the contribution of computed tomography to the staging of non-oat-cell primary bronchogenic carcinoma. Fortschr Rontgenstr 1982; 137:540-3. 49. Yee ES, Raper SE, Thomas AH, Ebert DA. Technical accuracy and clinical efficacy of thoracic computed tomography. Am J Surg 1982; 144:35-41. 50. Dixon Wl, ed. BMDP statistical software manual. Berkeley: University of California Press, 1985.

Computed tomography to stage lung cancer. Approaching a controversy using meta-analysis.

The ability of computed tomography (CT) to detect mediastinal lymph node metastases from nonsmall cell bronchogenic lung cancer is highly controversia...
694KB Sizes 0 Downloads 0 Views