Interpreting Chest Radiographs Without Visual Search 1

Diagnostic Radiology

Harold L. Kundel, M.D., and Calvin F. Nodine, Ph.D. Ten radiologists were shown a series of 10 normal and 10 abnormal chest films under two viewing conditions: a 0.2-second flash and unlimited viewing time. The results were compared in terms of verbal content, diagnostic accuracy, and level of confidence. The overall accuracy was surprisingly high (70% true positives) considering that no search was possible. Performance improved as expected with free search (97 % true positives). These data support the hypothesis that visual search begins with a global response that establishes content, detects gross deviations from normal, and organizes subsequent foveal checking fixations to conduct a detailed examination of ambiguities. The total search strategy then consists of an ordered sequence of interspersed global and checking fixations. INDEX TERMS:

Radiographs, interpretation • Radiology and Radiologists

Radiology 116:527-532, September 1975

were hopeful that.the answer would provide insightinto not only the film reading process but also the role of template matching and feature extraction in pattern recognition. In order to answer this question, we employed the tachistoscopic viewing technique, used by psychologists for more than 80 years, which consists of flashing a stimulus picture on and off without modifying the overall luminance of the display. The duration of the stimulus and its brightness can be varied and precisely controlled.

RESEARCH in our laboratory has been concerned with the role of the visual search in pattern recognition (3, 4). In these studies, movements and fixations of the radiologist's eye were recorded as he viewed a radiograph, thus revealing which areas of the film received attention and the priority given to each one as indicated by the duration and number of fixations. Analysis of the sequential pattern of fixations has led us to speculate about the visual strategies radiologists use to detect abnormalities on the radiograph. However, many questions remain unanswered. Analysis of a radiological image is an enormously complex perceptual task that only a highly trained human observer is able to perform, and this complexity is reflected by the patterns of eye movement: results indicate that 80 to 120 fixations occur during the 20 to 30 seconds normally spent in viewing a single film. Since this time is most likely spent in testing one or more hypotheses regarding the abnormality or disease under consideration and selecting the sites from which additional data should be collected, the pattern of eye fixations probably represents multiple changes in search strategies, making it difficult to draw inferences about them on the basis of patterns of eye movement alone. It was with this problem in mind that we designed the present experiment. If our analysis is correct, one way around the problem of more than one strategy contributing to each eye movement record is to reduce the amount of viewing time per film. In addition, it seemed to us that a search sequence of eye fixations is initiated only after an overall impression or gestalt is formed by a preattentive global response similar to that proposed by gestalt psychologists and more recently resurrected by cognitive psychologists (7). Thus the important question was how much the radiologist sees in a single fixation, and we ECENT

R

MATERIALS AND METHODS

Ten radiologists served as observers in the experiment; 3 were staff radiologists and 7 were residents with a minimum of two years of experience at Temple University Hospital. Twenty postero-anterior chest films were copied onto 35mm slides; of these, 10 (representing 6 men and 4 women) were normal. Films ranged from somewhat light and contrasty to somewhat dark. Six of the abnormal films illustrated pulmonary abnormalities, depicted by the silhouettes in Figure 1; the remaining 4 showed enlarged hearts ranging from minimal to massive cardiomegaly, represented by the silhouettes in Figure 2. These silhouettes are used as illustrations instead of the original films for economy of space and because they illustrate the major abnormal features in a schematic way. Some of the films contained incidental findings, among them a bullet in the superior mediastinum in one of the cardiac cases and fibrocalcific densities in the right upper lobe of a patient with multiple pulmonary metastases. All films were displayed normal size using a two-channel projector tachistoscope. Prior to viewing, the screen was uniformly illuminated to the same average brightness as the films and a black dot was projected onto the spot corre-

1 From the Diagnostic Radiology Research Laboratory, Temple University School of Medicine, Philadelphia, Pa. Accepted for publication in March 1975. Supported by grant GM 14548 from the National Institutes of General Medical Sciences, USPHS. sjh

527

528

HAROLD

L. KUNDEL

AND CALVIN

F. NODINE

September 1975

Fig. 1. Silhouettes drawn from the 6 films of pulmonary abnormalities to show the major abnormal features. The numbers in the upper left corner correspond to the film numbers used in the text and tables. (It must be emphasized that the observers were shown slides of the actual chest films, not silhouettes.)

Fig. 2. Silhouettes drawn from the 4 films of enlarged hearts to show the variations in size used. The numbers in the upper left corner correspond to the film numbers used in the text and tables. Number 11 is a normal heart with a prominent aortic arch, and Number 12 is one of the 10 normal films included in the series.

sponding to the center of the heart, just below the level of the hilar structures. Each observer was tested individually. The observer sat in front of the viewing screen with his chin resting on a platform 60 cm from the screen. After being familiar-

ized with the apparatus, he was told that he was going to be shown a series of chest films, some normal and some abnormal, and that these films would be flashed for only a fraction of a second. The observer initiated each display by pushing a button after fixating on the

INTERPRETING CHEST RADIOGRAPHS WITHOUT VISUAL SEARCH

Vol. 116

Table I:

Diagnostic Radiology

529

Diagnostic, Descriptive, and Modifier Statements Made by 10Subjects Viewing 6 Films Containing Lung Abnormalities - - - - - - - - - T y p e of Statements Free Viewing 0.2·Second Flash Diagnostic Descriptive Modifier Diagnostic Descriptive

Film Description

1. 2. 3. 4.

Solitary lung nodule (carcinoma) Multiple tiny calcified nodules (histoplasmosis) Infiltrate with small cavity (pneumonia) Mass (carcinoma) 5. Multiple bilateral nodules (metastases) 6. Pneumothorax Total

0 0 3 1 0 0

9 10 8

"4

34

0 0

7

0 0 0 12 16 4 32

0 3

10 6

5

5

0 1 9

10 9 4

18

44

Modifier

7 18 9* 20 40t 6 100

* Nine subjects reported a cavity in free viewing; none reported it in flash viewing. t Eight subjects reported bilateral nodules and all reported multiple nodules in free viewing; none reported bilateral nodules in flash viewing, although all indicated multiple nodules. Table II:

Diagnostic, Descriptive, and Modifier Statements Made by 10Subjects Viewing 5 Films Containing Cardiovascular Abnormalities Type of Statement Free Viewing 0.2-Second Flash------.. Diagnostic Descriptive Modifier Diagnostic Descriptive

Film Description Massive cardiac enlargement (pericardial effusion) Enlarged heart and congestive failure Enlarged heart (left ventricle) Enlarged heart (left atrium, mitral valve) 11. Normal heart (prominent aortic arch) Total

7. 8. 9. 10.

* An incidental finding of a bullet in the

1 0 0 1 0

9 10 4 5 5

1 1 2 1 0

1 3 0 2 0

2

33

5

6

9

Modifier 2

17

13*

12 8 6

9 3 0

52

27

left mediastinum was reported by all subjects in free viewing but not in flash viewing.

dot marking the center of the chest. When the button was pressed, the dot disappeared and the chest film flashed on the screen for 0.2 second, after which it was again replaced by the dot. Since the average duration of a fixation is about 0.3 seconds and the minimal interval between fixations is about 0.25 seconds, we found that a viewing time of 0.2 second minimized the possibility of a second fixation while still maintaining an interval reasonably close to the duration of a single fixation. The observer was asked to report what he saw after each film presentation, and his response was recorded verbatim. No specific questions were asked by the experimenters and a differential diagnosis was not required, although the observer occasionally offered one voluntarily. We had originally planned to use specific questions or a check list to elicit responses, but we rejected this idea because we wanted to see how the observer responded under the above conditions. After six weeks the observers were asked to return and were again shown the entire series of 20 films. The viewing conditions were identical except that the films were arranged in a new sequence and the subjects were given unlimited viewing time. As before, all responses were recorded verbatim. RESULTS Analysis of the Content of Verbal Responses

The verbal responses of the observers were grouped into four categories: (a) Primary diagnostic statements (e.g., pneumonia or mitral valve disease) (b) Basic descriptive statements (e.g., abnormal density or enlarged heart)

(c) Modifiers of a basic descriptive statement (e.g., homogeneous or left atrial enlargement) (d) Secondary statements (e.g., another film or another look would be helpful). The number of primary diagnostic, descriptive, and modifier statements made by the 10 observers for the films of pulmonary abnormalities are given in TABLE I, while those for the cardiac abnormalities are shown in TABLE II. Performance was better when unlimited viewing time was allowed; diagnostic statements were noticeably absent with flash viewing except in the case of pneumonia. Most of the verbal protocols for lung abnormalities could be characterized as basic descriptive statements which identified an abnormal shadow (e.g., density or infiltrate) and gave information about its location. Observers were not able to resolve cavity detail in Film 3 or the bilateral abnormality in Film 5, although they were able to identify more than one nodule in the right lung, which contained most of. the nodules (TABLE I). Using flash viewing, SUbjects could not identify the solitary nodule in Film 1 or the multiple tiny calcified nodules in Film 2, calling the former "normal" and the latter either "normal" or "dark film." Furthermore, although findings such as obscuration of the right heart border, lung edge, and lung lucency were readily noted on Film 6 with flash viewing, the observers were unable to integrate this pattern to form the diagnosis of pneumothorax. The verbal statements used by observers to categorize cardiovascular abnormalities under flash viewing displayed a pattern of deficiencies similar to those for the lung when compared with free viewing. Not only were more descriptive and diagnostic statements generated under free than flash viewing, but observers tended to use more modifiers in describing cardiovascu-

530

HAROLD L. KUNDEL AND CALVIN F. NODINE

Table III:

Classification of Primary Statements Made by 10 Observers Viewing 10 Normal and 10 Abnormal Chest Films ----10.2·Second FlashTN* TP FN FP

Right-upper-lobe nodule Histoplasmosis Right·upper·lobe pneumonia Left-lower-lobe mass Metastatic nodules Right pneumothorax Massive cardiac enlargement Large heart and congestive heart failure 9. Left ventricular enlargement 10. Large heart (mitral valve) Total (abnormal films)

1. 2. 3. 4. 5. 6. 7. 8.

Total (normal films) * TN

= true

September 1975

negative; TP

= true

0 0 0 0 0 0 0 0

0 2 10 10 10 9 10 9

7 8 0 0 0 1 0 0

3 0 0 0 0 0 0 1

0 0

5 5

3 3

0

70

22

66

,0

0

2 2 8 34

TP+ FP

Free Viewing--------FN FP TP+ FP

TN

TP

Abnormal Films 0 0 0 0 3 0 0 0 0 0 0 0 0 0 1 0

10 9 10 10 10 10 10 10

0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 3 0 1 2 0 0

0 0 0 27

0 2

1 1

10 8

0 2

6"

0 0 0

97

"3

0

73

0

0

8' 0

positive; FN = false negative; FP = false positive. Table IV: Schema for Characterization of Verbal Statements

Category

Statement

Alternative Interpretations

Abnormal Abnormal Probable abnormal

Diagnostic Descriptive Usually descriptive

None to few Few to many Many

Probable normal

Qualified descriptive

Many to few

Normal

Adiagnostic

None

lar abnormalities; for example, in Films 9 and 10 observers under free viewing identified enlarged chambers in addition to heart size when describing these films, while under flash viewing they limited their statements primarily to descriptions of heart size. In addition, none of the observers identified the bullet in Film 8 under flash viewing, but all identified it under free viewing, suggesting that observers did not have sufficient time to deal with multiple abnormal signals with a single fixation. The bullet subtended a visual angle of approximately 10 ; however, observers had no problems resolving target features larger than 10 under flash viewing, such as a prominent aortic arch. Congestive heart failure and pneumothorax seem to present similar problems for observers under flash viewing, since they offer too many signals to integrate in a single fixation.

-------Examples,---~---

Lung

Heart

Tuberculosis Fibrocalcific densities Questionable densities

Mitral stenosis Cardiac enlargement Possible cardiac enlargement Upper limits of normal

Prominent markings but within normal limits

a particularly difficult situation was multiple bilateral pulmonary metastases: many subjects reported abnormal density in the lungs but could not discriminate individual nodules on flash viewing (TABLE I). Since this represented detection of an abnormality it was scored as a truepositive for the purpose of this study. The data obtained ' are given in TABLE III. A breakdown of the false-positive statements for both normal and abnormal films indicated that with flash viewing, observers tended to incorrectly identify abnormalities in the right chest associated with either the right hilus or an infiltrate in the right lung. Half of all false-positives occurring under these conditions were of this type, whereas less than 20 % of those which occurred during free viewing were identified in the right chest. j

Analysis of Film-Reader Confidence or Equivocation Analysis of Film-Reader Error

The interpretations of the films were compared to the actual findings as defined by the investigators and confirmed by an overwhelming majority agreement by the 10 observers' responses in free search and divided into four categories: true negative (TN), true positive (TP), false negative (FN), and false positive (FP). In addition, a joint category of true positive, false positive (TP FP) was also designated to cover cases in which, for example, an enlarged heart was reported when the heart was indeed enlarged (TP), but in addition a pulmonary infiltrate was reported when the lungs were actually clear (FP). A true positive was scored only if the subject reported the major abnormality on the film. For example,

+

+

Statements of confidence or uncertainty were usually implicit in the content of the verbal report. Five different degrees of equivocation or uncertainty were indicated (TABLE IV). Diagnostic statements (e.g., pneumonia, mitral valve disease) represent the lowest degree of equivocation with reference to abnormal films, because they refer to a single or limited number of pathological processes with known etiologies, so that the number of possible alternative interpretations is restricted (or, in terms of information theory, reduction of uncertainty is maximized). On the other side of the scale, there were statements of "normal" which were taken to represent the lowest degree of equivocation with reference to normal films because they were free from any further

531

INTERPRETING CHEST RADIOGRAPHS WITHOUT VISUAL SEARCH

Vol. 116

modifying statements which might suggest a tinge of abnormality. "Normal" might be considered an adiagnostic statement, because it refers to the absence of a pathological process which severely limits possible alternative interpretations and as such is maximally informative. In between these two extremes, there were (a) qualified descriptive statements of normal which contained possible references to abnormality, frequently using modifiers such as "prominent" or "upper limits," e.g., "hilus slightly prominent" or heart size approaching upper limits," (b) descriptive statements of abnormal which contained references to abnormalities (e.g., nodule, mass, infiltrate) without noting the etiology, thus permitting many possible interpretations and again suggesting a set. of alternative interpretations, increasing rather than resolving ambiguity, and (c) frankly equivocal statements which' were usually descriptive and could refer to normal or abnormal but usually referred to abnormal, such as "didn't see anything," "dark film," or "I don't know." By dividing the films into "normal" and "abnormal" and analyzing the responses in terms of these five categories, a 2 X 5 stimulus-response matrix could be developed for each of the two viewing conditions. This rating scale is similar to the one used by Goodenough et al. (1) and puts the data into a form amenable to plotting as a so-called relative operating characteristic or ROC curve (8). The observers' responses were scored independently by each of us, with good agreement. The mean values for the 10 observers are plotted in Figure 3 and are typical of ROC curves obtained in visual detection experiments in that they are not symmetrical about the negative diagonal but are skewed to the right. In addition, the shapes of the curves for the two viewing times are roughly similar, suggesting that similar decision-making criteria were used at both viewing times. I I

DISCUSSION

The purpose of our experiment was to determine how much information is taken in with a single glance. Our results indicate that radiologists were able to use the information from a single fixation effectively to make some rather complex diagnostic decisions. The fact that our observers were so successful in dealing with the limited amount of information contained in a split-second flash is the result of their being highly trained as well as the use of films representing straightforward abnormalities. Even so, some errors were made under free viewing conditions. The false-positive rate was higher than one would normally expect, even using free search. This probably represents a bias introduced by the circumstances of the experiment, i.e., the observers were expecting to see abnormal films and were basically "overreading." Despite the relatively good observer performance with flash viewing, it was significantly below that obtained with free viewing. There are two reasons for this:

Diagnostic Radiology

100 -. Free Search

-.::

Interpreting chest radiographs without visual search.

Interpreting Chest Radiographs Without Visual Search 1 Diagnostic Radiology Harold L. Kundel, M.D., and Calvin F. Nodine, Ph.D. Ten radiologists wer...
NAN Sizes 0 Downloads 0 Views