Perspective

CADe for Early Detection of Breast Cancer—Current Status and Why We Need to Continue to Explore New Approaches Robert M. Nishikawa, PhD, FAAPM, FSBI, David Gur, ScD The authors describe where we are in terms of using computer-aided detection (CADe) systems during clinical mammographic interpretations, what are the issues that we face, and why they believe that, despite disappointment in terms of verified added value when it comes to detection of soft tissue abnormalities, we need to continue to explore new approaches to improving CADe-alone performance levels and, more important perhaps, new approaches to optimal communication of CADe-generated information. Key Words: Computer-aided detection; computer-aided diagnosis; breast imaging; mammographic screening. ªAUR, 2014

C

omputer-aided detection (CADe) has been an integral part of the clinical practice for two decades during the review and interpretation of screening mammograms, in particular in the United States (1). Despite initial expectations that using CADe in clinical practice would demonstrate a significant improvement in performance/accuracy and an increase in efficiency, we can all agree that CADe has not been as successful as originally envisioned, in particular for identifying soft tissue abnormalities (1). There may be a number of reasons for the disappointing findings, but basically, it boils down to two primary classes: CADe-alone performance and communication of data that CADe can generate or a combination thereof. CADe performs very well when it comes to highlighting microcalcification clusters, and radiologists like CADe because it increases operational efficiencies by enabling them to read large volumes of screening examinations in one session (1). Unfortunately, when it comes to soft tissue abnormalities, the number of CADe false detections is at least a magnitude higher than that of radiologists. As a result of needing to discard a large fraction of marked regions, many radiologists tend to largely discard/ignore all soft tissue marks; hence, the potential for increasing the detection of

Acad Radiol 2014; 21:1320–1321 From the Imaging Research, Department of Radiology, University of Pittsburgh, 3362 Fifth Ave, Pittsburgh, PA 15213 (R.M.N., D.G.). Received April 24, 2014; accepted May 14, 2014. Funding Sources: This work is supported in part by grant R01EB013680 to the University of Pittsburgh from the National Institute of Biomedical Imaging and Bioengineering, National Institutes of Health. Address correspondence to: R.M.N. e-mail: [email protected] ªAUR, 2014 http://dx.doi.org/10.1016/j.acra.2014.05.018

1320

actually marked cancers they did not detect ‘‘without’’ CADe is lost (2). Additionally, the computerized information we present could be less than optimal from a user perspective. Although investigators are attempting to improve CADealone performance, there has been little work on what information should CADe communicate to the user and only recently investigators are turning to this issue (3). Why do we believe strongly that CADe should not be disbanded in this field, not yet anyway? We know that interpreting screening mammograms is a highly repetitive task while, on average, the ‘‘yield’’ in detecting cancers is quite low. Approximately 10% of screening examinations are typically interpreted as ‘‘incomplete’’, namely these cases are ‘‘recalled’’ or identified as requiring diagnostic work-up to rule out a possible cancer, with only 4%–6% of these eventually verified as having cancer. This process is highly subjective, resulting in large intra- and interinterpreter variability. Computers on the other hand are, if nothing else, consistent. Computers systematically analyze each region (pixel values) in the images included in an examination; hence, right or wrong, computers do not ‘‘miss’’ because of incomplete searches, or ‘‘satisfaction of search’’, or any other human observer–related reasons. In addition, computers do not ‘‘tire’’ and are not affected by internal and external factors that could affect human behavior (eg, background noise, lighting, personal mood, impact of the recent cases seen). This may seem ridiculous to some, but in actual practice, these factors can and do affect performance. When combining interpretive/diagnostic information generated by human observers, who are most efficient and perceptive of global features and can use external information (eg, comparison with contralateral images and prior

Academic Radiology, Vol 21, No 10, October 2014

examinations, history, age, and so forth), with that generated by a computerized system, an additional advantage occurs, as these two partially correlated systems (the human observer and the computer) rarely ever generate fully correlated information. Hence, if the computerized system alone operates at an observer’s level, the possibility to improve decision making is, in principle, higher than that generated by double reading of two independent human observers. The reason we do not observe this theoretically plausible level of improvement is that overall performance of the computerized systems (CADe) is not yet as high as that of an experienced observer. In addition, we know that when an observer actually pays close attention to CADe results (‘‘follows the CADe’’), performance in detecting cancers not marked by CADe decreases substantially (4). Current CADe-alone performance in detecting and marking verified cancers is approximately 75% with 2.5 false detections per case—statistically significantly higher sensitivity and lower specificity than radiologists reading the same cases—and unlike radiologists, breast density does not appear to affect cancer detection by CADe (5). Efforts are underway to improve CADe-alone performance (by reducing the false detection rate) for specific types of cases and abnormalities by changing the decision tree as to what is actually ‘‘marked’’, or not, thereby discarding CADe marks that radiologists would presumably not interpret as positive even if the CADe did, or alternatively, discarding marks of regions that are so ‘‘visually obvious’’ that radiologists would presumably interpret as ‘‘positive’’ even if CADe did not mark these. However, the impact on observer behavior, if any, of either approach needs validation in clinical practice. We know CADe highlights at least some valuable information that, in principle, should improve observer performance, but what information is of true value to the interpreter and how it may affect actual observer behavior is not well understood. Other possible approaches that are specifically designed to alter the interaction between CADe and the observer have been preliminarily explored. For example, investigators have attempted several forms of interactive CADe, albeit, to date, none of the proposed approaches demonstrated significant effects in the past. Additionally, a recent preliminary report showed that not displaying initially any region of interest but allowing observers to select (point and click) specific regions for which the CADe results, if any, are then presented could affect observer behavior, as there were no false-positive marks on the originally presented images (3). Improving a radiologist’s ability to recognize correct CADe prompts may be as simple as including the likelihood that the detected lesion is malignant, namely present classification results (computeraided diagnosis - CADx) for suspected regions combined with CADe. Another possible approach is to focus on marking cases (by case rather than by region) or abnormalities

CADE FOR EARLY DETECTION OF BREAST CANCER

(by region) that are more likely to be missed (eg, detected later as interval cancer and/or detected during subsequent examinations but clearly depicted on the examination in question) or highlighting only these cases (or regions) to the reader (some type of a ‘‘warning’’). However, this type of an approach has not been investigated to date. We believe that CADe in some form is inevitable. Computers are becoming more powerful, computer vision and artificial intelligence techniques are becoming more sophisticated, and our understanding of human vision and perceptual psychology is improving. For a relatively simple task, such as an automated observer finding an object on an image, computers will one day outperform radiologists. We see CADe as a bridge from radiologists being the primary reader to computers being the primary reader. It is not a question of if this will happen; it is only a question of when this will happen. We also note that there are CADe systems for detecting lung and colon cancer, but these are not commonly used clinically, and a discussion on their merit and/or possible improvement is beyond the scope of this article. We note that we intentionally ignored all financial aspects of using CADe in clinical practice. CADe was initially approved for reimbursement via a legislative mechanism, and although reimbursement for using CADe has been declining in recent years, it remains a revenue generator with both a technical and a professional component. In summary, after the initial enthusiasm leading to high expectations, we have come to realize that current CADe is not affecting observer performance and/or behavior as originally expected/envisioned. Knowing that the fundamental premise is solid, we simply have to refocus our developmental efforts on different approaches that result in better use of the information that could be generated by computerized analyses of mammographic images. We believe this type of effort is important, timely, and warranted. REFERENCES 1. Gur D, Sumkin JH, Rockette HE, et al. Changes in breast cancer detection and mammography recall rates after the introduction of a computer-aided detection system. J Natl Cancer Inst 2004; 96(3): 185–190. 2. Nishikawa RM, Schmidt RA, Linver MN, et al. Clinically missed cancer: how effectively can radiologists use computer-aided detection? AJR Am J Roentgenol 2012; 198(3):708–716. 3. Hupse R, Samulski M, Lobbes MB, et al. Computer-aided detection of masses at mammography: interactive decision support versus prompts. Radiology 2013; 266(1):123–129. 4. Zheng B, Ganott MA, Britton CA, et al. Soft-copy mammographic readings with different computer-assisted detection cuing environments: preliminary findings. Radiology 2001; 221(3):633–640. 5. Cole EB, Zhang Z, Marques HS, et al. Assessing the stand-alone sensitivity of computer-aided detection with cancer cases from the digital mammographic imaging screening trial. AJR Am J Roentgenol 2012; 199(3): W392–W401.

1321

CADe for early detection of breast cancer-current status and why we need to continue to explore new approaches.

The authors describe where we are in terms of using computer-aided detection (CADe) systems during clinical mammographic interpretations, what are the...
69KB Sizes 0 Downloads 3 Views