Adherence to Thresholds: Overdiagnosis of Left Ventricular Noncompaction Cardiomyopathy Vinay Kini, MD, Victor A. Ferrari, MD, Yuchi Han, MD, Saurabh Jha, MBBS Thresholds derived from quantification in imaging are increasingly used to define disease. This derivation is not an exact science. When one uses a threshold to define a disease, one does not clearly demarcate disease from normality because the threshold includes overlapping spectra of mild disease and normality. Thus, use of the threshold will mislabel normal individuals with disease. In this perspective, we will describe how the threshold has been derived for left ventricular noncompaction cardiomyopathy, the statistical biases in the design of studies used to derive the threshold, and the dangers of overdiagnosis when the threshold is used to rule out left ventricular noncompaction cardiomyopathy. Key Words: Overdiagnosis; cardiovascular imaging; ventricular noncompaction. ªAUR, 2015

D

efinitions are important to diagnose, prognosticate, and treat a disease. To increase consistency and reduce uncertainty, we increasingly ask for objective criteria to establish disease. For example, chronic bronchitis is defined by a productive cough on most days for at least 3 months for 2 years. It is neither always desirable nor feasible to obtain tissue for confirmation of disease, particularly cardiac disease, as biopsies have morbidity and sampling error. There is reliance on imaging for diagnosis, and imaging is therefore increasingly used to objectify the positivity of disease. A threshold is the minimum required to fulfill disease status. Thresholds oversimplify the complexity of diagnosis by assuming a dichotomy between those with a particular disease and those without (1). In reality, there exists spectrum of disease, as well as spectrum of ‘‘nondisease.’’ The compositions of groups can vary from one study and one clinical situation to another, which affects the generalizability of measurements made on any group. This leads to the establishment of diagnostic thresholds that are inaccurate when used in real-world clinical scenarios. Left ventricular noncompaction cardiomyopathy (LVNC) is a rare disease, previously under-recognized, characterized by a bilayered myocardium with an abnormally trabeculated subendocardial layer of the myocardium with prominent trabeculae and recesses (2). The clinical and phenotypic presentations are variable, and it is recognized that patients with a severe phenotype have a

Acad Radiol 2015; 22:1016–1019 From the Division of Cardiovascular Medicine, The Hospital of the University of Pennsylvania, 9021 Gates, 3400 Spruce Street, Philadelphia, PA 19104 (V.K., V.A.F., Y.H.); and Department of Radiology, The Hospital of the University of Pennsylvania, Philadelphia, Pennsylvania (S.J.). Received October 2, 2014; accepted November 5, 2014. Address correspondence to: V.K. e-mail: [email protected] ªAUR, 2015 http://dx.doi.org/10.1016/j.acra.2014.11.016

1016

poor prognosis from progressive heart failure, embolic phenomena, and malignant arrhythmias. Diagnosis of LVNC is based on a threshold. However, proposed thresholds are controversial because ever since their implementation, there have been increasing rates of diagnosis, and likely overdiagnosis, of LVNC (3). LVNC is instructive in how thresholds in imaging are derived and the problems inherent in establishing a diagnosis based on fulfillment of a threshold. We will critically analyze the studies used to develop thresholds for LVNC on echocardiography and cardiac magnetic resonance (CMR). We will explain why these thresholds increase overdiagnosis.

ESTABLISHMENT OF A THRESHOLD Although isolated reports of LVNC date back to the 1960s, the first major study of patients with LVNC was in 1990 (4). The authors reported a series of eight patients who were thought to have a congenital abnormality of the myocardium characterized by ‘‘numerous, excessively prominent trabeculations and deep, intertrabecular recesses.’’ They reported a high rate of cardiovascular complications associated with this entity and attempted to provide a diagnostic tool using echocardiography. The authors developed a ratio between two distances X and Y. X is the distance from the epicardial surface to the trough of the trabecular recess. Y is the distance from the epicardial surface to the peak of trabeculation. They compared the ratio to eight controls and noted that all patients with LVNC had an X/Y ratio that decreased to 0.5 at the apex. They therefore proposed an X/Y threshold of 2.0; therefore, they proposed a threshold of 2.0 for the diagnosis of LVNC. In their conclusion, they wrote, ‘‘classification of isolated ventricular noncompaction as a distinct cardiomyopathy would facilitate its diagnosis and most probably contribute to unmasking a much higher incidence of this disorder.’’ The authors also emphasized the long-time frame from symptom onset to diagnosis of LVNC (8). The threshold was widely adopted. Still, the limitations of echo including difficulty in assessing the left ventricular apex because of the near-field effect and dependence on good imaging windows were widely recognized. CMR is not afflicted by these technical limitations and offered an alternative to echo in the diagnosis of cardiomyopathies. The most widely used threshold on CMR was developed by Petersen et al. (9) in 2005. The authors compared the CMR findings of seven patients with a known diagnosis of LVNC to CMR findings from small cohorts of the following: competitive athletes, patients with DCM, hypertrophic cardiomyopathy, LVH, or aortic stenosis. They found that the average diastolic NC/C ratio of the LVNC group was 3.0 (95% confidence interval, 1.5–4.5) and was significantly higher than the other groups. They calculated that an NC/ C ratio of >2.3 would provide a diagnosis of LVNC with a sensitivity, specificity, positive predictive value, and negative predictive value of 86%, 99%, 75%, and 99%, respectively. The authors concluded that ‘‘the diastolic ratio of >2.3 showed high diagnostic accuracy for distinguishing pathologic LVNC from the degrees of noncompaction observed in healthy, dilated, and hypertrophied hearts.’’

APPLICATION OF THRESHOLDS TO WIDER POPULATIONS The aforementioned studies are methodologically reasonable, and the authors should be commended for providing diagnostic criteria for such a rare and recently discovered entity. Indeed, they are the best in the circumstances. However, this does not mean they are without significant flaws.

A threshold was derived for a very rare disease (a reported prevalence in these early studies of 0.3%) based on very small cohorts with poorly defined disease states. The imaging findings of these patients were then compared to small control groups of distinctly normal patients or to those with other distinct diseases. In effect, a diagnostic threshold was established with very small representations of normal or diseased states, when the phenotypes of both groups are in reality quite varied. It would be improbable that a group of 10 normal or a group of 10 patients determined to have LVNC could provide an accurate representation of all the phenotypic manifestations of those groups. The problems that arise from such assumptions become clear when these diagnostic thresholds are applied to larger populations. In 2008, Kohli et al. evaluated the echocardiographic criteria proposed by Chin et al., Jenni et al., and a third set of criteria previously proposed by Stollberger et al. (10). They applied three thresholds for LVNC to the echocardiograms of 202 consecutive patients with left ventricular systolic dysfunction who were referred to a heart failure program at a tertiary hospital (11). They also applied the criteria to the echocardiograms of 60 normal healthy volunteers. They found that nearly 25% of the heart failure patients, as well as 8% of the healthy controls, fulfilled one or more of the criteria for LVNC. This was in stark comparison to the earlier reported prevalence of 2.3) for diagnosis of LVNC to a large cohort of patients participating in the Multi-Ethnic Study of Atherosclerosis (MESA). Of 323 patients without cardiac disease or hypertension, 140 (43%) had an NC/C ratio >2.3 in at least one segment. The maximum thickness of trabeculation was positively associated with Chinese and black races, and left ventricular end diastolic volume (ie, the larger the ventricle, the more likely it was to have significant trabeculation.) Several questions arise from the two studies. The most obvious question is by what means did established plausible thresholds with high specificity flag LVNC in so many normal subjects. Recall that the specificity from the study by Peterson et al. is 99%, and yet, 43% of asymptomatic patients in MESA fulfilled the threshold of 2.3. Furthermore, there are larger questions regarding the nature of diagnosis and subsequent management. Should normal study participants be concerned that they meet definition for LVNC and seek treatment with warfarin? If so many ‘‘normal’’ subjects have features of LVNC on cardiac imaging, is it possible that the disease is much more common than previously believed? Or were the initially developed thresholds too sensitive, overdiagnosing patients who in reality have no underlying pathology? From the standpoint of cardiac imagers, is it better to err on the side of overdiagnosis, potentially exposing normal patients to unnecessary testing and 1017

KINI ET AL

treatment? Or should thresholds be adjusted to err on the side of underdiagnoses, potentially missing patients with real pathology who would benefit from treatment? This is a choice familiar to imagers: fewer misses and more overcalls versus more misses and fewer overcalls.

BIASES IN STUDIES OF DIAGNOSTIC TEST ACCURACY The fact that the thresholds for LVNC caused overdiagnosis when applied to larger cohorts is a result of two major biases in the statistical design of the original studies. The first is incorporation bias, which occurs when the test being studied is incorporated into the gold standard. The first two studies used echocardiography to establish the diagnosis of LVNC (ie, patients were determined to have LVNC based largely on echocardiographic characteristics) and then tested echocardiography to see whether it could establish the diagnosis. Obviously, if a test’s ability to detect a disease is being studied, and the disease is partly defined by that test, the test is likely to look good. Similarly, in the study by Petersen et al., patients with LVNC were identified ‘‘on the basis of either echocardiographic or CMR documentation of a distinct two-layered appearance of trabeculated and compacted myocardium,’’ and then, CMR was tested to see how accurately it could establish the diagnosis (9). The effect of incorporation bias is that sensitivities and specificities are falsely elevated, and when the test is applied to broader populations (as in this case), its diagnostic performance declines. The second is spectrum bias. For a diagnostic test to be accurate, it must be able to perform well in a real-world setting, where the spectrum of disease varies widely. Tests tend to perform well when they only have to distinguish between the obviously diseased and the obviously nondiseased. Spectrum bias occurred in these studies because the diseased and nondiseased spectrums were not reasonably represented. More specifically, the test was not tasked to distinguish between mild disease and the wilder shores of normality. Subjects in the LVNC cohorts were identified largely on the basis of very abnormal imaging findings and compared to subjects with imaging findings that did not at all resemble LVNC. For example, the initial threshold for magnetic resonance imaging was derived from very small cohorts of patients in the United Kingdom. When the thresholds were applied to the MESA cohort (an intentionally diverse population), black and Chinese subjects tended to be more likely to meet the diagnostic criteria on the basis of what are most likely normal racial or ethnic differences. The effect of spectrum bias is to falsely elevate sensitivity when the diseased cohort is phenotypically severe and to falsely elevate specificity when the control cohort looks nothing like the diseased cohort. Both of these occurred in this case.

1018

Academic Radiology, Vol 22, No 8, August 2015

OPTIMIZING THE USE OF IMAGING IN THE DIAGNOSIS OF LVNC The dangers of using thresholds rigidly for the diagnosis of disease are evident, particularly when the derivation of thresholds is subject to significant bias. Once falsely labeled with disease, the disutility cascade commences. In the case of LVNC, overdiagnosis could lead to unnecessary treatment with drugs such as angiotensin-converting enzyme inhibitors, beta blockers, aldosterone antagonists, and warfarin. Perhaps, even more dangerous is the low threshold many cardiologists have for placing implantable cardioverter defibrillators in patients with LVNC given their increased risk of malignant arrhythmias. These treatments could have adverse lifechanging impact on subjects, particularly for young patients, such as ability to participate in sports, choice of career, and eligibility for life insurance. On the other hand, underdiagnoses can lead to significant morbidity and mortality that potentially could be alleviated, although the extent to which these treatments could prevent future adverse outcome in asymptomatic LVNC patients is unknown. Because the true prevalence of LVNC is not currently known, cardiac imagers must determine the optimal path forward. A critical recognition is that thresholds are not absolute, and the standard deviation of ‘‘normal’’ could vary based on the population studied. Threshold should not be used to rule out LVNC. We reserve maximum caution for the patient with normal left ventricular ejection fraction who is found to have hypertrabeculation on echocardiogram and who the cardiologist sends for CMR to rule out LVNC. This is the gateway to overdiagnosis which can be forestalled by respecting the asymptomatic nature of the patient. The reader is asked to reflect a little on the absurdity of defining disease on a threshold which is precise to onetenth of a decimal point. Does it seem conceivable that a patient with an NC/C ratio on CMR of 2.4 has LVNC and one who has a ratio of 2.3 is normal? That a difference of 0.1 is the difference between heart failure, emboli and malignant arrhythmias, and normality? There is increasing evidence that LVNC is a phenotypic expression of a variety of genotypic disorders, contributing to its varied clinical presentation and appearance on imaging studies (13). There is currently no guideline or consensus statement from professional societies on the diagnosis of LVNC. As is often the case, best practice should be to use imaging wisely and in the appropriate clinical context. For LVNC, a clinical scenario that includes a positive family history, decreased left ventricular ejection fraction, embolic phenomena, neuromuscular disease, and atrial or ventricular arrhythmias must play a large role in establishing the diagnosis. An increased NC/C ratio can aid with the establishment of diagnosis, but ultimately the diagnosis should be made on the basis of supporting clinical evidence by the treating clinician.

Academic Radiology, Vol 22, No 8, August 2015

OVERDIAGNOSIS OF LV NONCOMPACTION CARDIOMYOPATHY

REFERENCES 1. Newman TB, Kohn MA. Evidence-based diagnosis. New York, NY: Cambridge University Press, 2009; 5. 2. Quaife RA, Salcedo EE, Wolfel EE. Non-compaction cardiomyopathy: underdiagnosed or overdiagnosed? Curr Cardiovasc Imaging Rep 2013; 6:498–506. 3. Niemann M, Stork S, Weidemann F. Left ventricular noncompaction cardiomyopathy: an overdiagnosed disease. Circulation 2012; 126: e240–e243. 4. Chin TK, Perloff JK, Williams RG, et al. Isolated noncompaction of left ventricular myocardium: a study of eight cases. Circulation 1990; 82: 507–513. 5. Ritter M, Oechslin E, Sutsch G, et al. Isolated noncompaction of the myocardium in adults. Mayo Clin Proc 1997; 72:26–31. 6. Oechslin EN, Attenhofer Jost CH, Rojas JR, et al. Long-term follow-up of 34 adults with isolated left ventricular noncompaction: a distinct cardiomyopathy with poor prognosis. J Am Coll Cardiol 2000; 36:493–500. 7. Jenni R, Oechslin E, Schneider J, et al. Echocardiographic and pathoanatomical characteristics of isolated left ventricular non-compaction: a step

8.

9.

10.

11.

12.

13.

towards classification as a distinct cardiomyopathy. Heart 2001; 86: 666–671. Ichida F, Hamamichi Y, Miyawaki T, et al. Clinical features of isolated noncompaction of the ventricular myocardium: long-term clinical course, hemodynamic properties, and genetic background. J Am Coll Cardiol 1999; 34:233–240. Petersen SE, Selvanayagam JB, Weismann F, et al. Left ventricular noncompaction: insights from cardiovascular magnetic resonance imaging. J Am Coll Cardiol 2005; 46:101–105. Stollberger C, Finsterer J, Blazek G. Left ventricular hypertrabeculation/ noncompaction and association with additional cardiac abnormalities and neuromuscular disorders. Am J Cardiol 2002; 90:899–902. Kohli SK, Pantazis AA, Shah JS, et al. Diagnosis of left-ventricular noncompaction in patients with left-ventricular systolic dysfunction: time for a reappraisal of diagnostic criteria? Eur Heart J 2008; 29:89–95. Kawel N, Nacif M, Arai A, et al. Trabeculated (noncompacted) and compact myocardium in adults: the Multi-Ethnic Study of Atherosclerosis. Circ Cardiovasc Imaging 2012; 5:357–366. Oechslin E, Jenni R. Left ventricular non-compaction revisited: a distinct phenotype with genetic heterogeneity? Eur Heart J 2011; 32:1446–1456.

1019

Adherence to thresholds: overdiagnosis of left ventricular noncompaction cardiomyopathy.

Thresholds derived from quantification in imaging are increasingly used to define disease. This derivation is not an exact science. When one uses a th...
91KB Sizes 0 Downloads 15 Views