Cephalometrics of anterior open bite: A receiver operating characteristic (ROC) analysis David W. Wardlaw, DDS, MS," Richard J. Smith, DMD, PhD, b David W. Hertweck, BS, c and Charles F. Hildebolt, DDS, PhD d Little Rock, Ark., St. Louis, Mo., and Cincinnati, Ohio

A new approach is presented for the evaluation of cephalometric measurements in which a measurement is considered to be a diagnostic test for the presence or the absence of some component of malocclusion. This approach allows cephalometric measurements to be judged by the criteria that are generally used for clinical diagnostic tests, including determination of sensitivities, specificities, positive and negative predictive values, and, most important, receiver operating characteristic (ROC) curves. In this study, ROC curves are generated for the relationship between several skeletal cephalometric measurements and anterior dental open bite in a sample of 1541 orthodontic patients. The overbite depth indicator is found to be a better diagnostic criterion for the presence of dental open bite than any other commonly used skeletal cephalometric measurement or ratio. ROC curves are of substantial value for evaluating the diagnostic information of cephalometric measurements. (AMJ ORTHOD DENTOFACORTHOP 1992;101:234-43.)

L a t e r a l cephalometric radiographs are a routine component of the diagnostic records taken in clinical orthodontic examinations. They are also frequently used as a research tool for evaluating the effects of orthodontic treatment and for describing facial growth. Nevertheless, cephalometric radiographs have been strongly criticized as having limited clinical value and as lacking biologic validity for research on facial growth. 1-3 We believe that much of the criticism of cephalometric radiographs is misdirected. The deficiencies of cephalometric data for both clinical and research problems are not inherent features of what is, in fact, an objective and reproducible technique. Rather, much of the problem with cephalometrics concerns the manner in which specific measurements are selected and used. Thus, if cephalometry has not fulfilled its apparent potential in orthodontics, the fault is with ourselves and what we have chosen to do with this tool rather than with the technique itself. Although cephalometric measurements are often used in carefully controlled studies and are subjected to sophisticated statistical analyses, little attention has been directed to justifying, evaluating, and comparing the measurements themselves (i.e., the development of an objective basis for selecting

aln the private practice of orthodontics in Little Rock, Ark. bFormerly Dean of the School of Dental Medicine and Professor and Chairman, Department of Orthodontics, Washington University. CDepartment of Biology, University of Cincinnati. aDepartment of Radiology, School Of Medicine, Washington University. 8/1/26753

234

the particular cephalometric measurements that are used for clinical or research purposes). What is it that a particular cephalometric measurement is supposed to diagnose? How do we determine whether the measurement is a reliable and valid test for its intended purpose? Thus far, the best method for determining the validity of a diagnostic measure is the receiver operating characteristic (ROC) analysis. 4 The ROC analysis is accepted and widely used in fields as diverse as weather forecasting, information retrieval, crime investigation, military monitoring, and sensory psychology. 4-6 During the past 10 years, more than 500 articles about ROC analyses have been published in the biomedical literature. 7 In this study, we use ROC curves to evaluate the ability of skeletal cephalometric measurements to identify patients with anterior dental vertical open bites. Our objective is not only to evaluate the relative value of skeletal cephalometric measurements for use in diagnosis of a dental condition (open bite) but also to demonstrate the use of ROC analysis in cephalometric studies. MATERIALS AND METHODS The 2 x 2 decision matrix

Cephalometric measurements may be brought into the mainstream of medical diagnostic techniques by considering a measurement to be a diagnostic test for a particular clinical condition. A diagnostic test generally has a cut-off value that is used to identify positive and negative results (i. e., a clinical condition is in one of two states: present or absent). For example, a clinician may consider a patient with a mandibular

ROC analysis of cephalometric measurements 235

Volume 101 Number 3

Condition Present

Absent

C+

C-

True

u)

Positive

positive

False positive

T+

(TP)

(FP)

Negative T-

False negative (FN)

¢0

True negotive (TN)

Fig. 1. The 2 x 2 table (or decision matrix) is a standard method for evaluating efficacy of a diagnostic test. It is necessary for the diagnostic test to have a specific cut-off value, so that patients can be classified as having a positive or negative result on the test (that is, the condition is either present or absent in each patient). The frequency of different combinations of test results and true states shown in the table lead to the analyses described in the text.

plane to Frankfort horizontal angle of 40 ° or greater to be sensitive to vertical eruptive forces and therefore a person for whom one would generally avoid cervical headgear or vertical elastics. In more formal terms, the clinician has defined a cut-off value of 40 ° on the SN-MP measurement (his diagnostic test) to indicate the presence or the absence of the clinical condition "vertically sensitive." The clinician may or may not be correct in the conclusion that a particular patient with an SN-MP angle greater than 40 ° is, in fact, sensitive to vertical forces. This, then, leads to construction of a 2 x 2 table (also known as a decision matrix) relating positive and negative results on the test to the presence or the absence of the condition. This table is a useful and well-known tool for evaluating the efficacy of diagnostic tests (Fig. 1). T M The data from a 2 x 2 table can be used to compute a variety of characteristics for a diagnostic test. Each cell (box) of the 2 x 2 table indicates the number (or percentage) of patients in a study who meet the two conditions that define that cell. A limitation to the use of the decision matrix is that the true status of the patient (with regard to the presence or the absence of a condition) must be known with more certainty than the diagnostic indicator being evaluated (i.e., the accuracy of truth must exceed that of the diagnostic system being evaluated). 6 Perhaps the most well-known characteristics that are calculated from a decision matrix are the sensitivity and specificity of a test. The sensitivity of a test is its ability to correctly identify a patient with a specific condition, whereas the specificity of a test is its ability to correctly identify a patient without a specific condition. ,2 In medical decision theory, the presence of a specific condition is variously referred to as the presence of (1) a signal, (2) a disease, or (3) an abnormality. The formulas for calculation of sensitivity and specificity from the 2 x 2 table are as follows:

Sensitivity -

TP

True-positive patients

TP + FN

Total patients with condition

TN Specificity - - TN + FP

True-negative patients Total patients without condition

Positive and negative predictive values can also be calculated from a 2 x 2 table. 9'1° The positive predictive value is the probability that the condition is present when the test result is positive. The negative predictive value is the probability that the condition is not present when the test result is negative. These values are calculated as follows: TP Positive predictive value - - TP + FP TN Negative predictive value - TN + F ~ Another useful measure is the likelihood ratio for a positive result? 3'~4 The likelihood ratio is determined by two probabilities--the probability of a given test result when the condition is present (the true-positive fraction) divided by the probability of the same test result when the condition is absent (the false-positive fraction). LR ( + ) -

TP/C + - FP/C -

Sensitivity 1 - Specificity

A test result with a likelihood ratio of 1.0 implies that the test is of no value in changing the initial diagnostic certainty. Above 1.0, the likelihood ratio indicates the odds that a patient with that cut-off value has the condition, as opposed to not having the condition. Test results with higher likelihood ratios are better discriminators of condition that those with low ratios.

236

Wardlaw

I.O

et

I

al.

I

.>

Am. J. Orthod. Dentofac. Orthop. March 1992

I

I

I

I

I

I

I

A

1.0

Cut-off Volue

t03

typical ROC curve ,e--

o 12:

0.5

o.5

.>

/ Stricter Cut-off Value

° ~

o ~L

~ /

/

t.

p-

0

1.0

Folse Positive Rote (I-specificity) Fig. 2. As cut-off value of a diagnostic test is changed, number of patients classified as positive or negative for a condition changes. Each cut-off value changes the results of the 2 x 2 table. Plotting the sensitivity on the ordinate and 1-specificity on the abcissa for each cut-off value produces an ROC curve. The less stringent cut-off values produce the points of the curve in the uppei" right area. As the cut-off value becomes more stringent, the points are closer to the origin. (Modified from Metz CE, Invest Radiol 1986;21:720-33.)

T h e R O C curve For evaluating cephalometric measurements, there are significant limitations with any calculations derived from a single 2 x 2 table, in that the results depend on the specific cut-off value selected for the cephalometric measurement. For example, a finding that a 40 ° SN-MP angle is a poor test for "vertically sensitive" tells us little about how good a value of 45 ° might be. A 2 x 2 table does not test a variable as much as it tests the specific cut-off point selected. Measurements such as specificity and sensitivity change systematically in opposite directions as the cut-off point changes. This problem is resolved by use of the ROC curve, which characterizes a diagnostic test (i.e., cephalometric measurement) by a curve that is equivalent to a series of 2 x 2 tables with changing cut-off values. The ROC curve was originally developed for use in electromagnetic (radar) signal detection theory to separate information from noise? '9 It was introduced in clinical laboratory medicine and diagnostic radiology by Luste@ ~'~6 and became widely used in medical decisionmaking theory in the 1970s? 3~7'~8 Medical radiologists have found the ROC curve particularly useful for evaluating the performance of observers and imaging systems. '7''9-2z In laboratory medicine the ROC curve is used for comparing different methods of medical diagnosis. 23~ The ROC curve is a graphic plot of the true-positive rate (TPR = sensitivity) versus Zhe false-positive rate (FPR = 1 - specificity) of a test as the decision threshold (cut-off point) of the test is varied systematically. Each point on the curve represents the 2 x 2 table at a given cut-off point;

0

~

No Diagnostic Volue

1

. 0.5

0

Folse Positive Rote (I-specificity) Fig. 3. For nonoverlapping ROC curves, curve that is closest to upper left corner of graph, and therefore occupies a larger percentage of entire area of graph, has higher sensitivities and lower false-positive rates at all cut-off points, and is considered a better test. (Modified from Gohagan et al., Invest Radiol 1984;19:587-92.)

therefore the ROC curve can be thought of as the series of 2 x 2 tables that result when the cut-off point is varied from the smallest to the largest value of a test result. The more stringent cut-off points result in sensitivities, TPRs and FPRs, near the origin of the graph, whereas relaxed cut-off points result in values in the upper right area of the graph (Fig. 2). The comparison of ROC curves for different diagnostic tests is straightforward for nonoverlapping curves, in that the curve that is the highest and farthest to the left represents the most efficient test and the one with the greatest diagnostic value (Fig. 3). This test has a higher true-positive rate and a lower false-positive rate at every cut-off point selected. Methods for evaluating overlapping curves are given by Swets and Pickett 6 and Metz et al. 26 The diagnostic value of the ROC curves can be quantitatively expressed by calculating the area under the curve, 12"7'27expressed as a proportion of the total possible area. An area of 1.0 indicates a perfect test (no false-positive or falsenegative results), whereas an area of 0.5 indicates a test that performs no better than chance. An area of less than 0.5 would indicate a test that performs more poorly than chance. When different diagnostic tests with nonoverlapping curves are compared, the test with the best performance has the largest area under the ROC curve.

S a m p l e selection For this study, more than 2000 pretreatment lateral cephalometric radiographs of white male and female patients between 11 and 20 years of age were obtained from one orthodontic practice in Little Rock, Ark. Radiographs were eliminated from the sample because of the presence of a mutilated

Volume 101 Number 3

dentition, extensive dental restorations (generally patients with restorations replacing multiple missing permanent teeth), poor alignment of the patient's head in the cephalostat, or inadequate quality of the cephalometric radiograph. No anatomic variations were used to exclude patients from the sample. After the selection process was completed, the final sample consisted of 1541 patients.

Data collection All radiographs were taken with a single cephalometric x-ray unit using a standard source-to-midsagittal plane distance of 60 inches. The office routine called for minimizing magnification by placing the film as close as possible to the patient's head. The subject-to-film distance (and magnification of midline structures) was therefore dependent on facial width. In a survey of 100 consecutive patients from the same office, the average subject-to-film distance was 8.6 cm, with a range of 7.8 cm to 10.6 cm. These values result in an average magnification of midline structures of 6%, with a range of 2% for the difference between largest and smallest facial widths in midline magnification. However, the only linear measurement used in this study is anterior overbite. The largest value observed (10.8 ram) might incorporate a magnification error of 0.2 ram. All other variables in the analysis are angles or ratios and are therefore unaffected by magnification. Landmarks were digitized directly from radiographs with an IBM-AT computer and a Numonics 2400 digitizer with numeric coprocessor, in conjunction with the ORTHODIG digitizing program. 28Data were analyzed on an IBM-AT computer with the SYSTAT package. 29 All cepahlometric landmarks were identified and marked by one expert observer. Digitizing of marked radiographs was accomplished by three persons. An analysis was completed to determine intraoperator and interoperator errors. Landmarks were reidentified by the same person on 20 radiographs. The largest mean error in landmark identification for any variable was 0.85 ram. Of the reidentified landmarks, 90% were within 1.5 mm of the original position and 95% were within 2.0 mm. A similar analysis was used to test interoperator variation in digitizing landmarks. Errors were smaller than those associated with the identification of landmarks. There were no statistically significant differences between the digitized points of different operators for any variable, and no significant differences between cephalometric measurements computed from original or reidentified points. Thus intraoperator and interoperator error rates for landmark digitization were considered acceptable for this study.

Measurement selection Several skeletal and dental cephalometric measurements have been associated with the presence of anterior open bite. 3°43 On the basis of this literature, the following measurements were calculated from the digitized point coordinates. Variables were defined as in Riolo et al.¢ 4 1. Mandibular plane angle (SN-GoMe, with constructed gonion) 2. Gonial angle (Ar-Go-Me, with constructed gonion) 3. Y axis (S-Gn measured from S-N)

ROC analysis of cephalometric measurements 237

4. AFH ratio (N-ANS/ANS-Me measured perpendicular to S-N) 5. PFH/AFH (S-Go/N-Me with constructed gonion) 6. Symphysis angle (SN-BPg) 7. PP-MP angle 8. Interincisal angle 9. LFH ratio (Ar-Go/ANS-Me with constructed gonion) 10. OP-MP angle (OM angle) 32 11. Modified ODI (overbite depth indicator). Defined by Kim 45as the angle of the A-B plane to the mandibular plane combined with the palatal plane to Frankfort horizontal plane angle. If the latter angle is positive, it is added to the former angle; if it is negative, it is subtracted from the former angle. Lower values of the ODI indicate an open bite tendency. In this study we measured a modified ODI, in that the palatal plane was measured to S-N instead of to FH to avoid a large measurement error in the identification of orbitale that was found in a preliminary analysis. In a preliminary analysis, the anterior facial height ratio was calculated in two w a y s - - o n c e with N-ANS and ANSMe measured perpendicular to the S-N line and once as the actual linear distance between point coordinates. The Y axis angle was measured as the angle of the S-Gn line to FH and to S-N. The results for the pairs of measurements were similar, and only one measurement of each pair is reported in this study. Anterior open bite is defined as the dental relationship (on the cephalometric radiograph) for which the vertical overlap between the central incisors is less than 0.0 mm (that is, no vertical overlap of incisors), measured perpendicular to the palatal (ANS-PNS) plane. Of the final sample of 1541 patients, 68 had dental open bites. The objective of this study is to evaluate the ability of cephalometric measurements to distinguish the 68 patients with anterior open bites from the remaining 1473 patients without this condition.

Statistical analysis Descriptive statistics were calculated for each cephalometric measurement for the total sample of 1541 patients. Two 2 × 2 tables were constructed for each measurement. The first table used the value one standard deviation above or below the mean as the cut-off point, in the direction of greater open bite tendency. The second table used a stricter criterion of two standard deviations from the mean as the cutoff point. For each table the sensitivity, specificity, positive predictive value, negative predictive value, and positive likelihood ratio were calculated. The ROC curves were generated for each cephalometric measurement with intervals of 1° for angular measurements, 1.0 mm for linear measurements, and 0.01 for ratio measurements (that is, for each increment in measurement, a TPR and a FPR were calculated for use in creating ROC curves). The ROC curves were generated and smoothed with the SYGRAPH package, "6 and the distance-weighted least squares method, which draws a curve through a set of points by means of a least squares fit. The curve is allowed to flex locally to best fit the data.

238

Wardlaw et al.

Am. J. Orthod. Dentofac. Orthop. March 1992

Table I. Descriptive statistics I Mean Minimum Maximum SD

SN-MP

Gonial angle

Y-axis (SN)

AFH ratio

PFH/AFH ratio

32.18 16.99 50.17 5.65

121.42 99.04 139.35 6.13

67.13 55.47 80.20 3.73

0.87 0.65 1.21 0.08

0.65 0.52 0.80 0.05

I

Symphysis angle 83.96 53.65 106.39 7.98

*The measurement reported here is a modified ODI, in which palatal plane inclination was measured to SN rather than to FH. **Of 1541 patients, 68 exhibited open bite (4.4% of sample); negative value = open bite. Total sample size = 1541.

Table II. Sensitivity, specificity, positive predictive value, negative predictive value and likelihood ratio at one and two standard deviations from the mean, in the direction of greater open bite tendency

Standard deviation SD1 Value at SD 1 Sensitivity (%) Specificity (%) Predictive value ( + ) (%) Predictive value ( - ) (%) Likelihood ratio SD 2 Value at SD 2 Sensitivity (%) Specificity (%) Predictive value ( + ) (%) Predictive value ( - ) (%) Likelihood ratio

Gonial angle

Y-axis (SN)

AFH ratio

38° 38 84 11 97 2.44

128° 25 85 7 96 1.70

71° 37 86 11 97 2.58

0.79 49 82 11 97 2.72

0.61 34 78 7 96 1.55

44° 6 98 13 96 3.27

134° 4 98 11 96 2.75

75° 7 98 16 96 4.06

0.70 10 99 26 96 7.36

0.56 4 98 9 96 2.20

SN-MP

The area under the ROC curves was measured on an IBMAT with a Numonics 2400 digitizer and a numeric coprocessor. The area under the curve was digitized three times, and the average area value was reported for each ROC curve. The ROC curves can also be hand drawn, and areas beneath them calculated with (1) graphic methods; 47 (2) the Wilcoxon statistic;48'49 (3) formula after TPRs and FPRs are transformed to normal-deviate spacer; or (4) computer programs that have been published or are available from their developers. 6,26,5°-~3

RESULTS Descriptive statistics are presented in Table I. O f the 1541 patients, 68 (4.4%) exhibited anterior dental open bites and were classified as positive. Table II lists the sensitivity, specificity, positive predictive value, negative predictive value, and positive likelihood ratio for the 13 cephalometric variables, each of which was treated as a diagnostic test. These statistics were calculated according to cut-off points that were established at values one and two standard deviations from the mean, in the direction of greater open-bite tendency.

PFH / AFH ratio

At values one standard deviation (SD1) from the mean, the sensitivities of all the cephalometric measurements were low, ranging from 25% to 49%. The anterior facial height ratio and the O D I were the most sensitive tests. The sensitivities of the gonial angle, the symphysis angle, and the anterior/posterior lower facial height ratio were the lowest. The specificities of all m e a s u r e m e n t s were consistently higher, all being between 82% and 87%, except for the P F H / A F H ratio with a value of 78%. The positive predictive value was highest for the ODI, whereas the negative predictive value of all tests was uniformly high (96% to 97%). The positive likelihood ratio for the O D I was considerably higher than for all other measurements. W h e n a cut-off value two standard deviations (SD2) from the m e a n was used for each measurement, the sensitivities decreased to a range of from 1% to 13%. The anterior facial height ratio, the interincisal angle, and the O D I had the highest sensitivities, whereas the lower facial height ratio, the O P - M P angle, the P F H / A F H ratio, and the gonial angle had the lowest.

Volume 101 Number3

ROC analysis of cephalometric measurements

lmerincisal angle

LFH ratio

OP-MP

PP-MP

ODI*

127.71 91.04 173.12 10.79

0.69 0.40 1.00 0.09

20.77 8.44 39.61 4.34

25.01 8.19 44.92 5.42

82.52 60.62 106.20 7.18

Symphysis angle

I

I

Overbite** 4.25 -5,28 10.85 2.36

PP-MP

lnterincisal angle

LFH ratio

OP-MP

76 ° 28 84 8 96 1.80

30 ° 35 83 9 97 2.11

117 ° 35 86 11 97 2.61

0.61 31 82 7 96 1.75

25 ° 40 82 9 97 2.18

75 ° 44 87 14 97 3.47

68 6 98 11 96 2.81

36 7 98 13 96 3.36

106 ° 10 99 37 96 12.88

0.52 1 98 4 96 0.88

29 ° 4 99 17 96 4.40

68 ° 13 98 26 96 7.76

The specificity of each test was high (98% to 99%). The positive predictive values of the anterior facial height ratio, the interincisal angle, and the ODI were higher than those of the other tests, whereas the gonial angle, the PFH/AFH ratio, the symphysis angle, and the LFH ratio had the lowest values. The negative predictive value was equally high for all tests. The positive likelihood ratios of anterior facial height ratio, the interincisal angle, and the ODI were the highest of all tests. As the cut-off point was varied from SD1 to SD2 from the mean, the sensitivity for all tests decreased, and the specificity for all tests increased. The positive predictive value for all tests increased, and the negative predictive value generally stayed at the same level or slightly decreased. There was an increase in the positive likelihood ratios of all tests with the exception of the LFH ratio. The ROC curve for the ODI was the highest and the farthest to the left of all the curves for skeletal cephalometric measurements, thus suggesting that the

239

ODI

ODI was the best diagnostic test for open bite. Fig. 4 contains the ROC curve for each measurement plotted with the curve for the ODI. The ROC 'curves for the anterior facial height ratio and the interincisal angle most closely approximate the ODI curve, whereas the curves for the PFH/AFH ratio, the gonial angle, the LFH ratio, and the symphysis angle fall noticeably short of the ODI curve (farther down and to the right on the graph). The area under each curve is listed in Table III. DISCUSSION

In this study, the presence or absence of an anterior open bite was directly visible on the cephalometric radiograph. The other cephalometric measurements therefore are not diagnostic tests in the traditional sense, for if we wish to know whether our patient has the condition in question, we can see it directly and do not need a "diagnostic test." Furthermore, the results of this study would have been identical if we had used some other "gold standard" to identify the patients with open bites,

240

W a r d l a w et al.

Am. J. Orthod. Dentofac. Orthop. March 1992

SN-MP angle

Gonial angle

Y-Axis angle

AFH Ratio

PFH/AFH Ratio

Symphyzeal angle

PP-MP angle

Interincieal angle

LFH Ratio

OP-MP angle

Fig. 4. Skeletal cephalometric measurement with largest area beneath its ROC curve and thus measurement with most diagnostic value was ODi. Each graph in this figure includes the ROC curve for the ODI and the ROC curve for the indicated measurement. This allows direct comparison of results for each variable with the most successful variable, the ODI.

Table III. Areas under the ROC curves Area

oDl AFH ratio Interincisal angle PP-MP 'angle SN-MP angle OP-MP angle Y-axis (SN) angle PFH/AFH ratio Gonial angle LFH ratio Symphysis angle

76.4 71.8 70.0 69.6 69.5 67.7 66.0 65.4 63.7 59.8 58.2

such as diagnostic casts or a clinical examination, because the presence or the absence of an open bite is determined with equal accuracy from the radiograph. What, then, is the purpose of this analysis? What do we mean when we describe a cephalometric measurement as a diagnostic test? Orthodontists have long been concerned with skeletal variation in patients for whom we can clinically observe all occlusal anomalies. This is because we make a judgment as to the difficulty of treatment, the pattern of additional growth changes, and the cause of the occlusal anomaly, on the basis of the skeletal pattern. Previous studies that have evaluated relationships between cephalometric measurements and open bite have used statistical procedures such as correlation coefficients or, more frequently, t tests between groups of patients with and without open bites. 30-43The ROC analysis presented here is simply a new conceptional and statistical approach for evaluating the rela-

tionship between the occlusal variation that we treat and the underlying "causal" skeletal variation. The results suggest that the application of the ROC method to cephalometric measurements can provide useful insights, particularly in assessing the relative diagnostic value of different measurements. The visual simplicity of the ROC data display and the quantitative expression of each curve make this technique a flexible and easily interpreted form of data summary. Because an ROC curve displays data at continuously changing cut-off points, instead of at one arbitrarily defined value, it is superior to data analysis with single 2 × 2 tables. The sensitivities of all diagnostic tests were quite low at the SD1 level and decreased further as the cutoff point was raised to the SD2 level (Table II). The number of open bite patients who were correctly identified by each test decreased considerably because of an increased number of false-negatives and a decreased number of true-positives as the cut-off point was raised (that is, as the strictness for diagnosing open bite was increased). The low sensitivities of the skeletal cephalometric measurements for the diagnosis of dental open bite is partially a result of the low number of open bites in orthodontic populations (68 of 1541). To better understand the interrelationships between sensitivities and specificities as cut-off values are varied, equal numbers of positive and negative cases are best. 6 Such a sample was not available for this study and would not be a realistic reflection of the prevalence of anterior open bite. The data do, however, demonstrate the relative trade-offs as diagnostic criteria are varied.

Volume101

Number 3

The specificity of all tests was high, meaning that each cephalometric measurement was relatively successful at correctly identifying patients without open bites. The specificity of each test increased as the cutoff point was changed from the SD1 level to the SD2 level. This increase in specificity was due to an increased number of true-negatives and a decreased number of false-positives as the cut-off point was raised. An inverse relationship between sensitivity and specificity as the cut-off point is changed is a standard feature of the relationship between these two measures. A strict criterion level leads to high specificity and low sensitivity, whereas a lax criterion results in low specificity and high sensitivity. 27 Of course, with a prevalence of 4.4%, a test that identified all patients as negative for open bite would appear to have a high specificity. The positive predictive value and the positive likelihood ratio are more useful in distinguishing among cephalometric measurements in these circumstances. The positive predictive value of every test but the LFH ratio increased as the cut-off point was raised from SDI to SD2. This improvement was due to a marked decrease in the number of false-positives. The positive predictive values were low for all tests. This is due to the low prior probability of anterior open bite in orthodontic patients. The important issue of prior probabilities and their relationship to the clinical performance of tests is reviewed in several papers. 9,~4 The negative predictive values were high at the SD1 (96% to 97%) and SD2 (96%) levels and varied little between the two cut-off points or between measurements. This indicates that a negative test result for any of these tests has a high level of reliability associated with it. At the SD 1 level, the positive likelihood ratios were all fairly low (1.70 to 3.47). The tests that were identified as having the greatest sensitivites and positive predictive values also had the highest positive likelihood ratios at both cut-off levels. The interincisal angle had a very large likelihood ratio at the SD2 level (12.88), which is difficult to explain. The increase in the likelihood ratio as the cut-off point was raised from SD1 to SD2 (with the exception of the LFH ratio) was due to a marked decrease in the false-positive rate (denominator of the likelihood ratio). The sensitivity, specificity, positive predictive value, negative predictive value, and likelihood ratio are descriptions of the diagnostic ability of cephalometric measurements at arbitrary cut-off points. The standard deviation level cut-off points were selected in this study because they approximate values used in the literature to identify patients with excessive vertical

ROC analysis of cephalometric measurements

941

tendencies, and they also allow for an objective comparison between different variables at single values. The ROC curve represents a superior form of data summary because it represents the diagnostic performance of a measurement for the full range of possible cut-off points and the elimination of the bias resulting from selection of a single value. It also allows for selection of a cut-off point at a position along the curve that maximizes the objectives of the test. When evaluated with the ROC curves, the ODI, the anterior facial height ratio, and the interincisal angle are the best diagnostic tests for open bite, whereas the PFH/AFH ratio, the gonial angle, the LFH ratio, and the symphysis angle discriminate least between patients with and without open bite (Table III). The ODI had the highest diagnostic value. The ROC curves appearing in Fig. 3 provide a comparison of the relationship between the ROC curve for each cephalometric measurement and the ROC curve for the ODI. Although the mandibular plane angle is a measurement that many clinicians consider a critical feature in assessing open-bite tendency, the angular measurements (SN-MP, OP-MP, PP-MP) incorporating the mandibular plane in this study were no better than average diagnostic tests for open bite, which coincides with Bjrrk's 35 observation that the indications of backward mandibular rotation do not include the mandibular plane angulation. However, the features that Bjrrk has identified as related to backward rotation were of mixed diagnostic value in this study. As mentioned previously, the symphysis angle performed poorly, whereas the interincisal angle was relatively effective. Schudy 32 evaluated the OP-MP angle (which he defined as the OM angle) in patients with excessive vertical dimension and compared it to the SN-MP angle and the interincisal angle. We cannot conclude from the present study that the OP-MP angle is more important or reliable 33 than several other commonly used measures, but our findings agree with Schudy's conclusion that "the OM angle is significantly related to overbite.'32 The use of a PFH/AFH ratio for diagnosis of facial type has been recommended by Siriwat and Jarabak. 42 Although Skieller et al. 4° found one form of this ratio to have the highest correlation with mandibular growth rotation over a 6-year period, in this study the PFH/AFH ratio was one of the weaker measures of the presence of an open bite (Table III). Our results support the work by Kim 45 on the relationship between the two-measurement index, which he defined as the overbite depth indicator, and the presence of anterior open bite. This is our second study

242

W a r d l a w et al.

(each used different samples) in which we c o n f i r m the strength of this relationship. 43 It is noteworthy that the study by K i m 45 was based on a statistical analysis of cephalometric relationships to overbite rather than on clinical impression or tradition. In addition, as a c o m bination o f two features, the O D I m a y reflect both skeletal and d e n t o a l v e o l a r variation that cannot be evaluated by single m e a s u r e m e n t s . The R O C analysis reported in this study c o u l d be extended by selection o f cut-off points for specific cephalometric m e a s u r e m e n t s to optimize the identification of patients at risk o f h a v i n g the condition in question. The selection of a c u t - o f f point requires an analysis of the costs and benefits associated with correct and incorrect diagnoses. 12.x3,23 In the case of open-bite tendency, one might w a n t to elect a cut-off point that m i n i m i z e s the n u m b e r o f false-negatives (patients with open-bite tendency that erroneously test negative), because it is relatively simple for the clinician to take precautions to a v o i d bite-opening mechanics. The avoidance o f increased treatment time and difficulty in closing the bite is a substantial benefit w h e n c o m p a r e d with the m i n i m a l costs o f unnecessarily taking these same precautions in a patient with no open-bite tendency w h o is w r o n g l y identified as having the t e n d e n c y (a false-positiv, e). T h e usefulness of this and n u m e r o u s other orthodontic applications o f the R O C m e t h o d should be the subject of further study. We thank Dr. Fay O. Wardlaw and his staff for their assistance in obtaining the records used in this study, Dr. Rebecca German for the helpful discussions concerning statistical analysis and study design, and the two anonymous reviewers for helpful suggestions. We also thank Rita Kuehler and Mary. F. Wardlaw for preparing the drafts of the manuscript.

REFERENCES 1. Moyer RE, Bookstein FL. The inappropriateness of conventional cephalometrics. AM J ORTHOD1979;75:599-617. 2. Hixon EH. Cephalometrics: a perspective. Angle Orthod 1982; 42:200-11. 3. Moss ML. Beyond roentgenographic cephalometry--what? AM J ORTHOD1983;84:77-9. 4. Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;240:1285-93. 5. Swets JA. The relative operating characteristic in psychology. Science 1973;182:990-1000. 6. Swets JA, Pickett RM. Evaluation of diagnostic systems. New York: Academic Press, 1982. 7. Hanley JA. Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagn Imaging 1989;29:30735. 8. Weinstein MC, Fineberg HV. Clinical decision analysis. Philadelphia: WB Saunders, 1980. 9. Griner PF, Mayewski RJ, ,Mushlin AI, Greenland P. Selection

Am. J. Orthod. Dentofac. Orthop. March 1992

and interpretation of diagnostic tests and procedures. Ann Intern Med 1981;94:553-92. 10. Robertson EA, Zweig MH, Van Steirteghem AC. Evaluating the clinical efficacy of laboratory tests. Am J Clin Pathol 1983;79:7886. 11. Kantor ML, Zeichner SJ, Valachovic RW, Reiskin AB. Efficacy of dental radiographic practices: options for image receptors, examination selection, and patient selection. J Am Dent Assoc 1989;119:259-68. 12. Metz CE. Basic principles of ROC analysis. Semin Nucl Med 1978;8:283-98. 13. McNeil BJ, Keeler E, Adelstein SJ. Primer on certain elements of medical decision making. N Engl J Med 1975;293:211-5. 14. Radack KL, Rouan G, Hedges J. The likelihood ratio: an improved measure for reporting and evaluating diagnostic test resuits. Arch Pathol Lab Med 1986;110:689-93. 15. Lusted LB. Logical analysis in roentgen diagnosis. Radiology 1960;74:178-93. 16. Lusted LB. Introduction to medical decision making. Springfield, II1.: Charles C Thomas, 1968. 17. Metz CE, Goodenough DJ, Rossmann K. Evaluation of receiver operating characteristic curve data in terms of information theory, with applications in radiography. Radiology 1973;109:297303. 18. Kassirer JP. The principles of clinical decision making: an introduction to decision analysis. Yale J Biol Med 1976;49:14964. 19. Swets JA. ROC analysis applied to the evaluation of medical imaging techniques. Invest Radiol 1979;14:109-21. 20. Simon TR, Neumann RL, Gorelick F, Riely C, Hoffer P, Gottschalk A. Scintigraphic diagnosis of cirrhosis: a receiver operating characteristic analysis of the common interpretive criteria. Radiology 1981;138:723-6. 21. Kelsey CA, Moseley RD Jr, Mettler FA Jr, Garcia JF, Parker TW, Briscoe DE. Comparison of nodule detection with 70-kVp and 120-kVp chest radiographs. Radiology 1982; 143:609-11. 22. Gohagan JK, Spitznagel EL, McCrate MM, Frank TB. ROC analysis of mammography and palpation for breast screening. Invest Radiol 1984;19:587-92. 23. Lusted LB. General problems in medical decision making with comments on ROC analysis. Semin Nucl Med 1978;8:299-306. 24. Erdreich LS, Lee ET. Use of relative operating characteristic analysis in epidemiology. Am J Epidemiol 1981;114:649-62. 25. Carson JL, Eisenberg JM, Shaw LM, Kundel HL, Soper KA. Diagnostic accuracy of four assays of prostatic acid phosphatase: comparison using receiver operating characteristic curve analysis. JAMA 1985;253:665-69. 26. Metz CE, Wang D-L, Kronman HB. A new approach for listing the significance of differences between ROC curves measured from correlated data. In: Deconink F, ed. Information processing in medical imaging. The Hague: Nijohoff, 1984:432-45. 27. Turner DA. An intuitive approach to receiver operating characteristic curve analysis. J Nuclear Med 1978;19:213-20. 28. Dunford-Shore B, German RZ. Orthodig: orthodontic digitizing environment. St. Louis: Washington University School of Dental Medicine, 1986. 29. Wilkinson L. Systat: the system for statistics. Evanston, I11.: Systat, Inc., 1986. 30. Wylie WL. The relationship between ramus height, dental height and overbite. AM J ORTHODORALSURG 1946;32:57-67. 31. Johnson EL. The Frankfort-mandibularplane angle and the facial pattern. AM J ORTHOD1950;36:516-33.

Volume 101 Number 3

ROC analysis of cephalometric measurements

32. Schudy FF. Cant of the occlusal plane and axial inclination of teeth. Angle Orthod 1963;33:69-82. 33. Schudy FF. Vertical growth versus anteroposterior growth as related to function and treatment. Angle Orthod 1964;34:75-93. 34. Subtelny JD, Sakuda M. Open bite: diagnosis and treatment. AM J ORTHOD 1964;50:337-58. 35. Bji~okA. Prediction of mandibular growth rotation. AMJORTHOD 1969;55:585-99. 36. Isaacson JR, Isaacson RJ, Speidel TM, Worms FW. Extreme variation in vertical facial growth and associated variation in skeletal and dental relations. Angle Orthod 1971;41:219-29. 37. Nahoum HI. Anterior open bite: a cephalometric analysis and suggested treatment procedures. AM J ORTHOD 1975;67:513-21. 38. Schendel SA, Eisenfeld J, Bell WH, Epker BN, Mishelevich DJ. The long face syndrome: vertical maxillary excess. AM J O~THOD 1976;70:398-408. 39. Opdebeeck H, Bell WH, Eisenfeld J, Mishelevich D. Comparative study between the SFS and LFS rotation as a possible morphogenic mechanism. AM J ORTHOD 1978;74:509-21. 40. Skieller V, Bj6rk A, Linde-Hansen T. Prediction of mandibular growth rotation evaluated from a longitudinal implant sample. AM J ORTHOD 1984;86:359-70. 41. Cangialosi TJ. Skeletal morphologic features of anterior open bite. AM J ORTHOD 1984;85:28-36. 42. Siriwat PP, Jarabak JR. Malocclusion and facial morphology: is there a relationship? Angle Orthod 1985;55:127-38. 43. Dung DJ, Smith RJ. Cephalometric and clinical diagnoses of open bite tendency. AM J ORTHOD DENTOFAC ORTHOP 1988; 94:484-90. 44. Riolo ML, Moyers RE, McNamara JA Jr, Hunter WS. An atlas of craniofacial growth. Monograph 2, Craniofacial Growth Series. Ann Arbor: Center for Human Growth and Development, University of Michigan, 1974.

243

45. Kim YH. Overbite depth indicator with particular reference to anterior open bite. AM J ORTHOD 1974;65:586-611. 46. Wilkinson L. Sygraph. Evanston, I11.: Systat, Inc., 1986. 47. McNeil BJ, Hanley JA. Statistical approaches to the analysis of receiver operating characteristics (ROC) curves. Med Decis Making 1984;4:137-50. 48. Bamber D. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. J Math Psychol 1975;12:387-415. 49. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29-36. 50. Dorfman DD, Alf E Jr. Maximum likelihood estimation of parameters of signal detection theory and determination of confidence intervals rating method data. J Math Psychol 1969;6:48796. 51. Dorfman DD, Berbaum KS. RSCORE-J: pooled rating-method data: a computer program for analyzing pooled ROC curves. Behav Res Methods, Instruments, Comput 1986;18:452-62. 52. Berbaum KS, Franken JR, Dorfman DD, Barloon TJ. Influence of clinical history upon detection of nodules and other lesions. Invest Radiol 1988;23:48-55. 53. Berbaum KS, Dorfman DD, Franken JR. Measuring observer performance by ROC analysis: indications and complications. Invest Radiol 1989;24:228-33. 54. Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986;21:720-33.

Reprint requests to: Dr. Richard J. Smith Department of Anthropology Washington University St. Louis, MO 63130

BOUND VOLUMES AVAILABLE TO SUBSCRIBERS

Bound volumes of the AMERICAN JOURNAL OF ORTHODONTICS AND DENTOFACIAL ORTHOPEDICS are a v a i l a b l e to s u b s c r i b e r s (only) for the 1992 issues f r o m the P u b l i s h e r , at a cost o f $ 5 3 . 0 0 ($67.71 C a n a d a a n d $ 6 4 . 0 0 i n t e r n a t i o n a l ) f o r Vol. 101 ( J a n u a r y - J u n e ) a n d Vol. 102 ( J u l y - D e c e m b e r ) . S h i p p i n g charges are inc l u d e d . E a c h b o u n d v o l u m e c o n t a i n s a subject a n d a u t h o r i n d e x a n d all a d v e r t i s i n g is r e m o v e d . C o p i e s are s h i p p e d w i t h i n 60 days after p u b l i c a t i o n o f the last issue in the v o l u m e . T h e b i n d i n g is d u r a b l e b u c k r a m w i t h the j o u r n a l n a m e , v o l u m e n u m b e r , a n d y e a r s t a m p e d in g o l d o n the spine. Payment must accompany all orders. C o n t a c t M o s b y - Y e a r B o o k , I n c . , S u b s c r i p t i o n S e r v i c e s , 11830 W e s t l i n e I n d u s t r i a l D r i v e , St. L o u i s , M O 6 3 1 4 6 - 3 3 1 8 , U S A ; t e l e p h o n e ( 3 1 4 ) 4 5 3 - 4 3 5 1 or (800)325-4177. Subscriptions must be in force to qualify. Bound volumes are not available in place of a regular Journal subscription.

Cephalometrics of anterior open bite: a receiver operating characteristic (ROC) analysis.

A new approach is presented for the evaluation of cephalometric measurements in which a measurement is considered to be a diagnostic test for the pres...
864KB Sizes 0 Downloads 0 Views