STATISTICS IN MEDICINE, VOL. I I. 1719-1729 (1992)

IMPACT OF MEASUREMENT ERROR A N D TEMPORAL VARIABILITY ON THE ESTIMATION OF EVENT PROBABILITIES FOR RISK FACTOR INTERVENTION TRIALS JAMES D. NEATON AND G L E N N E. BARTSCH Dirision of Biostatisrics, School of Public Health, University of Minnessoia, 2221 Uniuersiiy Atvnue SE. StiitrJ -700. Minneapolis. M N 55414. U S .A .

SUMMARY The impact of measurement error and temporal variability of risk factors on estimates of disease probabilities based on the logistic function is discussed. Monte Carlo results and empirical findings from the Multiple Risk Factor Intervention Trial indicate that the degree of attenuation of logistic parameter estimates is well approximated by the reliability coefficient when the errors are assumed to be normal random variates and event probabilities are small. In the design of intervention studies, measurement error and temporal variability of risk factors d o not usually influence estimates of the probability of developing the disease in the control group. but can result in mis-estimation of the probability of developing the disease in the experimental group, substantially reducing the statistical power of the clinical trial.

INTRODUCTION 'Prospective studies such as the Framingham Heart Study' are a valuable data resource for the design of clinical trials, and have been used to design recent coronary heart disease (CHD) intervention studies. In the design of both the Multiple Risk Factor Intervention Trial (MRFIT) and the Lipid Research Clinics Coronary Primary Prevention Trial (CPPT), control (p,) and experimental group (p,) event rates were estimated using the logistic model applied to the age-sex eligible subjects free of pre-existing C H D in the Framingham Heart The logistic model has also been proposed for the design of secondary prevention trials involving risk factor m~dification.~ The use of the logistic model with data sets such as Framingham is appealing, since the strength of the risk factor association with the disease endpoint can be measured and used to predict relative reductions in the incidence of disease corresponding to hypothetical risk factor changes. For many clinical trials, including the MRFIT and CPPT, the risk factors under study are subject to random errors due to measurement error and temporal variability (deviations from usual levels due to environmental or physiological factors). The purpose of this report is to consider the impact of this variability on estimates of control and experimental group event probabilities using the logistic model. Recent related work includes that by Carroll et ~ l . , ~ Prentice,6, Stefanski and Carroll' and Rosner et d9who considered the impact of measurement error on parameter estimates for the logistic and similar models.

'

0277-67 15/92/13 1719-1 1$10.50 Wiley & Sons, Ltd.

0 1992 by John

Received 1989 Revised M a y 1992

1720

J. D. NEATON AND G. E. BARTSCH

ESTIMATION OF EVENT RATES FOR MRFIT AND CPPT The investigators who designed MRFIT and CPPT employed similar procedures for estimating p c and p e . An outline of the steps taken is as follows: 1. Logistic regression coefficients corresponding to the risk factor(s) under investigation were estimated for the primary outcome of the study. For MRFIT this outcome was death from CHD within 6 years and the risk factors were diastolic blood pressure (BP), blood cholesterol and cigarette smoking; for CPPT the outcome was C H D death or non-fatal myocardial infarction within 7 years and the risk factor was blood cholesterol. The parameter estimates were obtained using a group of subjects in the Framingham Heart Study of the same sex, in the same age group, who had not previously experienced the event at the 'baseline' examination. 2. The estimated logistic regression coefficientswere then used to calculate &for subjects who met the eligibility criteria. To estimate pe, each subject's risk factor level Ziwas reduced by an amount thought achievable by the planned intervention, and the logistic model was reapplied. Values of p c and pe were then averaged over the Framingham subjects who met the risk factor eligibility criteria. For example, for the n Framingham subjects meeting the CPPT risk factor eligibility criteria (cholesterol levels > 295 mg/dl) the following logistic functions were e v a l ~ a t e d : ~

assuming that cholesterol levels would be reduced by 4 per cent in patients given placebo and dietary advice (control group) and by 28 per cent in patients given the lipid lowering drug, cholestyramine, and dietary advice (experimental group). 3. These values of p c and p e were used for estimating sample size, but p e was first increased to allow for non-adherence to the experimental treatment and for the possibility that the intervention would not immediately reach its full impact on the disease process, using the model of Halperin. There are two problems with this approach to estimating pE and p c . Firstly, logistic parameter estimates corresponding to blood cholesterol (in both studies) and diastolic BP (in MRFIT) are attenuated due to temporal variability and measurement error of the regressor variables,' and consequently the expected impact of changes in these risk factors is underestimated. Secondly, hypothesized differences in risk factors due to treatment are stated in terms of 'true' values, but the computational procedure for estimating pe was based on reductions in observed screening levels. These observed levels may be spuriously high at initiation of intervention if high risk subjects are selected, owing to regression to the mean.'' Changes in these risk factors prior to randomization for participants in the MRFIT and CPPT are shown in Table I. The impact of regression to the mean was reduced in CPPT by requiring lipid levels to be elevated at multiple screening visits prior to randomization. However, in MRFIT, where only a single visit was used to determine risk factor eligibility, regression to the mean was substantial. In the design of MRFIT it was assumed that diastolic BP would be reduced by 10 per cent as a result of the intervention.2 From Table I it can be noted that much of this hypothesized decrease occurred prior to randomization.

-'

1721

MEASUREMENT ERROR A N D TEMPORAL VARIABILITY

Table I. Risk factor levels at screening visits prior to randomization in the MRFIT and CPPT Screening visit prior to randomization 1

2

3

99.2 253.6

92.9 240.4

92.0 NA

5.676

5.638

NA

MRFIT*

Diastolic BP (mm Hg)* Blood cholesterol (mg/dl)t CPPTS

log blood cholesterol (mg/dl) ~

NA = not available * Blood pressures shown are the average of two readings with a standard sphygmomanometer. t Cholesterol was determined from serum at first screening and from plasma at second screen. Since plasma values are 3 per cent lower than serum values,” the drop in cholesterol for MRFIT participants between first and second screen due to regression to the mean is estimated as 6 mg/dl. $ Taken from Reference 13.

LOGISTIC REGRESSION WITH REGRESSOR VARIABLES MEASURED WITH ERROR Consider a clinical trial of two treatments in which the endpoint is a binary variable, say the presence or absence of a disease event after a specified period of follow-up, such as death from CHD within 6 years. We assume that the experimental treatment involves the reduction of a single risk factor and that subjects with elevated levels of the risk factor are to be studied. Subjects are followed for a fixed period, and the probability of the ith subject experiencing an event during the follow-up period (pi)will be ideally modelled using the linear logistic function:

Pr( Yi= 1IX,) = p i = (1

+ exp( - fl;

-

flt(Xi - pX)))-’,

(3)

where Yi= 1 denotes the occurrence of an event, X iis the ‘true’ regressor variable for the ith subject and f l T is the corresponding regression coefficient. flz is the regression coefficient corresponding to the intercept term. When risk factors are subject to temporal variability and measurement error, X may not be observable. Let X be the subject’s ‘true’ risk factor level, for example blood pressure; let e be the error resulting from measurement of the risk factor and temporal variation; and let 2 be the subject’s observed risk factor level at a single screening, Z = X e. Subscripts are dropped to simplify notation. Assume a structural model as in Carroll et aL5 in which X is sampled from a normal distribution with mean p x and variance 02. Also assume that e normal ( O , O ~ and ) that X and e are independent. Then Z is distributed with mean pz = p x , and variance O; = a: + 03. Since X may not be observable one usually assumes that Y given Z is linear logistic:

+

-

Pr(Y=

llz)= (1 + exp( Po -

-

P1(Z- p Z ) ) ) - l ,

(4)

even though (3) and (4) are incompatible. Rosner et aL9 have shown that when disease incidence is low, fl, is an approximate estimate of Pry, where

The term y has been called the coeficient of reliability or the coejicient of generali~ahility.’~ Thus, the slope involving the true risk variable fl: in terms of the slope for the observed variable fll is

1722

J. D. NEATON AND G . E. BARTSCH

given by fl: = 7 - ‘PI, so fi? is greater than PI in absolute value. The relationship between p: and P1 for the linear structural model in which X is assumed random was considered by Waldl’ and more recently reviewed by Cochran16 and Fuller.” Estimates of 8: are more variable than 8, since var(fi) = ;‘-2var(b1). The mean squared errors (MSEs) of and b1about are 7-2var(b,) + [ y - ’ E ( B , ) - fl:I2 var(bJ

and

(6)

+ CE(b-1) - P?12?

(7)

(by)< M S E O , ) when

respectively. When 7 - l perfectly corrects for the bias the MSE

or

To consider the impact of measurement error on the predicted probability of developing the disease endpoint (pc),consider the expected values of the log odds of the disease endpoint based on true (T) and observed (0)risk factor levels, E ( f i o y-’fil(X - p x ) ) and E ( P o Pl(Z - p z ) ) , respectively. In a study in which subjects are chosen at random from the population, both of these have expected value equal to Po since E ( Z ) = E ( X ) = pz. The expectations, E ( P o + 7 - ‘B1(X- p x ) l Z > b ) and E(Po P l ( Z - p z ) ) Z b), corresponding to the selection of subjects with observed values of the risk factor Z greater than b (high risk subjects), are also identical since

+

+

+

E ( X ) Z > b) = E [ E ( X I Z ) / Z> b ] = 1jE(Z(Z> b).

(10)

Thus, both for subjects chosen at random from the population and for high risk subjects, the expected log odds of the disease endpoint is approximately the same whether the observed risk level Z and regression coefficient P1 are used, or whether the ‘true’ risk level X and regression coefficient 8: are used. The experimental group event rates based on observed and true risk factor levels will be denoted by peo and peTrespectively. The subscript 0 or T will denote whether the treatment effect is estimated based on observed (as in CPPT and MRFIT) or ‘true’ risk changes. Let 6 denote the hypothesized treatment effect, for example, the lowering of cholesterol or BP as a result of the intervention. For a single risk factor, these estimates of pe for the ith subject can be written as Peo = (1

+ ~ X P-( Bo - Pl(Z, - P Z

peT = (1

+ exp(

-

Po

-

7-’PI(Xi

-

-

&)I-’

px

- &))-I.

(1 1) (12)

Furthermore,

approximately. Thus for fixed values of 6 = do = aT, the use of observed risk levels underestimates the influence of risk factor change on disease risk. In some situations, however, 6, will be less than So. If hypothesized risk factor changes are expressed as percentage reductions from screening rather than ‘true’ risk factor levels, this may result in bT < do as a result of regression to the mean. Specific examples are given below.

MEASUREMENT ERROR AND TEMPORAL VARIABILITY

-

Table 11. Results of Monte Carlo study for the logistic regression = 1/(1 + e- 1.40- 1.34X, ), X N(O,O.lO)

Reliability coefficient 0.I 0.2 0.3 0.4

0.5 06 0.1

0.8 0.86* 09 1.o

Estimated logistic parameters (SD) 0.900 0.400 0'233 0.150 0.100 0.067 0.043 0.025 0.017 0.011 OW0

(SD)

B O W )

0:

81

- 1.350 (0.073) - 1.354 (0.072) -

1.365 (0.076)

- 1.374 (0.072) - 1.378 (0.077) - 1.383 (0.080) - 1.396 (0.073) - 1.396 (0.073) - 1.396 (0.076) - 1.399 (0.077) - 1.407 (0.076)

0.140 (0.072) 0.261 (0.105) 0.385 (0.123) 0.521 (0.137) 0.666 (0.162) 0.791 (0182) 0.930 (0194) 1.074 (0.218) 1.140 (0.213) 1.199 (0.230) 1'346 (0'234)

model:

Inflated parameter estimate (Y - I B 1 )(SD)t 1.40 (0.72) 1.30 (0.53) 1.28 (0.41) 1.30 (0.34) 1.33 (0.32) 1.32 (030) 1.33 (0.28) 1.34 (0.27) 1.33 (0.25) 1.33 (0.26) 1.35 (0.23)

1723 Prob( Yi= 1]Xi)

Difference Y-lBl

- B1

0.06 - 0.04 -

0.06

- 0.04 - 0.01 - 0.02 - 0.01 0.00 -

0.01

- 0.01 0.0 1

SD = standard deviation. * Value for y used by Carroll et aL5 t 7 - ' SDCP,).

RESULTS

Influence of measurement error on the regression coefficient To compare how well y approximates the attenuation of PT for the logistic model, we compared the observed attenuation for logistic parameter estimates with the estimated attenuation based on the reliability coefficient, using Monte Carlo methods and also using data from MRFIT. For the Monte Carlo study of the logistic model we chose parameters similar to those used by Carroll et al.' We let = - 1.40, j?T = 1.34 and X N(0, OlO), and we varied y from 0.1 to 1.0 to obtain 02 = &(I - y ) y - A sample size of 1200 was used and for each value of y we generated 400 simulated data sets from which estimates of Po and p1were obtained, and then averaged. For each simulation, X iand ei ( i = 1, . . . , 1200) were generated using SAS RANNOR,'* Z i was determined as X i+ ei,and the dependent variables Yiwere generated by

-

'.

+

1 if (1 exp( - p8 - p 7 X J - l 3 U(0, 1); Y i = { 0 otherwise.

U(0, 1) denotes a uniform (0,l) random number generated with SAS RANUNI.'* Po and p1were estimated with the SAS LOGIST procedure" from the Yiand Zigenerated. The results of this Monte Carlo study are given in Table 11. For all levels of the reliability coefficient y, the deattenuated parameter estimate y-lg, is similar (within 5 percent) to the value of (1.34) set in the simulation. For y = 0.86, the value used by Carroll et al., our findings are similar to those for the probit model.' The logistic parameter p: was underestimated by the same amount, 0.20, as in Carroll et d ' s probit simulation. Carroll et d ' s error-in-variables estimate was 1.36 compared with 1.33 for y - l g l in our simulation. Thus, for this particular example, the approximations of p: based on the reliability coefficient are very good. The estimates which correct for the attenuation are more variable than the usual logistic estimates (Table 11). For this particular example the bias dominates and MSE (y-'P1) is smaller

1724

J. D. NEATON AND G . E. BARTSCH

Table Ill. Estimated logistic regression coefficients corresponding to diastolic blood pressure (mm Hg) for the endpoint CHD death in 6 years for men in MRFIT control group

No. of visits

No. of readings

Estimated logistic regression coefficient

Reliability

coefficient

Adjusted regression coefficient

Difference

M

N 1

2

0.62 0.65

0.0316 (0.0128) 0.0325 (0.0134)

0.0331

1 2

0.76 0.79

0.0390 (0.0142) 0.0419 (0.0146)

0.0387 0@%03

1

(B1)

-

0.0006 0.0003 0.00 16

'B1

than MSE for all values of y. For y = 0.86, the MSE estimates for y and B1are 0.061 and 0.085, respectively. To assess how well approximated the true parameter value in a more realistic situation, an empirical investigation using data from MRFIT was also carried out. Using the mortality follow-up data for men in the control group of MRFIT who were not taking antihypertensive medication prior to randomization, the logistic regression of C H D death in 6 years on diastolic BP was estimated using baseline BP measurements of varying reliability. Two measurements of BP were made at each of two baseline visits 1-3 months apart. The reliability was varied by (1) using a single recording of BP; (2) averaging two BP readings at a single visit; (3) averaging two readings, one from each of the two visits; and (4) averaging all four readings for each participant. The increases in the logistic regression coefficient for diastolic BP were compared with those predicted using the estimated relative reliability of the four measures (Table 111).The reliability of each measure was determined using variance components for diastolic BP which were estimated from these data. The estimated variance components were a; = 58.4, a$ = 26.1 and a: = 10.2, where a; is the between-subject component of variance, a: is the between-visit component, and C T ~is the component between readings at the same visit. In terms of these variance components the reliability coefficient for N visits and M readings per visit is given by

","Bl

7N.M

= a:(a:

+ a:N-' + a , 2 ( N M ) - ' ) - ' .

(14)

Table 111 shows that the estimated regression coefficient for diastolic BP increased when two readings taken at a single visit were averaged and when readings from two visits were averaged. The increases noted from that based on a single reading were similar to those expected based on / y the ~ ,four sets of measurements used. the ratio of reliability coefficients ( ~ ~ , ~ for Influence of measurement error on p E and p e In Figure 1 the probability of CHD is plotted using estimated regression coefficients shown in Table 111. Also shown is the logistic function assuming a$ = of = Ofor diastolic BP. The assumed normal distributions for diastolic BP are drawn beneath the abscissa. Note that the lines all pass through the same CHD death rate 8.7 per 1OOO) at p x = 91 mm Hg, the average BP level at entry for these participants in MRFIT. The greater the reliability of the BP in the regression model, the steeper the curve. In examining these curves it is important to note that while the abscissa is

MEASUREMENT ERROR A N D TEMPORAL VARIABILITY

8

-

40

-

36

-

32-

0

9

1725

28

-

8

a"

8

24-

0

5 a

4

2016 -

n 12 -

r

0

8 4 -

0

50

70

100

110

130

120

Figure 1. Probability of CHD death in 6 years for MRFIT usual care participants according to diastolic BP level (mm Hg): . . . . . . . 'true' level (ui = 58.4 (mm Hg)2); 2 readings at 2 visits (ut = 74 (mm Hg)'; - - - -- 1 reading at 1 visit (ui= 94.7 (mm Hg)z) ~

-

labelled diastolic BP, the BP readings for each line have different reliabilities. Substitution of 'true' levels above the population mean into the observed model will underestimate risk while substitution of observed levels into the function based on 'true' measurements will overestimate risk. This point has also been noted by Wald,ls Carroll et a/.,' Stefanski and Carroll' and Fuller.' In estimating event probabilities for the design of a clinical trial it is important therefore to substitute risk factor measurements into the logistic function which have the same reliability as the measurements which were used to estimate the parameters of the function. Although the logistic function based on observed risk factor measurements can be used to obtain unbiased estimates of pF, estimates of p e based on hypothesized risk factor reductions should be based on a model which accounts for the measurement error and temporal variability of the risk factors. The impact of measurement error on estimates of p c is evident from the regression coefficients in Table 111. Using the estimate corresponding to a single BP reading (0.0316) the relative odds of CHD death in 6 years corresponding to a 10 mm Hg reduction in diastolic BP is exp[00316( - lo)] = 0.73. Since 6 year event probabilities are small, the relative odds approximates the relative risk and one minus this odds ratio x 100 gives an estimate of the percentage reduction in risk (27 per cent). For the estimate based on the average of four readings, two at each of two visits 1-3 months apart (00419, see Table III), the corresponding estimated percentage reduction in disease risk is 34 per cent. If the reliability coefficient is used to approximate the 'true parameter estimate, a 40 per cent reduction in CHD risk is estimated. Thus

1726

J. D. NEATON AND G . E. BARTSCH

Table IV. Monte Carlo results for hypothetical BP trial: estimates of pE,p e , sample size and power Screening rules B

A

Screen 1 Screen 2 Screen 3 (observed level at screen 1) d (‘true’ level)

2 95

2 95

2 95

-

b 95

95 2 95 101.7 98.9 0.0624 10.2 8.4 0.0450 0.0402 7044 0.99

-

z

99.7 9 41 0.0584 10.0 4.2 0,0423 0.0469 7742 0.64

PCO

bo* 6,* Pe0*

PeT*

Total sample size (2N) using pco and pco Power using Per

C

-.

101.0 97.3 0.0609 10.1 6.9 0.0440 0.0425 7303 0.94

* 6, and 6, are average differences (mm Hg) between treatment and control groups used for estimates of pe0 and peT.6, corresponds to a 10 per cent reduction in observed diastolic BP (2);8, corresponds to a 10 per cent reduction in ‘true’ diastolic BP ( X )if X > 95 mm Hg.

the impact of BP lowering on disease risk can be substantially underestimated due to attenuation of the logistic parameter estimates. In the above illustration of the impact of measurement error on estimates of pe it was assumed that So = ST.As noted previously, if ST < So the impact of the proposed risk factor intervention .on disease risk may be overestimated and peo may be less than peT.For example, it is common to assume that the treatment will lower the risk factor level by an amount which depends on the baseline level. This was done in both CPPT and MRFIT. In the CPPT do = 0.282, and in MRFIT for diastolic BP do = 0.1OZ for Z 2 95 mm Hg and 0 otherwise.2 In both studies pe was estimated using observed risk levels which were subject to regression to the mean as illustrated in Table I. If ‘true’ levels were available and used, it is clear that ST < So, and a reduction in power would be expected. In CPPT, ST is similar to So since multiple screening visits were used to establish risk eligibility and all patients were given cholestyramine or placebo even if there was further regression of cholesterol levels. However, in MRFIT eligibility was based on screening BP levels but treatment for the intervention group was based on an average of BP readings recorded at visits following randomization into intervention or control groups. While 82 per cent of participants screened had screening levels of BP 2 95 mm Hg and were expected to be treated immediately by design, only 62 per cent of participants were actually treated at the end of 6 years of follow-up.2oThus ST would be considerably less than So. To illustrate the impact of regression to the mean on estimates of pera study was simulated. The association of BP (average of two readings at a single visit) with all cause montality in 6 years was estimated for MRFIT data and estimates of Po = - 6.169 and PI = 0.0339 were found. The reliability of the BP measurements used to estimate and was assumed to be 0.65 (see Table HI),and the ‘true’ levels ( X ) were assumed to have mean 84 and variance 58.4, the mean of 84 mm Hg corresponding to the average level for men screened in MRFIT.21 As in the previous section, the variance of BP readings between visits and between readings at the same visit were assumed to be 26.1 and 10.2 (mm Hg)’, respectively. This simulation study examined three different BP eligibility rules and results are given in Table IV. For each screening rule, risk levels for 10,OOO eligible participants were simulated. The

B0

MEASUREMENT ERROR AND TEMPORAL VARIABILITY

1727

value of p c was determined using simulated observed risk factor levels for the initial screen:

Values of p e were estimated using both observed and 'true' risk levels. The estimate of pe based on observed risk levels was estimated as

&c 1

Peo =

10.000 I=

1

1

1 + e-a"-:lcZ,-o.lOzi,~

It was assumed for Peo that a 10 per cent reduction in the initial screen BP would be achieved with treatment. The estimate of p e based on 'true' BP level assumed that a 10 per cent reduction in 'true' BP would be obtained with treatment among participants who had 'true' levels at entry 2 95 mm Hg. For participants with simulated 'true' BP levels < 95 mm Hg, it was assumed that no reduction due to treatment would be obtained. The value of jieT was obtained by first estimating the average treatment difference (3,). Then the relative odds of death in 6 years corresponding to a reduction in diastolic BP equal to 8, was estimated as exp[b,y-'( - &)I; then (17) P e T = ~cO{expCbly-'( - ST)]}. With a single eligibility visit the percentage difference between pco and jieo is 28 per cent while the percentage between jico and peT is 20 per cent. As the number of BP eligibility visits increase, 'higher risk' subjects are chosen, that is subjects with screening levels closer to 'true' levels, and peT is less than pea. The total sample size for two groups ( 2 N ) assuming equal allocation was determined using the following approximate formula:

where Z 1 is a constant corresponding to the significance level, Z1- B is another constant corresponding to power, and ji = ( p c o peO)/2. This sample size with type I error = 0.05 (two-sided test) and power = 0.90 using jjco and peo is similar for these three designs, ranging from 7044 to 7742; however, the power, determined by solving for Z 1- B in the above formula, based on these fixed sample sizes and using Per instead of PeO, is considerably less than 090 for the first screening rule (A) and greater than 0.90 for rules B and C (see Table IV). For the design based on a single eligibility visit (similar to that used for MRFIT), the sample size of 7742 subjects was based on a presumed treatment difference of 28 per cent and power of 0.90, but since the treatment protocol is based on 'true' rather than observed risk levels, a better estimate of the treatment difference is 20 per cent istead of 28 per cent, and power based on this difference and 7742 subjects is 0.64 instead of 0.90. Thus even though p1 is underestimated by approximately 35 percent, power is reduced due to regression to the mean. A similar Monte Carlo study was carried out for the MRFIT study in which risk eligibility, based on level of BP, serum cholesterol and cigarette smoking, was determined at a single screening visit; we estimate that power was reduced 1&15 per cent as a result of regression to the mean.

+

CONCLUSIONS As a result of measurement error and temporal variability, associations of risk factors, such as blood pressure and serum cholesterol, with disease endpoints can be substantially underestimated. As illustrated here, using data from the MRFIT, this bias can be reduced by using the

1728

J. D. NEATON AND G . E. BARTSCH

average of multiple readings of risk measurements to study relationships with disease endpoints. As recently illustrated by MacMahon et aL2* this bias can also be reduced by classifying individuals into risk factor categories using one measurement and then using the average of a subsequent measurement within each of the categories defined by the initial measurement to quantify the disease/risk-factor association. In the absence of multiple measurements at different points in time, statistical models to account for the bias due to measurement error in binary regression models have been developed.*, In our simulations and empirical investigations we found that the inverse of the reliability coefficient was an excellent approximation to the degree of bias. Further work is necessary to evaluate the approximations considered here for highly skewed predictors and for very strong predictors. The attenuation of regression coefficients due to measurement error and temporal variability can impact the estimation of event probabilities used to design intervention trials. In most situations the bias due to measurement error results in conservative estimates of pe since the impact of the risk factor intervention on the disease endpoint is underestimated; however, in intervention studies in which participants with elevated risk factor levels are selected, failure to take into account intra-subject variability of risk factor measurements and the resultant regression toward the mean can also result in an underestimation of p e and overestimation of power. Estimates of p e should be based on ‘true’ risk levels. Unfortunately ‘true’ levels are not observable and in many studies only single measurements of risk factors are available for estimating logistic parameters and event rates. To overcome this problem we have estimated p e using Monte Carlo methods. This approach allows the trial designer easily to consider the influence of a number of parameters in addition to the size of the intervention effort 6, for example, the distribution of the risk factor and the reliability coefficient. Many other factors should also be considered in estimating event rates for intervention trials. The estimated regression coefficients are subject to sampling variability and, depending on the data set used, the confidence interval for the parameter estimates may be wide. For example, the coefficients used to design MRFIT were estimated from a group of Framingham men among whom only 40 coronary deaths had occurred in 6 years. In Reference 4, 95 per cent confidence intervals for sample size estimates are derived taking into account the sampling variability of the logistic parameter estimates. In summary, measurement error and temporal variability of risk factors result in a deflation of estimated logistic regression coefficients. The magnitude of deflation appears to be well approximated by the reliability coefficient of the risk variable. However, error-in-variables estimates are associated with larger variance than estimates that do not account for measurement error,5. and caution should be exercised in inflating parameter estimates determined from small samples. Underestimation of the regression coefficient does not influence estimates of p c as long as the risk levels substituted into the logistic function have the same reliability as those that were used to estimate the parameters. Estimates of p e should be based on ‘true’ risk levels using regression coefficients adjusted for measurement error and using treatment effects defined in terms of ‘true’ levels. ACKNOWLEDGEMENTS

The authors wish to thank Thomas A. Louis for his comments and discussions on many aspects of this paper, Deb Sampson and Margie Andrews for typing the manuscript, and two referees for valuable assistance. This work was supported, in part, by contract NOI-HV-22971-17 from the National Heart, Lung, and Blood Institute.

MEASUREMENT ERROR A N D TEMPORAL VARIABILITY

1729

REFERENCES 1. Gordon, T. and Kannel, W. E. ‘The Framingham Study: introduction and general background to the Framingham Study’, Sections 1 and 2, Framingham Monograph, National Heart and Lung Institute, Bethesda, Maryland, 1968. 2. Multiple Risk Factor Intervention Trial Group. ‘Statistical design considerations in the NHLI Multiple Risk Factor Intervention Trial (MRFIT)’, Journal of Chronic Diseases, 30, 261-275 (1977). 3. Lipid Research Clinics Program. ‘The Lipid Research Clinics Coronary Primary Prevention Trial: design and implementation’, Journal of Chronic Diseases, 32, 609-631 (1979). 4. Coronary Drug Project Research Group. ‘Implications of findings in the coronary drug project for secondary prevention trials in coronary heart disease’, Circulation, 63, 1342-1350 (1981). 5. Carroll, R. J., Spiegelman, C. H., Lan, K. K. G., Bailey, K. T. and Abbott, R. D. ‘On errors-in-variables for binary regression models’, Biometrika, 71, 19-25 (1984). 6. Prentice, R. L. ‘Covariate measurement errors and parameter estimation in a failure time regression model’, Biometrika, 69, 33 1-342 (1982). 7. Prentice, R. L. ‘On the ability of blood pressure effects to explain the relation between oral contraceptives and cardiovascular disease’, American Journal of Epidemiology, 127, 213-219 (1988). 8. Stefanski, L. A. and Carroll, R. J. ‘Covariate measurement error in logistic regression’, Annals of Statistzcs, 13, 1335-1351 (1985). 9. Rosner, B., Willett, W. C. and Spiegelman, D. ‘Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error’, Statistics in Medicine, 8, 1051-1069 (1989). 10. Halperin, M.. Rogot, E., Gurian, J. and Ederer, F. ‘Sample sizes for medical trials with special reference to long term therapy’, Journal of Chronic Diseases, 21, 13 (1968). 11. Gardner, M. J. and Heady, J. A. ‘Some effects of within person variability in epidemiological studies’, Journal of Chronic Diseuses, 26, 781-793 (1973). 12. Widdowson, G . M., Kuehneman, M., DuChene, A. G., Hulley, S. B. and Cooper, G. R. ‘Quality control of biochemical data in the Multiple Risk Factor Intervention Trial: Central Laboratory’, Controlled Clinical Trials, 7, 175-335 (1986). 13. Davis, C. E. ‘The effect of regression to the mean in epidemiologic and clinical studies’, American Journal of Epidemiology, 104, 493498 (1976). 14. Cronbach, L. J. Essentials ofPsychological Testing, Harper and Row, New York, 1970. 15. Wald, A. ‘The fitting of straight lines if both variables are subject to error’, Annals of Mathematical Statistics, 11, 284-300 (1940). 16. Cochran, W. G. ‘Errors of measurement in statistics’, Technometrics, 10, 637-666 (1968). 17. Fuller, W. A. Measurement Error Models, Wiley, New York, 1987. 18. SAS Institute Inc. SAS User’s Guide: Basics, Version 5 Edition, SAS Institute Inc., Cary, NC, 1985. 19. SAS Institute Inc. SUGI Supplemental Library User’s Guide, 1983 Edition, SAS Institute Inc., Cary, NC, 1983. 20. Grimm, R. H., Cohen, J. D., Smith, W. M., Falvo-Gerard, L. and Neaton, J. D. ‘Hypertension management in the Multiple Risk Factor Intervention Trial (MRFIT): six year intervention results for special intervention and usual care men’, Archioes Internal Medicine, 145, 1191-1 199 (1985). 21. Neaton, J. D., Kuller, L. H., Wentworth, D. and Borhani, N. 0. ‘Total and cardiovascular mortality in relation to cigarette smoking, serum cholesterol concentration, and diastolic blood pressure among black and white males followed up for five years’, American Heart Journal, 108, 759-769 (1984). 22. MacMahon, S., Peto, R., Cutler, J.. Collins, R., Sorlie, P., Neaton, J., Abbott, R., Godwin, J., Dyer, A. and Stamler, J. ‘Blood pressure, stroke and coronary heart disease. Part I: Effects of prolonged differences in blood pressure: evidence from nine prospective observational studies corrected for the regression dilution bias’, Lancet, 335, 765-774 (1990).

Impact of measurement error and temporal variability on the estimation of event probabilities for risk factor intervention trials.

The impact of measurement error and temporal variability of risk factors on estimates of disease probabilities based on the logistic function is discu...
727KB Sizes 0 Downloads 0 Views