American Journal of Epderrotogy Copyright C 1991 by The Johns HopWns University Schod of Hygiene and Pubic Health AJ rights reserved

Vol. 133, No. 12 Printed in U S A.

LETTERS TO THE EDITOR RE: "TOTAL ENERGY INTAKE: IMPLICATIONS FOR EPIDEMIOLOGIC ANALYSES" Willett and Stampfer (1) emphasized the importance of adjusting for total energy intake when measuring the association between specific dietary factors and disease. The adjustment procedure they proposed, using nutrient residuals calculated by regressing nutrient intake on total energy intake, has been adopted widely in analyses of dietary data. It has been used in validation studies and to construct energy-adjusted nutrient intake covariates in models with disease outcome as the response. Willett and Stampfer's paper (1) also initiated a useful, more technical discussion about the different statistical methods used for energy adjustment. Subsequently, Willett in his book Nutritional Epidemiology (2) devoted a full chapter to implications of total energy intake, where he accounted for several of the methodological issues that had been raised in this discussion. However, a point which has not been raised previously is that the Willett and Stampfer adjustment procedure may introduce attenuation bias in the diet-disease association. The argument goes as follows: The adjustment procedure results in two variables, the nutrient residual and total energy intake, which by definition are uncorrelated with each other. This zero correlation may lead the investigator into thinking that total energy intake can no longer be a potential source of bias for the nutrient-disease relation, and as a result total energy intake is not considered as a covariate in the regression model. The argument is justified when a linearregressionmodel is used. For logistic regression, however, attenuation bias is introduced in the parameter for the nutrient residual if total energy intake is not included as a covariate when it has an independent effect on disease outcome. We raise this issue in the context of diet-disease associations for two principle reasons. First, total energy intake is not mentioned as a potential covariate in the empirical diet-disease investigations where nutrient residuals have been used (e.g., refs. 3 and 4). This is likely to reflect the fact that this form of attenuation bias is not generally appreciated in this area of research. Second, it has been stated that "in regression analysis (either logistic or ordinary regression), when caloric intake is associated with disease, the standard error of theregressioncoefficient will be greater for the 'calorie-adjusted' nutrient value than for the nutrient value when caloric intake is

entered as a separate term, although the regression coefficients will be similar" (5, p. 983). While increased efficiency may be a correct motivation for including total energy as a covariate in linear regression, the more important criterion of unbiasedness must be considered in logistic regression. The attenuation bias arising from leaving out total energy as a covariate in a logistic model can, in realistic situations, be as large as 20 percent. A compelling example stems from a reanalysis of the data from the Ireland-Boston Diet-Heart Study (6). Table 1 presents logistic regression coefficients for coronary heart disease mortality regressed on dietary cholesterol and total energy intake. We find that the regression coefficient for the cholesterol residual is of smaller magnitude and has a lower statistical significance when total energy is excluded from the model (/8 = 0.060, p = 0.089) than if total energy is included 0 = 0.078, p = 0.036). This attenuation bias occurs despite the fact that the cholesterol residual and total energy are by definition uncorrelated with each other, and is a result of omitting a covariate that is an independent predictor of the outcome. A similar effect can be noted in table 1 for total energy intake, which has a coefficient of smaller magnitude and a lower statistical significance when the cholesterol residual is omitted (0 = -0.229, p = 0.049) than when it is included (0 = -0.276, p = 0.023) in the model. An estimate of the magnitude of the attenuation bias resulting from omission of relevant covariates in logistic regression is given by Zeger et al. (7) as:

where /S is the logisticregressioncoefficient of the covariate of interest (in our example, the dietary cholesterol residual), Z is the omitted covariate (e.g., total energy intake), 7 is the logistic regression coefficient for Z, and c is a constant c = 16V3/15T. Note that the bias coefficient lies between 0 and 1; i.e., the bias will always reduce the magnitude of the regression coefficient. The attenuation bias is negligible when the variance of the omitted term 7Z is small. In particular, the bias term vanishes when 7 = 0, i.e., when there is no effect of the omitted covariate on the outcome variable. It is seemingly counterintuitive that bias is

1291

1292

Letters to the Editor

TABLE 1. Logistic regression of coronary heart disease mortality on dietary cholesterol and total energy intake in 1,022 men: The Ireland-Boston Diet-Heart Study, 1959-1982*

Modelf

Dietary cholesterol (100 mg)

0±SE§

Dietary cholesterol residual* (100 mg)

Total energy Intake (1,000kcal)

/3±SE

(3±SE

1. 2. 3. 4.

Total energy intake (1,000 kcal) -0.229 ± 0.117 0.049 Dietary cholesterol (100 mg) 0.012 ±0.030 0.691 Dietary cholesterol residual (100 mg) 0.060 ± 0.035 0.089 Dietary cholesterol residual + 0.078 ± 0.037 0.036 -0.276 ±0.121 0.023 total energy Intake 5. Dietary cholesterol + total 0.078 ± 0.O37 0.036 -0.414 ±0.151 0.006

energy intake * Participants In this study were recruited Into three groups; see ref 6 (or details. The J3 coefficients represent the change in log odds for a change of 100 mg or 1,000 kcal In the respective covariates. t The models include covanates for age and group. t The dietary cholesterol residual was calculated accordng to the method of WBett and Stampfer (1). § 0 ± SE, estimated regression coefficient ± standard error.

introduced by omitting a covariate that is perfectly uncorrelated with the covariate of interest. This has been noticed fairly recently in at least two different contexts. Gail et al. (8) found that certain important nonlinear regression models lead to biased estimates of treatment effects, even in randomized experiments, if needed covariates are omitted. Logistic regression models belong to this class, whereas models based on the logarithmic transformation of the expected response do not. Thus, for very rare diseases, the attenuation bias would not constitute a problem. Zeger et al. (7) distinguished between "subject-specific" and "population averaged" models for longitudinal binary data. With heterogeneity between subjects specified as a random individual term, averaging over subjects, i.e., ignoring the random effect, is conceptually similar to omitting balanced covariates. This is a reminder that much of the intuition in statistical practice stems from the use of linear regression models. Although much of the simplicity and beauty of linear modeling carries over to a large class of more general nonlinear models, caution is needed in extrapolation of results based on linear models. Having accepted that total energy should be included as a separate covariate in the regression model, one should consider the interpretation of the model parameters. As Pike et al. (9) noted, simultaneous inclusion of the nutrient residual and total energy intake as covariates in a regression model will result in the same model fit as inclusion of crude nutrient intake and total energy. Moreover, the nutrient coefficients and their standard errors will be identical in both models. This can be seen from table 1, in which the value of the logistic regression coefficient for dietary cholesterol is identical to that for the cholesterol residual when total energy intake is included as a covariate. The sole purpose of adopting the twostep procedure of computing nutrient residuals

would be to induce a different interpretation for the coefficient for total energy. If nutrient residuals are used, the coefficient for total energy pertains to the full effect of this variable. If crude nutrient intake is used, the coefficient for total energy is interpreted as the effect of "other sources of energy," since the (macro) nutrient under investigation is adjusted for. These differences and similarities of the two models are a consequence of the specification of the covariate space, and they hold equally for both linear and logistic regression. REFERENCES

1. Willett W, Stampfer MJ. Total energy intake: implications for epidemiologic analyses. Am J Epidemiol 1986; 124:17-27. 2. Willett W. Nutritional epidemiology. New York: Oxford University Press, 1990. 3. Smith KR, Slattery M, French T. Risk of colon cancer, diet, and collinear nutrients: an application and evaluation of the Willett-Stampfer method. (Abstract). Am J Epidemiol 1987; 126:737. 4. Katsouyanni K, Willett W, Trichopoulos D, et al. Risk of breast cancer among Greek women in relation to nutrient intake. Cancer 1988;61:181-5. 5. Willett WC, Stampfer MJ. The authors reply to "Re: 'Total energy intake: implications for epidemiologic analyses.' " (Reply to letter). Am J Epidemiol 1987; 126:982-3. 6. Kushi LH, Lew RA, Stare FJ, et al. Diet and 20year mortality from coronary heart disease: The Ireland-Boston Diet-Heart Study. N Engl J Med 1985;312:811-18. 7. Zeger SL, Liang K.Y, Albert PS. Models for longitudinal data; a generalized estimating equations approach. Biometrics 1988;44:1049-60. 8. Gail MH, Wieand S, Pintadosi S. Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 1984;71:431-44. 9. Pike MC, Bernstein L, Peters RK. Re: "Total energy intake: implications for epidemiologic anal-

Letters to the Editor

yses.' 13.

(Letter). Am J Epidemiol 1989; 129:1312-

Juni Palmgren National Public Health Institute Kalliolinnantie 4 00140 Helsinki, Finland Lawrence H. Kushi Division of Human Development and Nutrition School of Public Health University of Minnesota Minneapolis, MN 55455

THE AUTHORS

REPLY

We thank Drs. Palmgren and Kushi for their observations (1). Palmgren and Kushi usefully point out an additional reason for using the model we have suggested (2, 3), specifically including as predictor variables total energy intake and the residual from the regression of nutrient intake on total energy (which reflects the nutrient composition of the diet). The biologic rationale for this model, analogous to that for conducting isocaloric experiments in controlled feeding studies to assess the effect of a specific nutrient, has recently been discussed in more detail elsewhere (4). As Palmgren and Kushi note, the need to include total energy intake in a nonlinear model, when it is an important predictor of outcome, is a specific application of a recently recognized general principle. However, we disagree with a minor point made by Palmgren and Kushi that "the sole purpose of adopting the two-step procedure of computing nutrient residuals would be to induce a different interpretation for the coefficient for total energy" (1, p. 1292). Given the fundamental interest in

1293

the effect of nutrient intake independent of total energy (with which Palmgren and Kushi agree), an epidemiologist analyzing a study should have a solid appreciation of the distribution of this variable, which cannot be obtained by just entering both total energy intake and crude nutrient intake into a multivariate model. Thus, the computation and careful examination of nutrient residuals is an important part of an analysis. Moreover, as Palmgren and Kushi briefly mention, the nutrient residuals should be calculated and evaluated in validation studies when dietary composition independent of total energy intake is of ultimate epidemiologic interest. The point they make about the interpretation of total energy intake when it is included in a model with a crude nutrient intake can easily be overlooked and is important, because total energy intake no longer retains its biologic meaning in such a model. REFERENCES

1. Palmgren J, Kushi LH. Re: "Total energy intake: implications for epidemiologic analyses." (Letter). Am J Epidemiol 1991; 133:1291-3. 2. Willett WC, Stampfer MJ. The authors reply to "Re: 'Total energy intake: implications for epidemiologic analyses.' " (Reply to letter). Am J Epidemiol 1987;126:982-3. 3. Willett W. Nutritional epidemiology. New York: Oxford University Press, 1990. 4. Willett WC. Total energy intake and nutrient composition: dietary recommendations for epidemiologists. Int J Cancer 1990;46:770-l.

Walter C. Willett Meir J. Stampfer The Channing Laboratory Department of Medicine Harvard Medical School and Brigham and Women's Hospital Boston, MA 02115

RE: "2,4-D, 2,4,5-T, AND 2,3,7,8-TCDD: AN OVERVIEW" Having conducted some of the original human health research on phenoxy herbicides and dioxins (1-7) and having recently published our own critical review of the evidence of carcinogenicity of the phenoxy herbicides (8), we read the epidemiologic review by Lilienfeld and Gallo (9) with great interest. We were disappointed to find pervasive factual errors in their review, as well as conclusions which are unsupported by or inconsistent with the data presented. The most serious factual errors involved the authors' failure to distinguish between very different exposure environments. Differences among exposures to 2,4-dichlorophenoxyacetic acid (2,4-D), 2,4,5-trichlorophenoxyacetic acid (2,4,5-T), other phenoxy herbicides, chemical

precursors, and contaminants, particularly 2,3,7, 8-tetrachlorodibenzo-^-dioxin (2,3,7,8-TCDD), were repeatedly ignored or obscured. Without citing any evidence in support, Lilienfeld and Gallo repeatedly imply that the herbicide 2,4-D was and is contaminated by 2,3,7,8-TCDD. Although the less toxic di- and trichlorodibenzo-/>-dioxins have been detected at extremely low levels in some samples of 2,4-D produced in North America, 2,3,7,8-TCDD has never been detected with analytical methods sensitive to 1 part per billion (10). Furthermore, since 1989 the Environmental Protection Agency has required US manufacturers and suppliers to analyze 2,4-D for 2,3,7,8-TCDD at a limit of detection of 0.1 part per billion (11, 12). To the best

Re: "Total energy intake: implications for epidemiologic analyses".

American Journal of Epderrotogy Copyright C 1991 by The Johns HopWns University Schod of Hygiene and Pubic Health AJ rights reserved Vol. 133, No. 12...
255KB Sizes 0 Downloads 0 Views