159-

Structural equation models in medical research PM Bentler and JA Stein

University of California, Los Angeles

Structural equation modelling (SEM) is a modern statistical method that allows one to evaluate causal hypotheses on a set of intercorrelated nonexperimental data. The sample variances and covariances, and possibly the means, are compared to those predicted by a theory-based hypothetical model after optimal estimation of the parameters of the model. The goodness-of-fit of the empirical data to the hypothesized model is evaluated statistically. This review describes the underlying statistical theory and rationale of SEM. Both confirmatory factor analysis and latent variable path models are discussed. The applicability of SEM to assessment of reliability and validity is noted. A detailed example is provided, and several examples from the medical literature are briefly reviewed. Cautions regarding the possible misuse or misinterpretation of the technique are also mentioned. Possible future directions for the use of SEM in medical research are suggested. Two appendices provide more technical details.

1

Introduction

Linear structural equation models (SEM) with latent variables, and its special case of confirmatory factor analysis (CFA), are based upon multivariate techniques developed in the fields of biometrics, econometrics, psychometrics, and sociometrics. In essence, these methods require the specification of a series of simultaneous hypotheses about the impact of certain variables, including ’causal’ and control variables, on other variables. The consistency of such simultaneous hypotheses with a set of data drawn from a specified population can be evaluated by goodness-of-fit statistics. As such, these techniques allow one to evaluate causal hypotheses on correlational data. Because of the extensive use of nonexperimental data in the social sciences, SEM and CFA are often applied in the academic disciplines associated with the above statistical subspecialties. Surprisingly, in biometrics where the idea of SEM was first introduced (e.g. Wright), the method has been relatively ignored. For a historical overview of the field, see Bentler.2 For a current technical summary, see Bollen3 or Bentler4 (Ch. 10). Although methods for multivariate analysis have been around since the turn of the century, such methods have usually been used to explore data rather than test causal theories. The last decade, however, has seen the proliferation of SEM methods for the analysis of correlational data that allow substantially stronger causal conclusions than methods that have historically been considered appropriate for nonexperimental research. These methods, which substitute statistical control for experimental control, are especially useful if experimental data are not available and perhaps impossible to obtain. This may be the case if, for instance, it would be unethical to withhold treatment (i.e. prenatal care) or give a treatment (administer illicit drugs to adolescents), or if an experiment is logistically impossible, too expensive, or too time-consuming. As we shall note below, SEM and CFA have been employed in medical research although not yet to the extent that they have been used in the social sciences. It seems to us that there are many areas within medical research in which these methods could be employed successfully and to good effect. This is especially the case when there are

correspondence: Professor PM Bentler, Department of Psychology, Franz Hall, University of California, Los Angeles, 405 Hilgard Avenue, Los Angeles, CA 90024-1563, USA. Address for

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

160

wealth of variables available with correlations among themselves, and a technique is required to synthesize or summarize these variables, delineate their relationships, refine the measures employed, eliminate error in the individual measured variables, or model an underlying process or concept not readily amenable to direct measurement. In fact, it could be said that wherever a set of variables do not precisely measure the concepts or constructs of real interest in an investigation, i.e. when more than one measured variable could be used to represent a concept, then SEM with ’latent’ variables is an appropriate methodology. Since errors in measurement apparently exist in typical medical research, the potential for SEM seems to be strong. Medical researchers from various disciplines have outlined the usefulness of SEMs with certain types of data. For instance, in a review of the methodology, Francis described the potential of SEM for advancing neuropsychological theory and practice.s Since neuropsychologists operate with implicit models, he suggested that SEMs may bridge the gap between theory and research practice. Carr et al., in exploring causal links between changes in neuropeptide levels, cited SEMs as possibly fulfilling the goal of finding noninvasive means of identifying causal relationships through building models which fit empirical data.6 Buncher, Succop and Dietrich outlined the advantages of SEMs in environmental risk assessment especially when measurements are taken over several time points.7 They suggested SEM as a solution to the dilemma that typical regression models are not well adapted to sequential prediction when several stages are involved in the model. Although these SEM methods are sometimes known colloquially as ’causal modelling’ techniques, it must be stressed at this point that imputation of ’causality’ is in the eye and interpretation of the beholder and cannot be proved by the use of covariance structure modelling alone (see Baumrindg; Cliff 9). The methodology can evaluate whether a causal hypothesis is consistent with empirical data. If not, the causal hypothesis can be rejected statistically; if yes, the causal hypothesis cannot be rejected though it cannot be ’proven’ by the methodology. In practice, models may be inadequate, and the methodology may provide clues as to how to improve the theory to make it more consistent with the data. Before we turn to more details on the theory, methods, and applications, we want to be careful to report that the SEM methodology has not been immune from criticism and controversy. Freedman and others have engaged in a spirited debate over statistical issues and occasions of misuse associated with the technique. 10 Freedman found fault over the careless use of regression-type models, including SEMs, in the social sciences, and pointed to the need to take seriously the basic assumptions that underlie such statistical models. Bentler’s rejoinder defended the usage of SEMs in the social sciences.11 He noted that, while certain technical improvements were clearly needed (e.g. a small sample statistical theory), neither Freedman nor any other critic had found any technical (mathematical or statistical) problems in the methodology. Inadequate theories, data, and modelling practice seem to be a more serious problem than inadequate statistics. In fact, Breckler recently criticized those who use SEM without sufficient understanding of the limitations of the method.l2 He stated that the phrase ’causal modelling’ is a misnomer and should not be casually used. It is indeed the case that responsible researchers should always acknowledge that the plausibility of certain hypotheses can be tested and falsified, but cannot be definitively proved as true by SEMs. Nonetheless, if a causal theory is consistent with a set of data, the theory is confirmed to the extent that the proposed model is not rejected by the data. As Francis (p. 625) indicates, ’SEM facilitates drawing causal inferences by proposing a network of causal relationships among a

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

161

variables, and determining the extent to which variables would be related if, in fact, such network were operating.’5 The organization of this review is as follows: We first give a brief overview of the statistical basis of SEM, discuss the use of CFA in estimation of reliability and validity of clinical measures, and provide a detailed example of the use of SEM in medical research.We then briefly review further examples of the recent use of SEM in medical research in a wide variety of contexts, and finally we suggest some possible future directions for the use of SEM. A more detailed description of the statistical theory underlying SEM may be found in Appendix A and Appendix B (adapted from Bentler13). a



2

Structural equation modelling

...

Structural equations are akin to a system of regression equations used to specify, estimate, and test a process or theory under study. The equations relate given dependent variables to a set of predictor variables that, in turn, can be dependent variables in other equations. These equations and the associated variance and covariance specifications among nondependent variables are the SEM model. The implications of these specifications for the means, variances, and covariances of the measured variables are obtained by matrix algebra. In the context of covariance structure analysis, I is the population covariance matrix and 0 is a vector of basic parameters such as regression coefficients, and variances and covariances of the nondependent (’independent’) variables. In the context of more general structural models, the means of the measured variables are also a function of basic parameters that can include the means or intercepts of latent, unmeasured variables. Given the parametric values for the unknown parameter vector (0), the population mean vector JL and covariance matrix I are determined by these parameters, i.e. tt )m.(6) and I = ~(6). This is the ’structural’ hypothesis to be tested. In practice, the sample mean vector x and covariance matrix S are used to estimate the unknown parameter vector 9. If §i FL(6) and £ = 1(6) are close to x and S, the SEM model is a plausible representation of the data. If the model-implied and sample matrices are far apart, in some appropriate metric, the model is rejected. In many models, the means are not structured as ~(6), ft x, and only the structure of the covariance matrix ~ is of interest. Such models are called ’covariance structure models’. A special type of covariance structure model that involves latent factors, as in factor analysis, is called ’confirmatory factor analysis’ . The statistical theory underlying SEM is asymptotic, so that, in principle, relatively large sample sizes should be used to estimate and test models.14,15 Typically, sophisticated statistical computer programs such as LISREL16 or EQS4 are used since SEM techniques are computationally demanding. EQS has been described as particularly userfriendly due to the elimination of matrix algebra in model setups. 5,17 EQS implements a general mathematical and statistical approach to the analysis of linear structural equation systems. The mathematical model (see Appendix A) subsumes a variety of covariance structure models, including multiple regression, path analysis, simultaneous equations, first- and higher-order confirmatory factor analysis, as well as regression and structural relations among latent variables.18 In the Bentler-Weeks model, the parameters of any linear structural model are the regression coefficients and the variances and covariances of the independent variables. The statistical theory allows for the estimation of parameters and testing of models using traditional multivariate normal theory but also the more =

=

=

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

162

general elliptical and arbitrary distribution theories, based on a unified generalized least squares (GLS) or minimum chi-square approach. As noted above, the variances and covariances of a set of empirical data are compared to a hypothetical model. The extent to which the hypothetical model £(0) represents the actual data relationships in S determines the goodness-of-fit of the model. Goodness-offit is generally reported as a chi-square statistic with degrees of freedom (d~ corresponding to the difference between number of data elements in S (or, S and:i) and the number of free parameters in 0 that are estimated. If the X2 is small compared to df, the model reproduces the data well, i.e. 1(0) is close to S. If the x2 is large compared to df, assuming that statistical modelling assumptions (such as independence of observations) are met, the model is rejected, that is, the null hypothesis I = Xl(0) is rejected. Typically, the Xz test statistic is the value of a function at its estimated minimum times a sample size multiplier. As a result, in increasingly large samples, the null hypothesis Y. = Xl(0) is easily rejected due to higher power. In practice, since the chi-square is sensitive to sample size, models with a large number of subjects often do not fit the data well even with trivial differences between S and j. 19-20 Thus, additional fit indices are usually provided for model evaluation. In EQS, a normed fit index,21 a non-normed fit index, and a comparative fit index are provided.22 These fit indices have a range from 0-1 and are based upon the improvement in fit of the hypothesized model over a model of complete independence or uncorrelatedness among the measured variables. The fit indices can provide a more useful indication of how well a model fits the data, since they reflect the ability to ’explain’ covariances. Values over 0.9 are desirable, indicating that 90% or more of the covariation in the data is able to be reproduced by the hypothesized model. Tests of hypotheses about particular parameters, say Oi , are also important in structural modelling. For example, a test of whether Oi is zero in the population can be done by univariate large-sample normal z tests of the null hypothesis. The parameter estimates Oi for the structural coefficients are divided by their estimated standard errors to yield this statistic. These tests and related multivariate tests on sets of parameters are routinely available in SEM packages. 3

Latent variables

A latent variable is an underlying unobserved construct that may be indicated by two or more measured variables. The basic idea is that the observed variables are correlated only to the extent that they share this underlying construct, i.e. partialling out the latent variable would reduce the correlation to zero. Latent variables cannot be measured directly, but rather, measured variables serve as proxies or indicators of the construct of real theoretical interest. Equations in latent variable models express the measured variables in terms of the latent variables. Latent variables cannot be expressed as a linear combination of measured variables since the dimensionality of the space of latent variables exceeds the dimensionality of the space of measured variables.23 The existence of latent variables is usually substantiated through a confirmatory factor analysis (CFA). A CFA indicates how well observed or manifest variables serve as proposed indicators of one or more latent constructs. Although somewhat similar to the more familiar exploratory factor analysis, in a CFA the factors or constructs are hypothesized in advance of an analysis. In an exploratory factor analysis, one attempts to uncover how many factors are needed to explain the correlations among variables.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

163

In contrast, the SEM approach to factor analysis involves explicit hypotheses on 1) the number of latent variables or factors, and, 2) which measured variables are good indicators of these latent variables or factors. In addition, of equal importance, many factor loadings between measured and latent variables are hypothesized in advance of the analysis to be zero. Chi-square tests and fit indices, briefly described earlier, indicate the plausibility of the general hypotheses. In CFA, the latent variables may be uncorrelated or correlated among themselves but are never regressed on other variables (i.e. predicted by other latent or manifest variables). In more general structural models, there may be more complex relationships among the latent variables. They may not only be hypothesized to correlate among themselves, but also may be hypothesized to affect each other in various ways. That is, factors may be hypothesized to cause or be the consequence of other factors. Again, the plausibility of these hypothesized relationships is tested through goodness-of-fit measures, and individual regression paths can be tested for significance as noted above.

4

Model comparison

One of the most useful features of SEM is its capability to compare the fit of competing models. It is prudent to look at alternative models since other models may fit even better or provide more parsimonious solutions. In addition, the confidence that one has in a particular model may depend on whether other explanations have been tested and found wanting.24 One can fit alternative models and then compare chi-square values. Since the difference between two chi-square values for nested models is also distributed as chi-square, differences between hierarchical or nested models can be calculated through subtracting the chi-square values obtained and their degrees of freedom. One model will fit significantly better than another or there will be no statistically significant difference between the models. As in most disciplines, parsimonious explanations are preferred.7,25 In addition, another popular option is comparing multiple groups. Data are frequently gathered from individuals who belong to certain groups such as males and females, ethnic communities, and so on. The groups in question may represent multiple populations rather than a single population. Their means and covariance matrices may be described by parameters that are at the extreme completely different and unrelated, and at the other extreme by parameters that are identical for all groups (in which case, of course, the means and covariances are equal across groups). Hypotheses on various degrees of cross-sample equality can be tested by analysing parameters from each sample simultaneously to see which of several models reproduces the sample data of each group to a reasonable level of accuracy. As usual, a X2 test can be used to describe the adequacy of a model. In practice, one may start with a relatively unrestricted model, and several increasingly restrictive hypotheses may be evaluated by constraining certain key parameters to equality across the groups. The recommended orderly progression of increasing stringency is discussed by Alwin and Jackson26 and Bentler.4 An interesting feature of multiple-group models for means and covariances is that hypotheses on equalities of intercepts and means of latent variables can be tested, even though scores for the subjects on the latent variables are not actually computed.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

164

&dquo; &dquo;



&dquo;





...;

Reliability and validity Reliability and validity are areas of constant concern in scientific inquiry. Often medical researchers have a multiplicity of individual measures with varying amounts of error 5





.

contained within the

measures.

Since latent variables

are

error-free

measures

of under-

lying processes or constructs, they can both organize and represent various measures if there is shared variance among them. These latent variables can subsequently be predictors, outcomes, or covariates of other hypothesized constructs. Both CFA and the more general latent variable (LV) path models enable a confirmatory assessment of construct validity, as well as convergent and discriminant validity In latent variable models, the unwanted error components of the observed measures are excluded from the definition of the latent constructs and are modelled separately. Of course, manifest variable path models without latent variables can be employed in which measured variables are used as predictors and outcomes in a hypothesized system. However, these measured variables retain their errors of measurement (uniqueness), and hence estimates of coefficients or other parameters can be biased. One of the major reasons to use latent variable modelling is to partition error variance and develop ’perfect’ or error-free measures of underlying processes, so that the relative magnitude of an influence is an indicator of the true influence and is not biased by differential precision of measurement. Although individual observed measurements may be faulty, the measurement model as hypothesized and tested in a CFA is inferred from the pattern of correlations between the manifest measures. The use of multiple indicators for each construct allows one to infer from their covariation the degree to which each reflects the underlying concept under study. The residual variance (uniqueness) not accounted for by the latent factor is assumed to be composed of specific and random error components. The parameters of the measurement model describe the measurement properties of the observed variables, from which an estimate of the internal consistency of reliabilities can be obtained. 16 In LISREL, the reliabilities of the observed variables are reported as squared multiple correlations of the variables on the underlying factors. In EQS, the reliabilities are obtained by subtracting the squared error coefficient of the individual observed measures from 1 in the standardized solution. These estimates may or may not be appropriate indices of reliability, depending on the design of the study. Since these indices under CFA will include test specificity (reliable but unshared variance) as error variance, these indices should be considered indices of reliability only when the design permits placing all specific variance in the common factor space. This can be done, for example, with panel data (see below). CFA has been helpful in developing reliable and valid clinical instruments such as psychometric tests designed to assess intelligence and scholastic aptitude. Its usefulness has also been noted in clinical assessment research and test validation research when developing instruments to measure constructs or traits such as fear or anxiety.2~~2s Morris et al. emphasize the utility of the hypothesis testing and theory validation aspects of CFA since they point out that the clinical assessment field has many instruments with interesting titles that reflect little or no construct validity (i.e. do they measure what they purport to measure?).28 They provide an example relating school fears and physical complaints, and demonstrate that practitioners can have more confidence in results obtained with CFAs than with traditional factor analytic procedures. In addition, they cite the advantage of providing information regarding structural relationships between

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

165

various

constructs

that

are

assessed with data that

were not

collected

through

random

assignment techniques. jbreskog29 presents examples of CFA models that assess the levels of psychometric equivalence of sets of tests as described in classical test theory:3~ i.e. parallel, tau-equivalent, variable-length, and congeneric. Using nested models, one may test a series of increasingly stringent hypotheses about the psychometric properties of tests by constraining various parameters to equality. For instance, the parallel tests model requires equal factor loadings and equal error variances while the tau-equivalent model requires equal factor loadings only. Francis extrapolates this feature of CFA to the case in which one wishes to examine the equivalence of neuropsychological measures across multiple populations, or developmental stages.5 Millsap and Everson provide extension of CFA measurement models for covariance structures to models that also impose a structure on the means of the measured variables.31 While CFA thus permits specifying and testing various assumptions underlying a reliability estimate, such estimates also may not make sense or be appropriate if the model on which the estimate is based is not consistent with the data. an

Multiple-indicator longitudinal models In addition to a theoretical advantage when using latent constructs rather than imperfectly measured single variables, there are further statistical advantages when using multiple indicators for theoretical constructs in order to obtain more information about both stability and reliability. In longitudinal single-indicator models there is no way to estimate the possible influence of correlated measurement error. 32 Furthermore, as noted above, in cross-sectional data, with the usual CFA design it is not possible to estimate both specific and error variance. A good discussion of this problem is given by Millsap and Everson (p. 487).31 One way that components of the residual error may be separated is by the use of longitudinal (panel) CFA models. Panel data are highly similar or identical measurements obtained at two or more points in time from one sample of respondents. Wheaton et al. point out the necessity of incorporating issues of unreliability of measurement in any general model for the analysis of panel data and suggest strategies to improve the estimation of reliability and stability parameters.32 If test-specific error is treated as random error, the reliability of individual items will be underestimated. However, using SEM, panel models allow the decomposition of unique variance into specific and random measurement error.33 For instance, test-specific variance in the measurement of any indicator in original data can be correlated with the measurement of that same indicator at a later time, with the popular but misnamed ’correlated errors’. CFA allows one to explicitly model these test-specific correlated errors to obtain estimates of method error. Both random error and specific measurement variance can thus be estimated.32 j6reskog provides an example of disaggregating test-specific and occasion-specific variance in a four-wave six-variable model.34 Panel data from a large growth study included six aptitude and achievement test scores in grades 5, 7, 9, and 11. He postulated a two-factor solution for each occasion with two correlated common factors, verbal ability and quantitative ability. In addition to the ability factors, test-specific factors across time one such factor for each test - added considerable improvement in the fit of the model. In evaluating the variance components, he found that the ability factors accounted for the 6

-

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

166

largest proportion of the total variance, and that the test-specific variances were relatively small...

7

_



Multitrait-multimethod models

Longitudinal panel models, such as the example cited above, may be considered specializations of the multitrait-multimethod (MTMM) approach to discriminant, convergent, and construct validation.35 In MTMM studies a number of traits are measured with a number of different methods or measuring instruments. The MTMM approach is used to determine the extent of true relationship among traits in the presence of both method variance and random error. The CFA model has been described as the preferred method for analysing MTMM data.2~~36 In cases where trait and method variance are controlled for in the research design, CFA can explicitly define trait and method factors.29 Also, CFA can structurally account for correlated error problems. Schmitt and Stults,36 and Widaman37 outline various paradigms and restrictions of CFA MTMM models which provide tests of convergent validity, discriminant validity, and method bias. Marsh suggests that method factors are sometimes better not modelled explicitly, but rather should be treated as correlated errors when method factors cannot be well defined.3g Figure 1 presents a schematic representation of a hypothetical MTMM model testing the convergent and discriminant validity of nine measures of three traits: Anxiety, Depression and Anger. Three method factors are explicitly modelled as well: Parent Report, Self-report, and Teacher Observation. Discriminant validity would be estimated by examination of the magnitude of the correlations between the trait factors. One would expect some degree of correlation among these psychological states (obliquity), but too great a relationship would argue against these tests’ ability to discriminate or distinguish among the traits. Convergent validity would be indicated by the loadings of the tests on their individual trait factors. The method factors would account for shared variance due to similar methodological characteristics of the individual tests. Cole reanalysed several examples of complete and incomplete MTMM designs in clinical research using CFA.27 He found that his results provided more rigorous support of the original authors’ positions and provided valuable supplemental findings as well. w In his most complete clinical example, he reexamined data from a validation study of children’s measures of anger and depression.39 These two traits were measured with eight instruments which were composed of either self-reports, peer nominations, teacher nominations, or teacher ratings. A model which included two oblique trait factors, and three oblique method factors (self-report, nomination, and teacher-report) provided the best fit to the data. He reported evidence of discriminant validity since anger and depression did not correlate significantly. Convergent validity was partially confirmed, since most of the individual measures loaded significantly on their appropriate trait factors although one was considerably lower than the rest, and another one had an incorrect sign. In an extrapolation of the MTMM concept, Stein, Newcomb, and Bentler partitioned latent measures of frequency of drug use, quantity of drug use, disruptive drug use, and self-reported problem drug use from use of specific drugs (see Figure 2).4~ The substance-specific factors were considered analogous to method factors, and the more substantive general drug use factors were conceptualized as the trait factors. Other illustrative examples of MTMM CFA studies include an assessment of the validity of Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

167

1 Schematic representation of a multitrait-multimethod CFA model. Measured variables are in rectangles; latent variables are in circles. There are three trait factors: anxiety, depression, and anger. There are three method factors: Parent Report, Self-report, and Teacher Observations. Residual errors in variables not shown.

Figure

self-reports of alcohol and other drug use by separating the influence of four data collection methods from three types of substance use,41 and separation of peer-specific and family-specific support from measures of loneliness, socially supportive relationships, and social resources.42 Pitfalls One

major pitfall in using SEM for assessing reliability and validity would be disregarding the statistical and theoretical assumptions underlying the models. For instance, the maximum-likelihood techniques usually employed in SEM assume that the data have a reasonably multivariate-normal distribution. (See further discussion on this issue in Appendix B.) Furthermore, in the case of longitudinal panel models, sample attrition may affect the quality and psychometric properties of the data in terms of bias and representativeness. In addition, the sample size may be inadequate and the sample composition may be inappropriate for the purposes of developing and validating the instrument. Additionally, there may be too few indicators of the latent constructs which may lead to under-identification of the models.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

168

Cole encourages the judicious use of CFA in test validation research but points out that CFA estimates are only as good as the underlying data. 27 Additionally, if the model itself is misspecified, the estimates may be quite inaccurate. Furthermore, the researcher should be wary of conclusions based upon post hoc model modifications. Cross-validation with another sample is the best method for ascertaining that one is not capitalizing on chance, especially when adding correlated error residuals.

2 Confirmatory factor model of MTMM assessment of shown. (From Stein, Newcomb and Bentler.40 Copyright 1988

Figure

drug usage. Residual errors in variables not by the American Psychological Association.

Reprinted by permission.)

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

169

8

The modelling process

.. , ,

..

On a practical level, the following five steps are implemented when performing structural

modelling43:

..

_

~

1

’path diagram’ is drawn that includes all the variables involved in the causal system. Variables that do not receive causal inputs from other variables are independent variables. Those variables that are predicted by other variables are the dependent variables. Conventionally, in path diagrams, measured variables are represented as rectangles, latent variables are represented as ovals or circles. Two-way arrows represent correlations or covariances between independent variables. One-way arrows, sometimes called path coefficients, represent the influences of one variable A

another. The direction of the arrow indicates the direction of causation. The diagram is translated into a series of multiple-regression equations. There are as many equations as variables to be explained; i.e. as many equations as dependent variables. A computer program such as EQS or LISREL is employed to test the adequacy of the model. The parameters of the model are estimated from a set of empirical data. Path coefficients, which are standardized partial regression coefficients,24 indicate the influence of latent or manifest variables on the dependent variables. The adequacy of the model is evaluated through the use of statistical and nonon

2

3

4

5

statistical means. Ideally, alternative models are compared for the same set of data. One way this may be accomplished is through the use of hierarchical, nested models and chi-square difference tests.

.. _

Illustrative application We now present an example where modelling was used to represent both psychological and physical processes in cognitive functioning; thus, modelling simulated both structure and function. Hines, Chiu, McAdams, Bentler, and Lipcamon modelled the influence of the corpus callosum on verbal fluency, language lateralization, and visuospatial ability.44 These latter three constructs were conceptualized in turn as latent variables indicated by several measured variables (see Figure 3). For instance, verbal fluency was a psychological construct or latent variable indicated by scores on three manifest measures: a modified version of Thurstone’s Verbal Fluency test; a controlled associates test; and a sentence-formation task. Brain structure (the callosum) was also modelled through correlations between measures taken of the cross-sectional surface area of the posterior fifth (splenium), the posterior third minus fifth (isthmus), the anterior fourth (genu), and the midregion lying between the isthmus and the genu using magnetic resonance imaging. The relationships between the manifest variables in the rectangles and their associated latent variables comprise the measurement model portion of the hypothesized structure. A typical set of equations using the EQS program for the measurement section of the model would be as follows:

V5 (Thurstone fluency) = *F2 (Verbal Fluency) + E5 (Error residual) V6 (controlled associates) = *F2 + E6 V7 (making sentences) = *F2 + E7

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

170

Note that the measured variables are explained by the factors and errors, not the reverse. The (*) indicate that the regression coefficients are to be estimated by the program. ’V’ is for manifest variables, ’F’ indicates factors, and ’Es’ are errors in variables. By convention, in text discussions, factor names are usually capitalized to distinguish them from

measured variables. The hypothesized relationships among the latent variables are indicated by connections among the large circles or ovals in Figure 3. In CFA, the circles would be connected only by two-way arrows, representing covariances or correlations not explained by the model. In more general models such as this one, the factors are connected by one-way arrows on the basis of hypothesized directions of causal influence. A typical regression equation for the causal modelling portion of the model would be as follows: .. F2

(Verbal Fluency) = *V1 (splenium) + *Fl (Anterior Callosum) + *F3 (Language Lateralization) + D2 (Error associated with latent factor)

~’ .

1:

3 A path diagram illustrating the model of hypothesized relationships among subregions of the corpus callosum and cognitive traits. Rectangles represent measured variables and circles represent latent factors. Es represent error or residual variables associated with measured variables and Ds represent residual variables associated with latent factors. Numerical values associated with one-way arrows are beta weights. Numbers associated with two-way arrows are correlations. A variable having one or more one-way arrows aiming at it is 0.61 (splenium) + 0.03 (Anterior Callosum) + 0.11I a dependent variable in a regression, e.g. Verbal Fluency (Language Lateralization) + D2. The model consists of many simultaneous regressions. Significant relationships between variables and factors within the model are indicated by asterisks. *p < .05; **p < .01. (From Hines et al.44 Copyright 1992 by the American Psychological Association. Reprinted by permission.)

Figure

=

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

171

The fit of this model to the data was analysed with the EQS program and found quite acceptable. The chi-square was 56.04, with 45 degrees of freedom, yielding a probability of 0.12, and a comparative fit index of 0.92. Although the model is acceptable, only certain parameters are statistically significant as indicated in the figure by those regression coefficients that have asterisks. For instance, greater splenial area significantly predicted greater verbal fluency and lesser language lateralization. The authors pointed out that they found it advantageous to use SEM since sets of variables were selected a priori to measure specific constructs. &dquo; ?’..’ .i E~t w~_t....

. -

_

,

.

_

,

,

...

’.

,

’ ’ ’

Examples of use of SEM in medical research The following examples are a brief annotation of a number of recent studies in the medical literature that employed SEMs. They exemplify the range of specialties within the medical community that have used this methodological technique. Some of the studies use SEM in an analogous manner to the way in which it is used in the social sciences (i.e. comparing groups, validating assessment instruments), but others have found unique and imaginative ways to extend the methodology, as Hines et al. did in the above example. SEM already is used extensively in research areas that overlap between the social sciences and medicine. For instance, modelling is used frequently in health services research. A large proportion of the data that are available in health services research is of a nonexperimental nature.45 For instance, Newcomb and Bentler studied the impact 9



..

of late adolescent substance use on young adult health status and utilization of health services using latent variables representing drug use and health problems.46 Although general drug use was not significantly related to health problems or health services utilization, general cigarette use in adolescence was predictive of a range of negative health outcomes as well as more health services usage. Early cannabis use predicted more health problems as well. Stein, Fox and Murata performed a CFA using a multiple-group comparison to explicate the factor structure of barriers to the use of mammography among three ethnic groups.47 Black, White, and Hispanic women evidenced a similar pattern of responses to indicators of fears of embarrassment, radiation, pain, and anxiety, and concern about cost. In subsequent pooled analyses, concern about cost was the strongest predictor of less mammography usage . -, --







Confirmatory factor analyses CFA has been used extensively in test development, validation, and refinement in the clinic setting. (See Cole for further discussion of this issued) For instance, Caetano48 contrasted the factor structure of the concept of alcohol dependence as proposed in two psychiatric classifications, the revision of the third edition of the Diagnostic and statistical manual of the American Psychiatric Association49 and the tenth Revision of the International classification of diseases.50 In another example, CFA was used to assess the discriminant validity of left- and right-hand sensorimotor measures from a comprehensive neuropsychological battery.5 Analysis and cross-validation samples were used. The authors found strong support for distinctions between measures of simple and complex skills but little evidence for the discriminant validity of left- and right-hand measures in 10

_

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

172

their sample of primarily learning-disabled children. CFA has also been used in the assessment of the reliability and validity of various measures of self-reported affective distress in patients with rheumatoid arthritiS.52,53 Coulton et al. found factor invariance across three ethnic groups for the Arthritis Impact Measure Scales. Hagglund et al. assessed the convergent validity of several self-report measures commonly used with arthritis patients. They found adequate convergent validity among the scales but poor discriminant validity. Confirmatory factor analysis was used to validate a multivariate conceptualization of adherence in childhood diabetes.54 This notion contrasts with the usual descriptor of a patient as either compliant or noncompliant. The authors found that adherence was not univariate, but rather consisted of four components, Exercise, Injection, Diet Type, Eating/Testing Frequency, and two single-indicator constructs, total calories, and sweets consumed. This model was originally developed through exploratory factor analysis and was confirmed in two separate samples through the use of CFA and multiple group

comparisons. Lobel and Dunkel-Schetter conceptualized stress as a multidimensional construct and verified this through a CFA.55 They found a two-factor, rather than a single-factor, model of stress to be more reasonable. Stress perception and emotion were part of one factor; environmental conditions as reflected in major life events represented a second and distinct component of stress. They suggested that improved models of stress may better enable researchers to identify the health effects of stress. CFA was used to evaluate models of developmental integration in the laboratory rat, i.e. associations between characters created by interactions during ontogeny.56 Competing causal hypotheses of increasing complexity were modelled and then compared to observed correlations among osteometric measures. Models derived from conflicting hypotheses were not significantly different among themselves. The author points out that her results cast doubt upon the value of causal inference drawn from exploratory analysis, and the advantages of a confirmatory approach in the analysis of developmental integration both statistically and conceptually.

11

Latent path models

Latent variable path models have also been compared between two or more groups. For instance, Ellison, Greisen, Foster, Petersen, and Friis-Hansen compared the relations between perinatal conditions and developmental outcomes operationalized as latent variables at age four years for two cohorts of children, one in the US and the other in Denmark.57 Through comparison of the path coefficients, several substantive differences in treatment were found between the two cohorts. Turk and Rudy tested the robustness and generalizability of empirically derived classification systems for chronic pain patients through group comparisons.58 They found that profiles based on patterns of inter-relationships among various assessment scales were remarkably similar. In what was described as the first application of structural models to neuroscience, McIntosh and Gonzalez-Lima demonstrated how structural modelling can be used to determine the functional inter-relationships between brain structures that form the auditory system.59 Three groups of rats were studied in varying experimental conditions. In this case, the directional pathways (analogous to structural regression paths) were already known based on the anatomy. The interest here was the magnitude

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

173

of influence for each path within each brain, and how this relationship changed between experimental conditions. Goodness-of-fit measures were used as relative indices of how much of the covariances could be accounted for by the major anatomical pathways of the auditory system. The authors described the results as revealing relationships that were not obvious through conventional data analysis. In an audiological study, age-associated hearing loss was contrasted among men in their 30s, 50s, and 70s.60 Separate models were developed for each age group. A latent variable hearing factor was indicated by three objective measures of hearing capacity at different frequencies. The hearing factor was predicted by various physical, occupational, and psychosocial variables. The models revealed substantive differences among the age groups as well as some similarities. A latent variable analysis was performed to assess possible specific verbal fluency deficits in a group of patients with multiple sclerosis contrasted with normal controls.61 Through assessing the statistical significance of a dummy variable of group membership, the authors found no clear specific fluency deficit as a function of multiple sclerosis status alone. In a series of studies on human behavioural teratology, neurobehavioural effects of prenatal alcohol exposure were studied using a latent variable approach with a nonstandard estimation procedure. A general alcohol latent variable incorporating both binge and regular drinking patterns prior and during pregnancy predicted a pattern of neurobehavioural deficits, poor integration and quality of responses, and an inflexible approach to problem solving among children.62 Similar results were reported for outcome measures related to performance and behaviour in school age children.63 Others have also recommended and used SEMs or path analyses to predict birth outcomes from various prenatal influences. 64-66 Poland, Ager, Olson, and Sokol assessed the effects of selected social, behavioural, and biologic factors on birth weight.67 They use path analysis to model hypotheses about the inter-relationships among these variables. The key finding was the influence that quality of prenatal care had on birth weight, and the potential for improving birth outcomes by addressing negative effects of underlying social factors. In another study, the effect of low-level fetal lead exposure on neurobehavioural development in early infancy was assessed using SEMs.68 The authors found ample evidence for both direct and indirect effects of fetal lead exposure on early neurobehavioural development at both three and six months of age. Boys and infants from the poorest families appeared to be most sensitive to psychoteratogenic influences; in utero exposure to lead may also have exerted additional effects through decrements in fetal growth and maturation. z

12

Genetics and SEM

variants have been utilized extensively in genetics, especially in twin studies.69-~4 For instance, Bodurtha et al. analysed various genetic and environmental contributions to the variance of anthropometric measurements in children during early adolescence.75 LISREL was used to obtain maximum likelihood estimates of model parameters and a chi-square goodness-of-fit statistic. Models with different numbers of parameters were compared by chi-square difference tests. An adequate, parsimonious explanatory model was found which included only additive genetic effects and environmental factors unique to the individual. Another recent study showed how

SEM and

special

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

174

behaviour-genetic modelling would be accomplished especially easily using the EQS program.~6 Miner, Marks and Collins performed a genetic analysis of nicotine-induced seizures and hippocampal nicotinic receptors in two different strains of mice. 77 They used genetic correlations which are of interest because they give an indication of the extent to which two characters, such as seizure sensitivity and nicotinic receptor concentration, are controlled by the same genes. To estimate genetic correlations, they fit three data matrices to their expectations simultaneously using LISREL. Chi-square estimates indicated the adequacy of the models used to describe the data. They found different sensitivity to nicotine-induced seizures between the two different strains of mice. 13

Future directions

It is our view that there are many exciting and creative opportunities for structural modelling in the medical arena. If more biostatisticians were familiar with the technique, SEM could be applied routinely when appropriate. We present two brief examples. First, there is presently a program sponsored by the Agency for Health Care Policy and Research (AHCPR) to evaluate and improve the quality of medical care. In a recent report to Congress, AHCPR described their Medical Treatment Effectiveness Program (MEDTEP) which, as part of its mission, supports outcomes research Outcomes are assessed for treatment of medical conditions such as back pain, total knee replacements, acute myocardial infarction, cataracts, prostate cancer, ischaemic heart disease, and management of diabetes. This report further cites an article in the New England Journal of Medicine in which William Roper, then Administrator of the Health Care Financing Administration (HCFA), suggested that the quality of medical care could be improved by using population-based databases to understand which practices are most effective.79 They also called for research focusing on the measurement of medical outcomes with particular attention to quality-of-life measures. Quality-of-life measures have been particularly amenable to development using SEM methodology since they are usually constructs with multiple indicators. To illustrate, a beginning was made by Allen and Bentler, in which they studied the World Health Organization’s definition of health using data on 47 objective and subjective health measures from the RAND Health Insurance Experiment. 80 As a second example, in an entirely different context, an exciting application of SEM has been suggested by McManus and Bryden to test Geschwind’s theory of cerebral lateralization.81-82 This theory hypothesizes that the influence of fetal testosterone levels affects many aspects of cerebral lateralization and its association with learning disorders, giftedness, and immune deficits; anatomical asymmetries are hypothesized to underlie functional asymmetries. If this theory is correct, it would apparently require a radical rethinking of many areas of biology and medicine. However, McManus and Bryden point out that the theory is so complex and general that there has not been any serious attempt to test and evaluate the entire theory. Their key point is that the strongest evidence in favour of the entirety of the model are correlations between widely separated items such as handedness and immune disorders. They have thus developed 30 postulates from Geschwind’s theory that could be tested with causal modelling. These postulates, which are logically predicted from the theory, are testable and falsifiable using SEM techniques.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

175

Recently, a committee of the National Research Council investigated various methodological techniques that could be used to evaluate AIDS prevention programmes.83 The committee cited several advantages of structural modelling. For instance, the approach forces the analyst to be explicit in articulating theory, and facilitates an analytic focus upon how well the theory fits the data. In addition, the methodology is an intuitively appealing way to represent complex effects and influences. Furthermore, SEM allows explicit comparisons of competing models or theories, and successful models provide some basis for predicting outcomes that may occur as conditions change. Although they eventually decided that causal inferences derived from SEM did not provide the level of confidence of a well-executed randomized experiment, they further acknowledged that such models will surely have a role to play especially as an adjunct to randomized experiments. (And, of course, randomized experiments are not always possible.) They concluded by stating that as experience accrues in situations where modelling is done in tandem with experiments, theory and data may be developed that will allow modelling approaches to substitute for some experiments in the future. In conclusion, it thus appears to be inevitable that various applications of SEM will become more common in medical research in the future. A basic familiarity with the statistical theory and assumptions underlying the techniques is highly recommended since misuse occurs most often when users naively apply these powerful methods without sufficient background in the methodology. A good source of additional references can be found in the bibliographies of Bentler’s pragmatic overview and in Bollen’s more technical reviews. 4,3

Acknowledgements Support for this review

was provided by National Institute on Drug Abuse Grant DA01070 and DA00017. The secretarial assistance of Anntoinella Wilkie is gratefully

acknowledged. ’

Appendix A

.

,

Bentler-Weeks model As noted in the text, the family of EQS programs (available on PCs, MacIntoshes, Unix workstations, and various mainframes) utilize a simple, nonmatrix methodology for specifying structural models. Internally, however, the programs utilize a more precise mathematical language. The EQS input for any model consists of equations, variances and covariances, and some control information. The program transforms such material into the matrix specifications of the Bentler-Weeks model.1 The EQS manual provides many detailed examples that describe this process in fu11.4 Here we can only outline the process. The basic equation of the Bentler-Weeks model is

where , is the vector of dependent variables, P contains the regression coefficients of dependent on dependent variables, and y contains the regression coefficients of dependent on independent variables. Zeros are used in the matrices where no effects are expected. The Bentler-Weeks model also contains a matrix 4Y, the covariance matrix of the independent variables. Information from the variance section will go into the diagonal,

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

176

and covariance information into the off-diagonal, of this matrix. In covariance structure analysis, the estimation method is based on covariances. The Bentler-Weeks model generates the p by p model population covariance matrix I from the above matrix equation. This is given as

supermatrix containing rows ([3,0) and (0,0), r is a partitioned matrix containing (y’,I)’, and G is a known selection matrix that picks out the measured variables from all the variables. Note that once estimates of ~3, y, and (D are available, where B is

a

estimate of ~ is also available. The statistical task is to obtain estimates that are close the data, the sample covariance matrix S, as noted in Appendix B. In models with structured means, the intercepts and means of independent variables are also parameters, so sample means are analysed as well.

an

to

Appendix B

,

Statistical estimation in structural models Classically, the maximum likelihood (ML) method of estimation and testing based on an assumed multivariate normal distribution of variables has been used in structural modelling, but in recent years more general methods permitting elliptical and arbitrary distributions of variables have been developed as well. In covariance structure analysis, for the ML estimation procedure the fitting function to be minimized is where ~, S, and p were defined previously. This function measures discrepancies between S and Y. in two ways. The determinants are such that det(~) approaches det(S) as Xl approaches S. At the same time, S~-1 approaches the identity matrix, and the sum (trace) of diagonal elements (trS~-1) approaches p. Thus, 6 is chosen to make F(9) based on ~, _ Xl(0) as small as possible. The goodness-of-fit chi-square test statistic is given by nF(O), where n (N - 1) and N is the sample size. The degrees of freedom are p* = p (p + 1 )/2 - q, where q is the number of free parameters estimated. For generalized least squares (GLS) estimation, the fitting function is =

where s and cr are vectors containing the elements of the lower triangular part of the sample covariance matrix, S, and the model covariance matrix, ~, respectively. Q is a weighted sum of squares of discrepancies between elements s-- and uj of sample and model covariance matrices. The elements a are functions of the parameters, that is, a Qt~(8). W is a p* x p* matrix. The definition of W, the weight matrix, is based on the assumed distribution theory for the variables. Minimizing Q with W I leads to a type of least squares analysis which, unfortunately, does not readily yield a statistical chi-square variate. To obtain such a variate, W-1 will generally represent the sampling covariance matrix between sample covariances (e.g. between sij and ski). The general form of the elements of such an optimal W-1 is =

=

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

177

where uijkl is a fourth-order product moment about the means gi,

namely,

.

Estimators based on this general formula are called arbitrary distribution GLS or AGLS estimators in EQS, highlighting the main features of the method, namely the permitted arbitrary distribution of the variables and the GLS estimation method. Browne calls this estimator asymptotically distribution-free (ADF),15,84 while Chamberlain calls it a minimum distance estimator.85 This estimator is hard to compute because Uijkl is expensive to obtain. Large samples are also needed to estimate fourth-moments with precision. Recent research by Hu, Bentler and Kano has determined that the AGLS estimator provides the most reliable test of model fit under a wide variety of non-normal data conditions, but only in extremely large sample sizes.g6 In medium to small samples, a scaling correction of the ML statistic provided by Satorra and Bentler far outperformed the AGLS and all other estimators considered. 87 The general AGLS function specializes to a simpler form when variables are elliptically or normally distributed. In that case,

so that fourth-order moments are no longer needed. 15,88 As a consequence, the large weight matrix W can be reduced from dimension p* x p* to p x p in a specialized version of Q(9), making these methods computationally much simpler. In such a case, only K and the covariances are needed. In elliptical distributions, K represents the common kurtosis of the variables, i.e., the extent to which variables have lighter (negative K) or heavier tails (positive K) than the normal distribution. Multivariate normal distributions have K 0, so that the formula simplifies further. More details regarding the weight matrix W, estimation methods for K, and the specialized elliptical and normal theory GLS functions can be found in Bentler4 (Chapter 10). A new class of methods that are relevant when the variables have heterogeneous kurtoses, but in which computational simplicity is still available, was described by Kano, Berkane, and Bentler.89 Under the null hypothesis, the test statistic in GLS, nQ(9), has a Chi square distribution in large samples, when the relevant distributional assumption is correct. Because of the relative computational simplicity of the normal theory ML method, it would be desirable to use ML estimation whenever possible. ML estimates can be trusted (i.e. are consistent) even when the distribution of variables is not normal, but the chisquare statistic and standard error estimates may be off substantially. There are conditions under which normal theory tests are correct under violation of assumptions, for example, independence of error terms and independence between factors and errors. 90-92 Unfortunately, practical methods to evaluate conditions for applicability of the theory are not yet available in popular computer programs, and Hu et al. showed that ML fit statistics could be wide off the mark when asymptotic robustness is assumed but it is in =

fact not true The statistical

theory summarized above holds for continuous variables. It is theoretically appropriate for binary variables or ordered categorical variables having few response options (when such variables have many categories, the degree of violation of the assumption of continuity will be minor, and the problem may be ignored in practice). Special methods have been developed for the analysis of ordered category data with latent variable models. These involve using a probit-type regression to relate the categorical not

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

178

variables to latent continuous variables, and permitting the latent variables to relate to each other linearly via models of the sort summarized in this paper. See e.g. Muthen93 and Lee, Poon and Bentler94 for methods that are appropriate in this case. .

1

623-39. Carr DB, Jones

KJ, Berland RM, Hamilton A, Kasting NW, Fisher JE, Martin JB. Causal links between plasma and CSF endorphin levels in stress: Vector-ARMA analysis. Peptides 1985; 6: 5-10. 7 Buncher CR, Succop PA, Dietrich KN. Structural equation modeling in environmen6

tal risk assessment. Environmental Health

Perspectives 1991; 90: 209-13. Baumrind D. Specious causal attributions in the social sciences: The formulated steppingstone theory of heroin use as exemplar. Journal of Personality and Social Psychology 1289-98. 1983; 45: 9 Cliff N. Some cautions concerning the application of causal modeling methods. Multivariate Behavioral Research 1983; 18: 115-26. 10 Freedman DA. Statistics and the scientific method. In Mason W, Fienberg S eds, Cohort analysis in social research. New York: Springer, 1985. 11 Bentler PM. Structural modeling and the scientific method: Comments on Freedman’s critique. Journal of Educational Statistics 8

12

13

1987; 12: 151-57. SJ. Applications of covariance structure modeling in psychology: cause for concern? Psychological Bulletin 1990; 107: Breckler

260-73. Bentler PM. Latent variable structural models for separating specific from general effects. In Sechrest L, Perrin E, Bunker

.... ;,

J eds, Research methodology: strengthening causal interpretations of nonexperimental data (DDS Pub No. 90-3454). Rockville, MD: Department of Health and Human Services,

References Wright S. On the nature of size factors. Genetics 1918; 3: 367-74. 2 Bentler PM. Structural modeling and psychometrika: an historical perspective on growth and achievements. Psychometrika 51: 1986; 35-51. 3 Bollen KA. Structural equations with latent variables. New York: Wiley, 1989. 4 Bentler PM. EQS structural equations program manual. Los Angeles: BMDP Statistical Software, 1989. 5 Francis DJ. An introduction to structural equation Journal models. of Clinical and 10: Experimental Neuropsychology 1988;

-

14

15

1990. Bentler PM, Dijkstra T. Efficient estimation via linearization in structural models. In Krishnaiah PR ed, Multivariate analysis VI. Amsterdam: North-Holland, 1985. Browne MW. Asymptotically distributionfree methods for the analysis of covariance structures.

British Journal of Mathematical

and Statistical Psychology 1984; 37: 62-83. 16 Jöreskog KG, Sörbom D. Lisrel 7, a guide to the program and applications. Chicago: SPSS, 1988. 17 Morrow GR, Black PM, Dudgeon DJ. Advances in data assessment. Application to the etiology of nausea reported during chemotherapy, concerns about significance testing, and opportunities in clinical trials. Cancer 1991; 67: 780-87. 18 Bentler PM, Weeks DG. Linear structural equations with latent variables. Psychometrika 1980; 45: 289-308. 19 Anderson JC, Gerbing DW. Structural equation modeling in practice: a review and recommended two-step approach. 103: 411-23. Psychological Bulletin 1988; 20 Marsh HW, Balla JR, McDonald RP. Goodness-of-fit indexes in confirmatory factor analysis: the effect of sample size. Psychological Bulletin 1988; 103: 391-410. 21 Bentler PM, Bonett DG: Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin

588-606. 88: 1980; 22

Bentler PM. Comparative fit indexes in structural models. Psychological Bulletin

107: 1990; 238-46. Bentler PM. Linear systems with multiple levels and types of latent variables. In Jöreskog KG, Wold H eds, Systems under indirect observation: causality, structure, prediction. Amsterdam: North-Holland, 1982:101-30. 24 Loehlin JC. Latent variable models. An introduction to factor, path, and structural analysis. Hillsdale, NJ: Lawrence Erlbaum 23

Associates,

1987.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

179

25

26

27

28

29

30

31

32

33

34

35

36

37

Bentler PM, Mooijaart A. Choice of structural model via parsimony: A rationale based on precision. Psychological Bulletin 106: 315-17. 1989; Alwin DF, Jackson DJ. Applications of simultaneous factor analysis to issues of factorial invariance. In Jackson D, Borgatta E eds, Factor analysis and measurement in sociological research: a multi-dimensional perspective. Beverly Hills: Sage, 1981. Cole DA. Utility of confirmatory factor analysis in test validation research. Journal of Consulting and Clinical Psychology 1987; 55: 584-94. Morris RJ, Bergan JR, Fulginiti JV. Structural equation modeling in clinical assessment research with children. Journal of Consulting and Clinical Psychology 1991; 59: 371-79. Jöreskog KG. Analyzing psychological data by structural analysis of covariance matrices. In Magidson J ed, Advances in factor analysis and structural equation models. Lanham, MD: University Press of America, 1979: 45-100. Lord FM, Novick ME. Statistical theories of mental test scores. Reading, MA: Addison- . Wesley, 1968. Millsap RE, Everson H. Confirmatory measurement model comparisons using latent means. Multivariate Behavioral Research 1991; 26: 479-97. Wheaton B, Muthén B, Alwin DF, Summers GF. Assessing reliability and stability in panel models. In Heise DR ed, Sociological methodology. San Francisco: Jossey-Bass, 1977: 84-136. Raffalovich LE, Bohrnstedt GW. Common, specific, and error variance components of factor models: Estimation with longitudinal Research 1987; data. Sociological Methods & 15: 385-405. Jöreskog KG. Statistical models and methods for analysis of longitudinal data. In Magidson J ed, Advances in factor analysis and structural equation models. Lanham, MD: University Press of America, 1979: 129-69. Campbell DT, Fiske DW. Convergent and discriminant validation by the multitraitmultimethod matrix. Psychological Bulletin 1959; 56: 81-105. Schmitt N, Stults DM. Methodology review: Analysis of multitrait-multimethod matrices. Applied Psychological Measurement 1986; 10: 1-22. Widaman K. Hierarchically nested covariance structure models for multitraitmultimethod data. Applied Psychological

38

39

Measurement 1985; 9: 1-26. Marsh HW. Confirmatory factor analyses of multitrait-multimethod data: Many problems and a few solutions. Applied Psychological

Measurement 335-61. 13: 1989; Saylor CF, Finch AJ, Baskin CH, Furey W,

Kelly MM. Construct validity for measures of childhood depression: Application of multitrait-multimethod methodology. Journal of Consulting and Clinical Psychology 52: 1984; 977-85. 40

Stein JA, Newcomb MD, Bentler PM. Structure of drug use behaviors and consequences among young adults:

Multitrait-multimethod assessment of frequency, quantity, work site, and problem substance use. Journal of Applied Psychology 41

73: 1988; 595-605. Stacy AW, Widaman KF, Hays R, DiMatteo MR. Validity of self-reports of alcohol and other drug use: a multitrait-multimethod assessment. Journal of Personality and Social Psychology 1985; 49:

219-32. Newcomb MD, Bentler PM. Loneliness and social support: A confirmatory hierarchical analysis. Personality and Social Psychology Bulletin 1986; 12: 520-35. 43 Bentler PM, Newcomb MD. Personality, sexual behavior and drug use revealed through latent variable methods. Clinical Psychology Review 1986; 6: 363-85. 44 Hines M, Chiu L, McAdams LA, Bentler PM, Lipcamon J. Cognition and the corpus callosum: Verbal fluency, visuospatial ability and language lateralization related to midsagittal surface areas of callosal subregions. Behavioral Neuroscience 1992; 106: 3-14. 45 Sechrest L, Hannah M. The critical importance of nonexperimental data. In Sechrest L, Perrin E, Bunker J eds, Research methodology : Strengthening causal interpretations of nonexperimental data (pp. 1-8) (DDS Pub. No. 90-3454). Rockville, MD: Department of Health and Human Services, 1990. 46 Newcomb MD, Bentler PM. The impact of late adolescent substance use on young adult health status and utilization of health services: A structural-equation model over four years. Social Science and Medicine 1987; 24: 71-82. 47 Stein JA, Fox SA, Murata PJ. The influence of ethnicity, socioeconomic status, and psychological barriers on use of mammography. Journal of Health and Social Behavior 1991; 32: 101-13. 42

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

180

48

49

Caetano R. The factor structure of the DSM-III-R and ICD-10 concepts of alcohol dependence. Alcohol & Alcoholism 1990; 25: 303-18. American Psychiatric Association. Diagnostic and statistical manual, third edition. Washington DC: American Psychiatric

61

62

Association, 1987. 50

51

World Health Organization: International classification , of diseases 10th revision. Geneva: WHO, 1987. Francis DJ, Fletcher JM, Rourke BP. Discriminant validity of lateral sensorimotor tests in children. Journal of Clinical and

63

10: Experimental Neuropsychology 1988; 52

53

54

55

56

57

58

59

779-99. Coulton CJ, Hyduk CM, Chow JC. An assessment of the arthritis impact measurement scales in 3 ethnic groups. Journal of Rheumatology 1989; 16: 1110-5. Hagglund KJ, Roth DL, Haley WE, Alarcon GS. Discriminant and convergent validity of self-report measures of affective distress in patients with rheumatoid arthritis. Journal of Rheumatology 1989; 16: 1428-32. Johnson SB, Tomer A, Cunningham WR, Henretta JC. Adherence in childhood diabetes : results of a confirmatory factor analysis. Health Psychology 1990; 9: 493-501. Lobel M, Dunkel-Schetter C. Conceptualizing stress to study effects on health: Environmental, perceptual, and emotional components. Anxiety Research 1990; 3: 213-30. Zelditch ML. Evaluating models of developmental integration in the laboratory rat using confirmatory factor analysis. Systematic Zoology 1987; 36: 368-80. Ellison PH, Greisen G, Foster M, Petersen MB, Friis-Hansen B. The relation between perinatal conditions and developmental outcome in low birthweight infants. Comparison of two cohorts. Acta Paediatr Scand 1991; 80: 28-35. Turk DC, Rudy TE. The robustness of an empirically derived taxonomy of chronic pain patients. Pain 1990; 43: 27-35. McIntosh AR and Gonzalez-Lima F. Structural modeling of functional neural pathways mapped with 2-deoxyglucose: Effects of acoustic startle habituation on the

64

65

socioeconomically disadvantaged women. 66

67

68

69

70

71

auditory system. Brain Research 1991; 547: 295-302. 60

Era P, Jokela J, Qvarnberg Y, Heikkinen E. Pure-tone thresholds, speech understanding, and their correlates in samples of men of different ages. Audiology 1986; 25: 338-52.

den Burg W, van Zomeren AH, Minderhoud JM, Prange AJA, Meijer NSA. Cognitive impairment in patients with multiple sclerosis and mild physical disability. Arch Neurol 1987; 44: 494-501. Streissguth AP, Bookstein FL, Sampson PD, Barr HM. Neurobehavioral effects of prenatal alcohol: Part III. PLS analyses of neuropsychologic tests. Neurotoxicology and Teratology 1989; 11: 493-507. Sampson PD, Streissguth AP, Barr HM, Bookstein FL. Neurobehavioral effects of prenatal alcohol: Part II. Partial least squares analysis. Neurotoxicology and Teratology 1989; 11: 477-91. Lobel M. Prenatal contributors to adverse birth outcomes: Applying a biopsychosocial model. Unpublished doctoral dissertation. Los Angeles: University of California, Los Angeles, 1989. Lobel M, Dunkel-Schetter C, Scrimshaw SCM. Prental maternal stress and prematurity: A prospective study of van

72

Health Psychology 1992; 11: 32-40. Pearson DT, Dietrich KN. The behavioral toxicology and teratology of childhood: Models, methods, and implications for intervention. Neurotoxicology 1985; 6: 165-82. Poland ML, Ager JW, Olson KL, Sokol RJ. Quality of prenatal care; selected social, behavioral, and biomedical factors; and birth weight. Obstetrics and Gynecology 1990; 75: 607-11. Dietrich KN, Krafft KM, Bornschein RL, Hammond PB, Berger O, Succop PA, Bier M. Low-level fetal lead exposure effect on neurobehavioral development in early infancy. Pediatrics 1987; 80: 721-30. Boomsma DI, Molenaar PCM. Constrained maximum likelihood analysis of familial resemblance of twins and their parents. Acta Genet Med Gemellol 1987; 36: 29-39. Dolan CV, Molenaar PCM, Boomsma DI. Simultaneous genetic analysis of longitudinal means and covariance structure in the simplex model using twin data. Behavior Genetics 1991; 21: 49-65. Duffy DL, Martin NG, Battistutta D, Hopper JL, Mathews JD. Genetics of asthma and hay fever in Australian twins. American Review of Respiratory Disease 142 1990 : ; 1351-58. Fischbein S, Molenaar PCM, Boomsma DI. Simultaneous genetic analysis of longitudinal means and covariance structure using the simplex model: Application to repeatedly

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

181

measured weight in a sample of 164 female twins. Acta Genet Med Gemellol 1990; 39:

Washington, DC: National Academy Press, 84

165-72.

Schieken RM, Eaves LJ, Hewitt JK, Mosteller M, Bodurtha JN, Moskowitz WB, Nance WE. Univariate genetic analysis of blood pressure in children (The Medical College of Virginia twin study). The American Journal of Cardiology 1989; 64: 1333-37. 74 Stunkard AJ, Harris JR, Pedersen NL, McClearn GE. The body-mass index of twins who have been reared apart. N Engl J Med 322: 1483-87. 1990; 75 Bodurtha JN, Mosteller M, Hewitt JK, Nance WE, Eaves LJ, Moskowitz WB, Katz S, Schieken RM. Genetic analysis of anthropometric measures in 11-year-old twins: The Medical College of Virginia twin study. Pediatric Research 1990; 28: 1-4. 76 Loehlin JC. Using EQS for a simple analysis of the Colorado Adoption Project data on height and intelligence. Behavior Genetics, in 73

85

86

87

88

press.

77

78

79

80

81

Miner LL, Marks MJ, Collins AC. Genetic analysis of nicotine-induced seizures and hippocampal nicotinic receptors in the mouse. Pharmacology and Experimental 239: 853-60. Therapeutics 1986; USDHHS. Report to Congress: Progress of research on outcomes of health care services and procedures. AHCPR Pub. No. 91-0004, 1991. Roper WL, Winkenwerder W, Hackbarth GM, Krakauer H. Effectiveness in health care: an initiative to evaluate and improve medical practice. N Engl J Med 1988; 319: 1197-202. Allen HA, Bentler PM. Explaining the structure of health: a multivariate approach. Under review. McManus IC, Bryden MP. Geschwind’s theory of cerebral lateralization: Developing a formal, causal model. Psychological Bulletin

89

90

91

92

93

237-53. 110: 1991; 82

83

Geschwind N, Galaburda AS. Cerebral lateralization. Cambridge, MA: MIT Press, 1987. Coyle SL, Boruch RF, Turner CF. Evaluating AIDS prevention programs.

94

1991. Browne MW. Covariance structures. In Hawkins DM ed, Topics in applied multivariate analysis. London: Cambridge University Press, 1982: 72-141. Chamberlain G. Multivariate regression models for panel data. Journal of Econometrics 18: 5-46. 1982; Hu L, Bentler PM, Kano Y. Can test statistics in covariance structure analysis be trusted? Psychological Bulletin (in press). Satorra A, Bentler PM. Scaling corrections for chi-square statistics in covariance structure analysis. Proceedings of the Business and Economic Statistics Section of the American Statistical Association. Alexandria, VA: American Statistical Association, 1988: 308-13. Bentler PM. Some contributions to efficient statistics in structural models: Specification and estimation of moment structures. Psychometrika 1983; 48: 493-517. Kano Y, Berkane M, Bentler PM. Covariance structure analysis with heterogeneous kurtosis parameters. Biometrika 1990; 77: 575-85. Amemiya Y, Anderson TW. Asymptotic chi-square tests for a large class of factor analysis models. The Annals of Statistics 1990; 1453-63. 18: Mooijaart A, Bentler PM. Robustness of normal theory statistics in structural equation models. Statistica Neerlandica 1991; 45: 157-91. Satorra A, Bentler PM. Model conditions for asymptotic robustness in the analysis of linear relations. Computational Statistics & Data Analysis 1990; 10: 235-49. Muthén B. A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika 1984; 49: 115-32. Lee S-Y, Poon W-Y, Bentler PM. A threestage estimation procedure for structural equation models with polytomous variables. Psychometrika 1990; 55: 45-51.

Downloaded from smm.sagepub.com at University of Manitoba Libraries on April 24, 2015

Structural equation models in medical research.

Structural equation modelling (SEM) is a modern statistical method that allows one to evaluate causal hypotheses on a set of intercorrelated nonexperi...
2MB Sizes 0 Downloads 0 Views