This article was downloaded by: [Washington University in St Louis] On: 12 October 2014, At: 10:04 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Multivariate Behavioral Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmbr20

Confirmatory Factor Analysis and Profile Analysis via Multidimensional Scaling a

b

Se-Kang Kim , Mark L. Davison & Craig L. Frisby a

Fordham University , Bronx

b

University of Minnesota , Twin Cities

c

c

University of Missouri , Columbia Published online: 05 Dec 2007.

To cite this article: Se-Kang Kim , Mark L. Davison & Craig L. Frisby (2007) Confirmatory Factor Analysis and Profile Analysis via Multidimensional Scaling, Multivariate Behavioral Research, 42:1, 1-32, DOI: 10.1080/00273170701328973 To link to this article: http://dx.doi.org/10.1080/00273170701328973

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

MULTIVARIATE BEHAVIORAL RESEARCH, 42(1), 1–32 Copyright © 2007, Lawrence Erlbaum Associates, Inc.

Confirmatory Factor Analysis and Profile Analysis via Multidimensional Scaling Se-Kang Kim Fordham University, Bronx

Mark L. Davison University of Minnesota, Twin Cities

Craig L. Frisby University of Missouri, Columbia

This paper describes the Confirmatory Factor Analysis (CFA) parameterization of the Profile Analysis via Multidimensional Scaling (PAMS) model to demonstrate validation of profile pattern hypotheses derived from multidimensional scaling (MDS). Profile Analysis via Multidimensional Scaling (PAMS) is an exploratory method for identifying major profiles in a multi-subtest test battery. Major profile patterns are represented as dimensions extracted from a MDS analysis. PAMS represents an individual observed score as a linear combination of dimensions where the dimensions are the most typical profile patterns present in a population. While the PAMS approach was initially developed for exploratory purposes, its results can later be confirmed in a different sample by CFA. Since CFA is often used to verify results from an exploratory factor analysis, the present paper makes the connection between a factor model and the PAMS model, and then illustrates CFA with a simulated example (that was generated by the PAMS model) and at the same time with a real example. The real example demonstrates confirmation of PAMS exploratory results by using a different sample. Fit indexes can be used to indicate whether the CFA reparameterization as a confirmatory approach works for the PAMS exploratory results.

Correspondence concerning this article should be addressed to Se-Kang Kim, Department of Psychology, Fordham University, Bronx, NY 10458. E-mail: [email protected]

1

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

2

KIM, DAVISON, FRISBY

Profile analysis refers to a family of methods designed to distinguish between groups of test takers based on their unique pattern, or profile, of subtest scores (Stanton & Reynolds, 2000). The profile approach is especially useful in understanding relative strengths and weaknesses of test takers in their subtest scores, and this profile information is utilized clinically to make differential diagnoses and to design appropriate interventions based on an individual’s profile pattern. For example, an examininee’s performance on a cognitive test containing multiple subtests (e.g., Wechsler Preschool and Primary Scale of Intelligence—Third Edition: WPPSI-III, Wechsler, 2002) can be evaluated as a pattern of composite and subtest scaled scores. Profiles can be understood from both intraindividual and interindividual perspectives by comparing the examinee’s score patterns across subtests, or by comparing examinee’s score patterns to the appropriate normative reference group. These ability comparisons can help a practitioner identify potentially meaningful patterns of strengths and weaknesses, which is important in describing functional impairment and for designing and preparing educational plans (Sattler, 2001). For readers unfamiliar with Profile Analysis via Multidimensional Scaling (PAMS, Davison, 1996), this paper begins by describing and illustrating the use of Multidimensional Scaling (MDS) for exploratory study of profile patterns. It then describes the link between the PAMS model and the common factor model. These models provide the link between exploratory PAMS and Confirmatory Factor Analysis (CFA) that utilizes structural equation modeling. Then we describe and illustrate the application of CFA to confirmation of profile patterns found in an earlier sample through PAMS. PAMS and CFA constitute complementary methods for exploratory and confirmatory analysis of profile patterns.

AN OVERVIEW OF PROFILE ANALYSIS METHODS Ipsatized scores (subtracting an examinee’s mean of subtest scores from each individual subtest score) have been used to extract profiles from multi-subtest intelligence scales. The individual’s ipsatized scores yield a profile of the examinee’s relative strengths and weaknesses across subtests. However, McDermott and his colleagues have identified several disadvantages of using ipsatized scores for profile analysis of cognitive ability tests. Two of those disadvantages are that ipsatized scores are inappropriate for parametric statistical procedures and are inefficient at predicting either clinical utility or academic success (McDermott, Fantuzzo, & Glutting, 1990; McDermott, Fantuzzo, Glutting, Watkins, & Baggaley, 1992; McDermott & Glutting, 1997). In addition, an examinee’s unique profile only includes information about the specific individual and does not provide any information about how similar

CONFIRMATORY PAMS

3

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

or dissimilar the examinee’s profile pattern is when compared to the normative profile pattern of a representative group. Hence, researchers have directed their attention to the identification of “core” profiles, or normative profiles reflective of the major profile patterns in a data set (McDermott et al., 1990). In the past, cluster analysis (Aldenderfer & Blashfield, 1984; Konald, Glutting, McDermott, Kush, & Watkins, 1999) and Q-factor analysis (Cattell, 1967) have been used to identify the major profile patterns in a population. Cluster Analysis The essence of a cluster analysis is to classify objects into meaningful clusters or groups, where the objects within each cluster are more homogeneous than are the objects in other clusters, and clusters are assumed to be mutually exclusive to each other. Since our goal is to identify the most typical or major profile patterns among people, the objects within a cluster are people. If subtest scores are not on the same scale, the scores are standardized to have the same metric across subtests. Since clusters are assumed mutually exclusive, a person can fit in only one cluster. The mean subtest scores for the individuals are used to describe the profile characteristics of the cluster. Proximities (e.g., squared Euclidean distances) of subtest scores among individuals are used as input data for clustering or separating the individuals. The proximities consist of a square form of the persons  persons matrix, and when the sample size is large, the resultant matrix may be too unwieldy for analysis. Q-factor Analysis To define major profiles, Q-factor analysis also uses a square form of person proximity data to define major profiles, but the data matrix is not the Euclidean distance matrix that is typically used in cluster analysis. The data matrix used in Q-factor analysis is a correlation (or covariance) matrix with one row and one column for each person. While ordinary factor analysis utilizes a subtest correlation (or covariance) matrix to identify latent structures, the Q-factor approach analyzes correlations (or covariances) among people to define major profile patterns among people. In that sense, Q-factor analysis is an inverted ordinary factor analysis. Since the Q-factor approach adopts the ordinary factor analysis algorithm, which is different from cluster analysis, individuals’ mean subtest scores are irrelevant information in determining characteristics of major profiles; rather, measures of relationship (i.e., correlation) among individuals’ subtest scores are used to estimate the profile characteristics. In computing correlations among people (using their subtest scores), the size of the correlation matrix grows rapidly as the sample size increases. For instance, in our proposed study, the matrix would have 328 rows and columns because the sample size is 328.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

4

KIM, DAVISON, FRISBY

In past research, the sample size was sometimes so large that not all people could be included in a single analysis. In that case, the sample was divided into two or more groups, each group was analyzed separately, and the results from the several groups were combined in some ad hoc fashion in order to make generalizations for the sample as a whole (Moses & Pritchard, 1995). However, Heiser and Meulman (1983a) caution against combining separate configurations from subgroups to characterize the whole sample. While future hardware and software advances will enable analysis of larger and larger matrices, it remains to be seen whether such advances effectively remove the sample size limits on a single Q-factor or cluster analysis of proximity measures defined over all possible pairs of persons. Moreover, since cluster and Q-factor analyses use inter-person proximity data, none of the methods are feasible when only inter-item proximities, such as inter-item correlations or distances, are available for input data. Q-factor analysis and cluster analysis are, in a sense, bottom-up approaches. To identify major profiles, these methods start with arrays of individuals’ subtest scores, which are in fact observed profiles of the individuals. In our example, since there are 6 subtests (or variables) with a sample size of 328, the data matrix contains 328 rows for persons and 6 columns for subtests. Each row in the matrix includes an individual’s 6 observed subtest score array, which is the person’s observed profile. From the original data matrix, the 328  328 proximity data among subtest scores of people (e.g., inter-person correlations for Q-factor analysis or inter-person squared Euclidean distances for cluster analysis) are computed and then the proximity data are analyzed to estimate major profiles. Therefore, the analysis process starts from the bottom (analysis of the person proximity data to identify clusters of factors of persons) to the top (estimation of major profiles). However, the PAMS approach defines the major profiles (among people) first, using proximities among subtests, rather than proximities among people, and then estimates the relationship between major and observed profiles in their patterns. This process of PAMS is a top (identification of major profiles) down (estimate the matching statistics of observed profiles) orientation. This top-down procedure of PAMS does not require the potentially arduous procedure involved in estimating major profiles from large proximity matrices using the cluster or Q-factor methods, and at the same time can avoid Heiser and Meulman’s (1983a) contention about combining separate configurations estimated from subsets of the large data set to generalize the results. A Simple MDS with PAMS The first step in a PAMS procedure is to conduct a simple multidimensional scaling (MDS) analysis on proximity data. An MDS analysis can be illustrated

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

5

with respect to a simple example. Suppose a researcher were to identify 20 major American cities, and obtained the distance in miles between every conceivable pair of cities. These distances serve as the proximity data. The end result of an MDS analysis on this data would be the creation of a two-dimensional scatter plot (i.e., an “east-west” dimension and a “north-south” dimension) on which the location of the 20 cities can be plotted. Each city would be spatially aligned in the two dimensional space exactly as they appear geographically (e.g., Miami would appear in the southeast quadrant, and Seattle would appear in the northwest quadrant). Thus, each city can be represented by two coordinate values: One coordinate that reflects its position on the “east-west” axis, and one coordinate that reflects its position on the “north-south” axis. In the current study, proximity data are squared Euclidean distances between every conceivable pair of the 6 subtests of General Occupational Themes (GOT) in the Strong Interest Inventory (Hansen & Campbell, 1985). In general, PAMS uses a nonmetric scaling procedure (e.g., ALSCAL, or alternating least squares scaling, Takane, Young, & de Leeuw, 1977) to estimate scale-values (or dimension coordinates) as the first step. The original persons  subtests score (e.g., 328 persons  6 subtests in the current study) matrix is used to compute squared Euclidean distances among subtests in the first stage of the PAMS procedure. ALSCAL performs a simple MDS on this squared Euclidean distance matrix to compute scale-values (dimension coordinates) for observed variables (e.g., 6 subscales of the GOT) on all extracted dimensions. Here, each of the 6 GOT subscale variables would have a corresponding coordinate for each extracted dimension from the MDS analysis, and the coordinates of the subscales on a dimension make up one major profile. The PAMS model is a variation of the more traditional factor model. Both are special cases of a general linear latent variable model. Hence, that linear latent variable model can serve as an overarching conceptual framework that links research on traditional trait/ability factors to research on profile patterns defined over measures of several traits/abilities. The typical cluster analysis is not based on an explicit model of variables that can conceptually link results from the cluster analysis to results from studies employing a more traditional factor approach. Moreover, Q-factor or cluster analysis is mainly used for exploratory analysis purpose. Compared to Q-factor and cluster analysis approaches, the PAMS approach outlined below has several potential advantages. There is virtually no limit to the number of respondents who can be included in a single analysis. Also, because the PAMS model is both a variant of the linear latent variable model and an analysis of a subtests  subtests matrix for which extensive confirmatory factor analysis, employing structural equation modeling, has been proposed, the model leads to both exploratory and confirmatory analyses of data.

6

KIM, DAVISON, FRISBY

THE PAMS MODEL The PAMS model begins with the following equation: mpt D cp C

K X

¨pk  xt k C ©pt

(1)

kD1

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

where  mpt D the measure (observed score) of person p on subtest t (t D 1; : : : ; T ) in a profile data matrix where each row represents person p and each column represents subtest t;  cp D the level parameter that indexes the overall height of person p’s observed profile; it is obtained by calculating the average of all tests for P person p (i.e., cp D mp  D TtD1 mpt =T );  k (D 1; : : : ; K) D the number of dimensions or major profiles extracted from a simple MDS performed on the original profile data matrix, and any major profile k includes T scale-values/coordinates;  ¨pk D a weight for person p on major profile k, which indexes the degree of correspondence between the observed (scores) profile of person p and the major profile k;  xt k D the test parameter, which equals the scale-value/coordinate of test t on major profile k; and  ©pt D the error term, representing residuals from the model. Equation 1, expressed in a vector form, assuming two dimensions (K D 2) with six subscales (T D 6), is as follows: Mp D Œmp1 ; : : : ; mp6 t D Œcp ; : : : ; cp t C bDim1 W x* ; Dim2 W x* c  Œ¨p1 ; ¨p2 t C Œ©p1 ; : : : ; ©p6 t 61

D cp  1 C X  W C E  61

62 21

61

61

(2)

As shown in Equation 2, the PAMS model is based on a decomposition of an individual’s observed test score into two parts. The first part of the model reflects a profile level. The level parameter, cp D Ehmpt jpi, is defined as an expectation of the scores on T tests, given person p, and determines the height of person p’s observed profile. The PAMS model uses this level parameter to identify individual differences in observed profile heights. The second part of the model reflects the profile pattern that is defined as deviations from the profile level (or the mean value); i.e., a deviation T -length

CONFIRMATORY PAMS

7

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

vector D Œmp1 cp ; mp2 cp ; : : : ; mpT cp t . For example, supposing 6 subtest measurements, person p’s actual profile pattern is the array of ipsatized scores, .mp1 cp /; : : : ; .mp6 cp /. Person p’s ipsatized scores (that describe the person’s profile pattern) are represented by the summed product of two model PK parameters, kD1 ¨pk  xt k in Equation 1, where the person weight .¨pk / indexes how well a person’s observed profile matches major profile k. Because the scale-values .xt k / are constant across different people, the estimates of the person weights characterize individual differences. Assumptions and Restrictions Some assumptions and restrictions must be added to uniquely define the solution in the PAMS model. E.xt k / D 0:0 for all k

(3)

E.¨2pk / D 1:0 for all k

(4)

E.¨pk  ¨pk 0 / D 0:0 for all k; k 0 ¤ k

(5)

E.©pt / D 0:0 for all t

(6)

Var.©pt / D ¢ 2 for all t

(7)

E.¨pk  ©pt / D E.©pt  ©pt 0 / D 0:0 for all .k; t/ and .t; t 0 /; t ¤ t 0

(8)

Equations 3–8 are standard assumptions and restrictions for the PAMS model. Equation 3 implies that each dimension (or major profile) is ipsative so that the sum of scale-values in each major profile equals zero. Consequently, major profiles will reproduce observed score profile patterns, but not the level of an observed score profile that is accounted for by the level parameter .cp /. Equation 4 states that the mean of the squared person weights is assumed to be one. Equation 5 states that the sum of the cross product between the person weights .¨O pk ; ¨ O pk 0 / on two different dimensions equals zero. Equation 7 implies that the error variances are homogenous across all tests. This is the most restrictive of these assumptions and is not essential to the model itself, but the homogeneous variance assumption seems necessary to justify the most common scaling analyses (Ramsay, 1977; Takane et al., 1977; Kruskal, 1964a, 1964b; and Shepard, 1962a, 1962b) available in existing statistical packages. Equation 8 states that the errors are orthogonal to the person weights and to each other. As stated in Equation 5 above, dimension weights are assumed to be orthogonal across dimensions in order to uniquely identify the MDS solution. However,

8

KIM, DAVISON, FRISBY

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

independence need not be assumed between the level parameter and dimension weights. Consequently, in translating MDS results into a CFA model below, a latent variable corresponding to the MDS level parameter will be allowed to correlate with other latent variables, but the latent variables corresponding to MDS dimensions will be assumed uncorrelated as is consistent with the orthogonality assumption of Equation 5. From Equations 2–8, one can arrive at the fundamental result on which the analysis is based: •2tt 0

D .1=P /

P X

.mpt

mpt 0 /2

(9)

pD1

D

K X

.xt k

xt 0k /2 C 2¢ 2

kD1

D dt2t 0 C 2¢ 2

(10)

Equations 9 and 10 state that the squared Euclidean distance proximity measures computed from the raw data over pairs of tests (t and t 0 ) are within an additive constant of squared Euclidean distances .dt2t 0 / expressed in terms of the model parameters (xt k and xt 0 k ). That is, the proximity measures will satisfy the fundamental assumption of nonmetric multidimensional scaling, which is that proximity measures defined from the raw data .mpt / are monotonically related to distances computed from the test parameters .xt k /. If the data satisfy the PAMS model, there will be one dimension for each major profile, and the scale-values along that dimension will describe the major profile pattern corresponding to that dimension. This leads to an analysis that has three steps. In the first step, squared Euclidean proximity measures are computed for all possible pairs of variables. In the second step, the squared Euclidean distances are submitted to a multidimensional scaling analysis to determine dimensionality based on a badness fit index (i.e., STRESS) and then to estimate the dimension (major profile) scale-values (coordinates) given dimensionality. In the third step, each person’s vector of observed scores is regressed onto the dimensions to estimate the K (total number of dimensions) weights .¨pk / for each person. From this regression, the squared multiple correlation between the (T -length vector) raw scores of person p and the K dimensions can be estimated as a measure of how well the model fits the data of person p. The scale-values .xt k /, on the right side of the PAMS model in Equation 1, are scores for tests and not people. The parameters associated with people are person weights .¨pk /, which reflect the match between person p’s observed score

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

9

profile and major profile k. Because the xt k scores locate tests, the PAMS major profiles spatially represent test variation. Therefore, the PAMS model is testoriented in the sense that the fundamental element, scale-values (or coordinates of major profiles), locates tests in a K-dimensional space. The major profiles in PAMS are, in fact, dimensions in MDS. However, PAMS emphasizes interpretation of dimension patterns and explains arrays (or profiles) of individuals’ observed scores in terms of the dimensions’ patterns. Since PAMS emphasizes pattern interpretations, PAMS major profiles are often interpreted in terms of two separate constructs. That is, the label of the dimension profile often will include two contrasting constructs, although the dimension profile may still be considered one latent variable. To illustrate, the pattern interpretation may refer to the contrast between (a) an individual’s relative strengths and weaknesses on cognitive variables (e.g., verbal vs. performance ability); (b) competing priorities for the same individual (e.g., practical vs. ethical considerations), or (c) competing tendencies within an individual (e.g., internal vs. external locus of control). The term versus is often used in the label that describes the contrast defining the profile pattern. So, although one dimension (or latent variable) appears, its pattern is often interpreted in terms of two constructs.

THE NEED FOR CONFIRMATORY METHODS Profile analyses, including PAMS, Q-factor, analysis and cluster analysis, have been primarily used for exploratory purposes in identifying major profiles. Since the results are exploratory, there is no evidence about whether or not the identified profiles are statistically significant. In other words, the exploratory results are not confirmed with any statistical tests. Here, “confirmed” means that a priori hypothetical constructs drawn from previous research or exploratory analyses (e.g., factor analysis or MDS) are validated by determining the systematic and random properties of the constructs in the context of the Confirmatory Factor Analysis (CFA) approach. There are several ways to confirm exploratory profile analysis results (e.g., viewing correlations between the present exploratory results and the theory-based constructs, or examining congruence between sample results and parameter values). Our purpose here is to develop the link between exploratory PAMS analyses and CFA. Just as CFA can be used to confirm findings from prior exploratory factor analyses, it can also be used to confirm prior exploratory PAMS analyses. Considering PAMS major profiles as latent variables, the PAMS model can be viewed as a linear latent variable model for test scores, which parallels the factor model in several respects. However, the PAMS model is a reparameterization of the linear latent variable model so as to address research questions about profile patterns.

10

KIM, DAVISON, FRISBY

THE FACTOR MODEL VERSUS THE PAMS MODEL The factor model is described by: mpt D t C

K X

œt k  fpk C ©pt

(11)

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

kD1

where  mpt D the measure (observed score) of person p on test t; P  t D the mean of test t over P participants .t D mt D P pD1 mpt =P /, and if mpt is standardized to have mean 0 and variance 1 over P participants, then t is equal to 0;  œt k .k D 1; : : : ; K/ D the loading of test t on factor k;  fpk D the score of person p on factor k; and  ©pt D the uniqueness associated with specific test variance and random deviations from the model. To identify the solution, three constraints about factor scores are often added: (1) The mean of factor scores for each factor is zero; (2) the variance of factor scores is one; and (3) the intercorrelations of factor scores equal zero. The third constraint applies only to uncorrelated factor solutions, which is required to compare uncorrelated factors with dimensions. Deviations .©pt / are assumed to have mean zero and to be uncorrelated with each other and factor scores. In contrast to the PAMS model (see Equation 1), the scores .fpk / in the model are associated with people, not subtests, but the parameters .œt k / associated with subtests are weights on the person factor scores. The factor model is personoriented in the sense that the fundamental element, the factor score, locates individuals at different points along a continuum. When the factors are uncorrelated (as are dimensions) and measures .mpt / are standardized as in many factor models, then these weights determine the covariance between measures t and t 0 ; ¢t t 0 D

K X kD1

œt k  œt 0 k  .1=P /

P X

mpt  mpt 0

(12)

P D1

where P is number of participants and ‘’ means approximation. Since classical MDS (Torgerson, 1958) relies on a components analysis, various authors have studied the relationship between a classical MDS analysis of subtest data and a traditional principal components analysis. While a classical MDS would begin with proximity data computed as in Equation 9, ultimately

CONFIRMATORY PAMS

11

the solution would be obtained by taking principal components of what Torgerson calls scalar products. Various authors (Gower, 1966; Heiser & Meulman, 1983a, 1983b; Schonemann, 1970) have implied that the elements of the scalar product matrix have the form shown in Equation 13: •t t 0 D .1=P /

X

mpt mpt 0

(13)

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

p

where mpt D mpt mp and mp D the mean score for person p taken over all T measures. Note that mp is the level parameter’s value for person p PT  .D cp D t D1 mpt =T / in the PAMS model and mpt is a deviation from the person’s mean, which is an ipsatized score. Therefore, the elements •t t 0 constitute a dot product (or inner product) between scores expressed as deviations about the mean score of person p .mp /. The ordinary components analysis yields the principal components of covariances between observed scores, mpt and mpt 0 . Ultimately, a classical scaling would yield the components of dot P products between the ipsative scores, as shown in Equation 13, •t t 0 D .1=P / p mpt mpt 0 . Hence, the difference between the results of the two analyses will be the difference between components of the covariance matrix † (or correlation matrix R if the observed scores are standardized to have mean zero and variance one) and the components of the scalar product (or dot product or inner product) matrix,  with elements, •t t 0 . The covariance matrix for components (or factor) analysis includes each individual’s level information (D cp or mp ), whereas the scalar product matrix for scaling analysis does not include the level information because the scaling analysis is based on the ipsative scores from which the level information has been removed. Davison (1985) examined the relation between the components of the correlations, R, and the scalar products,  (also see Gollob, 1968a, 1968b; Heiser & Meulman, 1983a, 1983b) and concluded that the effect of subtracting personmeans .mp /, from the observed scores .mpt /, is to remove at most one component from the data. If applied to the PAMS model, the removed factor is the level component .cp / and the removed portion is comparable to the general factor in factor analysis. In data with a general component, the average congruence coefficient between the components solution where the first component was removed and the classical scaling solution was .99 (see Davison, 1985 for details). Consequently, if the scaling contains K dimensions, the components analysis of R will produce at most 1 C K components. In summary, the components solution will have at most one more axis than will the scaling solution, and the K-dimensional scaling solution will constitute a subspace of the components solution. Often, removing the general component from the components solution

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

12

KIM, DAVISON, FRISBY

allows components analysis to be nearly equivalent to a classical metric scaling analysis. However, the origins of components and PAMS solutions are set differently. In a components solution, the origin is set at the centroid of the people in what is often a 1 C K space (i.e., components scores have mean zero), but in the PAMS model, the origin falls at the centroid of the tests in a K space (i.e., test parameter values or dimension coordinates have mean zero along each dimension as specified in Equation 3). In short, the K-dimensional PAMS solution based on Equation 1 is often a subspace of a 1 C K components solution, but with the origin set differently in the PAMS and components solutions. This relationship between the PAMS and components solutions provides the bases for specifying dimensions as latent variables in CFA much as factors are specified as latent variables in CFA.

CONFIRMATORY PAMS The PAMS analysis has been used as an exploratory approach for identifying major profiles (e.g., Davison et al., 1996; Davison, Kuang, & Kim, 1999; Kim, Frisby, & Davison, 2004). Here we describe a CFA approach, utilizing Structural Equation Modeling (SEM), which parallels an exploratory PAMS. Just as CFA can be used to confirm an exploratory factor analysis, it can also be used to confirm an exploratory PAMS analysis. A line of research by Rounds and Tracey (1993) and Tracey and Rounds (1995) used structural equations modeling to validate and augment Holland’s (1973) six personality circumplex (Realistic, Investigative, Artistic, Social, Enterprising, and Conventional: RIASEC) by suggesting different representations of factor structures. Rounds and Tracey (1993) used both MDS and SEM techniques to verify Prediger’s three-factor solution (one general factor and two bipolar factors) in representing Holland’s RIASEC circumplex. In short, Rounds and Tracey assessed Prediger’s (1982) three-factor representation of Holland’s circumplex, using both the MDS and SEM methods. However, Rounds and Tracey used neither the MDS analysis to explore and identify new factor/dimension structures, nor the SEM (for a confirmatory factor analysis as used in the current study) to validate the extracted dimensions from an exploratory MDS analysis. Our approach differs from the Rounds and Tracey (1993) approach by using an MDS tool (in PAMS) to explore how many meaningful major profiles (or dimensions) can be identified with a satisfying fit index (e.g., a STRESS value less than 0.05) and then examining whether or not the PAMS exploratory results are sufficiently confirmed by CFA. Furthermore, while the Rounds and Tracey approach is described solely in the context of Holland’s theory (a circumplex hypothesis), the approach described here is more general and applicable in any

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

13

domain in which subtest patterns are of interest. Our approach is not limited to the testing of circumplex hypotheses, and involves specifying and testing hypotheses about score patterns, not a circumplex hypothesis. Whether or not the PAMS results are validated will be determined by CFA fit indexes. The CFA fit indexes include the Expected Cross-Validation Index (ECVI; Browne & Cudeck, 1993), the Tucker Lewis Index (TLI; Tucker & Lewis, 1973) as a reliability measure, the Akaike Information Criterion (AIC) measure (Akaike, 1987), and Steiger’s (1990) Root Mean Square Error of Approximation (RMSEA). RMSEA is used as “a measure of discrepancy per degree of freedom” (Jöreskog & Sörbom, 1993, p. 124). Browne & Cudeck (1993) and MacCallum, Browne, & Sugawara (1996) suggest that a RMSEA value equal to or less than 0.05 indicates a close model fit, values ranging from 0.05 to 0.08 represent adequate fit, values from 0.08 to 0.10 indicate mediocre fit, and values greater than 0.10 indicate poor fit. To specify the PAMS model with K major profiles in CFA, one must specify 1 C K latent variables. The first latent variable represents the general factor in the factor model and corresponds to the level parameter in the PAMS model. The last K latent variables, corresponding to K major profiles, refer to group factors. The first latent variable accounts for individual differences in profile level and is constrained to have equal loadings for all tests so that its loadings approximate those of a general factor. The first latent variable need not be uncorrelated with the other K-latent variables corresponding to MDS dimensions, since this first factor corresponds to the PAMS level parameter and the level parameter need not be uncorrelated with weights along PAMS dimensions. Consistent with the constraints of the PAMS model (Equation 3), all test parameters on the remaining K-latent variables were constrained so that the P test parameters sum to zero on each latent variable (e.g., TtD1 xt k D 0:0). In addition, along the remaining K dimensions, parameters were set in accordance with MDS results from a prior sample.

EXAMPLE OF EXPLORATORY PAMS Data and Sample In this section, we present results from the exploratory analysis in the prior sample. Our goal is to illustrate an exploratory analysis that can lead to confirmatory specification of a CFA in a second sample. The respondents in this study were adult clients in the Minnesota Vocational Assessment Clinic at the University of Minnesota. The total sample contained 328 respondents with complete data: 184 males and 144 females. Sackett (1993) contains a more complete description of the sample.

14

KIM, DAVISON, FRISBY

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

Clients completed several questionnaires, but only the General Occupation Theme (GOT) scales from the Strong Interest Inventory (Hansen & Campbell, 1985) were used in this analysis. The GOT scales consist of the Realistic (REA), Investigative (INV), Artistic (ART), Social (SOC), Enterprising (ENT), and Conventional (CON) interest scales. Each scale was standardized to have mean equal to 0.0 and standard deviation equal to 1.0. The input data for exploratory PAMS involved the squared Euclidean distance matrix computed from the standardized GOT scales.

The PAMS Solution To identify major profiles, ALSCAL (Takane et al., 1977) with a nonmetric scaling option was used. To determine dimensionality, one- and two-dimensional solutions were compared to each other with STRESS values. The STRESS values were 0.21 and 0.00 for one- and two-dimensional solutions, respectively. The one-dimensional solution was not appropriate since the STRESS value was large. The STRESS value for the two-dimensional solutions was satisfactory (i.e., less than 0.05). The two-dimensional solution is also consistent with Prediger’s (1982) model of two group factors. Table 1 shows scale-values from the two dimensional nonmetric MDS solution.

Interpretation of the MDS Solution Holland (1973) proposed a two-dimensional hexagonal model for the GOT (subscales) variables. Although Holland did not interpret axes through his two dimensional spaces, other researchers (Hogan & Blake, 1996; Prediger, 1982) have. Hogan and Blake interpreted their axes in terms of underlying personality

TABLE 1 Scale Values From the Exploratory PAMS Analysis of Holland’s General Occupational Theme (GOT) Scales on the Strong Interest Inventory (1973) GOT Scale Realistic Investigative Artistic Social Enterprising Conventional a Distinctive

Dimension 1 0.43 0.28 2.21a 0.34 0.80 1.59a marker variable coordinates.

Dimension 2 1.09a 0.83a 0.04 0.87a 0.97a 0.12

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

15

dimensions. The first is an Introversion-Extroversion dimension marked by the Realistic and Investigative scales at one end and the Social and Enterprising scales at the other. The second is a Conformity dimension marked by the Conventional scale at one end and the Artistic scale at the other. Prediger proposed a three-factor model to account for Holland’s hexagon. Prediger’s factor model includes a general factor and two bipolar factors that are Things vs. People and Ideas vs. Data. Prediger interpreted the general factor as a response bias. The Holland-Blake and Prediger dimensions are axes in the same plane that lie within a 30ı rotation of each other. We focus on the Hogan-Blake dimensions since our MDS solution corresponds more closely to theirs. The scale-values for Dimension 1 are plotted as diamonds in Figure 1 to show the profile pattern associated with that dimension. In the dimensional solution of PAMS, the Artistic (ART) scale and the Conventional (CON) scale appeared to be the most distinctive variables. The ART scale appeared on the negative end of Dimension 1, whereas the CON scale appeared on the positive end of the same dimension. ART and CON scales are the indicator variables at the opposite ends of Hogan and Blake’s (1996) Conformity dimension. Therefore, we have called Dimension 1 the Conformity dimension. The scale-values for Dimension 2 are plotted as circles in Figure 1 to show the profile pattern associated with that dimension. As can be seen in this figure, the two highest scores in the associated profile pattern are for the Realistic (REA) and Investigative (INV) scales. Both REA and INV variables fall on the Introversion end of Hogan and Blake’s (1996) Introversion–Extroversion dimension. The two lowest scale values in the pattern are for the Social (SOC)

Note that values of only distinctive marker variable coordinates were included. FIGURE 1 Major profile patterns: Dimension 1 (Conformity) Profile and Dimension 2 (Introversion vs. Extroversion) Profile.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

16

KIM, DAVISON, FRISBY

and Enterprising (ENT) scales. Both Social and Enterprising scales fall at the Extroversion end of Hogan and Blake’s Introversion–Extraversion dimension. Because the marker variables at opposite ends of this dimension are those at opposite ends of Hogan and Blake’s Introversion–Extroversion dimension, we call Dimension 2 the Introversion vs. Extroversion dimension. The Dimensions 1 and 2 scale-values not only sketch the two vocational interest patterns in Figure 1, but they are also directly connected to Holland’s (1973) hexagonal model. If the Dimensions 1 and 2 coordinates are graphed along the horizontal and vertical axes of a scatter diagram, the six points corresponding to the variables form the vertices of a somewhat misshapen hexagonal polygon. The Figure 2 plot displays this misshapen polygon. The points fall along that hexagon in the order specified by Holland’s theory. The points only approximate a hexagon, however, in that the distances between adjacent points along the hexagon are not equal. In summary, the exploratory MDS solution yielded two dimensions whose marker variables at opposite ends of the dimensions are those at the opposite ends of Hogan and Blake’s Conformity dimension and their Introversion–Extroversion

Note that Dimensions 1 and 2 marker variable coordinates were bolded. FIGURE 2

Holland’s RIASEC circumplex.

CONFIRMATORY PAMS

17

dimension. These two dimensions provided the basis for specifying CFA latent variables in a second sample as described later.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

Estimation and Interpretation of Person Weights, ¨ O pt In the next step, the scale values in Table 1 were used to estimate person weights that index matching between major profiles and observed profiles (or arrays of participants’ subscale scores). The two person weights for each respondent were calculated by successively regressing each individual’s vector of six subscale scores onto coordinates of the two major profiles (two MDS dimensions). In each of these individual regressions, the person’s six observed scores served as the criterion variable and the six scale values along the two dimensions served as the two predictor variables. Since two major profiles were identified from 328 respondents, each respondent was assigned two person weights, one for each dimension profile. If the person weight on Dimension 1 is substantial, the person’s observed profile is expected to be similar to the Dimension (or Major) 1 profile, whereas if the person weight on Dimension 2 is substantial, the observed profile of the person will be similar to the Dimension 2 profile. However, if a person has a substantial negative weight on either Dimension 1 or Dimension 2, then the person’s observed profile is a mirror image of the corresponding dimension. Some respondents could have substantial weights on both dimensions and in that case, their observed profiles may have similar patterns that are linear combinations of Dimension 1 plus Dimension 2. Figures 3, 4, and 5 display cases illustrated above.

FIGURE 3

The Respondent #3 observed profile superimposed on Dimension 1 Profile.

18

KIM, DAVISON, FRISBY

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

Respondent #3 .¨O 3.1/ D 0:51; ¨ O 3.2/ D 0:09/ had a substantial positive weight on Dimension 1, but a trivial weight on Dimension 2. The person’s observed profile, Figure 3, displayed a pattern that was similar to the Dimension 1 Profile. The explained variance .R2 / for Respondent #3 (by two dimensions) is 0.60 and the person’s observed profile was a little elevated (above average) as reflected by the value of the level parameter, cO3 D 0:41.

FIGURE 4

The Respondent #98 observed profile superimposed on Dimension 2 Profile.

Respondent #98 .¨O 98.1/ D :13; ¨ O 98.2/ D 1:51/ had a trivial weight on Dimension 1, but a considerably high positive weight on Dimension 2, and this person’s profile, Figure 4, was similar to the pattern of the Dimension 2 Profile. The proportion of explained variance (by two dimensions) for the profile of Respondent #98 was 0.83 and the observed profile was a little depressed (below average) as reflected by the level parameter value, cO98 D 0:24. Notice that because the scores were standardized, the mean for the level parameter values was equal to zero and the range of the level values was from 1.61 to 1.78. Similarly, the mean for the person weights was zero and the range of the weights was from 0.85 to 1.01 for Dimension 1 and from 1.59 to 1.30 for Dimension 2. Although means for person and level parameters are automatically set to be zero when standardized data are analyzed (as in the current study), their ranges will vary depending on the data in analysis. Respondent #42 .¨O 42.1/ D 0:64; ¨ O 42.2/ D 0:84/ had substantially large (in absolute value), but negative weights on both dimensions. Therefore, this person’s pattern is similar to that obtained when the two dimensions are linearly combined (Dimension 1 C Dimension 2) and then they are reversed: (Dimension 1 C Dimension 2). This linearly combined, but reversed dimension pattern is compared to the data of Respondent #42 in Figure 5. The proportion

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

19

FIGURE 5 The Respondent #42 observed profile vs. mirror image of linearly combined dimensions.

of variance accounted for was 0.90 (or R2 D 0:90) and the profile was a little depressed (below average) as reflected by its level parameter value, cO42 D 0:17. The overall accounted variance (by two major profiles) for observed profiles of 323 respondents was 0.55 .R2 D 0:55/. In other words, on average, 55% of the within person observed profile variance was explained by the two major profiles.

EXAMPLE OF CONFIRMATORY PAMS To demonstrate the confirmatory approach for PAMS, two data files were used: One was a simulation data set and the other one was a real data set. For a simulation example, data were simulated by the PAMS model. For a real data example, the Strong Interest Inventory standardization sample (Harmon, Hansen, Borgen, & Hammer, 1994) was used. Simulation Data Analysis Five hundred observations (or persons) were generated by Equation 1, and each observation consisted of six observed variables (Var1 Var6). For simulation, the level parameter value .cp / was a randomly generated intercept for each person with N.0; 1/; two dimension coordinates (or scale-values) were set as described below; and two person weights .¨pk ; ¨pk 0 / were randomly generated for each person along Dimensions 1 and 2, both with N.0; 1/. For each data point, error was added by randomly drawing a normal deviate, N.0; 0:25/.

20

KIM, DAVISON, FRISBY

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

True scale-values. With 6 observed variables and two dimensions, the column vector of variable length 6, (1.50, 1.00, 1.00, 1.50, 1.00, 1.00)t , was assigned for Dimension 1 true coordinates. A column vector of the same length, (0.00, 1.12, 1.12, 0.00, 1.12, 1.12)t , was assigned for Dimension 2 true coordinates. These true coordinates outline a hexagonal structure as proposed by Holland (1973). As required to meet the conditions of Equations 3, the coordinates along a dimension sum to zero. The cross product of the two coordinate column vectors is zero. PAMS of simulation data. The simulated data were analyzed by PAMS. Parameters and PAMS estimates were closely matched: Corr(True Dim 1, PAMS Dim 1) D 0.99; Corr(True Dim 2, PAMS Dim 2) D 0.98; Corr .True cp , PAMS cOp / D 1:00; Corr .True ¨1 , PAMS ¨O 1 / D 0:99; Corr .True ¨2 , PAMS ¨O 2 / D 0:97; and PAMS two dimensions explained 95% of the variation in the simulated data. In other words, the PAMS solution accounted for 95% of the variation within the observed profiles on average. The results indicate that PAMS replicated the parameter values quite well. CFA of simulation data. The 6  6 correlation matrices were entered as input data into CFA to confirm the two dimensions (or major profiles) that were identified a priori for simulation. However, a three-factor model—Factor 1 (Level) and Factors 2 and 3 (Dimensions 1 and 2)—was hypothesized. To be consistent with the PAMS model, the loadings for the Level Factor were fixed at 1.00. All six variables served as indicators of the Dimension 1 Factor since all true scale-values (or coordinates) of Dimension 1 were equal to or larger than 1.00, whereas Var2, Var3, Var5, and Var6 served as indicators of the Dimension 2 Factor, because those variable scale-values were equal to or larger than 1.00 along Dimension 2. Goodness of model fit statistics indicate close fit: ¦2 .df D 10; N D 500/ D 4:61 and p-value D 0.92; RMSEA D 0.00; Reliability (TLI) D 1.00; Expected Cross-Validation Index (ECVI) D 0.07 (0.08 for the saturated model); and AIC D 30.61 (42 for saturated AIC). In addition, correlations between true dimension coordinates and confirmatory dimension-factor loadings were examined: Corr(True Dim 1, Dim 1 Factor) D 1.00 and Corr(True Dim 2, Dim 2 Factor) D 1.00. The results show that reparameterization of the PAMS model in CFA, including the level factor, was confirmed. Real Data Analysis With real data, a PAMS previous finding from a prior sample (Sackett, 1993) was evaluated for validation of the finding in a different sample. The Strong Interest Inventory standardization sample (Harmon, Hansen, Borgen, & Hammer, 1994) was used for confirmatory PAMS. The data involve inter-scale correlation

21

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

matrices (one for females and the other for males) obtained from the Strong Interest Inventory Applications and Technical Guide (Harmon et al., 1994). The Strong Interest Inventory includes the same General Occupational Theme (GOT) scales as used in the exploratory PAMS. The 6  6 correlation matrices were entered as input data into CFA to confirm the two major profiles that were identified by the exploratory PAMS. The intercorrelations of the 6 interest scales are shown in Table 2. The respondents in the study were from the national standardization sample and consisted of 9,467 women and 9,484 men. Will the structure recovered in the exploratory PAMS analysis account for the Male and Female GOT scale intercorrelations in the standardization sample? In order to confirm or disconfirm the exploratory PAMS results, a confirmatory analysis, based on the subscale intercorrelations computed from both Female and Male groups can be performed through LISREL 8 (Jöreskog & Sörbom, 1996) on the six subscales of the General Occupational Themes (GOT) in the Strong Interest Inventory (Harmon et al., 1994). Consistent with the simulation data analysis, a three-factor model of GOT— Factor 1 (Level) and Factors 2 and 3 (Dimensions 1 and 2)—was hypothesized. Factor 1 is a correspondent of the level parameter in the PAMS model and Factors 2 and 3 are correspondents of the first and the second dimensions or major patterns recovered with MDS, respectively. All 6 observed variables (REA, INV, ART, SOC, ENT, and CON) serve as indicators of Factor 1, a general factor. Two observed variables, ART and CON, which appeared as distinguishing marker variables on Dimension 1, serve as indicators of Factor 2. The four observed variables, REA, INV, SOC, and ENT, which appeared on Dimension 2 as the characteristic marker variables, serve as indicators of Factor 3. Consistent with the orthogonality of weights assumption in PAMS (Equation 5), the last two factors corresponding with MDS dimensions were constrained to be uncorrelated with one another. Although the first (level) factor is

TABLE 2 Intercorrelations Between the General Occupational Themes (GOT) GOT Scale Realistic Investigative Artistic Social Enterprising Conventional

Realistic

Investigative

Artistic

Social

1.00 0.51 0.06 0.13 0.19 0.27

0.57 1.00 0.30 0.19 0.04 0.31

0.23 0.30 1.00 0.36 0.21 0.02

0.08 0.09 0.26 1.00 0.43 0.31

Enterprising 0.08 0.05 0.25 0.40 1.00 0.47

Conventional 0.23 0.20 0.14 0.22 0.36 1.00

Note. Correlation coefficients above the diagonal are based on 9,467 women; those below the diagonal are based on 9,484 men; and these coefficients were obtained from the Strong Interest Inventory Applications and Technical Guide (Harmon et al., 1994).

22

KIM, DAVISON, FRISBY

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

not necessarily constrained to be uncorrelated with the other two factors (since there are no orthogonality assumptions about the level parameter in the PAMS analysis), especially for Model 2 (in the section on Model Estimation below), the first factor was constrained to be uncorrelated with the other factors because the preliminary analysis indicated no correlation between the first and the other group factors. Moreover, an additional model that did not include the level factor was tested to examine impact of the level factor on the model fit. Model Specification Factor 1: Profile level. Despite the fact that the Holland (1973) theory is two dimensional, three interest factors were proposed. It is our contention that the Holland dimensions are sufficient to explain individual variation in profile pattern, but that an additional factor is needed to account for individual differences in profile level, which is missing in the Holland dimensions. Consistent with this conjecture, prior researchers have concluded that accounting for the intercorrelations among these six interest variables requires an additional factor, a “general” factor along which all six interest variables have approximately equal loadings (Davison, 1985; Rounds & Tracey, 1993). Therefore, to explain individual differences in vocational interest profile level (that approximates a “general” factor), an additional factor was included, but the additional factor loadings were either constrained to be equal, consistent with other researchers’ speculation (Davison, 1985; Rounds & Tracey, 1993), or freely estimated so as to examine the effect of various level factor specifications on fit. Factor 2: Conformity. Factor 2 was the second dimension profile for the interest variables. It was specified to approximate the horizontal axis in Figure 2. Figure 2 suggests a profile pattern with a positive coordinate for the Conventional (CON) variable and a negative coordinate for the Artistic (ART) variable. These two marker variables were included for Conformity in the confirmatory analysis (or CFA), whereas the other four variable loadings (REA, INV, SOC, and ENT) were set to zero in CFA because their scale-values were considered trivial (0.43, 0.28, 0.34, and 0.80, respectively). The last scale-value (0.80) of ENT could be arguable for whether or not it was included in CFA as a marker variable, but we decided not to include it on theoretical grounds. Furthermore, the scale-value of 0.8 is the half size (or less) of scale-values for variables used as markers of this dimension. Factor 3: Introversion vs. Extroversion. Factor 3 was specified to approximate the vertical axis in Figure 2, the depiction of the Holland (1973) hexagon and this vertical axis represent Dimension 2 of the exploratory PAMS solution. This axis suggests a dimension with positive coordinates for the Realistic (REA)

CONFIRMATORY PAMS

23

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

and Investigative (INV) variables and negative coordinates for the Social (SOC) and Enterprising (ENT) variables. The other two variables, Artistic (ART) and Conventional (CON), had trivial scale-values (0.04 and 0.12, respectively). For the confirmatory analysis, the two trivial variables (ART and CON) were set to zeros on their loadings of Factor 3, whereas REA, INV, SOC, and ENT were included as indicators of Factor 3 (Dimension 2). Again, the variables excluded from the marker set had scale-values less than half the size of those included. Model Estimation For the confirmatory PAMS analysis, three models were proposed and tested for both females and males and the unweighted least-squares method and the maximum likelihood estimation were used to test the models. Because the model (Model 1 below) that did not constrain the level factor loadings to be equal did not yield proper estimates (e.g., an undefined component in estimating the covariance matrix of latent variables) with the maximum likelihood approach, we had to use the least-squares procedure. Consistent with the PAMS assumption of Equation 3, along Factors 2 and 3, the sum of loadings was constrained equal to 0. The proposed models were: (1) Model 1 containing a general factor with unequal loadings; (2) Model 2 containing a general factor with equal loadings; and (3) Model 3 without a general factor. All three models constrained the Factor 2/Factor 3 correlation to zero, as PAMS does not allow major profile correlation. As illustrated below, the hypothesized models that included the level parameter factor (Models 1 and 2) showed better fit than the model without it (Model 3). Figures 6, 7, and 8 depict the paths of Models 1, 2, and 3 for females since paths for males are similar to those of females. The models are presented in the figures where circles represent latent variables and rectangles represent observed variables (or subscales). As goodness of fit statistics in Table 3 indicate, our hypothesized model, either Model 1 or Model 2, which included the level parameter as a separate factor, appeared to fit significantly better than Model 3, which did not include the level parameter factor, as indicated by the smaller values of ¦2 , RMSEA, ECVI and AIC and by the higher values of TLI compared to Model 3. The same models were applied to males, and as consistent with females, the models that included the profile level as a general factor showed better fit than the model that did not include the level factor. Although fit indices for the male group were not as good as those of females, fit indices for Model 1 (that allowed unequal loadings for the level factor) were satisfactory, whereas those for Model 2 (that constrained the level factor loadings to be equal) were not. The fit indices for the male group were as follows: RMSEA D 0.05 (0.13); TLI D 1.00 (0.78); ECVI D 0.01 (0.19); and AIC D 126.23 (1759.55) and the values in parenthesis indicate fit indices from Model 2 that included equal loadings for

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

24

KIM, DAVISON, FRISBY

FIGURE 6

Female group: Model 1 included level parameter factor with unequal loadings.

TABLE 3 Goodness-of-Model-Fit Statistics for Confirmatory Factor Analysis of Women GOT Data (Harmon et al., 1994)

Fit Statistics ¦2 (df; N ) & p-value ¦2 (df; N ) RMSEA Reliability (TLI) ECVI AIC

Model 1: Unequal Loading

Model 2: Equal Loading

Model 3: No Level Factor

¦2 (3, 9467) & 0.03 ¦2 (3, 9467) 0.02 1.00 0.01 49.00

¦2 (10, 9467) & 0.00 ¦2 (10, 9467) 0.08 0.91 0.07 644.79

¦2 (11, 9467) & 0.00 ¦2 (11, 9467) 0.34 0.35 1.27 10206.59

Note. RMSEA D Root Mean Square of Error of Approximation; TLI D Tucker Lewis Index; ECVI D Expected Cross Validation Index; AIC D Akaike Information Criterion.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

FIGURE 7 Female group: Model 2 included level parameter factor with equal loadings.

FIGURE 8 Female group: Model 3 did not include level parameter factor.

25

26

KIM, DAVISON, FRISBY

the level factor. Fit indices of Model 3 for males were not included here because they were all as poor as those of Female’s.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

SEM Parameter Estimates Model 1 included two marker variables, ART and CON, on the Dimension 1 factor (Factor 2) and four variables, REA, INV, SOC, and ENT were entered as marker variables for the Dimension 2 factor (Factor 3). Model 1 included the correlations of the level factor with Factors 2 and 3, Corr (Level, Dim 1) D 0.15 and Corr (Level, Dim 2) D 0.55. The results indicate that there is a substantial relation between the Level factor and the Dimension 2 factor. If the level factor is considered an individual’s response bias as suggested by Hanson et al. (1977) and Prediger (1982), there is a significant relation between the response bias and the Dimension 2 profile pattern. People with high levels of response bias tend to have interest profiles highest on Realistic and Investigative scales and lowest on Social and Enterprising scales. For Model 2, the correlations of Factor 1 with Factors 2 and 3 were virtually zeros, Corr (Level, Dim 1) D 0.03 and Corr (Level, Dim 2) D 0.01, and we did not include them in the final CFA model. Both models included the level parameter as a general factor, but Model 3 did not include the Level factor, but only Dimension factors. The factor loadings for only Models 1 and 2 are shown in Table 4 because these two models are our hypothesized models that represent the PAMS model adequately and their fit statistics appear statistically acceptable. Although the scale-values in PAMS were different from the factor weights in CFA since a different parameterization was used for each approach, the same pattern appears in both. Along Dimension 1 in PAMS and Factor 2 (or the

TABLE 4 Confirmatory Factor Analysis Loadings of Models 1 and 2 (in Parentheses) Estimated From the Sample of Women From the General Occupational Theme Standardized Data (Harmon et al., 1994) and Dimension Profile Coordinates Estimated From the Sample From Sackett (1993) Subscales Realistic Investigative Artistic Social Enterprising Conventional

Factor 1 0.27 0.19 0.56 0.57 0.99 0.38

(0.52) (0.52) (0.52) (0.52) (0.52) (0.52)

Factor 2 — — 0.61 (0.63) — — 0.61 ( 0.63)

Dimension 1 0.43 0.28 2.21 0.34 0.80 1.59

Factor 3 0.46 (0.37) 0.70 (0.58) — 0.30 ( 0.35) 0.86 ( 0.60) —

Dimension 2 1.09 0.83 0.04 0.87 0.97 0.12

Note. Signs of Dimension 1 coordinates were reversed to match those of factor loadings because changing signs of the coordinates does not affect generality of dimension interpretation.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

27

Dimension 1 factor) in the CFA, ART appears most distinctively at one end and CON appears most distinctively at the other end, just as would be predicted by the Hogan and Blake representation of Holland’s hexagon. Along Dimension 2 in the MDS and Factor 3 (or the Dimension 2 factor) in the CFA, both representing Introversion vs. Extroversion, REA and INV appear at one end and SOC and ENT appear at the other as would be predicted by the Hogan and Blake representation of Holland’s hexagon. The dimensions recovered in the PAMS analysis of one data set (Sackett, 1993) were confirmed by the CFA analysis of the other data set (Harmon et al., 1994). One aspect of the PAMS model in the CFA of real data was not confirmed. Whereas the PAMS model assumes an equal intercept parameter effect for all variables, CFA model fit was substantially improved by allowing the loadings to vary along the first factor representing the intercept. The PAMS equal intercept effect assumption may hold in the first sample on which the exploratory analysis was done. Alternatively, MDS may be robust with respect to violations of the equal intercept assumption and therefore capable of recovering the major profile patterns even when that assumption is violated. In any case, the CFA confirmed the profile patterns of the MDS dimensions but not the intercept assumption on which the MDS analysis is based.

DISCUSSION Based on an exploratory MDS, one can specify a loading structure for factors beyond the first, but how does one do so? Along CFA factors, the marker variables are those at the extremes of the MDS dimensions. Our suggestion is to include as marker variables for a dimension any variable with a squared loading above the average squared loading in the solution; that is above 1.00 in the case of ALSCAL. Based on theory and trial and error, one may also include some variables with scale-values at the extreme end of a dimension but with squared scale-values below the average squared scale value. In our example, we included all variables with squared scale-values above the average squared scale value as CFA marker variables, but based on theoretical considerations we included two additional variables along the Introversion/Extroversion factor. Loadings for non-marker variables should be set to zero. Along each factor, loadings for marker variables should be freely estimated subject to the constraint that they sum to zero, an imposition of the PAMS constraint in Equation 3 above. The PAMS model also suggests constraining all loadings equal for the first factor. In short, empirical MDS scale-values can be used to pick marker variables for a subsequent CFA analysis. The PAMS model suggests constraining all variable loadings equal along Factor 1 and constraining all variable loadings to sum to zero along factors beyond the first. Imposition of the former constraint was not

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

28

KIM, DAVISON, FRISBY

supported in our data, however. It may well be that not all variables are equally saturated with the construct represented by the PAMS model intercept term, in which case loadings along the first factor would not be equal. Profile Analysis via Multidimensional Scaling (PAMS) is designed as an exploratory analysis to identify the most typical (major) profile patterns from multivariate data. While PAMS is fundamentally exploratory, the profile patterns identified in PAMS can be subsequently examined in a confirmatory fashion with a second sample. In our example the Confirmatory Factor Analysis (CFA) approach was used to validate the PAMS major profiles. As illustrated above, when K major profiles are identified, 1 C K latent variables need to be included in CFA. This additional factor, which represents individual differences in profile level, refers to the general factor and can be interpreted as the “g” factor in cognitive tests or as a response-bias factor in Holland’s RIASEC scales (Hanson et al., 1977; Prediger, 1982). When this general factor was not included in the model, the PAMS major profiles were not properly confirmed in CFA. Therefore, if one who wants to replicate the PAMS results using CFA, the model should include one additional factor that is a correspondent of the PAMS level parameter in the model being tested. Rounds and Tracey (1993) used the Structural Equation Modeling (SEM) paradigm and MDS to provide converging evidence for Prediger’s dimensional representation (1982) of Holland’s circumplex. There is more of an affinity between the SEM and MDS methods employed by Rounds and Tracey (1993) than may be immediately apparent from their research. SEM models (or CFA, which is a special case of SEM) can be formulated so as to form a confirmatory complement to exploratory MDS, because SEM and MDS are based on very similar models, the PAMS model, in the case of their MDS, and the CFA model, in the case of their SEM. Both analyses are based on linear latent variable models for the original data. This similarity suggests how CFA can be used to confirm MDS findings from prior samples by including a general factor to represent individual differences in profile level plus several pattern factors, one for each dimension in the original MDS. The pattern factors (or MDS dimensions) represent within person variation and the structure of ipsatized scores. This similarity of the models connects to research questions involving profile patterns that have been more commonly studied using exploratory techniques, such as cluster analysis or Q-factor analysis. The general factor is a correspondent of individual differences in profile level discussed in the profile literature. Likewise, the group factors, if properly specified, are correspondents of patterns that represent the structure of ipsative scores discussed in the profile literature. Through these connections, CFA can be extended to research questions involving profile patterns, an area in which confirmatory approaches have been lacking.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

29

Although Rounds and Tracey included the general factor in their SEM model, the MDS model they used (i.e., individual differences model) and most other MDS models (except PAMS) do not generally include the general factor in the model. Rounds and Tracey included the general factor in their SEM model to match Prediger’s conjecture of the three-factor solution, but not by a theory driven requirement. However, the current analysis included the general factor in the SEM model, not based on Prediger’s a priori factor structure paradigm, but based on the structure of the PAMS model that generically includes a correspond of the general factor, the level parameter .cp /. Moreover, our SEM approach is appropriately called “confirmatory factor analysis” (CFA) since exploratory PAMS results (in which identified profiles correspond to latent factors) were replicated in the context of structural equation modeling. The connection between the PAMS model and the common factor (or specifically components analysis) model clarifies why the additional factor is needed, to account for individual differences in profile level (see the section of The Factor Model versus the PAMS Model). The PAMS model does so in the language of profile research and thereby aids in the translation of profile research questions into the CFA model form. Therefore, one who utilizes PAMS for exploratory profile analysis and wants to validate results in the context of CFA, irrespective of whether or not the a priori hypothesis includes the general factor as an additional factor (like Prediger’s case), has to include one additional factor, in addition to the dimension factors (identified by PAMS). Our specification of the CFA latent variables beyond the first factor differs from that of Rounds and Tracey (1993) in another respect. Holland’s hexagonal configuration, not the dimensions, seemed to be paramount in their thinking. Thus, Rounds and Tracey specified the SEM factors beyond the first factor so as to represent various forms of that hexagonal hypothesis. Not all research on profile patterns involves a hexagonal structure and therefore could be guided by the conceptualization that seemed to be at the forefront of Rounds and Tracey’s thinking. In our specification of the CFA model, the exploratory dimensions underlying the hexagon served as the basis for specifying the model. This latter approach is not limited to the situation in which the observed variables have a hexagonal structure. The Rounds and Tracey MDS analysis was not based on an explicit model of the observed data. Without an explicit model that links the latent dimensions to the observed interest variables, there can be no interpretation of the latent MDS dimensions in terms of those observed variables. For more details on this interpretational issue, readers are referred to MacCallum’s (1974) criticism of earlier MDS approaches. The explicit PAMS model clarifies the connection between the observed interest variables and the latent dimensions in a way that enhances the interpretation of the latent dimensions as major profile patterns.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

30

KIM, DAVISON, FRISBY

It also enhances the utility of these dimensions as tools for studying profile pattern research questions beyond the vocational interest domain. We used one data set (Sackett, 1993) as our reference to explore the major profile patterns and then confirmed the exploratory profile patterns in the other data set (Harmon et al., 1994) with the CFA approach. In our case, the two samples were not random samples from the same population. In combination, an exploratory MDS and subsequent CFA can be used to address two different questions, depending on whether or not the two samples are drawn from the same population. If the samples are drawn from two different populations, as in our example, then the question addressed in the CFA study is whether the major profile patterns found in the first population generalize to the second. In our example, the question addressed was whether the major profile patterns found in the first sample also account for the second sample? The first is a question of generalization; the second is a question of replication.

ACKNOWLEDGMENTS We give our thanks to Roger Millsap and three anonymous reviewers for their helpful comments to enhance the quality of this study.

REFERENCES Aldenderfer, M. S., & Blashfield, R. K. (1984). Cluster analysis. Beverly Hills, CA: Sage. Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332. Browne, M. W., & Cudeck, R. (1993). Alternative ways of assessing model fit. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 136–162). Newbury Park, CA: Sage Publications. Cattell, R. B. (1967). The three basic factor analysis research designs: Their interrelations and derivatives. In D. N. Jackson & S. Messick (Eds.), Problems in human assessment (pp. 300– 304). New York: McGraw-Hill. Davison, M. L. (1985). Multidimensional scaling versus components analysis of test intercorrelations. Psychological Bulletin, 97, 94–105. Davison, M. L. (1996). Multidimensional scaling interest and aptitude profiles: Idiogrphic dimensions, Nomothetic factors. Presidential address to Division 5, American Psychological Association, Toronto (August 10). Davison, M. L., Gasser, M., & Ding, S. (1996). Identifying major profile patterns in a population: An exploratory study of WAIS and GATB pattern. Psychological Assessment, 8, 26–31. Davison, M. L., Kuang, H., & Kim, S.-K. (1999). The structure of ability profiles patterns: A multidimensional scaling perspective on the structure of intellect. In P. L. Ackerman, P. C. Kyllonen, and R. D. Roberts (Eds.), Learning and Individual Differences: Process, trait, and content determinants (pp. 187–207), Washington, DC: American Psychological Association. Gollob, H. F. (1968a). Confounding sources of variation in factor-analysis techniques. Psychological Bulletin, 70, 330–344.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

CONFIRMATORY PAMS

31

Gollob, H. F. (1968b). A statistical model which combines features of factor-analysis and analysis of variance techniques. Psychometrika, 33, 73–116. Gower, J. C. (1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika, 53, 325–338. Hansen, J. C. & Campbell, D. P. (1985). Manual for the Strong Interest Inventory (4th ed.). Palo Alto, CA: Consulting Psychologists Press. Hanson, G. R., Prediger, D. J., & Schussel, R. H. (1977). Development and validation of sexbalanced interest inventory scales. (ACT Research Report No. 78). Iowa City, IA: American College Testing Program. Harmon, L. W., Hansen, J. C., Borgen, F. H., & Hammer, A. L. (1994). Strong Interest Inventory Applications and Technical Guide. Palo Alto, CA: Consulting Psychologists Press. Heiser, W. J., & Meulman, J. (1983a). Constrained multidimensional scaling, including confirmation. Applied Psychological Measurement, 7, 373–514. Heiser, W. J., & Meulman, J. (1983b). Analyzing rectangular tables by joint and constrained multidimensional scaling. Journal of Econometrics, 22, 139–167. Hogan, R. & Blake, R. J. (1996). Vocational interests: Matching self-concept with the work environment. In K. R. Murphy (Ed.), Individual differences and behavior in organizations (pp. 89–144). San Francisco: Jossey-Bass. Holland, J. L. (1973). Making vocational choices: A theory of careers. Englewood Cliffs, NJ: Prentice Hall. Jöreskog, K. G., & Sörbom, D. (1993). LISREL 8: User’s reference guide. Chicago: Scientific Software International. Kim, S.-K., Frisby, C. L., & Davison, M. L. (2004). Estimating cognitive profiles using Profile Analysis via Multidimensional Scaling (PAMS). Multivariate Behavioral Research, 39 (4), 595–624. Konald, T. R., Glutting, J. J., McDermott, P. A., Kush, J. C., & Watkins, M. M. (1999). Structure and diagnostic benefits of a normative subtest taxonomy developed from the WISC-III standardization sample. Journal of School Psychology, 37, 29–48. Kruskal, J. B. (1964a). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika, 29, 1–27. Kruskal, J. B. (1964b). Nonmetric multidimensional scaling: A numerical method. Psychometrika, 29, 115–129. MacCallum, R. C. (1974). Relations between factor analysis and multidimensional scaling. Psychological Bulletin, 81, 505–516. MacCallum, R. C., Browne, M., & Sugawara, H. M. (1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 2(1), 130–149. McDermott, P. A., Fantuzzo, J. W., & Glutting, J. J. (1990). Just say no to subtest analysis: A critique on Wechsler theory and practice. Journal of Psychoeducational Assessment, 8, 290–302. McDermott, P., Fantuzzo, J., Glutting, J., Watkins, M., & Baggaley, A. (1992). Illusions of meaning in the ipsative assessment of children’s ability. Journal of Special Education, 25(4), 504–526. McDermott, P., & Glutting, J. (1997). Informing stylistic learning behavior, disposition, and achievement through ability subtests—or, more illusions of meaning? School Psychology Review, 26(2), 163–175. Moses, J. A., Jr. & Pritchard, D. A. (1995). Modal profiles for the Wechsler Adult Intelligence Scale-Revised. Archives of Clinical Neuropsychology, 11, 61–68. Prediger, D. J. (1982). Dimensions underlying Holland’s hexagon: Missing link between interests and occupations? Journal of Vocational Behavior, 21, 259–287. Ramsay, J. O. (1977). Maximum likelihood estimation in multidimensional scaling. Psychometrika, 42, 241–266. Rounds, J, & Tracey, T. J. (1993). Prediger’s dimensional representation of Holland’s RIASEC circumplex. Journal of Applied Psychology, 78, 875–889.

Downloaded by [Washington University in St Louis] at 10:04 12 October 2014

32

KIM, DAVISON, FRISBY

Sackett, S. (1993). Predicting the vocational outcome of college freshmen with undifferentiated profiles on the Strong Interest Inventory: A longitudinal study. Unpublished doctoral dissertation, University of Minnesota Twin Cities, Minneapolis, Minnesota. Sattler, J. M. (2001). Assessment of children: Cognitive applications (4th ed.). San Diego, CA: Author. Schonemann, P. H. (1970). On metric multidimensional unfolding. Psychometrika, 35, 349–366. Shepard, R. N. (1962a). The analysis of proximities: multidimensional scaling with an unknown distance function I. Psychometrika, 27, 125–140. Shepard, R. N. (1962b). The analysis of proximities: multidimensional scaling with an unknown distance function II. Psychometrika, 27, 219–246. Stanton, H. C., & Reynolds, C. R. (2000). Configural frequency analysis as a method of determining Wechsler profile types. School Psychology Quarterly, 15, 434–448. Steiger, J. H. (1990). Structural model evaluation and modification: An interval estimation approach. Multivariate Behavioral Research, 25(2), 173–180. Takane, Y., Young, F. W., & de Leeuw, J. (1977). Nonmetric individual differences multidimensional scaling: An alternating rotation least squares method with optimal scaling features. Psychometrika, 42, 267–276. Torgerson, W. S. (1958). Theory and methods of scaling. New York: Wiley. Tracey, T. J. & Rounds, J. B. (1995). The arbitrary nature of Holland’s RIASEC types: A concentriccircles structure. Journal of Counseling Psychology, 42, 431–439. Tucker, L. R., & Lewis, C. (1973). A reliability coefficient for maximum likelihood factor analysis. Psychometrika, 38(1), 1–10. Wechsler, D. (2002). Wechsler Preschool and Primary Scale of Intelligence—Third Edition. San Antonio, TX: The Psychological Corporation.

Confirmatory Factor Analysis and Profile Analysis via Multidimensional Scaling.

This paper describes the Confirmatory Factor Analysis (CFA) parameterization of the Profile Analysis via Multidimensional Scaling (PAMS) model to demo...
712KB Sizes 2 Downloads 7 Views