This article was downloaded by: [Moskow State Univ Bibliote] On: 04 January 2014, At: 02:11 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 3741 Mortimer Street, London W1T 3JH, UK
Multivariate Behavioral Research Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hmbr20
Factors Influencing Four Rules For Determining The Number Of Components To Retain William R. Zwick & Wayne F. Velicer Published online: 10 Jun 2010.
To cite this article: William R. Zwick & Wayne F. Velicer (1982) Factors Influencing Four Rules For Determining The Number Of Components To Retain, Multivariate Behavioral Research, 17:2, 253269 To link to this article: http://dx.doi.org/10.1207/s15327906mbr1702_5
PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sublicensing, systematic supply, or distribution in any form to anyone is
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/termsandconditions
MultivaricGte Bshwioral Research, 1982, 17, 253269
FACTORS INFLUENCING FOUR RULEIS FOB DETERMINING THE NUMBER OF COMPONEiNTS TO RETAIN WILLIAM R. ZWICK and WAYNE F. VELICER University of Rhode Is'land
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
ABSTRACT
The performance of four rules for determining the number. of components to retain (Kaiser's eigenvalue greater than unity, Cattell's SCREE, Bartlett's test, and Velicer's MAP) was investigated across four systen~aticallyvaried factors (sample size, number of variables, number of components, and component saturation). Ten sample correlation matrices were generated from each of 48 known population correlation matrices representing the combinations of conditions. The performance of the SCREE and MAP rults was generally the best across all situations. Bartlettqs test was generally adiequate except when the number of variables was close to the sample size. Kaiser's rule tended to severely overestimate the number of components.
A common problem in the behavioral sciences is to repiresent a large set of observed variables (p) by some smaller set; ( m ) which still preserves the essential original information. Horst (19651, and Van de Geer (1971) discuss principal complonent analysis (PCA) as one method of dealing with this problem. Principal component analysis is the most commonly employed mlethod of a wide class of data reduction procedures typically called component analysis. A second class of procedures, typically called factor analysis, has been employed for the same problem. Factor analysis approaches typically assume that the number of factors is known a priori. Sometimes the maximum likelihood test is employed to determine if the assumed number of factors is correct. The maximum likelihood test is analogous to the Bartlett test for component analysis which is employed in the present study (Horn and Engstrom, 1979). Component analysis is the most widety employed of these two general approaches, (Kaiser, 1970; Glass and Taylor, 1966). Velicer (1974, 1976a, 1977) has shown tha,t the two approaches result in essentially equivalent solutions. Recently, attention has been focused on an issue that casts some doubt on the appropriateness of the factor analysis model, the factor indeterminacy issue (Guttman, 1955; Schtnemann and 'Wang, 1972; Steiger and Schonemann, 1978). In view of the indeterminacy issue, the comparability of the maximum likelihood and Ba,rtlett tests, the widespread usage of principal component analysis, and APRIL, 1982
253
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William R. Zwick and Wayne F. Velicer
the comparability of results across the two methods, i t was decided to focus on principal component analysis in the present study. A common problem in the use of principal component analysis is the determination of the number of components to retain (Velicer, 1976b ; Gorsuch, 1974). This problem is particularly acute when some method of rotation is performed upon the retained components. That is, if an unrotated solution is accepted, the structure of the retained components is not affected by the number of components retained. However, in rotated solutions, the retention of a few more or of a few less components may drastically change the rotated structure (Comrey, 1978; Kaiser, 1961). A variety of methods for determining the "appropriate" number of components to retatin have been suggested (Bartlett, 1950, 1951; Cattell, 1966 ; Humphreys and Montanelli, 1975 ; Joreskog, 1962 ; Kaiser, 1960; Velicer, 1976b). It has been found the suggested decision methods do not result in the retention of the same number of components (Cattell and Vogelman, 1977; Linn, 1968). The purpose of the present study is to examine the accuracy of four proposed methods for determining the number of components to retain. Simulated data were employed to permit the systematic variation of several factors which can potentially affect the accuracy of the methods. The rules were compared on an uncomplicated underlying structure containing a varying number sf welldefined equal common components and equal error cornponent. These simple welldefined structures were employed for two reasons. First, all structures employed meet the assumptions of all four rules. Second, a poor performance by any of the rules on this data would lead to the expectation of an equivalently bad performance on more complex data sets. "Real" or P'Classic" data sets would not have served the purposes of this study because such data sets preclude the manipulation of the factors of interest and lack a known criterion for comparison purposes. The most commonly employed rule for determining the number of components is to retain those components with eigenvalues greater than 1.0. This rule was derived from Guttman's (1954) classic work concerning three lower bounds for the number of image components. Kaiser (1960) and Jkaiser and Caffrey (1965) have extended the rationale of this rule which will be referred to a s K1. Gorsuch (1974) noted many users employ the K1 rule to determine the number of components rather than as a lower bound. Difficulties associated with this use are noted by Mote (1970) and 254
MULTIVARIATE BEHAVIORAL RESEARCH
*
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William R. Zwick and Wayne
F.
Velicer
Hump~hreys(1964) who argued that rotation of a greater number of components resulted in more meaningful solutions. They implied that the relatively blind use of the K1 rule, tk~erefore,may sometimes lead to the retention of too few components. Linn (1968) and others (Browne, 1968; Cattell and Ja~spers, 1967), however, have found the number of components retained by this method often overestimates the known underlying component structure. Gorsuch (1974) reported that the number of compcwents retained by K1 is commonly between one third and one fifth the number of variables included in the correlation matrix. The K1 rule, therefore, although commonly used, is believed by some critics to sometimes underestimate and by others to sometimes grossly overestimate the number of components. A second approach, proposed by Cattell (1966), involves an examination of the eigenvalues and is typically referred to as the "scree!" test. This approach involves plotting the eigenvalues ;, those falling above a straight line fit through the smaller values are retained. Complications which can occur include: (a) more than one break point in the line; (b) more than one suitable lin~emay be drawn through the low values; and (c:)too gradual a slope from lower to higher eigenvalues %o identify ;a break point in the line. A number of researchers (Cattell and Jaspers, 1967; Cattell and Vogelmann, 1977; and Tucker, Moopman, and Linn, 1969) have found the test to be accurate in a majority of the cases investigated. Cliff and Pennell (1967) found more definite breaks with larger ( N = 400 vs. N = 100) sample sizes and Linn (1968) concurred in this conclusion. The use of the scree test always involves issues of interrater reliability. Cattell and Vogelmann (1977:) have shown high interrater reliability among both naive and expert judges. However, Crawford and Koopman (1979) have reported extremely low interrater reliabilities. Horn and Engstrom (1979) have noted the underlying similarity of Bartlett's test anid the scree .method. Both tests are based on an analysis (oine statistical, the other visual) of the essential equality of the "remaj,ningw eigenvalues. The scree test appears most effective when strong compolnents are present with little confounding due to error or unique components. Bartlett (1950, 1951) suggested a statistical test of the null hypotliesis that the (p  m) remaining eigenvalues are equal. Starting with the first component, each is excluded from the test in turn until the null hypothesis fails to be rejected. The first m APRIL, 1982
255
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
Wiiliarn R. Zwick and Wayne F. Velicer
excluded components prior to the retention of the null hypothesis are the retained components. The test statistic has an approximate chisquare distribution. Bartlett's method appears sensitive to the number of subjects employed. Gorsuch (1973; 1974) argued that a s the sample size increases, the tests of significance become more powerful and, therefore, less and less substantial differences between eigenvalues are found to be significant. There are usually a set of small, poorly defined, unequal components in any empirical study. Given the presence of these components, the increased power of Bartlett's test can lead to the retention of these small components as the sample size increases. The sample size a t which this occurs is a function of both the significance level chosen (Horn and Engstrom, 1979) and the relative size of the smaller components. This can lead to the retention of more components a s a function of the number of subjects, other things being equal. It must also be recalled, however, as the sample size increases the estimates of equal population eigenvalues will become increasingly accurate. This increased accuracy leads to smaller differences between the estimated eigenvalues. If the smaller population eigenvalues are essentially equal, i t may be the case, with reasqnable ranges of sample size, that this increasingly accurate estimation offsets the increased power of the Bartlett test. A more recently proposed method (Velicer, 1976b) is based on the matrix of partial correlations. The average of the squared partial correlations is calculated after each of the first m components is partialed out. The minimum average of the squared partial correlation indicates the stopping point for this method. That is, when the average squared partial correlation reaches a minimum, the number of components partialed out is the number of eomponents to be retained. Velicer (1976b) demonstrated that the average of squared partials will continue to decrease until a unique component is partialed out. At that point, the average squared partial will increase. Therefore, the m retained components will contain no unique components. This method, to be referred to as the Minimum Average Partial Method (MAP), is congruent with the factor analytic concept of "common" factors. Velicer (197613) points out the method is exact, can be applied with any covariance matrix and is logically related to the concept of factors representing more than one variable. It is expected MAP will often produce fewer components than will the R 1 method, particularly when the 256
MULTIVARIATE BEHAVIORAL RESEARCH
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William F1. Zwick and Wayne F. Ve!licet
number of variables is large. A relatively recently introduced method, MAP has1 not beenSexamined systematically to date. This study's examination of the effectiveness of various decision methods will include the four methods described above. The K1 method was included because of its widespread use and the MAP method because of its unambiguous solution and its relation to "common factor" concepts. Bartlett's statistical method was included in this study because it is the only statistical metbod appropriate for PCA solutions. The scree test was included because of its apparent simplicity and its reported validity. Each of tliese methods may be differentially affected by several different variables including saimple size, the number of variables, the degree of component identification and component saturation. The roblustness of the four rules in question across these variables is a cenltral focus of this study and may prove to be a useful criteria in choosing among tlie methods.
Four factors which may effect the various decision rules were investigated: sample size, number of variables, number of components, and connponent saturation. Only "ideal" patterns were employed, i.e., va,riables loaded substantially on one and only one component, each component was identified by the same numbes of variables, and all nonzero loadings were equal. I t is importanrt to note that under these conditions, the number of variables per component is directly dependent on the combination of the number of variables and the number of components. Three levels of sample size, three levels of the number of variables, three levels of the number of components, and two levels of component loadings were investigated across each of four decision methods (MI; MAP; BART ; SCREE). The levels within each factor were chosen to represent applied research conditions. Small sample sizes are often a practical necessity. Large samples lead to much greater accuracy of estjimation. However, the impact of increasing the sample size decreases as the sample becomes larger and larger. The range of 75 to 450 was chosen to represent small and moderately large samples, An intermediate value of 150 was also included. Data reduetiion techniques are typically not required on simall data sets. The use of PCA in test construction, on the other band, APRIL, 1982
257
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
Wiiliam R. Zwick and Wayne F. Velicer
often involves 100 or more variables. Data sets of 150 or more variables are no longer rare. To represent this range, 36 variables were chosen as the smallest data set, 72 variables as a moderate set and 144 variables as the largest set. This range appears representative of applied use and wide enough to impact any rule sensitive to the number of variables. Two components are required for rotation and three or more are often assumed present. Few applied usages calI for more than 15 or 20 orthogonal components. Given the range of variables indicated above and the desire to have an equal number of variables identifying each component m= 3, 6, and 12 were chosen as the three levels for the number of components factor. Loading below .30 or .40 are typically ignored in applied uses of PCA. Loadings of over .85 are rarely found. The lower level of component saturation was set a t .50 to avoid trivial loadings while still representing variables which shared only a moderate proportion of their variance with a component. The upper level was set a t .80 to represent high loading variables. Linn (1968) has found a similar range (.40 to 3 0 ) broad enough to differentially effect decision rules. One population correlation matrix was generated for each combination of the 3 x 3 x 2 factors outlined above. Each population correlation matrix was determined as follows: A population pattern matrix (C) was created in accordance with the level of the "number of variables factor," the level of the saturation factor" and of the "number of components factor" under consideration. Postmultiplying by its transpose resulted in a covariance matrix R* ((CC' = R*). Substitution of ones into the diagonal of R* produced a population correlation matrix (R). The introduction of ones in the diagonal of R raised i t to full rank thereby allowing subsequent analysis. Ten sample correlation matrices based upon the population correlation matrix and employing the predetermined sample size were generated employing a computer program developed by Montanelli (1975) except for those cells where the number of variables was 144 and the sample size was 75. It should be noted that all four rules would correctly identify the number of components to be retained for the population correlation matrix. A principal component analysis was then performed on each of the resulting 480 sample correlation matrices. At the time this analysis was performed, the number of components to be retained was determined using each of the three calculable rules (K1, MAP 258
MULTIVARIATE BEHAVIORAL RESEARCH
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William
R.
Zwick and Wayne F. Velicer
and BART). The Bartlett's test was performed a t an alpha level of .05 in all cases. Plots of the eiigenvalues for each analysis were obtained. These plots were examined by raters briefly trained in the scree method but uninformed as to the purpose of the study. The two raters were graduate students in psychology. The graphs were presented to each rater independently and they were unaware of each other's responses. The graphs themselves were computer generated on 81/2"x 14" sheets of paper and contained no indication of what they represented. The raters achieved high interrater reliabjlity ( a = .94) across the 480 samples examined. An experienced expert judge, uninformed as to the purpose of the experiment but familiar with the use of tlhe scree test, rated one sample from each of the 48 cells. The correlation between the judge's ratings and the mean rating of the two raters on the same graph was also high ( r = 36).
(z)
The mean number of components retained by each rule in each cell of the overall design was computed. This mean was then subtracted from the population criterion. Positive mean difference (d) scores, therefore, indicate underestimation of the population value while negative mean difference scores indicate overestimation. Table 1presents a summary of these results for tlhose cells in which the component saturation was equal to .50 and the sample size was equal to 75. Since Tables 2 through 6 follow the same format as Table I, a detailed description will only be given for Table 1. The first row of Table 1 examines the performance of the four decision rules (MAP, K1, BART, SCREE) under the calndition of three population components ( m = 3) and 36 varia.bles ( p = 36) in the! correlation matrix. Under this condition MAP retained an average of 3.0 components and, thus, had a mean difference of 0.0 from the criterion. K1 retained 12 components for a difference score 9.0, an overestimation. BART retained an average of 2.0 components for a mean difference of 1.0, an underestimation. SCREE retained an average of 3.1 components flor a mean difference of .I, an overestimation. The second row represents the same values for each decision rule when the three components were based on 72 variables (p = 72). Under these conditions, MAP overestimated slightly (d = .5), K1 and BART APRIL, 1982
259
William R. Zwick and Wayne F. Velicer Table 1 Mean number of components retained and mean difference from population value. (Sample size = 75.
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
MAP
K1
Component saturation = 50.)
BART
SCREE
overestimated greatly (  20.8 and  22.2 respectively) and SCREE overestimated slightly (d = .I). Similarly, examination of the third and fourth rows indicates, when the number of components is six ( m = 6) and the number of variables is 36 ( p = 36), MAP and BART somewhat underestimated the number of components, SCREE slightly overestimated, and K1 greatly overestimated the number of components. An examination of the last two rows ( m = 12) indicates, across both levels of the number of variables factor, MAP greatly understimated, SCREE moderately underestimated, while K1 and BART greatly overestimated the population value. Table 2 has the same form as Table 1. The sample size has been increased to 150, however, so p = 144 matrices have been included. Under these conditions the performance of MAP, BART and SCREE improved but K1 continued to overestimate greatly. Across the three levels of number of components and number of variables SCREE slightly underestimated, MAP and BART moderately underestimated, while K1 greatly overestimated. It is important to note that much of the underestimation by MAP, BART and SCREE occurred when the population criterion was 12 and the number of variables was 36. Kl's overestimation appeared to be closely linked to the number of variables. For each increase in 260
MULTIVARIATE BEHAVIORAL RESEARCH
William R. Zwick and Wayne F. Vellicer Table 2 Mean number o f components retained and mean difference from population value. (Sample size = 150.
Component saturation = .50.) 7

Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
MAP
K1
BART
SCREE
the number of variables a t any one level of the number of components, there appears to be a related increase in the number of components retained by K1. Table 3 is presented in the same way as Tables 1 and 2,, In this case the sample size is 450. BART and SCREE perfonned well across the three levels of both the number of components ( m = 3, 6, 12) and the number of variables (p = 36, 72, 144). SCREE performed perfectly while MAP underestimated greiatly under only one condition. BART slightly over or underestimated in most cases. K1 consistently overestimated the population criterion. The over estimation again appears to be related to the number of variables. Tables 4, 5, and 6 parallel Tables 1, 2, and 3 with an increase in component saturation from .50 to .80. Table 4 indicates BA,RT again greatly overestimated when the number of variables .was equal to 72 and the sample size was 75. K1 continued to overestimate the population criterion but to a much lesser degree. M:AP APRIL, 1982
261
William R. Zwick and Wayne F. Velicer Table 3 Mean number of components retained and mean difference from population value. (Sample size = 450.
MAP
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
m
p
x
Component saturation = .050.)
K1 d
X
SCREE
BART
d
X
d
X
d
Table 4 Mean number of components retained and mean difference from population value. (Sample size = 75.
MAP
262
K1
Component saturation = .ED.)
BART
SCREE
MULTIVARIATE BEHAVIORAL RESEARCH
William
R. Zwick and Wayne F. Velicer
and SCREE very closely approximated the population criterion in all cells. Table 5 summarizes the results when the sample size vvas increased to 150. There was some underestimation by BART when the number of variables was 144. SCREE and MAP performed
Table 5
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
Mean number o f components retained and mean difference from population valu~a. (Sample size = 150.
MAP
K1
Component saturation = .80.)
BART
SCREE
very well. K1 continued to overestimate but this error was ess'entially restricted to the case of p = 144. Table 6 represents what theoretically should have been the "best case" for all the rules employed. The component saturation was high 1.80) and the sample size was large (N = 450). MAP performed perfectly with no mean over or under estimations. SCREE closely followed with only a few small over and undlerestimations. BART, though still slightly underestimating wlien p = 144, had only a slight underestimation overall. K1 continued to overestimate b ~ only ~ t in the case of p = 144 when the number of components was equal to 12. APRIL, 1982
263
William R. Zwick and Wayne F. Velicer Table 6 Mean number o f coinponents retained and mean difference from population value. (Sample size = 450.
MAP
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
m
p
X
d
X
Component saturation = .80.)
K1
BART d
X
d
SCREE
X
d
The performance of four decision rules for determining the number of components to retain was examined. Ten samples were drawn from each of 48 known population correlation matrices. These 48 matrices represented four systematically varying parameters. The four parameters were the number of components ( m = 3, 6, 1.21, the number of variables (p = 36, 72, 144), the sample size (N = 75, 150, 450) and the component saturation (CS = .50, 3 0 ) . The four decision rules employed were K1, SCREE, MAP and Bartlett's test. All four rules would retain the appropriate number of components when applied to the population correlation matrix. The study employed very simple population correlation matrices. That is, no unique or complex variables were included. These matrices all involved only large common components and error. This situation conformed exactly to the assumptions of all four rules. This antiseptic condition was seen as a good first 264
MULTIVARIATE BEHAVIORAL RESEARCH
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William
R. Zwick and Wayne F. Velicer
comparative testing ground for these decision rules. Weak ]performance by any decision rule under these conditions would appear to be strong evidence against the continued applied use of that rule. The K1 rule was found to have consistently overestimated the number of components. This finding parallels those of Linn (19168)' Cattell and Jasper (1967) and Browne (1968). At low saturation, the number retained often fell in the 1/31to 1/5 p range discussed by Gorsuch (1974). As the number of variables increased so did the number of components retained. Particularly a t .50 loadings, the number of co~mponentsretained appeared more related to the number of variables than to the number of components in the structure. This finding is clearly contrary to Mote's (1970) and Humphreys' (1964) suggestion that the KI rule may retain too few components. Their work involved a~ctualtest data and their conclusions about the appropriate number of components was b,ased upon a priori deccisions or subjective interpretability of the component pattern. The probable presence of poorly defined components on the one hand and an absence of a verifiable criterion on the other may explain their opposing results. Given the apparent functional relationship of the number ctf components retained to the number of variables in this study, it is difficult to recomniend the continued use of the K1 rule. The BART rule performed well in most cases except when the number of variables closely approached the sample size. It had been expected BART would overestimate the number of components for large sample sizes (Gorsuch, 1974). This did not occur. The range of s:ample size in this study was chosen to reflect applied situations. The absence of small unequal components in these data sets may have masked the effect of increased power relative to sample size. The difficulty of choosing an "appropriate" significance level in practice is an applied limitation of the BART rule (Horn and IEngstrom, 1979) which was not addressed in this study. Bartlett's test was most inaccurate at the smallest szmple sizes when the number of variables was close to the sample size (i.e., p = 72, AT = 75). A maximum likelihood test (Jorecrkog, 1962) similar to the Bartlett test involves a specific correc:tion factor for situations of this type. No such correction exists; for Bartlett's test. Until such a correction is developed, the Bar.tlett test should not be employed in such situations. APRIL, 1982
265
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William R. Zwick and Wayne F. Velicer
As the first comparative test of the MAP rule, this study demonstrated the procedure's general accuracy. MAP consistently underestimated the number of components in one situation. This occurred when the number of components was large relative to the number of variables and the component saturation was low (i.e., m = 12, p = 36 and 72 and CS = .50). This may have occurred because the common component in such a case is nearly as poorly defined as the unique part of a single variable. An investigator who wished to retain such components would be advised to employ an alternate procedure. This study indicates MAP is a viable decision rule worthy of future consideration. Its unequivocal stopping point and its relation to the concept of "common factors" argue for further examination. The SCREE test generally performed best across all settings. These results expand on those of Cattell and Vogelman (1977) concerning the SCREE'S accuracy. High interrater reliability was attained. This is contrary to the extremely low interrater reliabilities reported by Crawford and Koopman (1979). The absence of minor and/or unequal components in the sampled data set may have made the raters decisions simpler and, thus, more reliable. The accuracy of this rule relative to the three others across the 48 situations examined is perhaps the strongest support available for the use of the SCREE procedure. Certainly, future research in this area must now include this test. It appears, from these observations, that component saturation had the greatest impact upon the accuracy of three of the decision rules. Each rule, except BART, performed better a t the higher (.go) level of saturation than a t the lower (.50) level. It should be recalled that components made up of .50 loadings are typically not considered weakly defined. Increases in sample size generally improved rule perormance. Increases in the number of variables examined had a dramatically detrimental effect upon K1 but did not appear to negatively effect any other rule. Such increases, in fact, appeared to aid MAP and SCREE. All rules had difficulty identifying 12 components under conditions of small sample size, .50 saturation and 36 or 72 variables. This effect may have been caused by the smaller number of variables identifying these components. Although previous studies have examined subsets of these rules under some of the conditions examined, the present study presents comparisons across a wider variety of situations involving the 266
MULTIVARIATE BEHAVIORAL RESEARCH
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William R. Zwick and Wayne F. Velicer
controlled manipulation of more variables than any previoucs investigations. In addition, each of the 48 population structnres involved ten samples. In those areas where the simulated situa1;ions were similar, the results of Linn (1968) and Cattell and Vogelman (1977) were confirmed and expanded. The present study compared simultaneously four rules representing different approaches to the number of components problem, one of which, MAP, hadl not been systematicallly examined before. Further examination of these rules, particularly SCREE and MAP, under more complex conditions is called for. Applied data sets often include complex variables which load above .50 on imore than one component. The impact of such complex variables upon these decision rules has not been systematically examined to date. Further, it is not uncommon to find a single variable loading substantially on a component identified by no other variable. These unique variables may lead to a component being retained by ,some rules but not by others. Finally, it is common to find a different number of variables identifying various components within any data set. This inequality may differentially effect decision rules. For instance, in a p = 40, m = 5 case, two sets of 11 variables could identify each of the first two components and three sets of six variables could identify each of the next three components. Such a multiple break structure may pose problems for rules such as SCREE and BART which search the smaller eigenvalues for equality. Variables such as the three described above may drastically influence the performance of these rules. An examination (sf a t least these factors is needed before a filial decision can be reached concerning which of these rules is generally most appropriake. Based upon the results of this study, which included samples drawn from simple, well defined population correlation matrices, we can conclude there is no evidence supporting the continued use of the K1 rule. The BART procedur~ewas found to generally perform quite accurately except when the number of variable..'5 was close to the sample size. In the laf3er situations, BART greatly overestimated the number of components and, therefore, sllould not be employed under such conditions. A correction such as that employed by Joreskog (1962) in a maximum likelihood application might remedy this problem. The MAP procedure was generally accurate except when the component saturation was low andl few variables defined a component. If an investigator wishes to retain APRIL, 1982
267
William R. Zwick and Wayne F. Velicer
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
components of that type, other procedures may be more appropriate. The SCREE was generally the most accurate rule across all cases.
Bartlett, M. S. Tests of significance in factor analysis. The British Journal of Psychology, 1950, 3, 7785. Bartlett, M. S. A further note on tests of significance in factor analysis. T b British Journal of Psychology, 1951,4, 12. Browne, M. W. A note on lower bounds for the number of common factors. Psychometrika, 1968,3$, 2, 233. Cattell, R. B. The scree test for the number of factors. Ildultivariate Behavioral Research, 1966,1, 245276. Cattell, R. B., & Jaspers, J. A general plasmode for factor analytic exercises and research. Multivariate Behavioral Research Monographs, 1967, 212 pp. Cattell, R. B., & Vogelman, S. A comprehensive trial of the scree and KG criteria for determining the number of factors. Multivariate Behavioral Research, 1977,12, 289325. Cliff, N., & Pennell, R. The influence of communality, factor strength and loading size on the sample characteristics of f a d o r loadings. Psychometrika, 1967, 32, 309326. Comrey, A. L. Common meth~dologicalproblems in factor analysis. Journal of Consulting and Clinical Psychology, 1978,46, 648659. Crawford, C . B., & Koopman, P. Note: Interrater reliability of scree test and mean square ratio test of number of factors. Perceptual and Motor Skills, 1979, 49,223226. Glass, G. V., & Taylor, P. A. Factor analytic methodology. Review of Educational Research, 1966, 36, 566587. Gorsuch, R. L. Using Bartlett's significance test to determine the number of factors to extract. Edzecational and Psychological Measurement, 1973, 33, 367S64.      
Gorsuch, R. L. Factor Analysk. Philadelphia: Saunders, 1974. Guttman, L. Some necessary conditions for common factor analysis. Psychometrika, 1954,19, 149162. Guttman, L. The determinacy of factor score matrices with implications for five other basic problems of commonfactor theory. British Journal of Statistical Psyclwlogy, 1955, 8, 6581. Horn, J. L., & Engstrom, R. E. Cattell's scree test in relation to Bartlett's chisquare test and other observations on the number of factors problem. Multivariate Behavioral Research, 1979,14,283300. Horst, P. Factor Analysis of Data Matrices. New York: Holt, Rinehart and Winston, 1965. Humphreys, L. G. Number of cases and number of factors: An example, where N is very large. Educational and Psychological Measurement, 1964, 24,457. Humphreys, L. G., & Montanelli, R. G. An investigation of the parallel analysis criterion for determining the number of common factors. Multivariate Behavioral Research, 1975,10, 193205. JBresbg, K, G. On the statistical treatment of residuals in factor analysis. Psychdmetrika, 1962,&7,335354. Kaiser, H. F. The application of electronic computers to factor analysis. Educational and Ps~chologicalMeasurement, 1960,20,141151. Kaiser, H. F. A note on Guttman's lower bound for the number of common factors. British Journal of Statistical Psychology, 1961, 14, 12. Kaiser, H. F. A secondgeneration little jiffy. Ps'sychometrika, 1970, 35, 401415. 268
MULTIVARIATE BEHAVIORAL RESEARCH
Downloaded by [Moskow State Univ Bibliote] at 02:11 04 January 2014
William R. Zwick and Wayne F. Veticer Kaiser, H. F., & Caffrey, J. Alpha factor analysis. Psychometrika, 1965, 30. 144. Linn, R. L. A Monte Carlo approach to the number of factors problem. Psychometrika, 1968, 33, 3771. Montanelli, R. G., Jr. A computer program to generate sample correlation and covariance matrices. Educational and Psychological Measurement, 1975, 85, 195197. Mote. T. A. An artifact of the rotation of too many factors: Study orientation &. trait anxiety. Revista Interamericana de ~sioologia,1970,4, 171. Schijnemann, P. H., & Wang, M. Some new results on factor indeterminacy. Psychometrika, 1972, 37,6191. Steiger, J. H., and Schijnemann, P. H. A history of factor indetermlinacy. In Theory Colzstruction and Data Analysis in the Behavioral Sciences. (S. Shye, Ed.) San Francisoo: JosseyBass, 1978. Tucker, L. B., Koopman, R. F., & Linn, R. L. Evaluation of factor analytic research procedures by means of simulated correlation matrices. Psychometrika, 1969, 34, 421. Van de Geer, J. P. Introduction to Multivariate Analysis for the Socitrl Sciences. San Francisco: Freeman, 1971. Velicer, W. F. A comparison of the stability of factor analysis, principal component analysis and rescaled image analysis. Educational and Psychological Measurement, 1974,34, 563572. Velicer, W. F. The relation between factor score estimates, image scores, and principal component scores. Educational and Psychological Measurement, 1976,96, 149159. (a) Velicer, W. F. Determining the number of components from the matrix of partial correlations. Psychometrika, 1976, 41, 3, 321327. (b) Velicer, W. F. An empirical comparison of the similarity of principal component, image, and factor patterns. Multivariate Behavioral Research, 1977,1.2, 322.
APRIL, 1982
269