577189

research-article2015

ASMXXX10.1177/1073191115577189AssessmentHurks et al.

Article

Accuracy of Short Forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence: Third Edition

Assessment 1­–10 © The Author(s) 2015 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/1073191115577189 asm.sagepub.com

Petra Hurks1, Jos Hendriksen2,3, Joelle Dek4, and Andress Kooij4

Abstract This article investigated the accuracy of six short forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence–Third edition (WPPSI-III-NL) in estimating intelligent quotient (IQ) scores in healthy children aged 4 to 7 years (N = 1,037). Overall, accuracy for each short form was studied, comparing IQ equivalences based on the short forms with the original WPPSI-III-NL Full Scale IQ (FSIQ) scores. Next, our sample was divided into three groups: children performing below average, average, or above average, based on the WPPSI-III-NL FSIQ estimates of the original long form, to study the accuracy of WPPSI-III-NL short forms at the tails of the FSIQ distribution. While studying the entire sample, all IQ estimates of the WPPSI-III-NL short forms correlated highly with the FSIQ estimates of the original long form (all rs ≥ .83). Correlations decreased significantly while studying only the tails of the IQ distribution (rs varied between .55 and .83). Furthermore, IQ estimates of the short forms deviated significantly from the FSIQ score of the original long form, when the IQ estimates were based on short forms containing only two subtests. In contrast, unlike the short forms that contained two to four subtests, the Wechsler Abbreviated Scale of Intelligence short form (containing the subtests Vocabulary, Similarities, Block Design, and Matrix Reasoning) and the General Ability Index short form (containing the subtests Vocabulary, Similarities, Comprehension, Block Design, Matrix Reasoning, and Picture Concepts) produced less variations when compared with the original FSIQ score. Keywords short forms, intelligence, young children Intelligence tests are frequently included in assessments of children and adults (Watkins, Glutting, & Lei, 2007), either in the clinics or for research purposes (Hurks, Hendriksen, Dek, & Kooij, 2013). The extensive empirical literature on intelligence testing in older children and adults contrasts with the sparse reports on young children (4-7 years; Baron & Leonberger, 2012). The Wechsler Preschool and Primary Scale of Intelligence–Third edition (original English version: WPPSI-III; Wechsler, 2002b; Dutch version: WPPSIIII-NL; Wechsler, 2010) is one of the few well-validated and widely used tests available for assessing the intelligence of these young children (Ford & Dahinten, 2005; Sattler, 2008). Here, Wechsler introduced subtests that, according to him, highlight several cognitive aspects of intelligence, such as abstract reasoning, perceptual organization, verbal comprehension, quantitative reasoning, memory, and processing speed (Hurks et al., 2013; Wechsler, 2010). Based on these subtests, an “original” WPPSI-III composite score (the Full Scale Intelligent Quotient [FSIQ] score) can be calculated, which represents the child’s intellectual abilities. The FSIQ is based on the summation of “points earned”

across these WPPSI-III subtests with each subtest being accorded the same weight as the others (Hurks et al., 2013). In both standardization samples and clinical samples, this composite score has repeatedly been identified as an early and robust predictor of important life outcomes, in terms of education, work, and social interactions (Flanagan & Harrison, 2012; Gottfredson, 1998; Watkins, Glutting, et al., 2007; Watkins, Lei, & Canivez, 2007; Wechsler, 2002a, 2002b, 2010; Yang, Jong, Hsu, & Lung, 2011). Administrating to young children, the WPPSI-III subtests required for the calculation of the FSIQ score can take more than 90 minutes (Wechsler, 2010). However, it may be 1

Maastricht University, Maastricht, Netherlands Kempenhaeghe Center for Neurological Learning Disabilities, Heeze, Netherlands 3 University Hospital Maastricht, Maastricht, Netherlands 4 Pearson Test Publishers, Amsterdam, Netherlands 2

Corresponding Author: Petra Hurks, Department of Neuropsychology and Psychopharmacology, Maastricht University, P.O. Box 616, 6200 MD, Maastricht, Netherlands. Email: [email protected]

Downloaded from asm.sagepub.com at UNIVERSITE DE MONTREAL on September 13, 2015

2

Assessment 

impractical to administer all WPPSI-III subtests, for instance, when needing an intelligence estimate as a covariate in empirical research or when working with children with short attention spans due to neurological conditions and/or learning disorders (Hrabok, Brooks, FayMcClymont, & Sherman, 2014). In these cases, more time needs to be devoted to the assessment of other domains of functioning that may be more essential in answering the research or clinical referral questions (Reid-Arndt, Allen, & Schopp, 2011; Sattler, 2008). Therefore, there is often a need for a short form of the WPPSI-III (or a shortened test version) to estimate the individual’s IQ in evaluations of research and/or patients populations. Hence, the reason for an examination of an abbreviated version of the WPPSIIII-NL, which is a reliable and valid estimation of a child’s intelligence. In the literature, several short forms of the Wechsler scales in older children (Wechsler Intelligence Scale for Children–Fourth edition [WISC-IV]) and adults (Wechsler Adult Intelligence Scale–Third edition [WAIS-III]) have been examined. Previous research using the WISC-IV or the WAIS-III have found high correlations (in general ≥.90) between the IQ estimates of the short forms and the FSIQ estimates of the original long forms (Axelrod, 2002; Axelrod, Ryan, & Ward, 2001; Hrabok et al., 2014; ReidArndt et al., 2011). The strength of this relationship is positively correlated with the number of subtests in the short forms with the most accurate estimates being obtained after administering seven subtests (78%-94% clinically accurate rates; Hrabok et al., 2014). However, according to Miller, Streiner, and Goldberg (1996), all combinations of four or more subtests provide a (reasonably) accurate IQ estimate in adults and older children. To our knowledge, the number of studies that have examined the psychometric properties of short forms of intelligence tests that assess IQ in young children is extremely limited and for the WPPSI-III in particular nonexistent. The Stanford–Binet Intelligence Scales–Fifth edition (Roid, 2003), a test used to estimate the FSIQ of individuals aged 2 to 85 years, includes a short form (Abbreviated IQ) which is based on two subtests: nonverbal subtest (Object Series/Matrices) and verbal subtest (Vocabulary; Zwaigenbaum et al., 2012). Coolican, Bryson, and Zwaigenbaum (2008) found that in children with autism spectrum disorders, the Abbreviated IQ was highly correlated (.95) with the FSIQ estimates of the original long version of the Stanford–Binet Intelligence Scale. However, as mentioned above, in older children and adults, the accuracy of the IQ estimates of short forms increases when more than three subtests are included in the short form. Data on if and how the accuracy of short forms vary as a number of subtests in young children are not yet available. Therefore, the present study investigated the accuracy of six short forms of the Dutch WPPSI-III-NL to determine which short forms yield the most accurate IQ estimate in healthy children who

are between 4 and 7 years 11 months of age. We anticipated on finding (a) high positive correlations between the IQ estimates of the short forms and the FSIQ estimates of the original long form and (b) positive correlations between the number of subtests included in the short forms and the strength of the relationship between the IQ estimates of the short forms and the FSIQ estimates of the original long form.

Method Participants A total of 1,037 healthy children aged 4 to 7 years 11 months were administered in the 14 subtests of the Dutch WPPSIIII-NL test battery (Hurks et al., 2013; Wechsler, 2010). Children who had a clinical condition known to affect cognition (such as epilepsy or attention-deficit/hyperactivity disorder) were excluded from the sample. All children who participated in the study were native Dutch speakers and all parents (or caregivers) of the children gave consent for their child to participate in the study. Furthermore, participants were stratified based on Bureau of Census data (Centraal Bureau voor Statistiek, 2007; Nationaal Instituut voor de Statistiek, 2007; Vlaams ministerie van Onderwijs en Vorming, 2007), on (geographic) regional distribution, parental educational level, and ethnic groups. Participants were obtained from all regions of the Netherlands and Flanders, which is the Dutch-speaking part of Belgium (Hurks et al., 2013).

Instruments The Dutch WPPSI-III-NL consists of 14 subtests, including 7 core subtests that are used to calculate the FSIQ scores and 7 supplemental scores (Wechsler, 2010). For a description of these subtests, see Hurks et al. (2013). Subtest scaled scores have a possible range from 1 to 19, a mean of 10, and a standard deviation of 3. FSIQ scores range between 55 and 145, with a mean of 100 and a standard deviation of 15. Reliability coefficients for the Dutch WPPSI-III-NL FSIQ score range between 0.84 (test–retest reliability) and 0.92 (internal consistency) for children aged 4 to 7 years. Six short forms of the WPPSI-III-NL were developed using different combinations of the WPPSI-III-NL subtests. The number of subtests included in the short forms ranged from two to six. In particular, we used subtests that have been utilized in the short forms of the WISC-IV and the WAIS-III (see Table 1 for a description of the short forms used in our study). Next, the results of the summations of the standard scores were transformed by Cureton and Tukey’s (1951) smoothing method in equating (as described in Angoff, 1984) to obtain estimated IQ scores (M = 100, SD = 15, range: 55-145). Table 2 shows the

Downloaded from asm.sagepub.com at UNIVERSITE DE MONTREAL on September 13, 2015

3

Hurks et al. Table 1.  An Overview of the Different WPPSI-III Short Form Versions Examined in the Study. Name shortened version FSIQ SF2a_VcBd SF2b_VcMr SF4a_VcSiMrSs SF4b_VcMrSsCd SF4c_VcSiBdMr SF6_GAI

WPPSI-III subtests included Vocabulary, Information, Word Reasoning, Block Design, Matrix Reasoning, Picture Concepts, Coding Vocabulary, Block Design Vocabulary, Matrix Reasoning Vocabulary, Similarities, Matrix Reasoning, Symbol Search Vocabulary, Matrix Reasoning, Symbol Search, Coding Vocabulary, Similarities, Block Design, Matrix Reasoning Vocabulary, Similarities, Comprehension, Block Design, Matrix Reasoning, Picture Concepts

References Wechsler (2010) Sattler (1992), Sattler and Dumont (2004) Wechsler (1999) Hrabok et al. (2014) Hrabok et al. (2014) This combination is also known as the Wechsler Abbreviated Scale of Intelligence (Wechsler, 1999) GAI (Saklofske, Gorsuch, Weiss, Zhu, & Patterson, 2005; Saklofske, Prifitera, Rolfhus, Zhu, & Weiss, 2005)

Note. WPPSSI-III = Wechsler Preschool and Primary Scale of Intelligence–Third edition; SF = short form version; FSIQ = Full Scale Intelligence Quotient; Vc = Vocabulary; Bd = Block Design; Mr = Matrix Reasoning; Si = Similarities; Ss = Symbol Search; Cd = Coding; GAI = General Abilities Index.

estimated IQ equivalents for the sum of scaled scores of each shortened version of the Dutch WPPSI-III-NL.

Results Entire Sample The FSIQ estimates of the original long WPPSI-III-NL test were considered to be the referent measures in line with Spinks et al. (2009). Means, standard deviations, minimum and maximum scores for (a) the WPPSI-III-NL FSIQ estimates and (b) the IQ estimates of all six WPPSI-III-NL short forms are reported in Table 3. General Linear Model (GLM) repeated measures analyses and post hoc (least significant difference [LSD]) comparisons examined the statistical differences between the FSIQ estimates of the original long form and the IQ estimates of each short form included. We found that the IQ estimates significantly differ from each other (Pillai’s trace: F(6, 1010) = 5.79, p < .001, ηp2 = 0.03). The IQ estimates based on the short forms “SF2a_VcBd (short form 2a_Vocabulary and Block Design)” and “SF2b_VcMr (short form 2b_Vocabulary and Matrix Reasoning)” were significantly higher than all other IQ estimates (including the FSIQ estimates). None of the other comparisons were significant. Next, Pearson correlations were reported as a measure of agreement between the FSIQ estimates of the original long WPPSI-III-NL form and the IQ estimates of each short form included. Also, intraclass correlations were calculated to estimate the 95% confidence intervals (CI) of these correlations. Results of these correlations and 95% CIs are displayed in Table 4. While studying the entire sample, all IQ estimates of the short forms correlated highly with the FSIQ estimates of the original long form, all rs ≥ .83. Furthermore, the percentage of agreement (defined as ±5 IQ points) between the FSIQ estimates of the original long form and

the IQ estimates of each short form was also calculated. In relatively many cases, the IQ estimates of the short forms deviated more than ±5 points from the FSIQ estimates of the original long form; that is, percentages varied between 30.83% and 53.38% (see Table 3). GLM repeated measures analyses and post hoc (LSD) comparisons examined the statistical differences between the FSIQ estimates of the original long form and the FSIQ estimates of each short form, that is, in terms of ±5 point deviations between IQ estimates of the short forms and the FSIQ estimates of the original long form. We found that the FSIQ estimates of the short forms significantly differed from each other in how much these IQ estimates of the short forms deviated from the FSIQ estimates of the original long form (Pillai’s trace: F(5, 1011) = 34.53, p < .001, ηp2 = 0.15). Compared with all other short forms, IQ estimates of short forms “SF4c_ VcSiBdMr (Vocabulary, Similarities, Block Design, and Matrix Reasoning)” and “GAI (Vocabulary, Similarities, Comprehension, Block Design, Matrix Reasoning, and Picture Concepts)” deviated significantly less from the FSIQ estimates of the original long form. None of the other comparisons were significant. Finally, all aforementioned analyses were repeated while studying the effects of the participants’ age on the accuracy of the IQ estimates of the short forms. For one, the GLM multivariate analyses for comparing the statistical differences between the FSIQ estimates of the original long form and the IQ estimates of each short form included were repeated, while entering “age group” in the model as an independent factor. Age group was defined as a categorical factor with four categories (age [years] “4,” “5,” “6,” and “7”). All aforementioned main effects for IQ estimates remained significant. In contrast, the interaction of age by IQ estimates was not significant (Pillai’s trace: F(18, 3027) = 1.10, p = .350, ηp2 = 0.01). Also, Pearson correlations were calculated for

Downloaded from asm.sagepub.com at UNIVERSITE DE MONTREAL on September 13, 2015

4

Assessment 

Table 2.  Estimated IQ Equivalents for the Sum of Scaled Scores of All Six Short Versions of the WPPSI-III-NL. Standardized scores 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106

SF2a_VcBd

SF2b_VcMr

3

4

SF4a_VcSiMrSs

SF4b_VcMrSsCd

SF4c_VcSiBdMr

SF6_GAI

14

10

17   22   23 24 28   29   30   31 32 33-34   36 37 38   39 40 41   42-43   44-45 46   47 48 49   50 51 52 53 54 55 56   57 58 59   60-61   62   63 64 65

5

15

16

6 7

16

15

17 19

17

20

18

17 18

19 8

8

20 19-20 9 10

11

22 9

23

10

24 25

11 26

12

21 22

21 22 23

23 24 25 26 27

25 26

28

27

29

28

30

29

31 32

30

24

27 12 28

13

14

13

29

14

30 31

15

15

16

16

32 33

31 32 33 34

34

33 34

35 17

17

18

35 36

36

37

37

38 39

38

37 38

39

39

40

40-41

18

19 20

19 20

40 41

35 36

41 42 21

21

42 43

42 43

43

(continued)

Downloaded from asm.sagepub.com at UNIVERSITE DE MONTREAL on September 13, 2015

5

Hurks et al. Table 2.  (continued) Standardized scores 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145

SF2a_VcBd

SF2b_VcMr

SF4a_VcSiMrSs

22

44

22 45 23

SF4b_VcMrSsCd

SF4c_VcSiBdMr

44

44 45

45 46

23

46

46 47 24 24

47 48

47 48

48 49

25

49 50

49

51

50

52

51

53

52

28

54 55

53

29

56 57

54 55

57 58 59

30

30 31

56 57-58

60

31

58 59 60

25 26

26

50 51 52

27

27

28

53 54 55 56

29

61 59 32 33 34 35

32

61 62 63

60

62

61 64

64 65

36

33

64

67

66

38

38

67-76

68-72

71-76

SF6_GAI 66   67 68 69 70 71   72 73 74   75 76 77 78 79 80 81 82 83 84   85   86 87-88 90 91   93 94 95 97 98       105-109

Note. IQ = intelligence quotient; WPPSI-III-NL = Dutch version of Wechsler Preschool and Primary Scale of Intelligence–Third edition; SF = short form version; Vc = Vocabulary; Bd = Block Design; Mr = Matrix Reasoning; Si = Similarities; Ss = Symbol Search; Cd = Coding; GAI = General Abilities Index.

each “age group” separately. Again, no age differences were found in terms of Pearson correlations, that is, all rs ≥ .80 (see Table 4). Furthermore, the percentage of agreement (defined as ±5 IQ points) between the FSIQ estimates of the original long form and the IQ estimates of each short form was calculated per “age group” (see Table 3). Again, GLM repeated measures analyses and post hoc (LSD) comparisons were conducted to examine the statistical difference between the

FSIQ estimates of the original long form and the IQ estimates of each short form, that is, in terms of ±5 point deviations. No age differences were found for the percentages of agreement between the FSIQ estimates of the original long form and the IQ estimates each short form, that is in terms of ±5 point deviations (Pillai’s trace: F(15, 3030) = 1.26, p = .221, ηp2 = 0.01). All other results discussed above remained the same after including the variable “age group” in the analyses.

Downloaded from asm.sagepub.com at UNIVERSITE DE MONTREAL on September 13, 2015

6

Assessment 

Table 3.  Means, Standard Deviations, Minimum and Maximum Scores for the WPPSI-III-NL FSIQ, and All Six Short Versions of the WPPSI-III-NL. Age 4 years

Entire sample

Variables FSIQ SF2a_VcBd SF2b_VcMr SF4a_VcSiMrSs SF4b_VcMrSsCd SF4c_VcSiBdMr SF6_GAI

M

SD

Minimum

Maximum

% Within 5 points FSIQ

100.50 101.60 101.55 100.91 100.96 100.85 100.60

14.84 15.00 15.01 15.02 15.01 14.93 14.93

55 55 55 57 55 55 55

145 145 145 145 145 145 145

— 50 46.62 47.44 48.54 58.24 69.17

Abs. Mean difference

SD

— 6.67 7.03 6.55 6.80 5.24 4.18

— 5.13 5.30 4.56 5.14 3.81 3.16

Age 5 years

Age 6 years

Age 7 years

% Within % Within % Within % Within 5 points 5 points 5 points 5 points FSIQ FSIQ FSIQ FSIQ — 50.0 45.8 50.4 50.4 52.3 68.5

— 49.8 42.1 43.6 44.4 60.2 71.8

— 51.0 49.8 46.3 50.2 59.1 66.4

— 49.0 48.6 45.6 47.1 58.7 66.8

Note. IQ = intelligence quotient; WPPSI-III-NL = Dutch version of Wechsler Preschool and Primary Scale of Intelligence–Third edition; SF = short form version; FSIQ = Full Scale Intelligence Quotient; Vc = Vocabulary; Bd = Block Design; Mr = Matrix Reasoning; Si = Similarities; Ss = Symbol Search; Cd = Coding; GAI = General Abilities Index.

Table 4.  Intraclass Correlations Between the FSIQ Scores and Equivalents Based on All Six Short Versions of the WPPSI-III-NL. Entire sample

Age 4 years

Age 5 years

Age 6 years

Age 7 years

95% CI

95% CI

95% CI

95% CI

95% CI

  Variables SF2a_VcBd SF2b_VcMr SF4a_VcSiMrSs SF4b_VcMrSsCd SF4c_VcSiBdMr SF6_GAI

r .84 .83 .86 .84 .91 .94

Lower Upper bound bound 0.83 0.81 0.84 0.82 0.89 0.93

0.86 0.85 0.87 0.85 0.92 0.95

r

Lower Upper bound bound

.82 .82 .88 .87 .90 .94

0.77 0.77 0.85 0.84 0.87 0.93

0.85 0.85 0.91 0.90 0.92 0.95

r

Lower bound

Upper bound

r

Lower bound

Upper bound

r

Lower bound

Upper bound

.83 .80 .83 .80 .91 .94

0.79 0.75 0.79 0.75 0.88 0.92

0.87 0.84 0.87 0.84 0.93 0.95

.88 .86 .85 .83 .92 .94

0.85 0.82 0.82 0.78 0.90 0.93

0.90 0.89 0.88 0.86 0.94 0.96

.84 .83 .85 .83 .90 .93

0.81 0.79 0.81 0.79 0.87 0.91

0.88 0.87 0.88 0.87 0.92 0.95

Note. IQ = intelligence quotient; WPPSI-III-NL = Dutch version of Wechsler Preschool and Primary Scale of Intelligence–Third edition; SF = short form version; FSIQ = Full Scale Intelligence Quotient; Vc = Vocabulary; Bd = Block Design; Mr = Matrix Reasoning; Si = Similarities; Ss = Symbol Search; Cd = Coding; GAI = General Abilities Index; CI = confidence interval.

Accuracy as a Function of Ability Levels To examine the relationship of the FSIQ estimates of the original long form and the IQ estimates of each short form at the tails of the IQ score distribution, the sample was divided into three groups. Individuals with an FSIQ estimate of the original long form ≥115 were classified as “above average” (i.e., 16.2% of the entire sample). In this “above-average” group, the actual FSIQ estimates of the original long form range from 115 to 145. The “averageability” group had the FSIQ estimates of the original long form ranging from 85 to 114 (i.e., 71.1% of the sample). The “below-average” individuals were those children with a FSIQ estimate of the original long form

Accuracy of Short Forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence: Third Edition.

This article investigated the accuracy of six short forms of the Dutch Wechsler Preschool and Primary Scale of Intelligence-Third edition (WPPSI-III-N...
290KB Sizes 2 Downloads 5 Views