PsychologicalRepor&, 1990, 67, 531-541. O Psychological Reports 1990

REPORT O N SHANGHAI NORMS FOR THE CHINESE TRANSLATION O F T H E WECHSLER INTELLIGENCE SCALE FOR CHILDREN-REVISED ' LI DAN AND JIN W East China Normal Universiq STEVEN G. VANDENBERG

ZHU W E M E I AND TANG CAIHONG

Institute for Behavioral Genetics, University of Colorado, Boulder

Shanghai Sixth Hospital

Summary.-The Chinese translation of the Wechsler Intelligence Scale for Children-Revised (WISC-RC) was administered to 660 children (ages 6 through 16 yr) in the city of Shanghai. The obtained norms represent children's intelligence levels in big cities where the economic and c u l d development is advanced. The norms are reported as "Scaled Score Equivalents of Raw Scores" for each age group and as "IQ Equivalents of Sums of Scaled Scores." The reliability and validity of the norms indicate that the WISC-R is suitable for use with school-age children in China. The difference between the results for our Shanghai sample (WISC-RCs) and a USA sample (WISC-R) is also discussed.

The Wechsler Intelligence Scale for Children-Revised is a widely used intelligence scale for children. By the end of 1974, 12 translated editions had been recognized by the authors and publishers of the WISC-R. During the period of October 1980 to October 1981 we did a trial study in the city of Shanghai in which 352 randomly selected children from 6 to 16 yr. of age were tested. Analysis indicated that the scale is a reliable and vahd tool for measuring children's intelligence. It offers information for evaluating the levels of and analyzing the developmental structure of chddren's intelligence. In recent years, it has been applied in a small part of the Shanghai area and was welcomed by teachers, educational research workers, and clinicians. I t is necessary to develop "formal norms" to provide a scientific intelligence scale for school children from ages 6 through 16 yr. Although we are participating in the work of constructing national norms for the WISC-R, we consider it necessary to reflect on the developmental level of children's intelligence in the large city of Shanghai, because Shanghai is an economic and cultural center in China and has its own unique characteristics. In addition, there is a large population of children ages 6 through 16 yr. in Shanghai (about 600,000), so it is meaningful to formulate norms for Shanghai. Finally, after the national norms have been constructed, they can be compared with the norms for Shanghai. 'Requests for reprints and correspondence are to be sent to Jin Yu, Institute for Behavioral Genetics, University of Colorado, Boulder, C O 80309-0447.

532

L. DAN, ETAL.

For these reasons we did a second study in Shanghai from October to December 1984. This article contains a summary of that work.

During the initial testing in 1980-81, we found that the appliction of the WISC-RC, especially scoring, needed qualified examiners. It was necessary to hold training classes to make examiners familiar with test administration and scoring - procedures. . In selecting subjects, four variables were taken into account. They were district, school, age, and sex. The subjects were selected randomly in the following manner. District.-ALI subjects were selected from six districts of Jinan, Huangpu, Xuhui, Luwan, Hongkou, and Zhabai. School.-Because the ages of 6 through 16 yr. constitutes the school age range and the percentage of school attendance is very high in Shanghai (except for the severely mentally retarded), we decided to include only school children in our sample. To make the normative sample represent all schoolage children in Shanghai, most of the subjects came from ordinary elementary and high schools (excluding those in special schools, such as professional schools and art schools). According to the statistical data from the Shanghai Educational Bureau in 1983, the admission rate of students into Key Schools compared to that for ordinary schools was 1:3.6. Therefore, we selected 12 subjects from Key Schools in each age group (total 132) and 48 subjects from ordinary schools (total 528). The actual ratio was 1:4. Nineteen elementary and high schools were represented. Age.-The sample included 11 age groups from 6 through 16 yr. old. Each group had 60 subjects and the total number was 660. Sex.-In each age group, the numbers of the boys and girls were equal. Testing It is noted that this study did not adopt the revised manual available when we tested in 1980, but we used items for formulating national norms for the WISC-R. One reason was that the revised manual was developed at the 1981 Qinhuang Island cooperation meeting, and it was based in part on the results of the trials in Shanghai. The differences between the two manuals were very few. The other reason was that it would be easy to compare the norms for Shanghai with those for the nation if the same manual was adopted; furthermore, the ranking procedures could be compared after testing. For the "Vocabulary" subtest of the national manual, there were no examples given and it was difficult for the examiners to score the items objectively. Having referred to the scoring standard for the subtests devel-

SHANGHAI NORMS: WISC-RC

533

oped in 1980-1981, we added examples at the end of the manual as a reference for examiners. Formulating the Norms The main purpose of the study was formulation of norms. From November to December of 1984, the examiners gave the tests in the 19 schools and collected the data. After analysis of the data, Shanghai norms were formulated. This consisted of two parts; "Scaled Score Equivalents of Raw Score" were constructed for each age group as well as "IQ Equivalents of Sums of Scaled Scores." Detailed information about these norms is given in the Results section below. I . Raw Means and Standard Deviations of 12 Subtests for 11 Age Groups and Transforming Raw Scores Into Scaled Scores The raw score means and standard deviations are represented in Table 1 and Table 2. From these two we can see that almost all raw score means for subjects increase with age before 11 yr. old. TABLE 1

MEANSAND STANDARD DEVIATIONS OF RAW SCORES ONVERBALSUBTESTS FOR11 AGE GROUPS INTHISS ~ L E Age Group (Yr.)

Information M SD

6.5 7.5 8.5 9.5 10.5 11.5 12.5 13.5 14.5 15.5 16.5

7.4 10.2 12.3 14.3 17.1 19.0 20.2 21.7 23.3 24.7 24.8

6.5 7.5 8.5 9.5 10.5 11.5 12.5 13.5 14.5

Vocabulary 17.8 7.1 26.0 8.2 36.6 9.4 41.5 8.6 50.9 7.3 53.5 5.6 54.5 5.9 55.8 5.5 56.4 5.8

2.1

2.1 2.4 2.3 3.3 3.6 3.0 3.0 3.5 2.9 2.6

M

Similarities SD

7.0 10.2 11.4 12.9 16.3 17.4 17.4 18.3 19.1 20.6 21.3

M

Arithmetic SD

3.5 2.5 3.9 3.8 4.2 3.5 3.9 4.0 3.1 3.2 3.3

7.7 11.4 13.8 14.9 16.0 17.6 17.6 17.5 17.6 18.1 18.0

Comprehension 12.3 4.2 15.8 3.7 17.4 3.8 19.7 3.7 23.1 3.9 25.0 4.0 25.0 4.2 26.4 4.0 27.7 4.0

12.3 15.0 16.1 17.4 19.3 19.5 19.7 19.8 21.0

1.5 1.7 1.4 1.5 1.5 1.4 1.2 1.4 1.3 1.0 1.1

Digit Span 3.3 3.2 3.9 3.9 4.4 3.9 3.5 4.2 3.8

534

L. DAN. ETAL. TABLE 2

Age Group (Yr.1

Picture Completion M SD

Picture Arrangement M SD

Object Assembly 10.2 13.3 14.8 15.8 18.4 19.1 19.8 20.0 21.6 21.4 22.3

4.6 5.3 5.6 4.9 5.4 5.0 5.3 4.5 4.0 4.4 4.1

Block Design M SD

Mazes

Co&g 33.1 40.8 40.9 44.3 50.8 56.4 60.7 65.6 70.7 73.9 75.9

8.7 6.9 8.3 6.9 9.4

12.7 16.5 18.1 19.0 22.1

12.2

22.8

11.4 10.4 9.5 9.8 10.7

23.5 23.4 22.9 24.4 25.5

4.4 5.2 4.7 4.1 4.2 3.4 3.4 3.7 3.3 3.0 3.2

Raw scores are taken directly from the 12 subtests. The scoring methods for these 12 subtests are different, and the meanings are also different. Wechsler thought intelligence was a kind of whole energy of a person's understanding and handling of the world. Intelligence is better regarded as a combination or synthetic entity, not a simple unique characteristic. So Wechsler did not advocate the use of each subtest score to measure IQ, but advocated the use of five verbal subtest scores to measure Verbal I Q and five performance subtest scores to measure Performance IQ, and, lastly, to add 10 subtest scores to get a Full Scale IQ. To add scores on the 10 subtests, we first transformed raw scores into scaled scores. Wechsler's scaled score is a 1 to 19 standard score, with mean of 10 and standard deviation of 3; the whole range is equal to Z scores of -3 to + 3. We found that the distribution of raw scores was very nearly normal, and, while we transformed raw scores into scaled scores, we considered them as normally distributed so that we could use the formulation "SS (scaled scores) = 10 32" directly to transform. Generally speaking, the .raw scores' of each subtest should increase with age. When we found that several subtest scores did not show a regular

535

SHANGHAI NORMS: WISC-RC

increase, we smoothed them by using moving averages. Therefore, we obtained the equal marks of transformation about different age groups and constructed the scale-score tables for the middle period of each age group (4 mo., O days to 7 mo., 30 days) which are equal to raw scores. We added two tables of transformation for the first four months and last four months of each year so we had 33 tables (omitted here). 2. Means and Standard Deviations by Age Groups

The means and standard deviations of scaled scores for the Verbal Scale, Performance Scale, and Full Scale for different age groups were calculated. The three IQs were deduced from the scaled scores. We transformed all subjects' raw scores into scaled scores according to the above transformational table; see Table 3. Then we transformed these into IQs. According to the TABLE 3 MEANSAND STANDARD DEVIATIONS OF SLTVISOF SCALEDSCORES O N V ~ A LPERFORMANCE, , AND FULL SCALESBY CHRONOLOGICAL AGE(YEARS) Age Group (Yr.)

Verbal Scale M SD

Performance Scale M SD

Full Scale M

SD

WISC-R's characteristics, (1) we need deviations not the MA/CA ratio to calculate IQ. (2) Wechsler did not advocate use of every subtest but rather the combination of Verbal scaled scores, Performance scaled scores, and those for all of the subscales as the Full Scale to give three measures of IQ, i.e., Verbal IQ, Performance IQ, and Full Scale IQ. (3) Because sums of scaled scores for different age groups are very close, we can make the transformational tables for different ages on the bases of the data from the whole sample. The following is a detailed description of our measurements. We used the means and standard deviations of the whole sample (the mean of the Verbal scaled scores is 50, standard deviation of 11; the mean of the Performance scaled scores is 50, standard deviation of 9; the mean of the Full Scaled scores is 100, standard deviation of 17). The formulation

536

L. DAN. ETAL.

DIQ = 100 + 15Z is used to make the mean of summed scaled scores equal to an I Q of 100, and + 1 deviation is equal to + 15 IQ points.

3. The Distribution of 660 Children's Full Scale IQs The Full Scale IQs of the 660 children were available with the above normal model. These results are represented in Table 4 and Fig. 1. From these, it can be seen that for our sample of 660 children their Full TABLE 4 INTELLIGENCE CLASSIFICA~ONS

IQ

130 and above 120-129 110-119 90-109 80- 89 70- 79 69 and below Total

Classification

Excellent Good High average Average Low average Borderline (critical) Mentally deficient

Number of Percent Included Actual Subjects Theoretical Trial Normal Sample Sample Curve 15 51 108 325 96 50 15 660

2.2 6.7 16.1 50.0 16.1 6.7

2.2 100.0

Scale IQs are normally distributed. Here we are going to add some data and ratios without tables and figures. Of those children whose IQs were above 130, there were 10 (1.5%) with IQs between 130 to 140, and five (0.76%) with IQs from 135 to 145. Of those whose IQs were below 69, there were nine (1.4%) persons with IQs from 45 to 69 and six (0.9%) with IQs from 55 to 64. From these data we know there are rare subjects who have IQs above 135 or below 64. The above results conformed to the actual conditions of our sample and also demonstrated that this group is representative to some extent.

FIG. 1. The distribution of 660 children's Full Scale IQs

SHANGHAI NORMS: WISC-RC

4. Reliability Testing We used the odd-even split method and selected three age groups of 9-, 12-, and 15-yt-olds (180 subjects) to estimate the split-half correlation for 10 subtests (omitting Digit Span and Coding). In addition, we corrected the results using the Spearman-Brown formula (i.e., recovering the original length). These results are represented in Table 5 , from which we see that the reliabilities for the 10 subtests and the three kinds of scaled scores lie between 0.68 and 0.98. TABLE 5

Verbal Scales Information Similarities Arithmetic Vocabulary Comprehension

Split-half Reliability Correlation Coefficients (Corrected) 0.79 0.68 0.70 0.83 0.70

Verbal Scale (5 subtests) Performance Scale (5 subtesn) Full Scale (10 subtests)

0.89 0.81 0.82 0.90 0.82

Performance Scales Picture Completion Picture Arrangement Block Design Object Assembly Mazes

Split-half Reliability Correlation Coefficients (Corrected) 0.69 0.52 0.69 0.53 0.54

0.82 0.68 0.81 0.69 0.70

0.98 0.90 0.95

5. Validity Testing During the trial study we had adopted many methods to test validity, and the results were considerable. So in this study, we only used the correlations between IQs and school achievement in the two kinds of schools (key schools and ordinary schools), because in China the evaluations of school achievement are not standard at present and our subjects came from different schools so we can not obtain a unified index. From an over-all point of view, those students who are selected to enter key schools after examinations are ones whose school achievement is good, and those in ordinary schools are the ones whose school achievement is average. So, we estimate the school achievement in key and ordinary schools as above average and average, respectively. According to the theory of correlation between intelligence and school achievement, the achievements of these two groups of students have two corresponding intelligence levels. We used a 2 x 2 matrix (see Table 6) to estimate the correlation between Full Scale IQs and school achievement.

538

L. DAN, ET AL. TABLE 6 OF IQ AND ACADEMIC ACHIEV~NT 2 x 2 MATRIX

IQ

Academic Achievement Key Schools Ordinary Schools

110 and above Below 110 Total

73 59 132

Note.-Rate of Coincidence =

(a) (c)

Total 174 486 N = 660

(b) (d)

101 427 528

a+d = 0.76. a+b+c+d

Comparison of USA and Chinese Groups The results of the comparison of the mean raw scores of our sample with the USA sample (WISC-R) and the trial sample in 1980-81 are represented in Table 7 and Table 8. From these data it can be seen that the results TABLE 7 COMPARING RAW SCOREMEANSOF DIFFERENT AGECHILDREN'S SUBTESTSOF VERBALSCALEAMONGTHIS SAMPLE(S,), TRIALSAMPLE(S,), AND WISC-R SAMPLE( W ) Age (yr.-mo.)

S,

Information S, W

Similarities S,

S,

W

S,

Comprehension 12 16 17-18 20 23 25 25 26 27-28 28-29 29

13 15 16 17 20 21 24 25 26 26 26

8-9 11 13 15 17 20 22 23 25 25 27

Arithmetic S,

Digit Span 12 15 16 17 19 19-20 20 20 21 22 22

13 14 17 16 17 18-19 19 19 18 18

19

W

539

SHANGHAI NORMS: WISC-RC TABLE 8

RAW SCOREMEANSFORSUBTESTSOF PERFORMANCE SCALEFOR CHILDREN OF DIFFERENTAGES: PRESENTSAMPLE(S,), TRIALSAMPLE(SJ, AND WISC-R SAMPLEW) Age (yr.-mo.)

Picture Arrangement S, S, W

Object Assembly

Picture Completion

s,

s,

Coding

w

s,

Block Design

s,

w

Mazes

of the two studies in Shanghai (except several subtests in the Performance Scale) are very similar. I n terms of the comparison with the USA sample (WISC-R), there are stdl some differences. I n our sample, the means of the six verbal subtests are higher than those in the USA sample (WISC-R). For example, the raw scores on Arithmetic and Digit Span were advanced almost five years in mean age over those for the USA sample. O n the Performance subtests, only Object Assembly was lower than that in the USA sample (2 to 3 yc); Block Design was higher by 4 to 5 yr., Coding was higher by 2 to 3 yr., and others were similar. To see these differences vividly, we transformed these data according to the WISC-R norm and made sectional drawings of means for the different age groups. Now we take the group of 12-yr.-olds as an example (see Fig. 2). Analyzing the bases for these differences, we think they may be as follows. (1) The children in our study are from a very large city with high economic and cultural levels, but those in the USA sample (WISC-R) are from all over America, so this is an asymmetric comparison. (2) Different

540

L. DAN. ETAL

economics, cultures, and living habits and customs may produce such differences as these. For instance, the scores for Digit span and Arithmetic in our sample are obviously higher than those in the USA sample (WISC-R). Has such a result something to do with the pronunciation of numbers in the two different languages? This is worthy of further study. (3) For some subtests (such as Vocabulary) in the Verbal Scale a few items were changed according to the conditions of China and one may ask did that decrease the difficulty of these subtests so that the means of the raw scores are higher than those in the USA sample (WISC-R)? This is also considered a relevant factor. (4) With regard to the comparison between our sample and the Chinese trial sample, the higher scores on most of the Performance Scale subtests for the Chinese children may reflect paying more attention to training students on various kinds of skills, especially performance abilities. This is also worthy of further study.

WISC-R

FIG.2. Profile of means of subtest scaled scores for the 12-yr. age group in Shanghai proper (in terms of the norm of WISC-R)

The Range of Application of the WISC-RC According to the foreign evaluations, the WISC-RC is not a suitable test for those who score high or low. During our trial study we also found that some subtests are not long enough to assess extremely high IQ. When we revised the subtests for Digit Span and Arithmetic we added one item to

SHANGHAI NORMS: WISC-RC

541

the original scales. But from Table 7 and Table 8, we can see that this problem has not been solved. From 10 yr. of age this problem, that the lengths of subtests such as Vocabulary and Arithmetic are unsuitable, has been made evident. At 12 yr. of age, this problem also occurred on the subtests of Comprehension, Block Design, Mazes, and Digit span. This will affect the observed intelligence levels for those among the older age groups who have high intelligence. Besides these, given the effect of the representative range of our normative sample, the use of the WISC-RC in Shanghai is proper for those whose IQs range between 55 to 145. For those whose Full Scale IQs are above 145 or below 55, we suggest the use of other scales in evaluating intelligence.

Conclusion Although the WISC-R was developed in the USA, it is also a good tool in measuring intelligence of children in other countries after trial studies and formal revisions, as well as estimating currently the reliabilities and validities. The Shanghai norms formulated in our study provide a criterion for educational and psychological experimental studies and clinical diagnoses. Accepted tlrrgust 20, 1990.

Report on Shanghai norms for the Chinese translation of the Wechsler Intelligence Scale for Children-Revised.

The Chinese translation of the Wechsler Intelligence Scale for Children--Revised (WISC-RC) was administered to 660 children (ages 6 through 16 yr.) in...
312KB Sizes 0 Downloads 0 Views