Perceptualand Motor Skills, 1991, 73, 227-234. O Perceptual and Motor Skills 1991

NATIVE AND CROSS-LANGUAGE SPEECH SOUNDS: SOME PERCEPTUAL PROCESSES ' MINOLA A. PINARD Concordia Universily Summary.-Using a developmental approach, two aspects of debate in the speech perception literature were tested, (a) the nature of adult speech processing, the dichotomy being along nonlinguistic versus linguistic lines, and (b) the nature of speech processing by children of different ages, the hypotheses here implying in infancy detector-like processes and at age four "adult-like" speech perception reorganizations. Children ranging in age from 4 up to 18 years discriminated native and foreign speech contrasts. Results confirm the hypotheses for adults. I t is clear that different processes are operating at different ages; however, more complex processes may come into play around the ages of 6 to 10 years; boys may use different strategies than girls, and with age, a multiplicity of processes may be concurrently active.

The present study was done to examine development of native and cross-language speech perception to gain further insight into language and general intellectual functioning from age 6 to 18 years. Models of adult speech perception generally stress two levels of processing, an auditory or acoustic level and a phonetic or linguistic level. More specifically, the model most often adopted-although still subject to debate-is that postulated by Liberman and his colleagues (Liberman, 1970, 1981; Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967; Repp, 1981; Studdert-Kennedy, 1976; Studdert-Kennedy, Liberman, Harris, & Cooper, 1970), in which the auditory aspect is measured by frequency, amplitude, etc., whereas linguistic measures involve a complex of cues with linguistic significance, that is, context-dependent cues, generally associated with speech production. On the other hand, theories of development also involve a similar dichotomy, namely, that of a different pattern of development for stimuli with attributes easy to define, such as physical ones, as opposed to those for stimuli with characteristics more difficult to define (complex, abstract). A stimulus composed of physical attributes would be color, for example; one of abstract attributes would be faces, for example (Bornstein, Kessen, & Weisskoff, 1976; Mervis, Catlin, & Rosch, 1975; Carey-Block, 1978; Carey, Diamond, & Woods, 1980; Gibson, 1969). With respect to speech, however, two issues stiU remain: (a) Are Liberman's claims for adults valid; for opposing views see Stevens and Blumstein (1981), for example. (b) Developmentally, what can we conclude

'Address correspondence to Dr. Mmola P~nard,Concordia University, Department of Psychology, 1455 de Maisonneuve Boulevard West, Montreal, Quebec H3G 1M8, Canada.

228

M. A. PINARD

about the physical versus abstract dichotomy from the findings on speech perception? Our study examined both questions. (a) With respect to the presumed dichotomy between auditory and linguistic processes, we expect to find proof of that dichotomy if stimuli purported to belong to one or the other category diverge in their developmental patterns, and more specifically (though not exclusively) if they diverge along the lines expected for "physical" (early maturation) versus "abstract" stimuli (later maturation). (b) With respect to what can be concluded from developmental studies on speech perception, i.e., what processes are operating at given ages, we hope to find the ages at which one (auditory) or another (linguistic) process is operative (or dominant) by selecting stimuli representative of one or the other processing category for presentation. With respect to studies of adults, the categorical perception paradigm has been frequently used. A peak in the lscrimination function has corresponded to an abrupt change in the identification function for speech sounds (Studdert-Kennedy, et a/., 1970). This has been claimed to be unique to speech and to reflect special processes; see Pastore (1981), Pisoni (1978), and Pisoni and Lazarus (1974) for contrary views. I n the developmental literature, only the discrimination function of the paradigm has been employed. Here, native speech sounds are well discriminated early (e.g., Morse, 1972; Eimas, 1975; Miller, Morse, & Dorman, 1975). Foreign speech sounds are also well discriminated early (Lasky, Syrdal-Lasky, & Klein, 1975; Eimas, 1985) but show a dip in performance at ages 1 to 4 (Werker, Gilbert, Humphrey, & Tees, 1981; Werker & Tees, 1983, 1984a). The conclusion here has been that bases for mature perceptual processes can be seen to occur early (Eimas, 1985) and important speech perceptual reorganizations occur from 1 to 4 years (Werker & Tees, 1983, 1984b). The present study explored the meaning of the above patterns, using native and foreign consonant contrasts. The paradigm resembled the discrimination function above. I n terms of expected results, developmentalIy, a pattern different for native versus foreign sounds would indicate that speech is ~rocesseddifferently from nons~eechsounds (here foreign = nonspeech). The pattern of difference in development may indicate the type of process used at different ages (e.g., infancy: detector-type mechanisms; ages 1 to 4: native contrast perceived as different, i.e., part of my repertoire, and foreign contrast perceived as the same, i.e., not part of my repertoire; ages 1 0 onwards: native contrast ~erceivedas different and also foreign contrast perceived as different, i.e., both representing-more or less-examples of use of phonological categories as basis of judgment, e.g., /P/ versus 'something else').

NATIVE, CROSS-LANGUAGE SPEECH SOUNDS

229

METHOD Subjects

All subjects were unilingual francophones from upper-, upper-middle class areas of Montreal. They had normal auditory histories and came from public and private schools. There were 24 subjects within each age group, with equal numbers of boys and girls. The four age groups were 4 to 6 years (girls' mean: 5 yr., 6 mo., boys' mean: 5 yr., 7 mo.); 6 to 8 years (girls' mean: 6 yr., 8 mo., boys' mean: 6 yr., 11 mo.); 10 to 12 years (girls' mean: 10 yr., 9 mo., boys' mean: 11 yr., 1 mo.); and 16 to 18 years (girls' mean: 17 yr., 0 mo., boys' mean: 16 yr., 8 mo.). Stimuli The stimuli were selected to represent four contrasts, three native, one foreign. The three native contrasts were voicing (e.g., p/b), place of articulation (e.g., p/k), degree of obstruction (e.g., p/f), a l l phonemic in French and English. The fourth contrast, duration, is phonemic only in other languages, such as Hungarian. The contrasts of main interest, the Voicing and Duration (Foreign) contrasts, are both physically cued (but recall theory of encodedness) by duration, i.e., short versus long acoustic features (for voicing: Ling, 1976; for Hungarian: Benko & Imre, 1972). The other two contrasts, place and obstruction, were added for general comparative purposes. I n the preparation of the tapes, the aim was to present each prospective subject eight natural tokens of each of the four contrasts, plus a similar number of same pairs. The consonants were to be presented embedded in a VCV utterance, the V always being an /a/. Thus, the voicing contrast was represented by, for example, the pair apa/aba, duration apa/apa place, apalaka, obstruction, apa/afa. To form the contrasting pairs, the VCVs were spliced together with one second occurring between the words belonging to a pair and five seconds between pairs. Procedure The experiment was conducted in a school. The subject sat at a table facing an assistant. The subjects were told that the session would consist of hearing a pair of invented words through the earphones attached to the tape recorder and that they would have to indicate whether the pair was similar or not (for 4- to 6-yr.-olds say yes or no; for the other subjects write down 0 or X). AlJ subjects practiced with other word pairs before testing. The assistant, who also heard the stimuli, monitored subjects' progress. Subjects were eliminated if they did not wait for the pair to appear before giving an answer or if they waited longer than five seconds before giving an answer. Design The d' scores rather than the raw scores were used to separate the abil-

ity of the subject to differentiate between classes of events from motivational effects or response biases (Sorkin, 1962). The subjects' response was "same" or "not the same." The d' score was computed for each contrast level on the basis of the mean hit score for "same" and mean false alarm for "different" tokens of the contrast in question. The data were first analyzed by a three-way analysis of variance with repeated measures on one factor. The factors were age, sex, and subjects nested within each age-sex combination. The repeated factor was contrast, for which there were four levels, three native (voicing, place, obstruction) and one foreign (duration). Pearson correlation coefficients were then calculated between the contrasts voicing and duration at ages 6 to 8 years and 10 to 12 years, followed by tests of significance of the correlation.

RESULTS As can be seen in Table 1, there was a significant increase in performance over ages for all contrasts and both sexes. T h s increase, however, was manifest only between the ages of 6 to 8 years and 10 to 12 years. TABLE 1

EACH CONTRAST BY AGEACROSSSEXES

(dl

Age 4 to 6 yr. Voicing Place Obstruction Duration Age 6 to 8 yr. Voicing Place Obstruction Duration

SCORES)

Age 10 to 12 yr. Voicing Place Obstruction Duration Age 16 to 18 yr. Voicing Place Obstruction Duration

Source of Variance

F

P

Main Effect of Age Interactions With Age Age 4 t o 6 W i t h 6 to 8 yr. Age 6 to 8 With 10 to 12 yr. Ane 10 to 12 With 16 to 18 vr.

30.32

< .Ol

all < Fcric

ns ns < .01 ns

3.30 79.87 5.28

O n the question of the specific difference between voicing and duration contrasts with age, as can be seen in Table 2, we used three different analyses which give rise to slightly different interpretations. Although in a comparison of means, there does not appear to be a differential difference between voicing and duration contrast discrimination between the different age

23 1

NATIVE, CROSS-LANGUAGE SPEECH SOUNDS TABLE 2 ON VOICING VERSUSDURATIONCONTRASTS PERFORMANCE WITH AGEWITHINSEXES(dl SCORES)

M

SD

Girls Age 4 to 6 yr. Voicing Duration Age 6 to 8 yr. Voicing Duration Age 10 to 12 yr. Vo~cing Duration Age 16 to 18 yr. Voicing Duration Statistical Analysis Analysis of Variance Main effect of age Main effect of contrast Age by contrast Higher order age by contrast A priori Trend Analysis Girls Voicing Linear trend, ages 4 to 18 yr. Quadratic trend, ages 4 to 18 yr. Duration Linear trend, ages 4 to 18 yr. Quadratic trend, ages 4 to 18 yr. Boys Voicing Linear trend, ages 4 to 18 yr. Quadratic trend, ages 4 to 18 yr. Duration Linear trend, ages 4 to 18 yr. Quadratic trend, ages 4 to 18 yr. Correlation of Voicing and Duration Performance Girls Age 6 to 8 yr. Age 10 to 12 yr. Boys Age 6 to 8 yr. Age 10 to 12 yr.

SD

M p

p

p

p

p

Boys Age 4 to 6 yr. Voicing Duration Age 6 to 8 yr. Voicing Duration Age 10 to 12 yr. Voicing Duration Age 16 to 18 yr. Voicing Duration

F

p

1.40 0.46 1.81 0.95 3.27 2.65 3.47 3.27 P

groups, an analysis by trend suggests a more complex story. I n fact, for both boys and girls, performance on both contrasts increases linearly over age groups, but for voicing only, there was an added quadratic trend component, that is, as expected in the introduction, then, the gap between voicing and . duration contrasts is small at age 4 to 6 years, increases to a maximal point at age 10 to 12 years, and then becomes smaller again at age 16 to 18 years. The correlation coefficients between duration and voicing further show the value to be large for both sexes at age 6 to 8 years and to decrease significantly for boys at age 10 to 12 years. The initial high correlation is as expected if a detector mechanism or a simple acoustic mechanism (length) is used by both; the decrease is also expected if a switch is made to different mechanisms later on, e.g., learned acoustic repertoire versus foreign acoustic repertoire, or also, say for two known phonologies as opposed to known versus unknown phonology. I t is not entirely clear which of these latter two wouId yield the lowest correlation, thereby giving rise to boys' superiority. Furthermore, place and obstruction contrasts were performed much better than the voicing contrasts by all subjects. However, in addition, there was a sex difference, with boys performing better than girls on foreign contrasts only. I n conclusion, the across-the-board increase in performance on all contrasts between the ages of 6 and 10 years is concordant with the findings in the general developmental literature on perceptual processes, namely, the acquisition of some complex perceptual concepts or schemas, such as faces, for example (Carey-Block, 1978; Carey, et al., 1980), and also, quite interestingly, corresponds to learning to read and write. That this improvement also affects foreign contrasts, despite a generally poor performance here, we interpret this as reflecting that a number of mechanisms may be concurrently operative. The native versus foreign differences over age groups are delved into in most detail as they are the principal focus of this study. The existence of different mechanisms in the processing of speech sounds has once again been indicated by showing that the developmental trends are different (linear) for speech sounds processed by acoustic modes as opposed to speech sounds processed by linguistic modes (quadratic). Although our own model is now somewhat more complex than that, the findings support the proposed dichotomy. As for the nature of the specific processes operating at given ages, our data, though complex, allow us to propose tentatively some hypotheses. It seems clear (given the increasing gap between native and foreign speech performance and given the change from high to low correlation between native and foreign speech performance especially in boys) that simple to complex changes in the processing of speech sounds occur across ages. Our data are in agreement with Eimas' (1985) interpretation with respect to de-

NATIVE, CROSS-LANGUAGE SPEECH SOUNDS

233

tector-type mechanisms operating early. With respect to Werker and Tees's (1983, 1984b) interpretation of a "perceptual reorganization" at approximately age 4, our data agree, although the "perceptual reorganization" we still interpret to be of a "detector type," that is, subjects at that age seem to be judging on the basis of "learned" categories, familiar contrasts becoming easier to discriminate than unfamiliar contrasts. Only at age 10 onwards, are phonological (or also possibly phonetic) mechanisms brought into play. These are both more powerful than simple detector mechanisms (thereby giving rise to an over-all improvement in performance at those ages), but also give rise to significantly better discriminative performance for native contrasts (two known phonological entities) as opposed to foreign contrasts (one known phonological entitylone unknown phonological entity). The fact that this gap somewhat narrows at ages 16 to 18 years may reflect a greater likelihood that despite all subjects having been selected for unilingualism, the adult group may have come into contact (socially, through TV) with foreign sounds. REFERENCES BENKO,L., & IMRE,S. The Hungarian hnguage. The Hague: Mouton, 1972. BORNSTEIN,M. H . , KESSEN,W., & WEISSKOFT,S. Color vision and hue categorization in young human infants. Journal of Experimental Psychology: Human Perception and Performance, 1976, 2, 115-129. CAREY,S., DLAMOND, R., & WOODS,B. Development of face recognition-a maturational component? Developrnentul Psychology, 1980, 16, 257-269. CAREY-BLOCK, S. A case study: face recognition. In E. Walker (Ed.), Explorations in the biology ofkanguage. Montgomery, V T Bradford Books, 1978. Pp. 175-201. E m s , P. D. Speech perception in early infancy. In L. B. Cohen & P. Salapatek (Eds.), Infant perception: from sensation to cognition. Vol. 11. Perception of space, speech and sound. New York: Academic Press, 1975. Pp. 193-231. EIMAS,l? D. The perception of speech in early infancy. Scientific American, 1985, 252, 46-52. GIBSON, E. J. Principles of perceptual learning and development. N e w York: AppletonCentury-Crofts, 1969. LASKY,R. E., SYRDAL-LASKY, A,, & KLEIN, R. E. VOT discrimination by four to six and a half month old infants from Spanish environments. Journal of Experimental Child Psycholonv, - 1975.. 20,. 215-225. LBERMAN,A. M. The grammars of speech and language. Cognitive Psychology, 1970, 1, 301-323. ~ERMAN A., M. On finding that speech is special. Haskins Laboratories, New Haven, CT, Status Report on Speech Research, 1 July-31 December 1781. (ERIC Document Reproduction Service No. E D 212-010) LBERMAN,A. M., COOPER,F. S., SHANKWEILER, D. l?, & STUDDERT-KENNEDY. M. Perception of the speech code. Psychological Review, 1967, 74, 431-461. LING,D. Speech and the hearing-impaired child: theory and practice. (1st ed Yihstungton, DC: The Alexander Graham Bell Association for the Deaf, Inc., 1976. MERVIS,C. B., CATLIN,J., & R o s c ~ ,E. Development of the structure of color categories. Developmental Psychology, 1975, 11, 54-60. MILLER, C. L., MORSE,l? A , , & DORMAN,M. F. Infant speech perception, memory, and the cardiac orienting response. Paper p s e n t e d at the meetings of the Society for Research in Child Development, Denver, CO, April, 1975. MORSE,P. A. The discrimination of speech and non-speech stimuli in early infancy. Journal of Experimentnl Child Psychology, 1972, 14, 477-492. PASTORE,R. E. Possible psycho-acoustic factors in speech perception. In l? D. Eimas & J. L.

234

M. A. PINARD

Miller (Eds.), Perspectives on the study of speech. Hillsdale, NJ: Erlbaum, 1981. Pp. 165-200. PISONI,D. B. Speech perception. In W. K. Estes (Ed.), Handbook of learning and cognitive processes. Vol. 6. Hillsdale, NJ: Erlbaum, 1978. Pp. 167-233. PISONI,D. B., & LAZARUS, J. H. Categorical and noncategorical modes of speech perception along the voicing continuum. Journal of the Acoustical Society of America, 1974, 55, 328-333. REPP, B. H. Phonetic trading relations and context effects: new experimental evidence for a speech mode of perception. Haskins Laboratories, New Haven, CT, Status Report on Speech Research, 1 July-31 December 1981. (ERIC Document Reproduction Service NO. E D 212-010) S o ~ m R. , D. Extension of the theory of signal detectability to matching procedures in psychoacoustics. Journal o j the Acoustical Socieiy of America, 1962, 34, 1745-1751. STEVENS, K. N., & BLUMSTEIN,S. E. The search for invariant acoustic correlates of honetic features. In I! D. Eimas & J . L. Miller (Eds.), Perspectives on the study o j speech. Hillsdale, NJ: Erlbaum, 1981. Pp. 1-35. STUDDERT-KENNEDY, M. Speech erception. In N. J. Lass (Ed.), Contemporary issues in ewperimentalphonetics. New ~ o r R Academic : Press, 1976. Pp. 243-293. STUDDERT-KENNEDY, M., LIBERMAN, A. M., HARRIS,K. S., & COOPER,F. S. Motor theory of speech perception: a reply to Lane's critical review. Psychological Review, 1970, 77, 234-249. WERKER,J. F., GILBERT,J. H . V., HUMPHREY,K., & TEES, R. C. Developmental aspects of cross-language speech perception. Chikd Development, 1981, 52, 349-355. WERKER,J. F., & TEES, R. C. Develo mental chan es across childhood in the perception of , 37, 278-286. non-native speech sounds. c a n d u n Journal o ~ s y c h o l o g y1983, WERKER,J. F., & TEES, R. C. Cross-language s eech erception: evidence for perceptual reorganization during the first year of life. 1nint ~ e i a v i o rDevelopment, 1984, 7, 49-63. (a) WERKER,J. F., & TEES, R. C. Phonemic and phonetic factors in adult cross-language speech perception. Journal of the Acoustical Society of America, 1984, 75, 1866-1878. (b)

Accepted July 23, 1991.

Native and cross-language speech sounds: some perceptual processes.

Using a developmental approach, two aspects of debate in the speech perception literature were tested, (a) the nature of adult speech processing, the ...
287KB Sizes 0 Downloads 0 Views