Journal o f Psycholinguistic Research, Vol. 1, No. 1, 1971

Theories of Language Acquisition Harold J. Vetter 1 and Richard W. Howell 2 Received March 8, 1970

Prior to the advent of generative grammar, theoretical approaches to language development relied heavily upon the concepts of differential reinforcement and imitation. Current studies of linguistic acquisition are largely dominated by the hypothesis that the child constructs his language on the basis o f a primitive grammar which gradually evolves into a more complex grammar. This approach presupposes that the investigator does not impose his own grammatical rules on the utterances o f the child; that the sound system of the child and the rules he employs to form sentences are to be described in their own terms, independently o f the model provided by the adult linguistic community; and that there is a series of steps or stages through which the child passes on his way toward mastery o f the adult grammar in his linguistic environment. This paper attempts to trace the development o f human vocalization through prelinguistic stages to the development of what can be clearly recognized as language behavior, and then progresses to transitional phases in which the language o f the child begins to approximate that o f the adult model. In the view o f the authors, the most challenging problems which confront theories o f linguistic acquisition arise in seeking to account for structure o f sound sequences, in the rules that enable the speaker to go from meaning to sound and which enable the listener to go from sound to meaning. The principal area o f concern for the investigator, according to the authors, is the discovery of those rules at various stages o f the learning process. The paper concludes with a return to the question o f what constitutes an adequate theory of language ontogenesis. It is suggested that such a theory will have to be keyed to theories of cognitive development and will have to include and go beyond a theory which accounts for adult language competence and performance, since these represent only the terminal stage o f linguistic ontogenesis.

INTRODUCTION Lenneberg (1967), Smith and Miller (1966), and others may be correct positing an innate predisposition for acquiring language or, more generally, a 1Florida State University, Tallahassee, Florida. 2University of Saskatchewan, Saskatchewan, Canada. 31

32

Vertex and Howell

predisposition for dealing with extremely complex patterns of stimuli, yet the fact remains that each child must learn his native language. The genetic predisposition is not for a specific language. The essential independence of race, culture, and language has long been established in anthropolo.gy (Boas, 1949; Kroeber, 1948; Linton, 1936; Sapir, 1921). Any normal youngster can learn any language if he is raised in the appropriate linguistic environment. Given the potential t o learn any language, how does the child learn a specific language? McKeachie and Doyle (1966, p. 333) have stated the major hypotheses in extremely simple terms: "(1) The child strings together words randomly, but is only reinforced for grammatical structures. Therefore ungraramatical structures

disappear. (2) The child imitates the phrases of his parents. As imitation increases, more and more grammatical phrases appear. (3).The child constructs his phrases on the basis of a primitive grammar which gradually evolves into a more complex grammar. . . . As soon as he utters two words together, they axe related by some rule." The first two hypotheses are pretty typical of the kinds of thinking engaged in by psychologists before the field came under the influence of modern structural linguistics. As McNeill (1966a) has suggested, the assumption seems to have been that child language was adult language Fdtered through a great deal of cognitive noise and impoverishment of vocabulary. The categories of adult grammar were used to describe child language on the supposition that the scholar knew the grammar of the child because he knew his own grammar. Thus, it seemed important to conduct surveys of vocabulary, to make fiequency counts of various grammatical classes, and to present case histories of the gradual elimination of "errors." Today, however, the student of child language has begun to adopt a different approach, as indicated in the third hypothesis stated above. Without imposing the rules of his own grammar on the utterances of the child, he takes a more detached view, investigating child language the way any modern linguist would approach the description of an alien tongue. Ervin and Miller (1963) distinguish two aspects to this approach. First of all, the sound system of the child and the set of rules he uses to form sentences are to be described in their own terms, quite independently of the model presented by the adult community. The second aspect concerns the successive steps through which the child passes on his way toward mastery of the system employed by the adults around him. The new look in child language studies places much greater emphasis on the active, creative role of the youngster. Even imitation now seems far more complex than one would assume from the passive implications of Jenkins and Palermo (1964, p. 165): " ' . . . children's language begins with a form of imitation followed by the acquisition of a numbe~ of simple S-R connections between

Theories of Language Acquisition

33

verbal labels and salient features of the environment to which they become attached. With a core of labels available, the child attaches words with other words in sequences, and the ordering or structuring begins." The problem of imitation and labeling will be treated separately, while the problem of structuring lies at the core of psycholinguistic research. Weksel (1965) was dismayed by such cavalier treatment of basic problems and has suggested in effect that language ontogeny is not going to be accounted for by any extensions or modifications of current learning theory. What is needed is a revolution in psychological theory to match the one that has already taken place in linguistics with the advent of generative grammar. This line of thought will recur in the pages to follow, but for the moment the apparently simple process of labeling may serve as an example of the difficulties being glossed over in the simple formulation by Jenkins and Palermo above (though they have now abandoned that position, according to Lenneberg, 1967, p. 274). If one holds a book in front of an infant who has reached a certain stage of development, say between 1 and 1% years of age, and repeats the word book a number of times, we may expect that eventually the child will say something at least vaguely recognizable as book. If, subsequently, the child says the word and points toward the object or if he touches the book and says the word, we say that he has learned the word book. An association has been established between the word and the object to which it refers. Each time the youngster says that word at some appropriate time and is rewarded by a smile or some other sign of approval, then the child is being reinforced or rewarded for the response and the association between the object and the sound sequence represented as book is strengthened. To the stimulus (book) there is a response (book) which began as an imitation of the adult response to the book. But as the objects or events to be labeled become less concrete, less tangible, this kind of explanation of learning becomes less persuasive. It is hard to define the stimulus in the case of words such as nice or empty or why. Even an operant conditioning approach, where the question of stimulus need not be so troublesome, is just as vulnerable, as Chomsky (1959) has shown in his devastating review of Skinner (1957). Even if we could be moderately satisfied with such explanations of the naming process, this still would account only for a fraction of language learning. As we have indicated, the really serious problems arise in accounting for the structure of sound sequences, in the rules that enable the speaker to go from meaning to sound and that enable the listener to go from sound to meaning. In the pages that follow, we shall be principally concerned with discovering those rules at different stages of the learning process. The chapter will close with a return to the question of what constitutes an adequate

34

Vetter and Howell

theory of language ontogenesis, but frrst we shall attempt to trace the development of human vocalization through prelinguistic stages to the beginning of what we can clearly recognize as linguistic behavior, and then on to transitional phases to the point at which the language of the child approximates that of the adult model.

PRELINGUISTIC DEVELOPMENT Eisenson et al. (1963) have suggested that there are five stages of prelinguistic vocalization: undifferentiated crying, differentiated crying, babbling, lallation, and echolalia. Undifferentiated crying is said to be part of a total bodily response to any discomfort. This gives way to differentiated crying after the frrst month of life, when different stimulating conditions are capable of eliciting characteristic qualities of sound, apparently because different patterns of muscle contraction throughout the body depend upon the different stimulating circumstances (as hunger vs. loss of support, perhaps; see Osgood, 1953). It is not clear why this should not also be the case in the initial stage. In any event, Lenneberg (1967, p. 267) considers that crying "as well as other sounds more immediately related to vegetative functions seem to be quite divorced from the developmental history of the second type of vocalization, namely all of those sounds which eventually merge into the acoustic productions of speech.'" Yet the infant is becoming sensitized to the persons who attend his needs, and he adjusts his vocalizations to the reduction of need states, so the first weeks cannot be completely disregarded. Around the sixth week or a little later, cooing sounds may be noted. At first these seem to have the character of a reflex and may be elicited by specific stimuli, such as a nodding object that resembles a face. According to Lenneberg (1967), cooing sounds are most readily obtained between the tenth and thirteenth week. After that the visual and social stimuli become increasingly differentiated and it is more likely to be a familiar face that elicits smiling and cooing. When the infant cries the artieulatory organs are held relatively still, but the tongue, at least, is involved in cooing. This results in a considerable difference, acoustically, between crying and cooing. But even though cooing has a vowel-like quality, these sounds are acoustically, motorically, and functionally different from speech sounds. At about 6 months of age, well into the babbling stage, the cooing sounds become more differentiated into vocalic and consonantal components, and new artieulatory modulations appear (Lenneberg, 1967). By the third or fourth month of life the normal infant enters the

Theories of Language Acquisition

35

babbling stage of prelinguistic vocalization, but this does not seem to depend upon the auditory stimulation provided by others. The vocalizations of deaf and hearing children are indistinguishable in the first 3 months, and only gradually, after the age of 6 months, is there a decrease in the range of sounds uttered by the deaf (Lenneberg, 1964). Ervin and Miller (1963, p. 110) note: "Thus, the hearing of a variety of speech sounds may increase the range of sounds used by the child, but we do not know if the hearing of a particular range of sounds influences the particular range used by the child." The human infant, then, shows a predisposition to babble, but this may or may not lead to linguistic behavior. On the one hand, deaf children are unable to exploit babbhng, while, on the other hand, linguistic competence has been demonstrated in a youngster who, because of a congenital neurological defect for speech articulation, presumably never babbled or attempted to imitate sounds (Lenneberg, 1962). While there are a number of issues that center on babbling, there seems to be no evidence of any normal speakers of the model language who have not gone through a babbling stage. (Persons who have been profoundly deaf from infancy are often taught to speak intelligibly, but never normally, including the phonemes of pitch, stress, and intonation in English.) There is some evidence of a developmental sequence for the appearance of the sounds that characterize babbling (most notably in the studies of Irwin, 1946, 1948, 1951, and 1960), but the virtue of having adequate samples is offset by two serious defects, as Ervin and Miller (1963, p. 110) note: "One is that they have seldom recorded the complete range of infant sounds, providing no record of rounded front vowels (as in French "tu," German '~oose"), glottal trills, implosives, or clicks. The other is that they have not separated sounds uttered during babbling or cooing and those constituting variants in systematic language. Linguists have noted marked differences in vocal behavior even when babbling and language occurred at the same age." Carroll (1960), referring to the same material, observes that the sound sequences in babbling appear to be more or less random and bear no direct relation to the sequences observed after true linguistic behavior begins. And he points out that Jakobson's (1941) theory of the sequential acquisition of phonemic distinctive features does not apply to the babbling stage, but only to the period after true language acquisition has begun. Mowrer (1960) has suggested that there is secondary reinforcement in hearing one's self speak as the rewarding parent speaks, presumably leading to greater imitation of the model. This would account for both increasing quantity of sounds and increasing approximation to adult sounds, as acknowledged

36

Vetter and Howell

by Ervin and Miller (1963). Yet, as indicated above, there is no direct correspondence between babbling sequences and the sequences of the model language. Moreover, Lenneberg (i964) has reported that sound spectographic analysis has revealed objective differences between the babbling sounds of children and the speech sounds of adults, while mothers have not been able to imitate the babbling sounds of their children with any degree of success. For reasons such as these, Weksel (1965, p. 697) has expressed serious doubt that "the infant's earliest motivation for speech development is due to his discovery of the similarity between his own sounds and those made by his mother while attending to his needs." Eisenson et al. (1963) characterize lallation, their fourth stage of prelinguistic development, as involving the child's imitation of his own accidentally produced sounds, as against echolalia, the fifth and final stage, which is characterized by the imitation of the sounds produced by others. Lallation is said to occur during the second 6 months of life, while echolalia begins around the ninth or tenth month. Eisenson et al. 0963, p. 19S) note: "The lallation and echolalic periods ~re of tremendous importance because during these stages the child acquires a repertoire of sound complexes which ultimately he will be able to produce at will, and which he must have before he can learn to speak or acquire a language in the adult sense." While Eisenson and associates have done a service in trying to sort out developmental stages in prelinguistic development, their statement carries implications which do not accord with the observations of other investigators. Carroll (1960) has already been cited to the effect that babbling sound sequences do not correspond to the sequences of the model in any consistent way, and he notes further that after babbling (which would include lallation and echolalia) ceases, usually before the end of the first year, the child may appear to have temporarily lost the ability to produce certain sounds. Tischler (19fi7),cited in Ervin and Miller (1963), noted that frequency of Vocalization reached a peak at 8 or 10 months of age, then declined for the 17 children in his study. Virtually all conceivable sounds occurred during the eighth to twelvth month, including some that were not in the adult language. Simply because there is no direct progression from a stage in which all sounds are random to the stage at which all sounds and sound sequences match those of the model, it should not be assumed that linguistically relevant behavior does not occur during the months prior to the production of unmistakable words. Weir (1966) and Eleanor Maccoby of Stanford sampled the vocalizations of infants between 6 and 8 months of age in households in which the primary languages were Mandarin Chinese, Syrian Arabic, and American English, respectively. They were usually able to identify the Chinese

Theories of Language Acquisition

37

infant by distinct pitch patterns, but were unable to easily distinguish the two Arabic babies from the American one. A subsequent, more extensive study along the same lines by the same investigators was undertaken, with some very preliminary observations reported in Weir (1966). At approximately 6 months of age, very different patterns could be seen developing for a Chinese baby and for an American and a Russian baby. Weir (1966, p. 156) notes: "The utterances produced by the Chinese baby are usually monosyItabic and only vocalic, with much tonal variation over individual vowels. A neutral tingle vowel with various pitches is also typical of another fix-month-old Chinese infant, as well as of a still different seven-month-old one. The Russian and American babies, at six and seven months, show little pitch variation over individual syllables; they usually have a CV (consonant-vowel) syllable, often reduplicated or r~peated at intervals several times, with stress patterns occurring occasionally and intonation patterns usually over a number of syllables." Among the other investigators who have noted early intonation patterns, Weir cited Kaczmarek (1953) to the effect that the intonation period, following crying, cooing, and babbling stages, marks the beginning of the language-learning process, and occurs as early as the fifth month. Weir also cited Lewis (1951, 1963) and Ohnesorg (1948, 1959) on the originally affective quality of intonation. Lewis argues for the early development of representational intonation from expressive intonation. This would seem to be a fruitful area for exploitation by learning theorists, but it appears to have been neglected by most students of child language. While it is likely that the term "linguistic behavior" will continue to be reserved for utterances which include identifiable words-Ervin and Miller (1963, p. 109) require "at least two systematically contrasted meaningful words, a point usually reached by the end of the first y e a r " - i t seems apparent that the first unmistakable steps are taken around the sixth or seventh months with the acquisition of tonal or intonation patterns which depend upon particular linguistic environments for their distinctive characteristics.

THE LINGUISTIC ENVIRONMENT All theories of language ontogenesis depend upon the fact that the end point of the process is some adult mode. The special problems that arise when the linguistic environment contains more than one model-any multilingual community, for example-will be discussed in the next chapter. Even in the most simple case, however, it is apparent that the infant is subjected to a wide variety of auditory stimulation. A great deal of this stimulation is, by any definition, noisy and must be distinguished from relevant stimuli. White noise

38

Vetter and Howell

is probably sufficiently unpatterned to pose a problem. And it may be that vocalizations produced beyond the vision of the infant are not immediately relevant. But if it is truly the case that vocal behavior, including crying, is keyed to the behavior of those who attend the needs of the infant, then it would seem to be important to examine that behavior. Usually it is assumed that one need consider nothing beyond the normal adult model of the relevant speech community, but this assumption may be unwarranted. As Ohnesorg (cited in Weir, 1966) points out, there is some over'articulation of intonation by the adult when addressing the child, which may lead to infant productions which approach caricatures of the normal adult patterns. There is also a possibility that there are a proportionately greater number of yes-no questions tossed at the child than would appear in adult discourse. This would mean a disproportionate number of rising intonation patterns, a possibility that Weir seemed to think might account for some of the difficulties she had encountered in interpreting her own materials. So far as labeling is concerned, it is important to note that a great number of adult vocalizations are geared to specific situations, such as feeding, preparing to depart, or just playing with the infant. There is typically a tremendous amount of repetition on the part of adults when addressing infants. The babbling character of the adult vocalizations eventually disappears and is inversely related to the prelinguistic and linguistic development of the child. The following passages (from Howell, 1967) are of interest in showing the babbling behavior of adults with different primary language backgrounds. The first passage was recorded on January 28, I960, in Fukuoka Prefecture, Japan, for the purpose of obtaining a record of the vocalizations of 3-monthold David. The second passage was recorded for the same purpose on February I0, 1960. Neither the parents nor the maid had the concept of adult babbling when the recordings were made. Numbers refer to the location of the passage on the tape, which was recorded at 7% inches per second. First Passage

008 (Mother): O09 010 016

022 (Father):

Hi [each time a very long diphthong, ris~g, then fa~ug]. Hi.

Tsk tsk tsk, well say something! [Laughs] Come on, yes yes yes. Tsk tsk tsk. [Father asks the date and she tells him, then repeats to the infant] This is the 28th, how age you? Tsk tsk tsk, tsk tsk tsk. Come on. Coo gee gee [g is unvoiced], gee gee gee, gee gee gee, come on. Hi, hi. Tsk tsk tsk. Hi, come on. Come on. Talk to Mommy. Talk! Yeah, hi. Gee gee gee, hi. David, age you gonna be a chatterbox? Huh? Hey, be a chatterbox. Okay? Huh?

Theories of Language Acquisition

39

Even after both parents became aware of the babbling nature of their speech, it was extremely difficult to change the pattern.

Second Passage 080 082 087 088 090 092 096 109 117 119 121

124

Hai, Debi-chan [Yes, Davey]. Debi-chan. [Incoherent] Do shimashita? [What's the matter?] Iya desu ka? [Don't you like it?] Iya desu ka? Okashii, ne[ [Funny, huh?] Okashii ne? [Funny, huh?] Okashiidesu ka? Okashii desu ka? Okashiidesu ka? Okashii desu ka? HaL HaL Hal hal hai. Hai hal hal Hai-yai-yai. Haai. [Laughs] Do shimashita? Do shimashita ka? [Baby laughs in response to series of soft, high-pitched sounds] Kuchuguttai wa [You're ticklish!], kuchiguttai wa [You're ticklish! ], kuchuguttai wa. Kosobayui, Debi-ehan [Davey's ticklish]. Kosobayui Debithan. Okashii,ne? Kuchuguttai wa, Kuchuguttal wa. Okashii. A.ra am ara! Hey hey hey! Date desu ka? [Who's this?] Date desu ka? Iya-iya to iu no [You say you don't l~e me] Dame desu yo [Shame on you] Dare desu ka? lya-iya-ntte [You say you don't like me] Dame desu yo. [After baby has begun to cry] Hal, hal, hal. Ohoho. Oioioi. Oyi. Date desu ka? Kate desu [Incoherent, but baby soon grows quiet.]

In all cases the adults were much more gentle in their verbalization than they ordinarily were when addressing other adults. The constant repetitions appear to be governed by the responses of the infant, with words, syllables, and facial expressions more likely to recur if the adult is rewarded by a laugh or smile on the part of the infant. At the end of the second passage, when the baby has begun to cry, the maid steps up her tempo considerably (though it does not show in the written representation). Her objective seemed to be to distract the baby. In any event, the model which was being offered throughout is not typical of adult discourse. There is considerable "baby talk" in the maid's utterances. At 096, for example, kuchuguttai (kuchiguttai) is a modification of kusuguttai, the local dialect form of kosobayui 'ticklish'. Affricating the sibilant frequently conveys a diminutive sense, as seen also in the -than, the form of the polite suffLx -san, which is appended to the names of youngsters and elsewhere signifies a particularly intimate relationship. Ohnesorg's observation on the distortion of intonation patterns has already been cited. One doubts that the child's progress toward mastery of the model is attrib-

40

Vetter and Howe[[

utable in any significant way to the special characteristics of the utterances which are directed toward him by adults. Such utterances may often be syntacticaUy simple and children may receive a certain amount of language tuition, but a great deal of the model which is presented to the child is done so indirectly. What the child overhears is normal adult speech, for the most part, and that is filled with false starts, slips, and in general the range of departures from idealized grammar that we call hesitation phenomena. We might note the presumed reinforcing effect of interaction with age mates who also have grammars which are not congruent with the model. It is obvious that the task of the child-eventually to synthesize the grammatical rules of the model from the distortions and imperfections of the linguistic performances around him-is extremely complex and is not easily accounted for by imitation. Here we might note, incidentally, that the passages illustrating adult babbhng contained no tangible referents except "mommy" and the baby himself. At this stage, the adult communication seems essentially to be expressive and probably would be as effective if conveyed through nonsense syllables. In addition to the problem of what exactly does constitute the linguistic environment of the child, there is also a problem of precisely how, or even whether, adults provide or withhold rewards for the production of acceptable sound productions. We assume that this is an important feature of language ontogenesis, but there is a dearth of research material on this problem. Some parents probably accept anything they can interpret, while others may attune their responses more precisely to utterances which approximate those of the adult model. This is a problem with important practical implications because many of the so-called culturally deprived speak what is considered to be very substandard varieties of the model. Bernstein (1966) has argued for the differentiation of restricted and elaborated codes, in which the former is more geared to expressive behavior and assumes greater familiarity between speakers; the restricted code is said to be more representative of blue-collar family interaction, though white-collar families maintain.restricted codes along with their more elaborate code. Ghetto children frequently appear as speech problems in nonghetto schools, particularly in the lower grades, and this appears to be attributable, at least in part, to the possibility that the mother will respond to the underlying meaning of the communication even ff it is poorly articulated. Indeed, it seems likely that some children would withhold appropriate utterances because of the greater amount of attention which may thus be obtained from the listener. It also seems likely that many mothers will accept "sublinguistic" vocalizations because it is less taxing than trying to elicit more suitable utterances. These are all problems which begin early but

Theories of Language Acquisition

41

continue well into stages at which we expect children to have gained an essential mastery of the model.

LINGUISTIC STAGES OF DEVELOPMENT A great deal of important preparatory activity has taken place prior to the production of intelligible words, some time around the end of the infant's first year. There is no clear point at which we can say without fear of contradiction that true linguistic behavior has begun and that all previous behavior, however vocal, is nonlinguistic. Intonation patterns, which in English constitute one aspect of the phonemic system, show the influence of the linguistic environment around the sixth or seventh month. Yet it is the use of symbols which distinguishes language from other forms of communication, and the most elementary manifestation of this process is naming or labeling. This means using words, where there is no intrinsic association between a sound sequence and its referent. Thus there is reason to designate as the first stage of linguistic behavior the use of unmistakable words. But naming is simply an essential part of the process. If language consisted of no more than naming or the use of labels, the Jenkins-Palermo model, which depends on imitation and "simple S-R connections between verbal labels and salient features of the environment" (Jenkins and Palermo, 1964, p. 165), might with minor alteration serve quite adequately. But as Vetter (1969, p. 53) has phrased it: "The deficiencies of the Jenkins-Palermo type formulation become manifest when it is called upon to deal with the fact that children, in the natural course of language development, axe requked to produce various types of linguistic structures in the absence of appropriate and explicit examples of such structures in their linguistic habitats." If one-word utterances mark the first stage of linguistic development, then the first sequencing o f words, the use of two-word utterances, is a convenient mark for the second stage. Of course there is no clear differentiation between one-, two-, and multiword stages and, after a period in which two- and threeword utterances are pretty typical, events move rather quickly. Not only is there a sharp increase in the vocabulary but also in the number and complexity of grammatical patterns. By the age of 4 years the normal child has acquired the essential patterns of daily verbal interaction. There is no final

42

Vetter and Howell

stage, of course, since many adults continue to ref'me their use of the language and add to the range of utterances they can comprehend. After treating the development of the phonemic system, we shall consider stages which are characterized by one-word utterances, two-word utterances,.multiword utteranees, and, rather briefly, some of the problems that arise through the study of reading and writing. F~ally, we shall summarize with a consideration of what is necessary for the development of an adequate theory of language ontogenesis.

DEVELOPMENT OF THE PHONEMIC SYSTEM As Ervin and Miller (1963) observed, we cannot begin to analyze the

structure of the child's language until he has at least two systematically contrasted meaningful words. This limitation applies not only to what we usually think of as "grammar," but also to the analysis of phonemic systems. When O. C. Irwin and his associates, previously cited, refer to the appearance of "phonemes" during the early months of the infant's vocalization, this is something of an abuse of terminology, as Weir (1966) and Carroll (1960) have noted. This is because a phoneme refers to a certain range of sounds in a given language that can define meanings. For example, we know that g and d are different phonemes in English because we can fred different words that are otherwise pronounced identically, as bud vs. bug. On the other hand, we do not distinguish the rather strongly aspirated "p" in pin from the unaspirated "p" in spin. We say that the two kinds of "p" are representations of a single phoneme. In Chinese, however, the two kinds of "p" are just as different as the "g" and the "d" are in English, in that the use of one "p" rather than the other can result in a different word. Thus, there is no way to tell if two sounds are different phonemes or just variations (allophones) of a single phoneme unless we have a corpus of words to analyze. Since the sounds which appear during babbling have no direct bearing on the sounds that constitute the phonemic system of the model-many are produced which are later dropped, and some of them have to be re-acquired later because they are significant in the model-the babbling sounds need not be considered relevant here. Intonation is relevant and seems to begin early, but we lack sufficient evidence to relate it usefully to a general discussion of the phonemic system as it develops later, beyond Lenneberg's (1967) observation that the child reacts to whole patterns rather than to small segments. This appears to account for the fact that babies of Chinese-speaking parents show a markedly different intonation pattern from that of the babies of English-speaking parents as early as about 6 months (Weir, 1966). The differ-

Theories of Language Acquisition

43

ence is generally of the sort that we might expect to see developing in a potential speaker of a tonal language as against the pattern for a potential speaker of a nontonal language. The whole-pattern characteristic continues throughout the years during which the child is gaining mastery of his phonemic system. As Lenneberg (1967, p. 279) describes the process: "The first feature of natural language to be discernible in a child's babbling is contour of intonation. Short sound sequences are produced that may have neither any determinable meaning nor definable phoneme structure, but they can be proffered with recognizable intonation such as occurs in questions, exclamations, or affirmations. The linguistic development of utterances does not seem to begin by a composition of individual, independently movable items, but as a whole tonal pattern. With further development, this whole becomes differentiated into component parts; primitive phonemes appear which consist of very large classes of sounds that contrast with each other. R. Jakobson (1941) was the first to point this out clearly." Thus, the infant in the later stages of babbling has very likely produced an extremely broad range of sounds; such productions vary in the degree to which they are under firm cortical control. That is, not all of those sounds can be produced at will, while others may be produced in some phonetic environments but not in all phonetic environments. This results in collapsed phonemic systems, in which a number of phonemes in the adult model may be represented by a single phoneme in the system of the child. To give a simple illustration, David, the object of adult babbling in the passages given above, had a single initial d at the age of 3 years where the English model has both d and l. Thus, he would say I dike y o u where we would say I like you. In general, it seems that the younger the child is, the more collapsed the phonemic system will be. That is, the younger the child, the broader the sound categories that constitute the phonemic system. This can considerably obscure the identification of vocabulary items, especially at the one-word utterance stage. An example of how difficult the problem is may be found in Morris Swadesh's observation of a pattern in his son's phonemic system (reported by Ervin and Miller, 1963, p. 115): "Final and medial consonants of the adult's words were dropped by the child. The initial consonant was replaced by a nasal if a noninitial nasal was found in the adult's word; a labial was replaced by the labial nasal m, and a nonlabial was replaced by n: blanket me, green nL candy ne. Complicated substitutions of this type are not at all uncommon, but they are ordinarily not recognized by the parent." It is possible to demonstrate that there is a difference between competence and performance in the development of the phonemic system. David rejected adult imitations of his system and Brown and Berko (1960)

44

Vetter and Howell

provide a similar example. Investigator: That's your fis? Child: No, my fis. Investigator: That's your fish. Child: Yes, my fis. That is, the child's s represents the model s and sh, but the child earl hear and distinguish the two. Moreover, it is clear that he considers them to be different, even though he cannot articulate them differently. His linguistic performance in this respect lags behind his linguistic competence, as we might expect. Because of the collapsed nature of the early phonemic system, the transition from babbling to the production of one-word utterances is likely to be somewhat confused. Thus when Smith (1926) reports that the year-old child, on the average, can produce and respond to three words, it is hard to be confident that the figure is accurate. Aside from the confounding of competence and performance, it is probable that some productions simply were not recognized as such. So far as the demonstration of competence is concerned, this appears to take place over a period of years. According to Templin's (1966) data, most children have mastered all the phonemes of English by the age of 8 years, and any gross distortion of a phoneme or substitution of one phoneme for another is considered to indicate a speech (articulatory) problem. Phonemes may be described in terms of distinctive features, which are identified through minimal pairs-different words which differ only in the feature under investigation. Jakobson (1941, cited in Carroll 1960)has suggested that the cl~d learns to produce the distinctions required by the model in a def'mite developmental sequence. The order is said to reflect the prevalence of the contrasting features among the languages of the world. The distinctive features that occur most rarely in the world's languages tend to be the last which are mastered by the children who speak those languages. Thus, the distinction between the initial sounds of free and three is required in English but in very few other languages, and it appears to be one of the relatively late distinctions learned by English-speaking children. Once a contrast is learned it tends to permeate the whole phonemic system. When Joan, a child described by Velten (1943, cited in Ervin and Miller, 196-3), learned to contrast p and b she also learned to contrast t and d. That is, when she learned to distinguish the p and the b she was not simply sorting out two phonemes which had formerly been one in her system, but she was developing a more abstract distinction: voiced vs. voiceless stops. In this way, a child could double his repertoire of consonants with each pair of contrasting features. "The theory presents an economical process of learning since the number of contrasting features is much smaller than the number of phonemes. Radical changes in the system come at once rather than through the gradual approximation of the adult phonemes one by one" (Ervin and Miller, 1963, p. 112). Jakobson's approach is being applied by Gregoire (1947), Leopold

Theories of Language Acquisition

45

(1953-1954), and Velten (1943). On the evidence of diary reports on individual children, Ervin and Miller (1963, pp. 113-114) offer the following tentative generalizations: "(a) The vowel-consonant contrast is one of the earliest, if not the earliest, contrast for all the children. (b) A stop-continuant contrast is quite early for all children. The eontinuant is either a fricative (e.g., 1) or a nasal (e.g., m). (c) When two consonants, differing in place of articulation but identical in manner of articulation exist, the contrast is labial vs. dental (e.g., p vs. t, m vs. n). (d) Contrasts in place of articulation precede voicing contrasts. (e) Affricates (oh, 1") and liquids (l, r) do not appear in the early systems, ff) In the vowels, a contrast between low and high (e.g., a and 0 precedes front vs. back (e.g., i and u). (g) Consonant clusters such as st and tr are generally late. In regard to contrasts at different positions within the word, certain tendencies are observed. Children nomaally acquire initial consonants before f'mal or medial consonants, and consonantal contrasts often apply to initial position before other positions." If there is a generally valid progression to the acquisition of contrasting features, it is probably attributable to the relative difficulty of the requisite discriminations and motor skills. From the standpoint of learning theory, it might be more profitable to focus on these characteristics than on "phonemes," since the latter are already abstractions of allophones and thus may cover a variety of articulatory differences. Unfortunately, it has not been customary in this country to present findings in these terms, but Ervin and Miller (1963) summarize such a presentation in a Russian study (Shvarchkin, 1960). They also have additional discussion of further complexities in the development of the child's phonemic system. Even this limited treatment should serve, however, to indicate the difficulty of trying to apply any simple (or complex!) stimulus-response model to the phonemena of language ontogenesis. In summary, babbling may result in a measure of cortical control over the quality of vocalizations, but there is no direct correspondence between the number, quality, or sequencing of sounds during babbling and the development of the child's phonemic system. Even if the child of 10 months more or less randomly produces virtually any sound which is significant in any language in the world, he can take an additional several years to master the intricacies of his phonemic system without being considered to have a serious articulatory problem. This is well beyond the stage at which he will have mastered the basic grammatical patterns o f his language. During the early stages of linguistic behavior phonemic systems are likely to be collapsed, in that a limited range of sounds in the child's repertoire may represent a broad range of sounds in the model. Yet so far as competence is concerned, it can be demonstrated that the child can hear and differentially respond to the

46

Vettet and Howell

greater variation of phonemes in the model. Finally, the child has a strong tendency to apply the findings from one contrast set across the board. That is, the distinctive features, such as voicing, are applied by analogy to contrast sets other than the one for which the discovery was originally made, or demonstrated.

ONE-WORD UTTERANCES For about half a year between the ages of 12 and 18 months, the vocalizations of the child typically consist of single words. There are phonological, syntactic, and semantic differences between these utterances and those of the model, which Lenneberg (1967) sees as evidence not only of maturational factors, but also of a difference in learning strategy. The child learns patterns and structure rather than constituent elements first. Most adults who are learning or teaching a second language seem to begin with particular attention to the phonetic skills required and later take up the problems of syntax and semantics. Krech and Crutchfield (1958) have reported adult studies of the relative effectiveness of learning complex skills and a whole process as against mastering constituent processes of the whole and subsequently trying to join them into a single process. Their evidence suggests that the former approach is more effective. The single-word utterances are often called one-word sentences because it is customary to refer to the sentence as the unit of discourse, or because the word uttered by the child appears to represent a complete thought. But most of the recent studies of child language tend to bypass this universal stage and move directly to two-word or longer utterances. The implication of such an approach is that grammar actually begins with the sequencing of two words (see MeNeill, 1966a, 1966b, 1966c, for example). While he does not dwell on the point, Lenneberg (1967, p. 283) suggests that "if we assume that the child's ftrst single word utterances are, in fact, very primitive, undifferentiated forms of sentences, and that these utterances actually incorporate the germs of grammar, a number of phenomena may be explained." For example, during this period the child may have a repertoire of several dozen words, including many which will later appear in sequence. Yet he will not join two words and he cannot be induced to join them until a later period, when he will spontaneously join them into two-word utterances. The explanation seems to be that the use of single words in the early phase is grammatically different than the use of those same words later. The single words function as complete propositions and seem to carry some of the phonological burdens of the expansion. Lenneberg (I967, p. 283) notes:

Theories of Language Acquisition

47

"Phonologically they may be operated upon by a given rule, much the way a whole string of symbols is operated upon later on; for example, one of a variety of intonation patterns influences the uttermace-such as declarative, interrogative, or hortative pReh-contours. It is reasonable to assume that the formal processes that regulate the perception and production of sounds are essentially the same as those that enter into syntax and that the one-word stage is simply a transitional stage during which the rules are extended from the interaction of articulatory movements to the interaction of larger language units, namely morphemes and words, mad that the eventual acquisition and mastery of grammar has its origin fight at the beginning of language development. . . . " If Lenneberg is correct, it would seem that more attention should be paid to this stage than has been customary. It is apparent that daddy will not represent the same expansion (deep structure) every time it appears, and a study of the pitch contours with which it is associated under different contextual circumstances should provide further clues to the grammar of the one-word utterance. The child's use of single words is important not o n l y for the rather subtle questions of grammar it poses, but also for an understanding of early concept formation. There is a likelihood that not only is the single word a primitive sentence (at least in many cases), but it may also, simultaneously, have a fairly specific reference function. Thus, before he was 1 year old, David (mentioned above) had coined a term, da-da, that included all of the Japanese women who regularly worked in the household, but it excluded everyone else in the household (parents and siblings) and it excluded the Japanese woman who frequently visited during the day and served as babysitter in the evening; also, the term contrasted with d diy 'daddy.' Lenneberg (1967, p. 282) has suggested that with reference to one-word utterances that adults use, it is generally correct to say that "the meaning of words is uninterpretable in social commerce, unless we have enough clues with which to construct a sentence for that word." He is thinking o f single-word sentences as examples of ellipsis, but there seem to be times when the child is labeling. If the child points to a book and says something approximating book, we may, of course, offer the expansion That is a book, but the grammatical question may not be relevant at the moment. It is possible, in other words, that the utterance of a single word may not invariably be the functional equivalent of producing a sentence. Our lack of adequate comparative material from nonlndo-European languages obscures several questions. First, different languages codify experience differently (see Fishman, 1960 for a systematization of the Whorfian hypothesis), and this fact may carry implications for the kinds of problems encountered even in first-language learning. Even though any normal ekild can

48

Vetter and Howell

learn any language as his native tongue, this need not imply that all languages are equally easy or difficult for all purposes. It would seem likely that a language with a relatively simple phonemic system (such as Japanese or Spanish) would pose fewer articulation problems than a language with a more complex phonemic system (such as Korean or English). This, in turn, should imply that the performances of beginning speakers of languages with simple phonemic systems should be more readily interpretable by those attending them (there would be less need to puzzle out collapsed phonemic systems), and this in turn should mean that certain kinds of communication would be simpler at the early stages. The Japanese term mare a 'food' is a term employed by and to children and is learned very early, probably before the first birthday in most cases. It conforms essentially to the phonemic pattern of the model, is easy to articulate, and serves very effectively in adult-child communication. In English, on the contrary, there seems to be nothing which quite corresponds to this. At least we are not aware of common examples of American children at the age of 11 months clearly articulating words such as food, milk, or bottle, or anything linguistic that would be functionally equivalent. Our experience is that the youngster tends to cry and thus leave it to the ingenuity of the parent to determine what the proper response is. Another problem for which we need comparative data is the effect of different morphosyntactic structures on the problems of first-language learning. Where word order distinguishes subject and object in English, a system of suffixes may perform this function in other languages, as Latin, for example. So far as learning is concerned, are these equally easy or difficult? It is unlikely. First of all, Slobin (1966, p. 135), points out, "All of the world's languages make use of order in their grammatical structure, but not all languages have inflectional systems." Moreover, Greenberg (1963, cited by Slobin, 1966) has posited, as a linguistic universal, the appearance of subject before object in the dominant actor-action construction Of a language, with the two most common patterns being subject-verb-object (SVO) and subjectobject-verb (SOV). In the ease of Russian, the SOV order is dominant at first in the speech of the child and is later replaced by SVO (around the end of the second year). In both English and Russian, the earliest systems are unmarked for tense, gender, number, and so forth. This is not surprising for English, perhaps, since word order is so important to it [but see Bever et al. (1965), who point out, as Slobin (1966) noted, that even in English order is not so important a feature of syntactic structure as one might think]. But in Russian, thanks to its inflectional systems, word order is highly flexible. However, as Slobin observes, even though one might expect the Russian child to learn the morphological markers for subject, object, and verb and then combine them in any order, since they are exposed to such a variety of word

Theories of Language Acquisition

49

orders, they learn the morphology later: word order for the Russian children is as inflexible as it is for American children. The extent to which different language structures condition different learning patterns remains an important problem area. Kluckholm (cited' by Casagrande, 1948) felt that Navaho children take longer to learn their extremely complicated language than do English-speaking children, whose task appears easier. Leopold (1953-1954, cited in Ervin and Miller, 1963) said, on the basis of his detailed longitudinal study of an English- and a Germanspeaking child that the word order (syntax) develops before morphology (arrangements of morphemes within words), which supports the previous statement about the priority of word order in English and Russian. Burling (1959) thinks that while morphology is more important than syntax in Garo (as in Russian), so far as language acquisition is concerned, morphology and syntax appeared simultaneously (note the current tendency in linguistics to refer to "morphosyntax") and appeared to be of equal importance to the child. While the term "unmarked" is commonly used to describe early grammar, this does not imply that the forms selected by the child are somehow independent of the model. Ordinarily it means that contrasts which are grammatically marked in the model (as singular vs. plural in English) are not contrasted in the speech of the child. One form or the other is preferred. In Russian, for example, the child may run through a succession of markers, overgener~liT~ing each in turn until the appropriate categories are under control, at which time the markers are suitably distributed among them (Slobin, 1966). It is rather as if the child were running through a sequence of hypotheses regarding each of the markers, or more generally, a sequence of hypotheses about the grammar of the model. And, as Saporta (1967, p. 17) says, "one aspect of this puzzle is how different children exposed to different sets of sentences devise essentially the same grammars." It is understandable that the bulk of analyses of child language begin with two-word utterances in English, yet even in English there is a great deal which remains to be learned about one-word utterances. In part, this must await evidence from quite differently structured languages because we cannot fully appreciate the significance of those features which are selected without the perspective provided by exotic tongues. Brown and BeUugi (1964), for example, point out that the single-word utterances of American children carry primary stresses and have terminal intonation contours, but what happens in a language which lacks stress as a distinctive feature (such as Japanese, among others)? What of languages in which even the simplest words consist of several morphemes?

50

Vetter and Howell

TWO-WORD UTI'ERANCES In English, at least, the sequencing of two words marks a new stage of development, and it usually begins around F8 months of age. A two-word sequence is more than a mere joining of .two independent entities. As mentioned above, single words such as push and car carry primary stress and have a terminal intonation contour. But when placed into a single construction, the primary stress and the higher pitch fall on car, while push carries a lesser stress and a lower pitch (Brown and BeUugi, 1964). The terminal contour remains for car but disappears from push; the two words are no longer separated by a terminal contour. The two-word sequence thus involves a higher level of complexity and implies mastery of two new rules, one having to do with differential stress patterns. In addition to such tactical rules that become evident at the two-word level, there are also different form classes. There is an open class with a relatively large membership that consists for the most part of words that we would call nouns. Then there is a class of modifiers (Brown and Bellugi, 1964; Brown and Berko, 1960; Brown and Fraser, 1964), operators (Ervin, 1963, 1964), or pivot words (Braine, 1963). This is a relatively closed class with few members, but with each member getting a greater piece of the verbal action. While the point may not have been explicitly demonstrated, it seems probable that the open class contains a relatively high proportion of the items employed at the one-word stage, but it is nearly certain that some of the single items appear in the pivot class [Braine's (1963) night-night and bye-bye, for example, are in the pivot class but probably were used prior to the two-word stage]. Some of the expressions in the pivot class do not appear independently, as may be judged from the following example (Brown and Bdhigi, 1964) which shows how selection from each class and the sequencing are done. The pivot of two children between the ages of 18 and 36 months included a, big, dirty, little, more, my, poor, that, the, and two. An item would be selected from this class and would be followed by a selection from the open class, forming a noun phrase. Thus, they found the children generating noun phrases of the sort: a coat, a Becky, a celery, my mommy, that Adam, more nut, and dirty knee (Brown and Bellugi, 1964, p. 152). Some of these noun phrases correspond to the adult model, but some do not. We use the indefinite article a only to modify common count nouns in the singular and we consider it inappropriate before proper nouns (*a Becky) or mass nouns (*a celery). Similarly, the adult model requires the plural when more modifies a count noun, so that *more nut would not be a grammatical phrase for us, and eventually it will not be appropriate for the grammar of the child. At this stage, then, the child commonly has a class of words which must

Theories of Language Acquisition

51

ultimately be divided into subclasses (definite articles, indefinite articles, demonstrative adjectives, possessive pronouns, and so forth) ff its grammar is to conform to the model. Apparently phrases such as these appear quite early, along with the push car sequences. If these really do appear at about the same time, it israther interesting. Lenneberg, as reported above, has rather seriously proposed that one-word utterances are actually primitive, but grammatical, sentences for the child (here we are, perhaps, somewhat guilty of according others who have described one-word utterances as sentences rather cavalier treatment in implying that their identical proposals are not serious). Lenneberg seems to feel that the child is responding to grand patterns rather than components of those patterns. We have the same feeling, but would add the qualification that some one-word utterances need not be considered as sentences, but rather labels. If Lenneberg may stand without our qualification, then we should expect all of the initial two-word utterances to be best characterized as sentences. And the push car kind of sequence would seem to support this notion. On the other hand, the a Becky kind of sequence could be interpreted as easily as an expanded label rather than as a primitive sentence. Support for this idea comes in the form of noun phrase type constructions which appear in Brown and Bellugi's (1964) earliest records as components of longer constructions. The little girl in their study said, for example, Fix a Lassie and A horsie stuck when she was 18 months or so of age, evidently at the beginning of her two-word sequences. What we are trying to suggest, then, is that the earliest two-word sequences have two origins which are traceable to the two functions of the one-word utterances. One function was that of a sentence and the other was that of a label. The idea that two-word utterances in English may derive from different meanings of single-word utterances, much the way a given surface structure may derive from different deep structures, seems to derive support from McNeill~s (1966a) study of how a Japanese child learned to use the particles wa and ga. At 27 months the little girl had good control of the latter, which may be described as having an essentially grammatical function, while wa, which may be described as having an essentially reference function, was typically omitted even though it appeared twice as frequently as ga in the mother's speech. From this McNefll concluded that the development of reference and the development of grammar are quite separate in children. And ff Maclay and Osgood (1959) are correct, the normal production of adult speech involves the simultaneous selection of lexieal elements and grammatical processes (and incompatible choices lead to hesitation phenomena).

52

Vetter and Howell

IMITATION Earlier we discussed imitation in connection with early phonology and the development of the phonemic system where it became obvious that the production of the very young child bore very little in the way of a direct relationship to the model. The limitations of imitation as an explanatory principle become even more clear through an examination of how morphosyntactic rules develop. The most obvious evidence that the rules of the English-speaking 2-year-old are not simply based on the adult model is the abundance of utterances for which there is no adult analogue. Brown and Bellugi (1964), for instance, offer the following examples from their material: a scissor, a this truck, you naughty are, put on it, cowboy did fighting me, and put a gas in. Ervin (1964) was able to induce simple grammatical rules to account for the following utterances by a boy 2 years and 2 months old: blanket water, bow-wow dog, here big truck, where go the car? and where's a big choo choo car? Three months later it was possible to induce rules to acount for four categories of utterance, the largest of which was the declarative sentence that included examples such as: there's a bus, there's a green, here's a broken, and there's all-gone. Some of these are found in the adult model and some are not, but all conform to the same rules in the speech of the child in question-i.e., they are all grammatical according to his system. Brown and Bellugi (1964) speak of "imitation and reduction" and "imitation and expansion." In the first case, reference is to examples of the omissions which appear when the child imitates its parent. Sometimes the imitation is complete, but only if the model utterance is quite short. More generally there will be omissions in proportion to the length of the utterance to be imitated. Thus, tank car was tank car; wait a minute was wait a minute; but daddy's brief case was daddy brief case; that's an old time train was oM time train; and no, you can't write on Mr. Cromer's shoe was write Cromer shoe. Brown and BeUugi make the point that the imitations preserve the word order of the original, and they suggest that this may facilitate comprehension on the part of the adult. The omissions are function words, for the most part, items which relate the lexical items that carry the bulk of information. The result is a kind of telegraphic style. It happens, in English, that the function words normally enjoy a lesser stress than the lexical or content words, raising the somewhat acadenfic question of whether the child is responding to the stress pattern or the semantic burden of the words that are retained. Brown and Belhigi were able to induce responses which stressed function words by presenting them with an appropriately distorted model, but, as they observe, utterances based on function words at the expense of the content words would not constitute an effective system of communication.

Theodes of Language Acquisition

53

Imitation with expansion refers to the imitation that the adult performs in response to the truncated utterances of the children. When the boy in their study said There go one, for example, the mother imitated him but added the omitted functor There goes one. In the study in question the mothers responded to the speech of their children with expansions about 30% of the time. In general the expansions consist of the original word order plus whatever additions are necessary to transform the child's utterance into an appropriate and grammaticaIly acceptable equivalent in the adult model. A great deal of mother-child interaction consists of alternations of reductions and expansions. It seems likely that this type of interaction would greatly facilitate mastery of the model, though it may not be necessary. Here it would be instructive to actually measure the amount of such interactions that takes place in areas characterized as culturally deprived or otherwise such that Bernstein's (1966) restricted code might be expected to constitute the normal medium of communication. While parents and children may imitate, or repeat, each other's utterances with expansions and reductions, respectively, it should be kept in mind that mastery of the adult model is not merely a matter of successively more perfect imitations of adult utterances. Even in the parent-child interaction described by Brown and Bellugi (1964) and by Ervin (1964, p. 172), as the latter points out, "imitations under the optimal conditions, those of immediate recaU, are not grammatically progressive. We cannot look to overt imitation as a source for the rapid progress children make in grammatical skill in these early years." There are consistent speech productions by the child that have no analogue in the model, as already indicated. In some eases, these can be highly original and creative. One of the girls, Lisa, in the Ervin and Miller (1963) study, had a problem with Final sibilants for a considerable period and thus was not able to employ the plural suffix -s. So she invented a syntactic device to indicate plurality. Thus, one-two shoe meant more than one shoe. Ervin and Miller (1963, p. 30) note: "She later gave this up in favor of other number combinations. According to her mother, she would pick two numbers to indicate the plural, e.g., eight four shoe, and then after a few days she would pick another combination, e.g., three five shoe. She f'mally developed a

final 8 and was able to say shoeth. There may be a relationship between Lisa's syntactic marking of the plural and her earlier use of an operator class which had no adult analogue. In addition, at a later stage o f development Lisa seemed to develop some grammatical males of her own, males which had no counterpart in the model language."

54

Vetter and Howell

MORPHOSYNTACTIC DEVELOPMENTS The more advanced the stages of language acquisition being considered, the more hnguage-bound the discussion tends to become. Even at the level, of single-word utterances there may be considerable differences in accordance with the 1.anguage being acquired. Thus, most of the early one-word utterances in English consist of single morphemes, but in Japanese the single-word utterantes frequently contain more than one. Thus akeru 'will open' and aketa 'opened' are differently marked for aspect or tense even though they both appear as single-word utterances (for David at 22 months). Developmentally they probably correspond to two-word utterances in English, where they would not constitute acceptable sentences in terms of the adult model, yet they are perfectly acceptable in Japanese. In the speech of an English-speaking child both utterances might take the form open it, or I open this, while a literal translation would be '[I'm going to] open [it]' in the case of akeru and '[I] opened [it]' in the case of aketa. The pronouns for the subject and object are optional in Japanese but are required in English, so that their omission is not noticed in the former but raises problems in the latter. Even though it seems possible for the Japanese-speaking child to approximate the adult model earlier than for the English-speaking child, this only implies that for the Japanese child linguistic performance beats a closer correspondence to linguistic competence. Relative competence (understanding of the rules of the model) may be about the same for children in both linguistic backgrounds. American and Russian investigators have devised ingenious experiments to determine the linguistic competence of children, but we are not aware of similar assessments outside Indo-European languages. And even if there were adequate comparative materials, it is still rather difficult to decide what constitutes equivalent processes. Are English stress, Japanese pitch accent, and Chinese tones all equivalent? They are all phonemic, but they do not all carry the same burden. If tone were ignored in Chinese the result would be chaos, but ignoring Japanese pitch accent would rarely create difficulties. On the other hand, vowel length is a critical consideration in Japanese (but nonphonemie in Chinese and English). So far as relative importance is concerned, Chinese tones are slightly more critical than Japanese vowel length, while English stress is probably somewhat less important than either but more important than Japanese pitch accent. Word order is critical for Chinese and English, but is subject to great variation in Russian, while it enjoys a more intermediate importance in Japanese. It would seem that specific features are not directly comparable because their functions and importance vary from language to language. Comparisons evidently will have to be made at a higher level of abstraction.

Theories of

Language Acquisition

55

There are a number of interrelated processes which appear to typify child language development generally, including analogic formations, overgeneralization, the expansion of telegraphic speech, an increase in utterance length, and the increasing mastery of rules relating syntactic units. Analogic formations involve generalization of the sort cat/cats to coat/ coats, but overgeneralization can yield forms which are not found in the model, as foot/loots, or of the sort hit/hitted by analogy with pit/pitted. Actually such "incorrect" forms as loots and hitted show that the child has a rule for generating the plural and past tense, respectively. It frequently happens that a youngster will have the correct irregular plural or past forms and will later drop these for the overgeneralized forms. Presumably the initial use of the irregular forms implies the learning of specific words, as against generating forms on the basis of abstract rules. Later, of course, the child must refine his rules to generate the appropriate irregular forms. This is similar to the way the Russian child overregularizes, as reported by Slobin (1966, pp. 137-138): " . . . not only must the child learn an instrumental case ending for each masculine, feminine, and neuter singular and plural noun and adjective, but within each of these subcategories there are several different phonologically conditioned suffixes. The child's solution is to seize upon one suffix at first-probably the most frequent and/or most dearly marked acoustically-and use it for every instance of that particular grammatical category." When additional case endings appear, they may be at the expense of the endings that the child has been using, though the latter eventually reappear. As Slobin (1966, p. 139) says, "Practice clearly does not ensure the survival of a form in child speech, regardless of whether or not that form corresponds to adult usage (and, presumably, regardless of whether or not its usage by the child is 'reinforced' by adults). (This is very similar to the development of the past tense in English, in which irregular stzong forms, like did are at first used correctly, only to be driven out later by overgeneralizations from the regular weak forms, giving rise to transitory through persistent forms like doed.)." In the matter of overgeneralization and the development of analogic formations, children learning English and Russian seem to be very similar, but we need further evidence to see how these would work in typologically quite different languages. The telegraphic nature of child language in English, in which function words tend to be omitted, has been abundantly illustrated. The early language of the Japanese speaker also gives the impression of being telegraphic, again with certain functors (as the particles wa, ga, ni, we, and de, for example) being omitted. That is, the high information words are the ones more likely to appear in the early language. Subsequent development in

56

Vetter and Howell

English and Japanese involves expansion, including not only the addition of funetors but also a general increase in utterance length. In either case it is likely that maturational factors are important, but these are difficult to sort out from other considerations, such as the kinds of experience to which the chad has been exposed and the relationship of such experiences to the structure of the language in question. Slobin (1966) has traced the order of development and subdivision of grammatical classes in Russian, showing the importance of the semantic and conceptual aspects of the classes. The first morphological classes learned are those with concrete references, such as number and diminutives, followed shortly b y the maperative mood. Classes based on more abstract relational considerations, as cases, tenses, and persons of the verb occur later. Even though the grammatical structure of the conditional is simple, it is learned late, presumably because the conditional is semantically difficult. Similarly, the only noun suffixes that are learned before the age of three are those of clearly concrete or emotive reference, and grammatical gender, which is the most difficult feature for the Russian child to master, is amost completely lacking in semantic correlates. The development of English in the child appears not to have been traced in quite this suggestive fashion, thot~gh it may turn out that cross-language studies will depend upon correlations with the development of the conceptual and perceptual world of the child. At any event~ Slobin's analysis suggests that we might expect tO f'md those grammatical devices which carry the greatest personal significance for the child to be learned first, even though they may be rdatively less important for the adult grammar as a whole. Brown and Bellugi (1964, pp. 134-135), for example, have noted that conversations involving the child are "very much in the here and now. From the child there is no speech of the sort that Bloomfield called 'displaced,' speech about other times and other places. Adam's utterances in the early months were largely a coding of contemporaneous events and impulses.'" The American studies have been strongly influenced by the linguistic theory of Noam Chomsky, opening the possibility that once a serviceable general theory is developed, with an understanding of how the general theory is developed, with an understanding of how the general theory is realized in the specific theories that constitute specific grammars, the way will be open for comparative studies of linguistic ontogenesis in purely generative or transformational terms. In the meantime, it would be useful if studies of ontogenesis in languages other than English were to be conducted in such terms. This assumes, of course, that transformational grammar will turn out to have the universal applicability that its proponents are trying to develop.

"l'heories of Language Acquisition

57

So far as English is concerned, it is possible to order, to some extent, specific grammatical features according to the age at which they are acquired. Thus, Brown and Fraser (1964) have correlated the appearance of be in the progressive and the modal auxiliaries will or can with the mean number of morphemes per utterance. In a general way this correlates also with age, but the number of morphemes provide a better predictor of when the grammatical features in question will appear. For t3 children, neither be in the progressive (such as is going) nor the modal auxiliaries (such as I can go) appear while the mean number of morphemes is 2.6 or less, while both are present by the time the mean number of morphemes per utterance has become 3.5. Morphemes, by the way, appear to be a better index of utterance length than words. It has already been pointed out that many one-word utterances of Japanese children consist of two morphemes at an age when their English counterparts are producing two-word utterances of one morpheme each. Bellugi (1966, cited by Lermeberg, 1967) has conducted an analysis of the development of questions in which she sees three stages. First a string of words may be turned into a question by a gradual rise in pitch, with negatives being indicated by the simple addition of no at the head of the string. At this stage the child appears not to understand the construction of certain types of questions. But at the second stage the child indicates, through the appropriateness of its answers to questions, that understanding has been achieved. Yet the child's productions only involve either the rise in intonation (for yes-no questions) or they employ a question word (such as what, how, and so forth). The third stage appears some 10 months after the youngster has begun forming two-word utterances. It coincides with the appearance of functional auxiliary verbs and negative sentences and is characterized by the production of wellformed questions. One interesting observation which Bellugi noted was a limitation on the number of transformations under control in the third stage. If a negative and a question were both to appear in a single utterance, only one or the other aspect was under good control. One cluqd could ask Can't it be a bigger truck? in which it appears that the negative question is understood in both respects, but the same child revealed that the question aspect was less well controlled than the negative aspect in failing to make the necessary inversion for the question Why the kitty can't stand up? As Ervin and Miller (1963, p. 125) noted, "The great advantage of a transformational account is that a small number of rules compared to a large number of possible sentences results in an economy of description and a potential economy in learning as well." Yet the transformational account of the model may not reflect the grammar of the language of the child. Ervin (1963) notes:

58

Vetter and Howell

"Chomsky has described the various uses of do in adult EnEli~h economically as based on the same rule. Do we find the use of do in negatives, interrogatives, ellipsis, and emphasis appearing concurrently? Quite clearly this is not the case. As we have seen, don't appears early in negatives. It is often the only negative signal In interrogatives, question words or a rising pitch signal a question, and do is typically not present until months after it appears in negatives or in ellipsis. . . . Sentences which are described as generated through transformation rules in the adult grammar, may be based on different, and simpler rules in the early stages of the child's grammar." While Ervin's account might seem discouraging for those concerned with developing a common base from which to approach ontogenesis in divergent languages, we should note that the postulation of a single rule to account for the several transformations involving do in the adult model at least provide a perspective for the understanding of the way a child acquires the model usages o f do. And before moving along to a general consideration o f what might constitute an adequate theory of ontogenesis, we might note that Ervin's description of don't as a general negative signal is in close correspondence to Slobin's description of the Russian child's use of nyet as a general negative signal. Negation would appear to be one kind of grammatical feature that could be approached comparatively with ease and confidence.

THEORIES OF LANGUAGE ONTOGENESIS A full appreciation of the complexities of language acquisition probably requires some rather basic grasp of what is required in order to describe the grammar of the adult. Of course, the full description of English has yet to appear and, for reasons to be illustrated below, may never appear. Yet, as Chomsky and perhaps all other serious students of the problem state, we do have an implicit set of rules that enable us to generate grammatical sentences and to comprehend the novel utterances which have been generated by the same set of rules or grammar. And the child somehow acquires or creates a series of grammatical systems which eventually correspond in all essentials to that of the model. So even though we may not be able fully to describe English, we must develop a theory adequate to explain how we all have come to know English (or whatever our native language is). The extreme difficulty of the problem may be seen in the following discussion of a series of strings of words, some of which are readily identified as English by all speakers and some of which would be rejected as English by nearly all speakers, at least initially. Most of the strings derive from examples provided by Chomsky, though the discussion more directly follows Lenneberg (1967):

Theories of Language Acquisition

colorless green ideas sleep furiously furiously sleep ideas green colorless

59

(1) (2)

Somehow string (1) strikes us as being more grammatical or more nearly like English than the second, though neither makes much sense. Thus, meaning, in the ordinary sense, is not the criterion for grammaticality. Nor is the probability of occurrence of the strings because both had a probability of approximately zero before Chomsky introduced them. Just as the difference between strings (1) and (2) cannot be attributed to a difference in the probabilities of the individual words, neither can it be attributed to the transitional probabilities between parts of speech. That is, we do not invariably see a chain of the form adjective-noun-verb-adverb as being grammatical, while sometimes we see the reverse sequence as being grammatical:

occasionally call warfare useless useless warfare call occasionally

(3) (4)

String (3)is perfectly acceptable as an instruction, but (4) is not grammatical. Yet the order of the parts of speech is the same for string (3)

occasionally call warfare useless and string (2)

furiously sleep ideas green colorless On the other hand, the order is the same for strings (1) and (4) though (1) colorless green ideas sleep furiously is perceived as much more nearly grammatical than (4) useless warfare call occasionally. Nor may we attribute the sense of grammaticalness to the order of the morphemes -less, -s, and -ly, because this order makes sense in

friendly young dogs seem harmless

(5)

but it does not make sense in string (2). Since we know that word order is important for distinguishing subject and object in English, it is obvious that the meaning of the individual words cannot define the meaning of the sentence:

the fox chases the dog the dog chases the fox

(6) (7)

Yet the relative order of subject and object is insufficient to define the meaning of the sentence, because in

the dog is chased by the fox the order is the same as that of (7) but the meaning is that of (6).

(8)

60

Vetter and Howell

Ultimately the child will develop a competence that will tell him that strings (3) and (5)-(8) are grammatical and make good sense. String (1) seems grammatical but is semantically anomalous. Yet (1) is acceptable as being not only English, but a particularly elegant kind of English-namely, it is poetic. Why should this be so? The answer is not to be found here, but perhaps a clue is to be found in the response of'a class of graduate students in descriptive linguistics at the City University of New York. After studying the sentences for a few minutes most of them said that they could accept all of the strings, and a few expressed a preference for (2) over (1)! Again the context was poetic, but the implication is rather strong that there is something about the brain that gives us such flexibility that we can learn to accept almost anything linguistic. Presumably the "meaningless" strings are somehow structured by the listener even though this does not appear to be possible according to the ordinary rules of the model. New constructions do appear in the language and, even aside from sarcasm, meanings can be reversed. Some years ago, for example, the expression I could care less was popular among American military personnel in Japan, though the clear meaning was I couldn't care less. Evidently the negative marker was missed but the meaning induced from the context, and then somehow the expression became fLxed in its new form. Grammatically there is no problem, since both forms of the expression are acceptable, but the example is not irrelevant to the question of language acquisition. Meanings ordinarily are induced from context and may override considerations of one's own grammar. Thus, we do not ordinarily produce double negatives in our utterances, but our linguistic competence is adequate to deal with them when they are produced by individuals whose speech variety includes them. Similarly we all must have a competence which is adequate to deal with (comprehend) normal speech, which Wpically is f'dled with various hesitation phenomena, including false starts, fragmentary utterances, and a variety of grammatically anomalous strings. This point is well understood by modem linguists, of course, but the flexibility of the brain that is able to perform the stunt of comprehending what frequently borders on gibberish has not always been appreciated by many psychologists who devote their lives to the demonstration that word meanings depend upon specific referents, even though the very nature of the symbol is that it can represent anything. The ability to induce meanings and to structure them is present very early in the ontogenesis of language. It should be apparent by now that no t h e o r y which bases language acquisition on simple imitation is going to be very useful. An adequate theory will have to account for the structuring process. The child induces the grammatical and referential meanings from the utterances in his linguistic environment; these form the basis of his linguistic

Theories of Language Acquisition

61

competence. But his linguistic performance is different from that of the adult, whose performance usually provides a more immediate demonstration of his competence and frequently corresponds precisely if the adult is given adequate time to prepare his utterances (if his speech is not spontaneous). The principal concern of the linguist is competence rather, than performance, la langue rather than /a parole, the abstract language code rather than the particular speech events. McNeill (1966a, 1966b, 1966c) feels that the student of language ontogenesis is ultimately concerned with performance as well as competence, but that understanding of the latter is prerequisite to an understanding of the former. Yet the studies cited in the foregoing pages all point to the evident fact that the child develops rules of his own which can be induced by the investigator from the child's performances, while the child's competence is always tested in terms of the model of the adults. Since we know the future result of the child's operations-mastery of the adult model-this probably creates no serious difficulties, but it means there are some logical problems in statements of the sort offered by McNeill (1966a, p. 17): "If it is possible to describe a child's linguistic performance,we must show how it derives from competence; that is, how the regularities in a child's grammatical knowledge produce regularities in his overt linguistic behavior. Nothing short of this will suffice." We have said that the linguist is principally concerned with competence, but the psycholinguist must be concerned also with performance, as McNeill indicated. Still, our understanding of performance is a major basis for inducing competence (better ways, for some purposes, involve experiments calling for differential nonlinguistic responses to appropriate linguistic stimuli). Moreover, the very flaws of performance provide valuable clues to the development of competence as well as to a broader range of psycholinguistic problems (see Howell and Vetter, 1969, and Lushley, 1951). Also, it can be shown that there are nordinguistic rules that govern a good part of speech performancesin fact, this is a concept basic to the whole discipline of sociolinguistics. In brief, then, an adequate theory of language ontogenesis is going to have to account for the original and creative bridges constructed by the child to get from his original experimental and maturational limitations to the rules that underlie adult linguistic performances. Such a theory will have to include and go beyond a theory that will account for adult competence and performance, since that is only the final stage of ontogenesis. Very likely such a theory will have to be keyed to theories of cognitive development-"a child's capacity to recognize, discriminate, and manipulate the features and process of the world around him" (Carroll, t964, p. 31).

62

Vetter and Howell

NOTE ADDED IN PROOF

As this article went to press, the authors received the first prepublication reports of the University of California Child Language Development Project from Professor Dan Slobin. There was no time, unfortunately, to do justice to this exceedingly valuable and important new source of information, but we hope to do so in a forthcoming publication. REFERENCES BeIlugi, U. (1966). Development of negative and interrogative structures in the speech of children. In Bever, T., and Weksel, W. (eds.), Studies in Psyeholinguistics, Holt, Rinehart and Winston, New York [cited in Lenneberg (1967)]. Bernstein, B. (1966). Elaborated and restricted codes: an outline. Sociol. Inquiry 36: 254-261. Bever, T., Fodor, J. A., and Weksel, W. (1965). On the acquisition of syntax: A critique of "eontexual generalization."Psychol. Rev. 72: 46"/-482. Boas, F. (1949). Race, Language and Culture. The MacMillan Company, New York. Braine, M. D. S. (1963). The ontogeny of English phrase structure: The first phase. Language 39: 1-13. Brown, R., and BeIlugi, U. (1964). Three processes in the child'sacquisition of syntax. In Lenneberg, E. (ed.), N e w Directions in the Study of Language, The M.I.T. Press, Cambridge, Massachusetts. Brown, R., and Berko, L (1960). Word association and the acquisition of grammax. Child Develop. 31: 1-14. Brown, R., and Fraser, C. (1964). The acquisition of syntax. In Bellugi,U., and Brown, R. (eds.), Monographs of the Society for Research in Child Development, Vol. 29, No. 1, pp. 43-79. Also in Corer, C. N., and Musgrave, B. (eds.). Verbal Behavior and Learning. McGraw-Hill, New York, 1963. Burling, R. (1959). Language development of a Gaso and English speaking child. Word 15: 45-68. Carroll, J. B. (1960). Language acquisition,bilinguailsm, and language change. In Encyclopedia of Educational Research, The Macmillan Company, New York. Also in Sapotta, S. (ed.) (1961). Psycholinguistics.Holt, Rinehart and Winston, Inc.,New York. Carroll, J. B. (1964). Language and Thought. Prentice-Hall,Engiewood Cliffs,New Jersey. Casagrande, J. B. (1948). Comanche baby language. Intern. J. Am. Linguist. 14: 11-14. Chomsky, N. (1959). Review of B. F. Skinner's Verbal Behavior [Appleton-CenturyCrofts, Inc., 1957, New York]. Language 35: 26-58. Eisenson, J., Auer, J. J., and Irwin, J. V. (1964). The Psychology of Communication. Appleton-Century-Crofts, New York. Ercin, S. M. (1963). Structure in children's language. Paper presented to the International Congress of Psychology, Washington, D.C., 1963. Etvin, S. M. (1964). Imitation and structural change in children's language. In Lenneberg, E. H. (ed.), New Directions in the Study of Language, The M.I.T. Press, Cambridge, Massachusetts, pp. 163-189. Etvin, S. M., and Miller,W. R. (1963). Language development. In Sixty.Second Yearbook of the National Society for the Study of Education, Part L Child Psychology. Umversity of Chicago Press, Chicago, Illinois, pp. 108-143. Fishman, J. (1960). A systematization of the WhozI'mn hypothesis. Behav. 3ci. 5: 323-339.

Theories of Language Acquisition

63

Greenberg, J. H. (1963). Universalsof Language. M.I.T. Press, Cambridge, Massachusetts [cited in Slobin (1966)]. Gregoire, A. (1947). L'apprentissagedu Language (2 vols.). Libralrie Droz S.A., Geneva, Switzerland [cited in Ervin and Miller (1963)]. Howell, R. W., (1967). Linguistic Choice as an Index to Social Change. Unpublished Ph.D. dissertation, University of California, Berkeley, California. Howell, R. W., and Vetter, H. J. (1969). Hesitation in the production of speech. J. Gen. Psyehol. 81: 261-276. Irwin, O. C. (1946). Infant speech: vowel and consonant frequency. Y. Speech Disorders 2: 123-125. Irwin, O. C. (1948). Development of vowel sounds. Or- Speech Hearing Disorders 13: 31-34 [cited in Ervin and Miller (1963)]. Irwin, O. C. (1951). Infant speech: consonantal position. J. Speech Hearing Disorders 16: 159-161 [cited in Ervin and Miller (1963)]. Irwin, O. C. (1960). Language and communication. In Mussen, P. H. (ed.), Handbook of Research Methods in Child Development, John Wiley and Sons, New York, pp. 487-516 [cited in Err'in and Miller (1963)]. Jakobsen, R. (1941). Kinderspraehe, Aphasic und allgemeine Lautgesetze. Uppsala Universitets Aarsskrift [cited in Lenneberg (1967)]. Jenkins, J. J., and Palermo, D. S. (1964). Mediation processes and the acquisition of linguistic structure. In Bellugi, U., and Brown, R. (eds.), Monographs of the Society for Research in Chtld Development, Vol. 29, No. I, pp. 141-169. Kaezmare!

Theories of language acquisition.

Prior to the advent of generative grammar, theoretical approaches to language development relied heavily upon the concepts ofdifferential reinforcemen...
2MB Sizes 0 Downloads 0 Views