Cognitive Science 40 (2016) 455–465 Copyright © 2015 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1111/cogs.12243

Cross-Situational Learning of Minimal Word Pairs Paola Escudero,a Karen E. Mulak,a Haley A. Vlachb b

a MARCS Institute, University of Western Sydney Department of Educational Psychology, University of Wisconsin-Madison

Received 7 July 2013; received in revised form 19 December 2014; accepted 19 December 2014

Abstract Cross-situational statistical learning of words involves tracking co-occurrences of auditory words and objects across time to infer word-referent mappings. Previous research has demonstrated that learners can infer referents across sets of very phonologically distinct words (e.g., WUG, DAX), but it remains unknown whether learners can encode fine phonological differences during cross-situational statistical learning. This study examined learners’ cross-situational statistical learning of minimal pairs that differed on one consonant segment (e.g., BON–TON), minimal pairs that differed on one vowel segment (e.g., DEET–DIT), and non-minimal pairs that differed on two or three segments (e.g., BON–DEET). Learners performed above chance for all pairs, but performed worse on vowel minimal pairs than on consonant minimal pairs or non-minimal pairs. These findings demonstrate that learners can encode fine phonetic detail while tracking word-referent co-occurrence probabilities, but they suggest that phonological encoding may be weaker for vowels than for consonants. Keywords: Word learning; Cross-situational statistical Phonologically minimal pairs; Consonants; Vowels

learning;

Statistical

learning;

1. Introduction Word learning is a difficult task. Explicit connections between words and their referents are not regularly made, presenting the learner with an often wide range of possible referents for one linguistic label in any one moment in time. While research has focused on how learners acquire word-referent mappings in single trials (e.g., Markman, 1989; Smith, 2000), research has more recently shifted toward examining how learners resolve ambiguity in word-referent mappings across several points in time. This phenomenon is most commonly termed cross-situational word learning or statistical word learning Correspondence should be sent to Paola Escudero, MARCS Institute, University of Western Sydney, Locked Bag 1797, Penrith NSW 2751, Australia. E-mail: [email protected]

456

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

(Fazly, Alishahi, & Stevenson, 2010; Fitneva & Christiansen, 2011; Frank, Goodman, & Tenenbaum, 2009; Kachergis, Yu, & Shiffrin, 2012; Vlach & Sandhofer, 2014; Yu & Smith, 2007, 2012; Yurovsky, Yu, & Smith, 2013). In a typical cross-situational word learning paradigm, participants are presented with a series of ambiguous learning trials consisting of multiple objects and multiple words, with no explicit indication of word-object correspondences. After the learning trials, participants are presented with a forced-choice test in which they are asked to identify objectlabel mappings. Adult learners can infer word-object co-occurrences at test (e.g., Yu & Smith, 2007) and retain mappings over extended periods of time (Vlach & Sandhofer, 2014). In brief, this body of work has revealed that learners can acquire object-label mappings by tracking word-object co-occurrences across learning trials. To date, cross-situational word learning experiments have used words that contain minimal to no phonological overlap, such as WUG and DAX (e.g., Fitneva & Christiansen, 2011; Vlach & Sandhofer, 2014; Yu & Smith, 2007). In these cases, encoding full phonetic details is not necessary to distinguish words. However, in real-world word learning situations, many words do share segments, some even to the point of forming minimal pairs in which two words are identical except for one consonant, as in BET and DEBT, or one vowel, as in BET and BEAT. Consequently, in natural language learning environments, the phonological details that make up words must be encoded in order for the words to be successfully learned. In explicit word learning tasks, adults have some difficulty distinguishing minimal word pairs (Escudero, Broersma, & Simon, 2013; Escudero, Simon, & Mulak, 2014; Papagno & Vallar, 1992). For instance, Escudero et al. (2013) presented Dutch listeners with a word learning task in which they were explicitly taught 12 pseudowords and their corresponding novel visual referents. At test, listeners made more errors for words presented in a minimal pair (e.g., PAX–PIX) than for words presented in a non-minimal pair (e.g., BEEPTOE–PIX). Because adult learners can perceive the phonological distinctions of their native language, their experienced difficulty with minimal pairs is unlikely to be a result of failing to perceive differences across word segments. Instead, adults may struggle to encode phonetic details for novel words on the fly. Adults’ poorer performance for minimal word pairs relative to non-minimal pairs leads us to the question of whether listeners’ implicit tracking of cross-situational statistics is likewise affected by phonological overlap between words, and whether the type of phonological overlap is important to word learning. Vowels and consonants seem to be perceived differently. While consonants tend to be perceived categorically (Liberman, Cooper, Shankweiler, & Studdert-Kennedy, 1967), vowels tend to be perceived in a more continuous manner (Beddor & Strange, 1982; Fry, Abramson, Eimas, & Liberman, 1962; € Polka, 1995; Stevens, Liberman, Studdert-Kennedy, & Ohman, 1969). Consonants can also play a more prominent or salient role in word recognition (Nespor, Pe~na, & Mehler, 2003) and lexical access (Cutler, Sebastian-Galles, Soler-Vilageliu, & van Ooijen, 2000; New, Ara ujo, Bour, & Nazzi, 2008a; New, Araujo, & Nazzi, 2008b) than vowels. Accordingly, novel words that differ by only one vowel may be less easily learned than words that differ by only one consonant. Thus, this study examined whether minimal

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

457

pairs can be learned during cross-situational statistical learning, and whether performance differs for consonant minimal pairs and vowel minimal pairs. We presented adult listeners with eight novel English words and eight picture referents and did not offer any explicit instructions regarding the aim of the experiment or relationship of the words to the picture referents. The eight novel words were CVC monosyllables such as BON and DEET that, when paired, formed (a) a non-minimal pair (nonMP), in which two or all three segments in each word differed (e.g., BON–DEET); (b) a consonant minimal pair (consMP), in which the initial consonant differed, but vowel and final consonant were shared (e.g., BON–TON); or (c) a vowel minimal pair (vowelMP), in which the vowel differed but the initial and final consonants were shared (e.g., DEET–DIT). The novel words were chosen from those included in previous studies on learning of novel minimal pairs (see Curtin, Fennell, & Escudero, 2009, for the words differing in vowels, and Fikkert, 2010, for those differing in consonants). Because we chose to use CVC stimuli from previous studies on novel word learning, position of the differentiating segments in consonant and vowel minimal pairs was not controlled for in this study. In sum, we predicted that adults would have lower performance on minimal pairs than on non-minimal pairs and that they would also demonstrate lower performance on vowel pairs than on consonant pairs.

2. Method 2.1. Participants Participants were 71 undergraduates at the University of Western Sydney. A language background questionnaire administered at the beginning of the session revealed that 31 participants were monolingual English speakers and 40 participants spoke two or more languages1 (M = 23.24 years, SD = 7.66 years; 54 females). Participants received course credit for their participation. 2.2. Stimuli 2.2.1. Novel words Eight monosyllabic nonsense words were recorded by a female native speaker of Australian English. As shown in Fig. 1, the words followed a CVC structure, adhered to English phonotactics, and have been used in previous research on the acquisition of minimal pairs (e.g., Curtin et al., 2009; Fikkert, 2010). Four of the words differed minimally in their first consonant, whereas the other four differed in their vowel. Two tokens of each of the eight spoken words were selected for use in the experiment so that intonation contours were comparable across words.

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

465

Markman, E. M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. Marslen-Wilson, W. D., & Zwitserlood, P. (1989). Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception and Performance, 15, 576–585. doi: 10.1037/ 0096-1523.15.3.576. Monaghan, P., Christiansen, M. H., & Fitneva, S. A. (2011). The arbitrariness of the sign: Learning advantages from the structure of the vocabulary. Journal of Experimental Psychology: General, 140(3), 325–347. http://doi.org/10.1037/a0022924 Nazzi, T. (2005). Use of phonetic specificity during the acquisition of new words: Differences between consonants and vowels. Cognition, 98(1), 13–30. doi: 10.1016/j.cognition.2004.10.005. Nazzi, T., & New, B. (2007). Beyond stop consonants: Consonantal specificity in early lexical acquisition. Cognitive Development, 22, 271–279. doi: 10.1016/j.cogdev.2006.10.007. Nespor, M., Pe~na, M., & Mehler, J. (2003). On the different roles of vowels and consonants in speech processing and language acquisition. Lingue E Linguaggio, 2, 203–229. New, B., Araujo, V., Bour, N., & Nazzi, T. (2008a). Consonants, but not vowels, prime lexical decision following masked priming. The Journal of the Acoustical Society of America, 123(5), 3323. doi: 10.1121/ 1.2933806. New, B., Araujo, V., & Nazzi, T. (2008b). Differential processing of consonants and vowels in lexical access through reading. Psychological Science, 19(12), 1223–1227. doi: 10.1111/j.1467-9280.2008.02228.x. Papagno, C., & Vallar, G. (1992). Phonological short-term memory and the learning of novel words: The effect of phonological similarity and item length. The Quarterly Journal of Experimental Psychology, 44A (1), 47–67. doi: 10.1080/14640749208401283. Pater, J., Stager, C., & Werker, J. F. (2004). The perceptual acquisition of phonological contrasts. Language, 80(3), 384–402. doi: 10.1353/lan.2004.0141. Pe~ na, M., Bonatti, L. L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298(5593), 604–607. doi: 10.1126/science.1072901. Perea, M., & Carreiras, M. (2006). Do transposed-letter effects occur across lexeme boundaries? Psychonomic Bulletin & Review, 13(3), 418–422. doi: 10.3758/BF03193863. Perea, M., & Lupker, S. J. (2004). Can CANISO activate CASINO? Transposed-letter similarity effects with nonadjacent letter positions. Journal of Memory and Language, 51(2), 231–246. doi: 10.1016/ j.jml.2004.05.005. Polka, L. (1995). Linguistic influences in adult perception of non-native vowel contrasts. Journal of the Acoustical Society of America, 97(2), 1286–1296. doi: 10.1121/1.412170. Smith, L. B. (2000). How to learn words: An associative crane. In R. M. Golinkoff & K. Hirsh-Pasek (Eds.), Breaking the word learning barrier (pp. 51–80). Oxford, England: Oxford University Press. Stager, C. L., & Werker, J. F. (1997). Infants listen for more phonetic detail in speech perception than in word-learning tasks. Nature, 388(6640), 381–382. doi: 10.1038/41102. € Stevens, K. N., Liberman, A. M., Studdert-Kennedy, M., & Ohman, S. (1969). Cross-language study of vowel perception. Language and Speech, 12, 1–23. Vlach, H. A., & Sandhofer, C. M. (2014). Retrieval dynamics and retention in cross-situational statistical learning. Cognitive Science, 38, 757–774. doi: 10.1111/cogs.12092. Werker, J. F., Cohen, L. B., Lloyd, V. L., Casasola, M., & Stager, C. L. (1998). Acquisition of word-object associations by 14-month-old infants. Developmental Psychology, 34, 1289–1309. doi: 10.1037/00121649.34.6.1289. Yu, C., & Smith, L. B. (2007). Rapid word learning under uncertainty via cross-situational statistics. Psychological Science, 18(5), 414–420. doi: 10.1111/j.1467-9280.2007.01915.x. Yu, C., & Smith, L. B. (2012). Modeling cross-situational word-referent learning: Prior questions. Psychological Review, 119, 21–39. doi: 10.1037/a0026182. Yurovsky, D., Yu, C., & Smith, L. B. (2013). Competitive processes in cross-situational word learning. Cognitive Science, 37, 891–921. doi: 10.111/cogs.12035.

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

459

each named once, either left to right, or right to left, with 500 ms between spoken words. There was no indication of the order in which the visual referents would be named. Learning trials were presented in random order for each participant. The trials were controlled such that each visual referent occurred with every other visual referent at least once and no more than twice. If the same pairing occurred more than once, the left and right designations of the images were swapped so that participants never saw exactly the same visual pairing more than once. As each word appeared nine times, the occurrence of an image in the left or right position was balanced so that half of the words’ referents appeared five times on the left and four times on the right, whereas the other half appeared four times on the left and five times on the right. Whether a visual referent was named first or second, and how many times each of the two tokens of each nonsense word were heard, were balanced similarly. The word pair presented in each trial belonged to one of three possible phonological relationship types based on the phonological similarity between the two spoken words. These three pair types were (a) non-minimal pairs (nonMP), (b) consonant minimal pairs (consMP), and (c) vowel minimal pairs (vowelMP). The training set consisted of 24 nonminimal pairs, 6 consonant minimal pairs, and 6 vowel minimal pairs, for a total of 36 pairs. Each learning trial lasted 3.5 s, and an attention video followed every third trial except the last, and played until participants focused on the center of the screen. The total duration of the learning phase was approximately 3 min. Examples of learning trials are presented in Fig. 2. 2.3.2. Testing phase Testing began after completion of the learning phase. Participants sat in front of a laptop computer with a 15-in. monitor, which was set up next to the monitor used for the training. The test trials contained the same pairs of novel objects as in training, but with the left and right designations of the images randomized once, in such a way that each participant received the same trials. For each trial, once the two images had been on the screen for 500 ms, the spoken word corresponding to one of the images (the target object) was played four times, in two alternating repetitions of the two tokens, with 500 ms between repetitions. Each word served as the target four or five times. As in the

Fig. 2. Examples of learning and test trials.

460

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

training set, the test comprised 24 nonMP trials, 6 vowelMP trials, and 6 consMP trials. The order in which the trials were presented was randomized for each participant. Participants were presented with a forced-choice test and were instructed to select via keyboard press whether they thought the spoken word in a trial corresponded to the left or right image. Each trial lasted 6.5 s, resulting in a total testing phase duration of approximately 4 min. Examples of test trials are presented in Fig. 2.

3. Results We were interested in whether learners would be able to infer word-object pairings via cross-situational statistical learning when faced with the task of simultaneously distinguishing phonologically similar words and tracking object-label statistics. As shown in Fig. 3, the mean percentage of correct responses over all subjects and all CVCs was better than chance for all three types of word pairs (nonMP: M = 74.26, SE = 2.39, t[70] = 10.168, p < .001; consMP: M = 74.65, SE = 2.45, t[70] = 10.045, p < .001; vowelMP: M = 67.14, SE = 2.80, t[70] = 6.127, p < .001). Thus, learners were able to infer word-object pairings for each category of phonological relationship between words. We were also interested in whether there were differences in word learning performance across the three trial types. As our data comprised categorical responses, participants’ correct and incorrect responses were examined in a mixed-effects logit model (Baayen, Davidson, & Bates, 2008; Jaeger, 2008; see also Arnon, 2010) with pair type (nonMP, cons-

Fig. 3. Difference in percent accurate identification of auditory words and their visual referents from chance after cross-situational training. Participants heard one auditory word in the context of two visual referents and were asked to select the image that corresponded to the auditory word. Responses are organized by trial type, based on the type of phonological overlap across the auditory words corresponding to the visual referents in each trial. The trial types are nonMP (non-minimal pair, e.g., BON–DUT), consMP (consonant minimal pair, e.g., BON–PON), and vowelMP (vowel minimal pair, e.g., DEET–DUT). Error bars represent one standard error.

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

461

MP, vowelMP) as a fixed effect (with nonMP as the reference category), and subject, pair and target (i.e., the auditory word played during the test trial) as random effects. There was a main effect of pair type (v2(2, n = 2556) = 6.96, p = .031). Participants had fewer correct responses to vowelMPs than nonMPs (b = 0.31, 95% CI [ 0.57, 0.05], p = .020, eb = 0.73). No difference was found between consMPs and nonMPs (b = .07, [ 0.14, 0.28], p = .493, eb = 1.08). In summary, only performance on vowel minimal pairs was worse than performance on non-minimal pairs. To compare participants’ performance across the consonant and vowel minimal pairs, we conducted a further mixed-effects logit model with pair type (consMP, vowelMP) as a fixed effect, and pair and target as random effects. There was a main effect of pair type (v2(1, n = 840) = 6.50, p = .011); participants had fewer correct responses to vowelMPs than consMPs (b = 0.38, [ 0.68, 0.09], p = .011, eb = 0.68).

4. Discussion This study demonstrates that adults can learn minimal pairs, as well as non-minimal pairs, in a cross-situational paradigm. Learners simultaneously acquired cross-situational statistics and encoded the novel words in fine phonological detail that was required to differentiate minimal pairs. Thus, this study extends previous findings on cross-situational learning (Fazly et al., 2010; Fitneva & Christiansen, 2011; Frank et al., 2009; Kachergis et al., 2012; Vlach & Sandhofer, 2014; Yu & Smith, 2007, 2012; Yurovsky et al., 2013) by demonstrating that a more challenging set of word-referent pairings can still be learned through tracking the co-occurrence of words and objects. Previous work on minimal pairs has found that novel word learning performance is lower for words that share segments for native and non-native listeners (e.g., Escudero et al., 2013; Escudero, Simon, et al., 2013, 2014; Papagno & Vallar, 1992). While our results revealed that accuracy for vowel minimal pairs was lower than for non-minimal pairs, there was no difference in performance between consonant minimal pairs and nonminimal pairs. Thus, these results suggest that the type of segments that two words share also influences word-pair learning. Data from corpus, behavioral, and modeling studies suggest that minimal pairs may be more difficult to learn than non-minimal pairs because word learning is aided by variability across novel words such that learners are not required to learn words with very similar sounds in the same context (Monaghan, Christiansen, & Fitneva, 2011). Our findings qualify this previous research by showing that vowel minimal pairs are the most difficult to learn. We propose that this finding may be due to the fact that the vowels used in our experiment were in close acoustic proximity to one another on the basis of their formant values (see Escudero, Best, Kitamura, & Mulak, 2014; Fig. 1). This acoustic-phonetic proximity paired with the fact that linguistic processing of isolated novel words relies on sound-by-sound, bottom–up information may have made learning these vowel minimal pairs particularly difficult.

462

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

Unlike isolated novel words, sentence intelligibility tasks incorporate considerable predictive information from top–down processing. Thus, future research should investigate whether a different pattern of results are observed when words presented in the context of fluent speech. Notably, vowel information is more important than consonant information when identifying words in fluent speech (Cole, Yan, Mak, Fanty, & Bailey, 1996; Kewley-Port, Burkle, & Lee, 2007), which may be due to the proposal that vowels carry prosodic information (Nespor et al., 2003), perhaps making them more naturally perceived in running speech than in individual words. Moreover, removing consonants and presenting only vowels from sentences produced in fluent speech resulted in a 2:1 benefit over presenting only consonant information for both young normal-hearing listeners and elderly hearing-impaired listeners (Kewley-Port et al., 2007). It might be expected then that in the context of fluent speech, performance for vowel minimal pairs may be improved. Interestingly, adults’ poorer performance on vowel than on consonant minimal pairs contrasts with research on infants under 17 months. When tested in the Switch task (Werker, Cohen, Lloyd, Casasola, & Stager, 1998), which is an explicit associative learning paradigm, 14-month-olds do not reliably learn consonant minimal pairs (Pater, Stager, & Werker, 2004; Stager & Werker, 1997; see also Fikkert, 2010). However, the only two studies that have tested infants’ learning of vowel minimal pairs in the same explicit associative learning task have found infants to have some success. Fifteen-month-olds learned the pair DEET–DIT (but not DEET–DOOT or DOOT–DIT; Curtin et al., 2009) and learned DEET–DIT and DEET–DOOT produced in an unfamiliar Canadian English accent (but not in their native Australian English accent; Escudero, Best et al., 2014). A possible explanation for this asymmetry between young infants and adults may be due to adults’ and older infants’ status as experienced word learners relative to 15-montholds, who are in the process of developing a lexicon. Consonant information is more important than vowel information in lexical processing (e.g., Berent & Perfetti, 1995; Carreiras, Vergara, & Perea, 2007; Lee, Rayner, & Pollatsek, 2001, 2002; Perea & Carreiras, 2006; Perea & Lupker, 2004) and lexical acquisition (Bonatti, Pe~na, Nespor, & Mehler, 2005; Nazzi, 2005; Nazzi & New, 2007; Nespor et al., 2003; Pe~na, Bonatti, Nespor, & Mehler, 2002). For example, in an experiment using response time and electrophysiological measures, a delay in the presentation of orthographic consonant information was more detrimental for lexical processing than a delay in the presentation of orthographic vowel information (Carreiras, Vergara, & Perea, 2009). Overall, this line of research has proposed that the main role of consonants is to encode meaning, whereas vowels enable the identification of rhythm and syntactic structure (Nespor et al., 2003). As experienced word learners, the adult participants in our word learning task may have attended more to consonant than vowel information, which might account for their lower performance for vowel minimal pairs. In addition, the vowel minimal pairs always contained vowels that occurred as the medial segment. Given that initial segments may have a more prominent role than later segments in lexical access and identification (Allopenna, Magnuson, & Tanenhaus, 1998; Marslen-Wilson & Zwitserlood, 1989), it is possible that our adult participants may have paid more attention to initial segments,

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

463

masking any intrinsic differences in the processing of consonants versus vowels, as all vowel minimal pairs had the same initial segment. Research in our laboratory further explores the difference between novel and experienced word learners (i.e., infants versus adults) using a cross-situational learning task. Specifically, we are currently exploring whether differences in performance occur for individual consonant and vowel minimal pairs, as found in Curtin et al. (2009), or whether consonants and vowels are processed differently overall. In sum, mapping new words to referents in the world is a difficult task due to the many possible referents available in any one situation. Nonetheless, even when phonological similarity between words was highly variable, learners were able to map words to ambiguous referents while simultaneously encoding sufficient phonological detail as to be able to distinguish words that share segments, without top–down assistance. This success supports cross-situational statistical learning as a viable model for natural implicit word learning and opens up exciting questions with regard to the generality of this process, its applicability across languages other than English, and its influence on word learning in infancy.

Acknowledgments This research was supported by MARCS Institute start-up funds awarded to the first author. The second author’s work was also supported by Australian Research Council grants DP130102181 and CE140100041.

Note 1. The results showed no effect of language background, and thus language background will not be discussed further.

References Allopenna, P. D., Magnuson, J. S., & Tanenhaus, M. K. (1998). Tracking the time course of spoken word recognition using eye movements: Evidence for continuous mapping models. Journal of Memory and Language, 38(4), 419–439. doi: 10.1006/jmla.1997.2558. Arnon, I. (2010). Rethinking child difficulty: The effect of NP type on children’s processing of relative clauses in Hebrew. Journal of Child Language, 37(1), 27–57. doi: 10.1017/S030500090900943X. Baayen, R. H., Davidson, D. J., & Bates, D. M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59(4), 390–412. doi: 10.1016/j.jml.2007.12.005. Beddor, P. S., & Strange, W. (1982). Cross-language study of perception of the oral-nasal distinction. Journal of the Acoustical Society of America, 71(6), 1551–1561. doi: 10.1121/1.387809. Berent, I., & Perfetti, C. A. (1995). A rose is a REEZ: The two-cycles model of phonology assembly in reading English. Psychological Review, 102(1), 146–184. doi: 10.1037/0033-295X.102.1.146.

464

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

Bonatti, L. L., Pe~na, M., Nespor, M., & Mehler, J. (2005). Linguistic constraints on statistical computations: The role of consonants and vowels in continuous speech processing. Psychological Science, 16(6), 451–459. doi: 10.1111/j.0956-7976.2005.01556.x. Carreiras, M., Vergara, M., & Perea, M. (2007). ERP correlates of transposed-letter similarity effects: Are consonants processed differently from vowels? Neuroscience Letters, 419(3), 219–224. doi: 10.1016/ j.neulet.2007.04.053. Carreiras, M., Vergara, M., & Perea, M. (2009). ERP correlates of transposed-letter priming effects: The role of vowels versus consonants. Psychophysiology, 46(1), 34–42. doi: 10.1111/j.1469-8986.2008.00725.x. Cole, R. A., Yan, Y., Mak, B., Fanty, M., & Bailey, T. (1996). The contribution of consonants versus vowels to word recognition in fluent speech. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (Vol. 2, pp. 853–856). Atlanta, GA: IEEE. doi: 10.1109/ICASSP.1996.543255. Curtin, S. A., Fennell, C., & Escudero, P. (2009). Weighting of vowel cues explains patterns of word-object associative learning. Developmental Science, 12(5), 725–731. doi: 10.1111/j.1467-7687.2009.00814.x. Cutler, A., Sebastian-Galles, N., Soler-Vilageliu, O., & van Ooijen, B. (2000). Constraints of vowels and consonants on lexical selection: Cross-linguistic comparisons. Memory & Cognition, 28(5), 746–755. doi: 10.3758/BF03198409. Escudero, P., Best, C. T., Kitamura, C., & Mulak, K. E. (2014). Magnitude of phonetic distinction predicts success at early word learning in native and non-native accents. Frontiers in Psychology, 5, 1059. doi: 10.3389/fpsyg.2014.01059. Escudero, P., Broersma, M., & Simon, E. (2013). Learning words in a third language: Effects of vowel inventory and language proficiency. Language and Cognitive Processes, 28, 746–761. doi: 10.1080/ 01690965.2012.662279. Escudero, P., Simon, E., & Mulak, K. E. (2014). Learning words in a new language: Orthography doesn’t always help. Bilingualism: Language and Cognition, 17(02), 384–395. doi: 10.1017/S1366728913000436. Fazly, A., Alishahi, A., & Stevenson, S. (2010). A probabilistic computational model of cross-situational word learning. Cognitive Science, 34(6), 1017–1063. doi: 10.1111/j.1551-6709.2010.01104.x. Fikkert, P. (2010). Developing representations and the emergence of phonology: Evidence from perception and production. In C. Fougeron, B. K€uhnert, & M. D’Imperio (Eds.), Laboratory phonology 10: Variation, phonetic detail and phonological representation. Vol. 10 (pp. 227–258). Berlin: De Gruyter Mouton. Fitneva, S. A., & Christiansen, M. H. (2011). Looking in the wrong direction correlates with more accurate word learning. Cognitive Science, 35, 367–380. doi: 10.1111/j.1551-6709.2010.01156.x. Frank, M. C., Goodman, N. D., & Tenenbaum, J. B. (2009). Using speakers’ referential intentions to model early cross-situational word learning. Psychological Science, 20(5), 578–585. doi: 10.1111/j.14679280.2009.02335.x. Fry, D. B., Abramson, A. S., Eimas, P. D., & Liberman, A. M. (1962). The identification and discrimination of synthetic vowels. Language & Speech, 5(4), 171–189. Jaeger, T. F. (2008). Categorical data analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models. Journal of Memory and Language, 59(4), 434–446. doi: 10.1016/j.jml.2007.11.007. Kachergis, G., Yu, C., & Shiffrin, R. M. (2012). An associative model of adaptive inference for learning wordreferent mappings. Psychonomic Bulletin & Review, 19(2), 317–324. doi: 10.3758/s13423-011-0194-6. Kewley-Port, D., Burkle, T. Z., & Lee, J. H. (2007). Contribution of consonant versus vowel information to sentence intelligibility for young normal-hearing and elderly hearing-impaired listeners. The Journal of the Acoustical Society of America, 122(4), 2365–2375. doi: 10.1121/1.2773986. Lee, H.-W., Rayner, K., & Pollatsek, A. (2001). The relative contribution of consonants and vowels to word identification during reading. Journal of Memory and Language, 44(2), 189–205. doi: 10.1006/jmla.2000.2725. Lee, H.-W., Rayner, K., & Pollatsek, A. (2002). The processing of consonants and vowels in reading: Evidence from the fast priming paradigm. Psychonomic Bulletin & Review, 9(4), 766–772. doi: 10.3758/ BF03196333. Liberman, A. M., Cooper, F. S., Shankweiler, D. P., & Studdert-Kennedy, M. (1967). Perception of the speech code. Psychological Review, 74(6), 431–461. doi: 10.1037/h0020279.

P. Escudero, K. E. Mulak, H. A. Vlach / Cognitive Science 40 (2016)

465

Markman, E. M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. Marslen-Wilson, W. D., & Zwitserlood, P. (1989). Accessing spoken words: The importance of word onsets. Journal of Experimental Psychology: Human Perception and Performance, 15, 576–585. doi: 10.1037/ 0096-1523.15.3.576. Monaghan, P., Christiansen, M. H., & Fitneva, S. A. (2011). The arbitrariness of the sign: Learning advantages from the structure of the vocabulary. Journal of Experimental Psychology: General, 140(3), 325–347. http://doi.org/10.1037/a0022924 Nazzi, T. (2005). Use of phonetic specificity during the acquisition of new words: Differences between consonants and vowels. Cognition, 98(1), 13–30. doi: 10.1016/j.cognition.2004.10.005. Nazzi, T., & New, B. (2007). Beyond stop consonants: Consonantal specificity in early lexical acquisition. Cognitive Development, 22, 271–279. doi: 10.1016/j.cogdev.2006.10.007. Nespor, M., Pe~na, M., & Mehler, J. (2003). On the different roles of vowels and consonants in speech processing and language acquisition. Lingue E Linguaggio, 2, 203–229. New, B., Araujo, V., Bour, N., & Nazzi, T. (2008a). Consonants, but not vowels, prime lexical decision following masked priming. The Journal of the Acoustical Society of America, 123(5), 3323. doi: 10.1121/ 1.2933806. New, B., Araujo, V., & Nazzi, T. (2008b). Differential processing of consonants and vowels in lexical access through reading. Psychological Science, 19(12), 1223–1227. doi: 10.1111/j.1467-9280.2008.02228.x. Papagno, C., & Vallar, G. (1992). Phonological short-term memory and the learning of novel words: The effect of phonological similarity and item length. The Quarterly Journal of Experimental Psychology, 44A (1), 47–67. doi: 10.1080/14640749208401283. Pater, J., Stager, C., & Werker, J. F. (2004). The perceptual acquisition of phonological contrasts. Language, 80(3), 384–402. doi: 10.1353/lan.2004.0141. Pe~ na, M., Bonatti, L. L., Nespor, M., & Mehler, J. (2002). Signal-driven computations in speech processing. Science, 298(5593), 604–607. doi: 10.1126/science.1072901. Perea, M., & Carreiras, M. (2006). Do transposed-letter effects occur across lexeme boundaries? Psychonomic Bulletin & Review, 13(3), 418–422. doi: 10.3758/BF03193863. Perea, M., & Lupker, S. J. (2004). Can CANISO activate CASINO? Transposed-letter similarity effects with nonadjacent letter positions. Journal of Memory and Language, 51(2), 231–246. doi: 10.1016/ j.jml.2004.05.005. Polka, L. (1995). Linguistic influences in adult perception of non-native vowel contrasts. Journal of the Acoustical Society of America, 97(2), 1286–1296. doi: 10.1121/1.412170. Smith, L. B. (2000). How to learn words: An associative crane. In R. M. Golinkoff & K. Hirsh-Pasek (Eds.), Breaking the word learning barrier (pp. 51–80). Oxford, England: Oxford University Press. Stager, C. L., & Werker, J. F. (1997). Infants listen for more phonetic detail in speech perception than in word-learning tasks. Nature, 388(6640), 381–382. doi: 10.1038/41102. € Stevens, K. N., Liberman, A. M., Studdert-Kennedy, M., & Ohman, S. (1969). Cross-language study of vowel perception. Language and Speech, 12, 1–23. Vlach, H. A., & Sandhofer, C. M. (2014). Retrieval dynamics and retention in cross-situational statistical learning. Cognitive Science, 38, 757–774. doi: 10.1111/cogs.12092. Werker, J. F., Cohen, L. B., Lloyd, V. L., Casasola, M., & Stager, C. L. (1998). Acquisition of word-object associations by 14-month-old infants. Developmental Psychology, 34, 1289–1309. doi: 10.1037/00121649.34.6.1289. Yu, C., & Smith, L. B. (2007). Rapid word learning under uncertainty via cross-situational statistics. Psychological Science, 18(5), 414–420. doi: 10.1111/j.1467-9280.2007.01915.x. Yu, C., & Smith, L. B. (2012). Modeling cross-situational word-referent learning: Prior questions. Psychological Review, 119, 21–39. doi: 10.1037/a0026182. Yurovsky, D., Yu, C., & Smith, L. B. (2013). Competitive processes in cross-situational word learning. Cognitive Science, 37, 891–921. doi: 10.111/cogs.12035.

Cross-Situational Learning of Minimal Word Pairs.

Cross-situational statistical learning of words involves tracking co-occurrences of auditory words and objects across time to infer word-referent mapp...
238KB Sizes 1 Downloads 9 Views