Mem Cogn (2015) 43:298–313 DOI 10.3758/s13421-014-0472-4

Lexico-semantic effects on word naming in Persian: Does age of acquisition have an effect? Mehdi Bakhtiar & Brendan Weekes

Published online: 17 October 2014 # Psychonomic Society, Inc. 2014

Abstract The age of acquisition (AoA) of a word has an effect on skilled reading performance. According to the arbitrary-mapping (AM) hypothesis, AoA effects on word naming are a consequence of arbitrary mappings between input and output in the lexical network. The AM hypothesis predicts that effects of AoA will be observed when words have unpredictable orthography-to-phonology (OP) mappings. The Persian writing system is characterized by a degree of consistency between OP mappings, making words transparent. However, the omission of vowels in the script used by skilled readers makes the OP mappings of many words unpredictable or opaque. In this study, we used factor analysis to test which lexico-semantic variables, including AoA, predict the reading aloud of monosyllabic Persian words with different spelling transparencies (transparent or opaque). Linear mixed-effect regression analysis revealed that a Lexical factor (loading on word familiarity, spoken frequency, and written frequency) and a Semantic factor (loading on AoA, imageability, and familiarity) significantly predict word-naming latencies in Persian. Further analysis revealed a significant interaction between AoA and transparency, with larger effects of AoA for opaque than for transparent words and a significant interaction between imageability and AoA on reading opaque words; that is, AoA effects are more pronounced for lowimageability opaque words than for high-imageability opaque words. Interactions between these factors and spelling transparency suggest that late-acquired opaque words receive greater input from the semantic reading route. Implications for understanding the AoA effects on word naming in Persian are discussed. M. Bakhtiar (*) : B. Weekes Laboratory for Communication Science, Division of Speech and Hearing Sciences, University of Hong Kong, 8/F, Meng Wah Building, Hong Kong, People’s Republic of China e-mail: [email protected]

Keywords Word naming . Transparency . Persian . Age of acquisition (AoA)

Different theories and models have been proposed to explain visual word recognition. The dual-route cascaded (DRC) model of oral reading (Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001) assumes that reading different kinds of letter strings (i.e., words and nonwords) is achieved through at least two distinct and independent pathways: a direct lexical route and a sublexical grapheme-to-phoneme route (Coltheart, 1978; Coltheart, Curtis, Atkins, & Haller, 1993). The DRC model also assumes that a supplementary semantic pathway can be activated whenever reading for meaning is necessary. The DRC model assumes that reading words with predictable or transparent mappings between orthography and phonology—for example, in German—and reading words with less predictable or opaque mappings—for example, in English— can both be explained within its theoretical framework (see, e.g., Ziegler, Perry, & Coltheart, 2000). Other theories make specific assumptions about the effect of orthographic transparency on reading. For example, the orthographic-depth hypothesis (ODH) assumes that oral reading across different scripts is determined by the degree of transparency between orthography and phonology (Frost, Katz, & Bentin, 1987). The ODH assumes that oral reading of words in very transparent scripts in which the grapheme– phoneme correspondence is highly predictable, such as Italian, Spanish, or Turkish, is mediated by assembling phonology from print using a sublexical reading route, and that reading aloud of these transparent words will be less affected by lexical factors such as word frequency and by semantic factors such as rated word imageability. The contrasting view, called the universal hypothesis (Besner, 1987), assumes that oral reading across scripts is independent of transparency. In fact, it is assumed by advocates of this hypothesis that the

Mem Cogn (2015) 43:298–313

most efficient and fastest way to read words in all scripts (even in very transparent orthographies) is via the lexical reading route. One way to test these competing hypotheses has been to examine the effects of different psycholinguistic variables on visual word recognition in languages that vary in orthographic depth. For example, the universal hypothesis predicts effects of word frequency and imageability on word naming, regardless of the orthographic depth of a script. In several studies, the effects of lexical variables such as word frequency, semantic variables such as imageability and/or concreteness, and sublexical variables such as word length have all been found to impact on visual word recognition in both shallow and deep orthographies (Balota, Cortese, Sergent-Marshall, Spieler, & Yap, 2004; Bates, Burani, D’Amico, & Barca, 2001; Burani, Arduino, & Barca, 2007; Crepaldi, Che, Su, & Luzzatti, 2012; Cuetos & Barbón, 2006; Davies, Barbón, & Cuetos, 2013; Liu, Hao, Shu, Tan, & Weekes, 2008; Raman, 2006). Finding an effect of frequency on word naming in transparent languages such as Turkish, Spanish, and Italian supports the universal hypothesis and indicates that a lexical route is used, despite the shallow nature of these orthographies (Burani et al., 2007; Davies et al., 2013; Raman & Baluch, 2001; Raman, Baluch, & Sneddon, 1996). On the other hand, effects of length on word naming show that a sublexical route is involved. Evidence for this has come from studies investigating length effects on word naming in English, showing significant effects of length on the reading of nonwords but not of words (three to six letters long; Weekes, 1997). A recent study showed a U-shaped effect of word length in visual word recognition: a facilitation effect for short words in English (three to five letters in length), a null effect for words with five to eight letters in English (New, Ferrand, Pallier, & Brysbaert, 2006), and an inhibitory effect for longer words in both English and Spanish (i.e., longer than seven or eight letters; González-Nosti, Barbón, Rodríguez-Ferreiro, & Cuetos, 2013; New et al., 2006). These results are assumed to reflect the operation of a sublexical route, which is recruited more for reading longer words and nonwords (González-Nosti et al., 2013; Weekes, 1997). A recent connectionist dual-process model of word reading in Italian (CDP++) supports the universal hypothesis by showing that a computational model can read aloud words more efficiently than nonwords (lexical effect), read high-frequency better than low-frequency words (frequency effect), and also produce a word length effect when reading in a shallow orthography (Perry, Ziegler, & Zorzi, 2014). Less is known about the effects of semantic variables (e.g., rated imageability, concreteness) on visual word recognition across different scripts. Early studies showed an effect of rated word imageability on word recognition that was greater for poor readers of English, who have lower ability when mapping grapheme–phoneme correspondences (Coltheart, Laxon,

299

& Keating, 1988). Studies with adult readers have reported significant effects of imageability on the reading aloud of lowfrequency, irregular words in English (Strain, Patterson, & Seidenberg, 1995, 2002), whereas Monaghan and Ellis (2002b) argued that semantic effects are in fact due to the age of acquisition (AoA) of a word, which is highly correlated with rated imageability. Generally, semantic variables have a greater effect in lexical decision tasks than in word-naming tasks in English, because it is assumed that orthography-tophonology (OP) mapping can be achieved with lower demands on semantic processes in word naming (i.e., through either lexical or sublexical routes; Balota et al., 2004; Yap, Pexman, Wellsby, Hargreaves, & Huff, 2012; though see Connell & Lynott, 2013, and Reimer, Lorsbach, & Bleakney, 2008). In studies of transparent orthographies such as Italian, it has been argued that there is no effect of semantics (i.e., imageability; Barca, Burani, & Arduino, 2002). By contrast, Davies et al. (2013), using principal component analysis (PCA), found that as well as frequency (lexical) and orthographic (neighborhood size) factors, semantic variables do predict word naming in Spanish—another transparent language. They argued that one reason for the null effect of imageability in Italian (Barca et al., 2002) might be the small number of observations sampled in that study, making it difficult to detect a milder effect of semantic variables on word naming in a transparent orthography. Davies et al. proposed that semantic variables do have an effect across different orthographies, including those with more transparent OP mappings (e.g., Turkish; see Raman, 2006) and they emphasized the effect of semantic knowledge on generating correct phonological output during word naming, even in a highly transparent language such as Spanish. Another variable that has a significant constraint on reading skill across scripts is AoA (for reviews, see Johnston & Barry, 2006; Juhasz, 2005). Early acquired words (e.g., ball) are recognized faster than late acquired words (e.g., bald). One theoretical question about AoA effects concerns whether these effects depend on spelling transparency, and therefore whether the effects vary across scripts. Detecting an effect of AoA in visual word recognition is a controversial matter, however. One issue is whether the age of learning a word has a truly unique effect in word naming, and another concerns whether this effect is lexical, semantic, or phonological in nature. Some researchers have questioned the veracity of independent effects of AoA on word naming, arguing that AoA is an alternative measure of word frequency (Balota et al., 2004; Zevin & Seidenberg, 2002). Despite this, effects of AoA on word naming that are independent of correlated variables such as word frequency and rated imageability have been reported in languages with opaque OP mappings such as Chinese and Japanese, as well as with more transparent scripts such as Dutch, French, Italian, Spanish, and Turkish (Juhasz, 2005). One outcome from these studies is the finding that the potency

300

of AoA effects varies with the type of script and the type of word presented. For example, although AoA has an effect on word naming in more transparent alphabetic scripts, as well as in languages using an opaque nonalphabetic script, such as Chinese (Liu et al., 2008) and Japanese (Havelka & Tomita, 2006; Yamazaki, Ellis, Morrison, & Lambon Ralph, 1997), the size of the AoA effect on word naming is larger in English than in the more transparent Dutch (Juhasz, 2005), and is larger if a Japanese word is presented in kanji script rather than in a more transparent kana script (Havelka & Tomita, 2006). Also, AoA effects are larger for irregularly spelled English words—for instance, yacht—than for regularly spelled words (Monaghan & Ellis, 2002b), suggesting that the effects of AoA depend on spelling transparency. Why would the effect of AoA vary with spelling transparency? Zevin and Seidenberg (2002, 2004) simulated AoA effects on lexical acquisition by training a connectionist model to learn the mappings between orthographic input and phonological output and examined the effects of the time during learning at which an item was introduced into the network (i.e., AoA) on performance. They found that AoA effects depend on the nature of the mappings between orthographic input and phonological output that are formed during learning. In fact they did not find any AoA effect when the mappings between orthographic input and phonological output were predictable, whereas an effect of AoA (i.e., frequency trajectory) was evident when the relationship between input and output is arbitrary such as acquiring the name of an object. They also argue that apparent AoA effects are in fact due to cumulative frequency and produced computational models to support their argument. On the basis of this work, Zevin and Seidenberg (2002) proposed that an AoA effect is not likely to appear in word reading of alphabetic languages because knowledge of items acquired early in lexical development can be generalized to the learning of items acquired later. However, Monaghan and Ellis (2010) found that AoA does have an effect on reading in English when the mappings between orthography and phonology are relatively opaque— for example, choir. By contrast, words containing more transparent mappings between orthography and phonology—for example, chair—do not show AoA effects. They argued that consistent words with transparent mappings between orthography and phonology are less sensitive to late acquisition because they benefit from mappings between input and output established in the lexical network. However, for inconsistent words with opaque OP mappings, late acquisition leads to a processing cost because arbitrary mappings between orthography and phonology cannot benefit from the established network. This can be referred to as the arbitrary-mapping (AM) hypothesis (see also Ellis & Lambon Ralph, 2000; Lambon Ralph & Ehsan, 2006; Monaghan & Ellis, 2002b). The AM hypothesis assumes that AoA effects depend on spelling transparency and predicts that the effects of AoA will

Mem Cogn (2015) 43:298–313

be greater when OP mappings are more arbitrary. According to the AM hypothesis, there would be an interaction between the effect of AoA and the consistency of OP mappings, with AoA effects being greater for words with unpredictable OP mappings, such as inconsistent words in English (Ellis & Lambon Ralph, 2000; Zevin & Seidenberg, 2002). This interaction has been reported; that is, AoA has a larger effect on the reading of low-consistency than of high-consistency words (Cortese & Schock, 2013; Monaghan & Ellis, 2002b; see also Liu et al., 2008). Other studies challenge the AM hypothesis by reporting an effect of AoA on word naming in transparent orthographies such as Spanish and Turkish (Davies et al., 2013; Raman, 2006), suggesting that the effect of AoA is independent of spelling transparency. Instead, it has been argued that the AoA effects found in Spanish and Turkish are due to the semantic demands of the context imposed by recruiting more imageable word items in the task (Wilson, Cuetos, Davies, & Burani, 2013). Wilson et al. reported significant effects of AoA and frequency in lexical decision, and an effect of frequency but not of AoA in immediate and speeded word naming in Spanish. However, in their last experiment (Exp. 4) they did find a significant effect of AoA when naming highly imageable words, and from this they inferred that, since high-imageability words have a richer semantic representation, the effect of AoA might have a semantic locus. This account resonates with the semantic hypothesis of AoA, which assumes a conceptual locus for the AoA effect (van Loon-Vervoorn, 1989). Proponents of this hypothesis argue that concepts learned earlier in life have a tendency to shape and dominate the developing semantic network, and therefore that later-acquired concepts will be at a disadvantage during learning if they are not compatible with the extant connections in the lexical network (Steyvers & Tenenbaum, 2005). This means that earlier-acquired concepts will have richer semantic connections than later-acquired concepts, which offers performance advantages such as easier access to the meaning of a word during skilled performance, and also makes earlyacquired words less vulnerable to brain damage (Ellis, Burani, Izura, Bromiley, & Venneri, 2006). One prediction that can be derived from the semantic hypothesis is that whenever an AoA effect is observed, one should also observe effects associated with other semantic variables, such as imageability. By contrast, the mapping hypothesis predicts that whenever one observes AoA effects, one should also observe effects associated with other variables known to affect the formation of mappings (such as frequency). Studies using factorial designs and regression analyses have reported a mixed pattern of results. Some studies (Cortese & Khanna, 2007; Monaghan & Ellis, 2002a) have reported a significant independent effect of AoA above imageability on word naming in English. Others have shown significant effects of both AoA and imageability on word

Mem Cogn (2015) 43:298–313

naming and lexical decision with disyllabic words (Cortese & Schock, 2013). González-Nosti et al. (2013), using mixedeffect regression, found significant independent effects of AoA, lexical word frequency, imageability, and orthographic neighborhood (ON) size on lexical decision in Spanish. Moreover, they showed that the imageability effect was more pronounced for low-frequency and late-acquired words. A different approach is to use PCA to identify the effects that cluster with AoA by forming common factors reflecting shared variance between AoA and the effects of other critical variables. According to the semantic hypothesis, the effects of AoA should cluster with the effects of other semantic variables, such as rated imageability. By contrast, the mapping hypothesis predicts that AoA effects should cluster with the effects of word frequency. Using this method, Davies et al. (2013) found a frequency-independent effect of AoA, which loaded on a semantic factor that included imageability and familiarity on word naming in Spanish. Studies to date have shown mixed results concerning the interactions between AoA, imageability, and spelling transparency, with some reporting an AoA effect for transparent scripts and others finding no effect. Researchers finding an effect of AoA with transparent scripts have argued against the mapping hypothesis and seem to locate the effect at the level of semantic processing, supporting the view that AoA effects reside within the semantic representations themselves. In the present study, we were interested in testing these hypotheses in a language containing words with both transparent and opaque OP mappings: Persian. Persian is an Indo-European language with borrowed letters from Arabic, a Semitic language (Baluch, 1992). The Arabic script was slightly modified by inventing four letters for some Persian phonemes (i.e., [p], [ch], [zh], and [g]). Vowel transcription has a key role in determining the depth of transparency in printed Persian. This is because three vowels (i.e., long vowels) can be written in a letter format, and hence become a part of a word (i.e., transparent words), whereas the other three vowels (i.e., diacritics) are not written as part of the word (i.e., opaque words). Diacritics are only used in early reading instruction for beginning readers. In fact, Persian has a relatively high degree of grapheme-to-phoneme regularity when vowels are transcribed in words. Almost every grapheme corresponds to one phoneme, except for two graphemes (‫ ا‬and ‫)و‬, which can correspond to more than one phoneme (/V or U/ and /â or /). Persian also has a digraph (‫وا‬-), which corresponds to one phoneme (â). Persian orthography in the vowel-less format becomes more opaque and less predictable. This is the typical orthographic format presented for skilled reading. Studies of word naming in Persian have shown that transparent words are read faster and more accurately than opaque words by children and adults, perhaps because transparent words can make use of both the sublexical and lexical–semantic reading routes, whereas opaque words rely only on the

301

lexical–semantic routes (Baluch, 2005; Baluch & Shahidi, 1991; Rahbari & Senechal, 2009). Also, evidence from similar scripts like Arabic and Hebrew has shown that vowelized words (i.e., those containing diacritics) are read faster than words without diacritics (opaque words; Abu-Rabia, 1997; Frost, 1995). Baluch and Danaye-Tousie (2006) asked skilled readers to read aloud lists of transparent and opaque Persian words, which were matched on word frequency, imageability, and length. Participants were asked to recall the words after performing a 20-s distractor task. The researchers found that transparent words, along with high-frequency and highimageability words, were recalled significantly better than other words. Rahbari and Senechal (2009) found that skilled readers were able to read transparent words faster than opaque words in Persian, and to read opaque words faster than nonwords, indicating the presence of lexicality effects on reading Persian. Furthermore, they showed that this lexicality effect in children increases during reading acquisition across Grades 1 to 4. However, this effect happens much earlier for transparent words (i.e., Grade 2) than for opaque words (not before Grade 4; Rahbari & Senechal, 2010). Perhaps, this finding implies that the arbitrariness in OP mapping is detrimental for the process of word-reading acquisition, even from early stages, and opaque words might need further exposure or more semantic support from context across time. Baluch and Besner (2001) also found a greater effect of imageability for both high- and low-frequency opaque words, but not for matched transparent words in an oral-reading task in Persian. The reports of stronger effects of imageability and frequency on the reading of opaque than of transparent words suggest that opaque words might need more support from the lexical–semantic routes (Baluch & Besner, 1991, 2001). By contrast, Timmer, Vahid-Gharavi, and Schiller (2012) investigated the effect of masked onset priming on reading aloud single words in Persian. The behavioral finding indicated a significant effect of phonologically matched onset primes (vs. a mismatch prime condition) on reading aloud transparent words, but not for opaque words. However, the event-related potential data showed an early amplitude difference (80- to 160-ms time window) between the matched and mismatched conditions for both transparent and opaque words, and a late amplitude difference (300- to 480-ms time window) for transparent words only. The authors argued that these results show evidence of early activation of both the sublexical and lexical routes for transparent and opaque words in Persian, on the basis of the DRC model. The present study had two aims. First, we wanted to know whether variables that predict word naming in other languages, such as word frequency, rated AoA, and rated imageability, also have an effect on word naming in Persian, as is predicted by the universal hypothesis. This is the first large-scale study to investigate the effect of these factors on word naming in Persian. Extant studies of skilled word reading in Persian have been

302

characterized by small numbers of items and participants—for instance, 20 participants in Baluch and Besner (2001), and only 60 items (30 transparent and 30 opaque words) in Rahbari and Senechal (2009). Moreover, only spelling transparency, subjective frequency, and imageability have been examined, whereas other variables, such as AoA, objective spoken and written word frequencies, familiarity, and orthographic neighborhood (ON) size, have not been considered. In fact, few statistical norms are available for studying these effects in Persian. Second, we wanted to test alternative predictions derived from the AM hypothesis and the semantic hypothesis of AoA effects. According to the mapping hypothesis, AoA should have a stronger effect on the reading of opaque than of transparent words, because the effect of AoA on word naming should be greater if OP mappings are less predictable. Another prediction is that the effects of AoA should associate with variables that reflect lexical reading—for example, frequency—which are also expected according to the cumulative frequency hypothesis. According to the semantic hypothesis, we would expect the effects of AoA to associate with measures of semantic processing, such as rated imageability, for both opaque and transparent words. In order to test these predictions, it was necessary to collect fresh statistical norms for Persian words based on the procedures used to study the effects of psycholinguistic variables on word naming in other languages (e.g., Balota et al., 2004; Crepaldi et al., 2012; Cuetos & Barbón, 2006; Davies et al., 2013; Schock, Cortese, & Khanna, 2012a; Schock, Cortese, Khanna, & Toppi, 2012b). This was the first study to collect these norms for Persian.

Preparatory study Method No comprehensive database provides psycholinguistic norms for Persian words. Therefore, our first step was to develop such norms for a subset of Persian monosyllabic words. The word stimuli for the study were extracted from Flexicon, which is a Persian corpus including about 50,000 lexemes, built from a sample of 10 million words and developed by the Iranian Supreme Council of Information and Communication Technology (www.dadegan.ir/catalog/flexicon). About 5% of this corpus is monosyllabic words. In total, 871 monosyllabic words were selected, excluding verb roots, inappropriate words (e.g., rude words), and homograph or homophone pairs. Although the sample of words is a small fraction of the written script, the stimuli represent a reasonable proportion of the monosyllabic words in Persian, and they represent all different types of monosyllabic words (i.e., CV, CVC, and CVC) with different frequencies. Table 1 shows summary statistics for the main psycholinguistic variables in this study.

Mem Cogn (2015) 43:298–313 Table 1 Descriptive summary for the main predictor variables Variable

Mean

SD

Min

Max

Number

AoA Familiarity Imageability Ln_Spoken Ln_Written Ln_ON Phonemes Letters DT

4.88 4.70 4.53 4.37 7.12 3.57 3.56 2.96 0.83

1.41 0.98 1.44 1.89 2.08 0.52 0.52 0.43 0.13

1.21 1.63 1.37 0.69 0.00 1.10 2.00 2.00 0.67

6.97 6.97 7.00 10.15 12.95 4.52 4.00 4.00 1.00

871 871 871 871 871 871 871 871 871

AoA = age of acquisition, Ln_Spoken = loge(Flexicon’s spoken frequency +1), Ln_Written = loge(Hamshahri’s written frequency +1), Ln_ON = loge(orthographic neighborhood size), DT = degree of transparency

Frequency measures The spoken frequency values for word stimuli were extracted from Flexicon, and the written frequency values were extracted from the Hamashari corpus, which is a sample from a daily Iranian newspaper published between the years of 1996 and 2002 (AleAhmad, Amiri, Darrudi, Rahgozar, & Oroumchian, 2009). This corpus has been built from 63 million words with an average length of 3.97 characters per word (Darrudi, Hejazi, & Oroumchian, 2004), and it includes about 31,500 lexemes with frequency values higher than 1 per million. In order to decrease the skew of this distribution, the raw frequency values extracted from both corpora were log-transformed (i.e., log frequency value +1). Values for the average bigram frequency were calculated for three- and four-letter words. However, the measure for bigram frequency used by Davies et al. (2013)—that is, type and token “frequencies of the words (of the same length) that include that bigram (in the same position)” (p. 301)—is not applicable to two-letter words. Due to the short word length of our stimuli (two to four letters) relative to those from Davies et al.’s study (three to ten letters), a comparable assessment of bigram frequency effects was not possible. However, we can report that for three- and four-letter words, there was a very high correlation between bigram frequency and number of letters (r = .75) and degree of transparency (r = .66), revealing collinearity. Word length was calculated as both the number of phonemes and the number of letters in the word. Monosyllabic words in official Persian have three different syllabic structures (i.e., CV, CVC, and CVCC), allowing for word lengths ranging from two to four letters or phonemes. Orthographic neighborhood size (ON) was computed as the number of words in the corpus with the same orthographic length that differed only in one letter from the target word. ON size for this study were calculated from the Hamashari corpus. In order to decrease the skew of the distribution, the raw ON values were loge-transformed.

Mem Cogn (2015) 43:298–313

Spelling transparency The transparent words (n =317) contain long vowels (â, i, u), which are written in a letter format and form the consistent mappings between graphemes and phonemes, whereas the opaque words (n =554) contain short vowels or diacritics (a, e, o), which are not written in the skilled script and make grapheme-to-phoneme mappings less consistent. The proportions of transparent and opaque words in Persian script were compared through the Flexicon corpus. The degree of transparency (DT) was computed by dividing the number of letters by the number of phonemes. For the completely transparent words, the number of letters was equal to the number of phonemes, resulting in the degree of transparency being equal to 1 (comprising 11% of the word units in the corpus). The degree of transparency for the opaque words ranged from .92 to .50, representing 89% of the words in the corpus (reflecting the relatively deep structure of Persian orthography). Therefore, the majority of words to some degree were opaque according to this metric. A transparency value of .50 for the most opaque words shows that at least 50% of the letters (i.e., diacritics) are missing in a word. Three types of words with different DTs were determined in our word stimulus set: that is, completely transparent words (DT =1), opaque words with three letters and four phonemes (DT = .75), and opaque words with two letters and three phonemes (DT = .67). DT values were used to assess the effect of spelling transparency on reading in the regression model. AoA, familiarity, and imageability Ratings for familiarity, imageability, and subjective AoAwere collected from 40 native Persian speakers (ages ranged from 19 to 32 years old) who voluntarily participated in the study by completing an online survey using a website called Jotform (the details can be found at www.jotform.com). The word stimuli were randomized and entered into nine different lists. Each list included about 100 words, and the average time to finish each list was about 30 min. For each list the instructions for rating were repeated, in order to make sure that participants retained the correct information about the task procedure. Participants received each list only after successful completion of the previous one. About 85% of the participants were able to complete all of the lists in one month. The rest of the participants completed the task within a one- to three-month period. For familiarity, the participant was asked to make an estimate of how much he or she was familiar with the concept of the word presented, using a 7-point rating scale (1 =I have never seen, heard or used the word, 7 =I have seen, heard or used the word everyday). For imageability, participants were asked to determine how easily a word provoked a mental image in the form of a picture, sound, taste, or smell using a 7-point rating scale (1 =very difficult, 7 =very easy). Finally, for the AoA rating we adopted the scale developed by Gilhooly and Logie (1980), in which participants were asked to estimate at what age they had learned the presented word,

303

using a 7-point rating scale (1 =birth to 2 years old, 2 =2 to 4 years old, 3 =4 to 6 years old, 4 =6 to 8 years old, 5 =8 to 10 years old, 6 =10 to 12 years old, and 7 =13 years old and older). The ratings for AoA, imageability, and familiarity from 40 participants were inspected following established procedures in the literature (Schock, Cortese, & Khanna, 2012a; Schock, Cortese, Khanna, & Toppi, 2012b), through the following steps. First, the general mean for each item was calculated by averaging the ratings across all 40 participants for each of the three variables (i.e., AoA, imageability, and familiarity) separately. Then the correlation coefficients were calculated between a participant’s ratings and the averaged latencies for all items. For the next step, the means and standard deviations of all correlation coefficients were computed separately for AoA (mean = .82, SD = .14), imageability (mean = .71, SD = .16), and familiarity (mean = .65, SD = .17), and ratings with correlation coefficients lower than two standard deviations below the general mean were excluded from further analysis. One participant’s ratings were excluded because the correlation coefficients (rs = .02, for AoA, and .04, for imageability and familiarity) were more than three standard deviations below the mean for all three variables. Moreover, one participant’s ratings were eliminated only for familiarity (r = .07), and another one’s only for imageability (r = .34). Perhaps this might indicate that these participants were not attentive to the instructions for some of the tasks. After elimination of the extreme ratings, the correlation coefficients ranged from r = .62 to .90 for AoA, from .46 to .81 for familiarity, and from .46 to .85 for imageability. Internal reliability was computed through Cronbach’s alpha separately for AoA (α = .988), imageability (α = .976), and familiarity (α = .968), indicating a very high interrater reliability for each variable. Initial phonemes In order to partial out the effect of voice onset times on word-naming latencies, the initial phonemes of word stimuli were categorized on the basis of the phonetic features of Persian in terms of the place of articulation, the manner of articulation, and voicing features (Samareh, 2000). Word stimuli were coded on the basis of whether the first phoneme corresponded to one of the 15 phonetic features of Persian, comprising bilabial, labio-dental, dental, alveolar, alveo-palatal, palatal, uvular, glottal, stop, fricative, affricative, trill, nasal, liquid, and voiced (Samareh, 2000). Phoneme codes were included in the further analyses as dichotomous variables in order to exclude the effect of initial phonemes on naming latencies. Factor analysis Studies that assess how different psycholinguistic factors predict lexical processing in normal participants typically have a problem with multicollinearity among the variables. Most of

304

Mem Cogn (2015) 43:298–313

these variables, which come from the subjective ratings of participants, are to some extent inherently correlated. Table 2 shows the correlation matrix for the main variables AoA, familiarity, imageability, and word frequency in this study. There are high correlations between different clusters of variables, such as rated AoA, familiarity, imageability, and frequency, and also between word length, orthographic neighborhood size, and transparency. A multicollinearity diagnostic test between the variables (i.e., AoA, familiarity, imageability, loge spoken word frequency, loge written word frequency, loge ON, transparency, and the numbers of phonemes and letters) revealed a condition number of 73. This degree of intercorrelation between variables can make it difficult to find the real effect of each variable using regression models and can lead to an unstable result (Baayen, 2008). An alternative method used in some studies is PCA or factor analysis (Baayen, Feldman, & Schreuder, 2006; Barca et al., 2002; Bates et al., 2001; Burani et al., 2007; Davies et al., 2013) to produce orthogonalized components to be used in regression models, instead of the main variables. Our motivation for performing the factor analysis was to test whether AoA loads on to a common factor with frequency (i.e., a frequencydependent effect), as is predicted by the AM and cumulative frequency hypothesis, or with rated imageability (i.e., a frequency-independent effect), as is predicted by the semantic hypothesis. The factor analysis (PCF) was conducted using Stata 12, which produces the same results as PCA in SPSS (Acock, 2008). All of the main variables except DT were entered into the PCF analysis to drive independent predictors for the regression analysis. The reason for excluding DT from

the PCF analysis was to see the interaction effect of the derived factor with transparency later, in linear mixed-effect regression. A collinearity diagnostic test between the derived factors and DT revealed a condition number of 1.45, indicating no collinearity among the variables (Belsley, Kuh, & Welsch, 1980). PCF analysis derived three factors with eigenvalues more than 1; each of these factors showed more than 20% unique variance, and cumulatively all three factors accounted for 79% of the predictor variance. Applying orthogonal rotation (varimax) to the PCF analysis helped to label the factors more easily on the basis of the types of items that had a loading of .40 or more on a factor. Following Davies et al. (2013), another PCF analysis was conducted separately for initial phoneme features, to drive uncorrelated factors to enter into the regression model instead of the raw variables. Table 3 shows the seven derived factors, renamed Initials1–7, according to their amounts of loading from different phonetic features. Following the analysis, one factor with loadings from AoA, familiarity, and imageability was labeled as the Semantic factor; the factor with the highest loadings from spoken frequency, written frequency, and familiarity was labeled as the Lexical factor; and finally, the factor with the highest loadings from word length (number of letters and phonemes) and ON size was labeled the Sublexical factor (see Table 4). Our factor loadings are similar to the main components derived by Davies et al. (2013), with the only difference being that unlike in their study, we found that AoA loaded on the Semantic and not noticeably on the Lexical (or frequency) factor; that is, we found only a frequency-independent AoA effect (Brysbaert & Ghyselinck, 2006).

Table 2 Correlation matrix of the main predictor variables, with their significance levels AoA FAM p

–.82

Lexico-semantic effects on word naming in Persian: does age of acquisition have an effect?

The age of acquisition (AoA) of a word has an effect on skilled reading performance. According to the arbitrary-mapping (AM) hypothesis, AoA effects o...
494KB Sizes 0 Downloads 5 Views