Brain & Language 145–146 (2015) 11–22

Contents lists available at ScienceDirect

Brain & Language journal homepage: www.elsevier.com/locate/b&l

Balloons and bavoons versus spikes and shikes: ERPs reveal shared neural processes for shape–sound-meaning congruence in words, and shape–sound congruence in pseudowords Jelena Sucˇevic´ a,b,⇑, Andrej M. Savic´ c,d, Mirjana B. Popovic´ c, Suzy J. Styles e, Vanja Kovic´ a Department of Psychology, Faculty of Philosophy, University of Belgrade, Cˇika Ljubina 18-20, 11000 Belgrade, Serbia1 Department of Experimental Psychology, University of Oxford, 9 South Parks Road, United Kingdom c University of Belgrade, School of Electrical Engineering, Bulevar Kralja Aleksandra 73, 11000 Belgrade, Serbia d Tecnalia Serbia Ltd., 11000 Belgrade, Serbia e Division of Psychology, School of Humanities and Social Sciences, Nanyang Technological University, 14 Nanyang Drive, 637332 Singapore, Singapore a

b

a r t i c l e

i n f o

Article history: Received 12 February 2014 Accepted 28 March 2015 Available online 16 May 2015 Keywords: Sound symbolism Event related potentials Lexical decision Implicit interference Language processing

a b s t r a c t There is something about the sound of a pseudoword like takete that goes better with a spiky, than a curvy shape (Köhler, 1929:1947). Yet despite decades of research into sound symbolism, the role of this effect on real words in the lexicons of natural languages remains controversial. We report one behavioural and one ERP study investigating whether sound symbolism is active during normal language processing for real words in a speaker’s native language, in the same way as for novel word forms. The results indicate that sound-symbolic congruence has a number of influences on natural language processing: Written forms presented in a congruent visual context generate more errors during lexical access, as well as a chain of differences in the ERP. These effects have a very early onset (40–80 ms, 100–160 ms, 280–320 ms) and are later overshadowed by familiar types of semantic processing, indicating that sound symbolism represents an early sensory-co-activation effect. Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction More than 80 years ago, Köhler (1929:1947) revealed an intriguing puzzle for cognitive science: Given two abstract shapes and two meaningless labels, people systematically prefer to pair a blobby, curvy shape with a label like baluma and a jagged, spiky shape with takete, despite the 50:50 probability of making this selection. His finding flew in the face of established understanding of the arbitrariness of linguistic signs (Saussure, 1916:1959), and provided concrete evidence of systematic preferences for sound– shape correspondences which were beginning to be documented for novel language stimuli (Sapir, 1929). In recent years, Köhler’s effect has been replicated and extended in a variety of contexts, demonstrating that the allocation of novel labels to novel shapes is highly predictable, and seemingly culturally independent. However the place of this effect in the lexicons of natural languages remains controversial: Does the English word circle actually sound rounder than star? Or how about Serbian, krug versus ⇑ Corresponding author at: Department of Experimental Psychology, University of Oxford, 9 South Parks Road, United Kingdom. E-mail address: [email protected] (J. Sucˇevic´). 1 Former. http://dx.doi.org/10.1016/j.bandl.2015.03.011 0093-934X/Ó 2015 Elsevier Inc. All rights reserved.

zvezda? We report one behavioural and one ERP study investigating the sound symbolic congruence of natural language stimuli in Serbian, where we compare the sound-symbolic congruence of real and novel word forms, presented in different visual contexts.

1.1. Sound symbolic matching in artificial contexts Köhler introduced a ‘dual matching’ paradigm in which participants were presented with two novel, nameless drawings and two novel labels, and were asked to allocate the labels to the objects in question. Following his findings, several researchers have demonstrated that extremely high percentages (95–98%) of adults prefer to match curvy shapes to words containing rounded vowels (/o/, /u/) and voiced bilabial consonants (/b/, /m/), in words like maluma, or bouba. By contrast they prefer to match spiky shapes to words containing high front vowels (/i/, /e/) and voiceless stop consonants (/k/, /t/), in words such as takete, or kiki, and these effects are observed for speakers of a variety of languages including English, Japanese, German and Serbian (e.g. Jankovic´ & Markovic´, 2001; Nielsen & Rendall, 2011; Ramachandran & Hubbard, 2001a). To rule out whether the effects could be driven by Latin orthography (e.g., pointy letter ‘‘k”, curved letter ‘‘b”) or other

12

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

codified Western European cultural associations, the effects have been replicated in the Himba, a remote population of Northern Namibia, who do not use written language, and have little exposure to Western cultural and environmental influences (Bremner et al., 2013). To rule out whether the effects were driven by lexical associations arising out of a large, structurally biased lexicon, the effects have also been replicated in 2.5 year olds, whose lexicons are substantially smaller than adults’ (Maurer, Pathman, & Mondloch, 2006), and even in four-month old infants, who have not yet begun to learn words (Ozturk, Krehm, & Vouloumanos, 2013). Taken together, these studies point to a robust preference for linguistic shape-to-sound matching independent of language, culture, literacy or age, when pairs of nonsense words are matched with pairs of novel objects. 1.2. Theories of the sensory origins of sound symbolism By way of explaining these kinds of effects, Maurer (1993) proposed that sensory summation and cross-modal confusion in infancy constitute a form of ‘synaesthetic’ perception in the immature brain – with neural hyperconnectivity between sensory areas resulting in a kind of intermingling of sensory stimulation (of which linguistic synaesthetic congruence could be one outcome). This suggestion is in line with Ramachandran and Hubbard’s (2001a) later proposal for the synaesthetic origins of language and consciousness, in which they further suggest that sound symbolic congruence like the bouba/kiki effect has its origins in the mirror neuron system, arising as a consequence of co-activation of somatosensory/motor activity involved in articulating and viewing speech – for example, correspondences between the sound of the vowel /o/ and the round shape of a speaker’s lips as they articulate it. Both theories argue that sound symbolism in adult language is an outcome of a more general system of cross-modal congruence evident in early neural development. Maurer has also proposed that sound-shape correspondences influence not only language development in the individual, but the manner in which human languages have evolved (Maurer et al., 2006; Spector & Maurer, 2009). Imai and Kita, (Imai & Kita, 2014; Imai, Kita, Nagumo, & Okada, 2008) have further formalised evidence in this field in their Bootstrapping Hypothesis for language acquisition and language evolution – according to which innate cross-modal biases provide a scaffolding for learning how words can be used to refer to objects – after which, more abstract mappings can follow. These perspectives may help to explain why sound symbolic matching is pervasive across languages of the world. 1.3. Sound symbolism in the lexicons of natural languages Despite the body of evidence documenting sound symbolic congruence in artificial shape mapping tasks, the role of sound symbolism in natural languages remains unclear. Namely, if maluma is perceived as ‘round’ independent of language, culture and age, why is it non-obvious that words circle and krug refer to shapes of equal roundness, while star and zvezda are both pointy? Are the shape-to-sound mappings of real words in fact arbitrary, while mappings of novel objects with novel linguistic strings represent some kind of perceptual but non-linguistic task effect? Some studies investigating the structure of natural language lexicons have suggested that certain subsets of vocabulary contain sound symbolic clues to word meaning. For example, when participants are asked to guess which of two real words from an unfamiliar language has which of two real-word meanings, highly imageable words in natural languages have been shown to be guessable at levels higher than chance (e.g., ‘bird’/‘fish’: Berlin, 1994; antonyms such as ‘big’ and ‘small’: Nygaard, Cook, & Namy, 2009; walking in a manner which is batobato or nosunosu

‘fast-heavy’ vs ‘slow-heavy’ in Japanese: Imai et al., 2008). This feature of adult ‘guessability’ has also been shown to facilitate verb learning for 4 year-olds (Imai et al., 2008; Kantartzis, Imai, & Kita, 2011). At a more comprehensive level of description, corpus analyses involving samples from 118 language families have indicated that sound symbolism is embedded in the lexicons of a large number of natural languages (e.g. Ciccotosto, 1991; Wichmann, Holman, & Brown, 2010). Furthermore, Monaghan, Shillcock, Christiansen, and Kirby (2014) have recently demonstrated that even when no a priori expectations are made about which sounds might be used to represent which meanings, monomorphemic words of English which share phonological properties also share semantic properties, and appear in similar linguistic contexts, at rates higher than would be predicted by chance. Their model also demonstrated that these properties were stronger for words which were learned earlier, lending further support to the sound symbolic bootstrapping hypothesis. 1.3.1. Sound symbolism in online language processing Despite the degree of sound symbolism known to be encoded in the lexicons of natural languages, few studies have demonstrated consistent effects of sound symbolism during natural language processing, using standard tests of lexical access. In one of the only studies of this kind Westbury (2005) embedded implicit parafoveal interference in a visual lexical decision task, where written stimuli (words and pseudowords) were presented inside spiky or curvy frames. The hypothesis was that if sound symbolic congruence has a cognitive influence during natural language processing, the association between a spiky frame and spiky sounding phonemes would facilitate processing of word forms containing ‘sharp’ phonological structure, but inhibit processing for ‘soft’ (and vice versa for curvy frames). Westbury’s results showed a facilitatory effect of frame congruence on lexical decision, but only in the case of pseudowords: No sound-symbolic frame effect was observed for real word stimuli. Westbury suggests that the effect of sound-symbolism is therefore a non-semantic effect, as it only influences the phonological decoding of pseudowords, not the process of accessing meaningful representations of real words. There are several reasons to be cautious about Westbury’s interpretation of sound-symbolism as an effect only influencing processing when a word-form is devoid of meaning. The first reason for caution is that new words arise in language all the time – and in the process, pseudowords make the transition from hollow phonological shells into calcified homes for semantically rich concepts: The recent addition Googling, with all of its attendant ‘soft’ phonology aligns well with the corporation’s family-friendly corporate identity. By contrast, the ‘hard’ consonants, sibilant cluster and ‘pointy’ vowel of Skype, maintain a tech-heavy, space-age feel. It would be a mistake to assume that new words shed their symbolism on entry to the corpus of contemporary language – not to mention that branding experts would be out of a job if they were not at least partly right. Indeed, there is evidence to suggest that sound symbolism plays a critical role in the potential success of new candidate words, and whether or not their meanings are easily learned: In experimental contexts, word learning studies have shown that sound symbolism enhances the speed/accuracy of word learning (Parault & Schwanenflugel, 2006), especially in early childhood (Kantartzis et al., 2011; Nygaard et al., 2009) and infancy (D’Souza, Plunkett, & Styles, submitted for publication; Miyazaki et al., 2013). These experiments point to the utility of sound symbolism in normal language processes involving the learning of new words in meaningful contexts. From a methodological perspective, it also turns out that detecting the influence of sound symbolism in language tasks can be highly dependent on the timing and organisation of the

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

experimental scenario. For example, recent studies have demonstrated that minor variants in experimental procedure can alter whether sound symbolic effects are observed. For example, Sucevic, Jankovic, and Kovic (2013) demonstrated that sound symbolic congruence can modulate lexical decision times to both words and non-words, but only under particular experimental conditions: RTs are influenced by frame congruence for all lexical decisions if the frame and word are presented simultaneously. However, if the frame appears shortly before the onset of the written wordform, the effect is only evident for highly typical frame shapes, whereas if the frame is presented more than one second before the word-form, no effects of frame congruence are observed. This means that sound symbolic influences in online language processing may be fragile, and short lasting. The fragility of sound symbolism is especially important when one considers the well established literature on reaction times in lexical decision tasks: real words can be accepted in lexical decision faster than plausible pseudowords can be rejected (e.g. Chambers & Forster, 1975). Thus, where words and pseudowords attract different lexical decision speeds, faster RTs may attract smaller RT differences, thereby masking small or subtle differences – a kind of ceiling effect – which may have played a role in whether sound symbolic effects were detected for words in Westbury’s task. There is also reason to be cautious about the stimuli used for comparison in Westbury’s study. He compares RTs to words and pseudowords presented in curvy versus spiky frames, according to whether the stimuli contained obstruent consonants (assumed to be ‘sharp’), continuant consonants (assumed to be ‘soft’), or a mix, without providing validation for the sound symbolism of these articulatory features. Indeed, according to the existing literature, some of the consonants employed by Westbury are atypical for the predicted shape congruence (e.g., voiced obstruents /b/ and / d/ were classified as ‘spiky’, whereas only the voiceless obstruents / k/ and /t/ have previously been linked with spiky shapes). Finally, it is worth considering that classical implementations of sound-shape congruence have dealt with the physical properties of real (if novel) shapes or objects. It may well be the case that where sound symbolism exists in natural languages, its aligns best with the physical shape of real objects described by concrete words – that is to say, the sound of the word aligns to the visual properties of the word’s meaning. For abstract words with no natural shape (e. g., tomorrow, taste, idea), Saussure’s arbitrariness may well hold true. Westbury’s study assumes that the visual shape of a frame will influence processing of a written word-form on the basis of the word’s phonology – independent of the word’s meaning. From this perspective, his study contained both concrete and abstract words (e.g., nail, noon) without consideration for whether the concrete words exhibited shape–sound congruence at the level of their meaning (e.g., nail is sound-to-meaning incongruent, with consonants classified by Westbury as ‘round’, but a naturally ‘pointy’ shape). In addition to the problem of concrete words’ phonological congruence to their meaning, the inclusion of both concrete and abstract nouns introduces further variability to the stimulus set: As proposed in classical language processing theories such as Paivio’s Dual Code Theory, concrete and abstract words differ in processing speed (Paivio, 1991), and may even have different underlying neural representations – with concrete words understood to evoke multimodal representations (Reilly, Westbury, Kean, & Peelle, 2012). 1.3.2. Neurobiology of sound symbolism Although the phenomenon of sound symbolism has been studied for several decades, there is little consensus on the mechanisms underlying these sensory effects. Ramachandran and Hubbard (2001b) first proposed that linguistic sound symbolism such as the ‘bouba/kiki’ effect may arise out of patterns of functional

13

sensory connectivity shared with the condition of synaesthesia. In particular, they highlighted two possible mechanisms: multisensory integration in the angular gyrus; and linkages between the visual and auditory properties of speech in the mirror neuron system, particularly Broca’s area. They noted that damage to the left angular gyrus was associated with highly ‘literal’ comprehension, poor metaphoric understanding, and in one notable case, no propensity for the bouba/kiki effect. Later, the same lab demonstrated that individuals with autism spectrum disorder (ASD) exhibit a lower propensity for the bouba/kiki preference (Oberman & Ramachandran, 2008), a finding which the authors linked to weak co-activation of speech gestures in Broca’s area – possibly implicating an impairment of the mirror-neuron system. Others have demonstrated that individual weakness of the bouba/kiki preference is linked to severity of ASD symptoms, suggesting a more generalised property of ‘sensory coherence’ as the source of the bouba/ kiki effect – since individuals with ASD often present with unusual sensory symptoms, and have difficulty integrating sensory information across modalities (Occelli, Esposito, Venuti, Arduino, & Zampini, 2013). These observations from patient populations with known neural or sensory impairments provide tantalising theoretical suggestions for the origins of these effects. However, since none of these studies involved neural recordings, it is difficult to determine whether the lack of sound-symbolism is ‘caused by’ the known deficit, or is simply co-morbid, with a different underlying cause. One recent fMRI study (Pirog Revill, Namy, Clepper DeFife, & Nygaard, 2014) lends support to the idea that sound symbolism in language may indeed share neural bases with general properties of inter-sensory processing. When guessing the meanings of antonym pairs in unfamiliar languages, neurotypically normal participants showed increased activation in the left superior parietal cortex for symbolically congruent words compared to incongruent words. This effect was combined with a diffusion tensor imaging (DTI) result showing that the strength of the sound symbolic effect correlated with the anisotropy of the left superior longitudinal fasciculus, indicating that those with greater white matter density and fibrous alignment along this tract showed stronger sound symbolism for unknown word forms. The authors point out that this finding corresponds to DTI correlations for crossmodal integration in neurologically normal adults, and propose that the two processes – cross-modal integration and sound symbolism – share neural mechanisms. However, given the low temporal resolution of fMRI recording techniques, this finding does not allow us to extrapolate about the moment-by-moment time-course of sound-symbolism during online language processing. This is particularly important when one considers that under normal reading conditions, 3–4 words can be read per second (Masson, 1982), while the temporal resolution of fMRI is constrained by the rate of detectable haemodynamic change, and is therefore slower than the speed of an individual word. One of the first studies to investigate high-temporal resolution neural signatures of neurologically normal participants during a task with sound symbolic congruence was conducted by Kovic, Plunkett, and Westermann (2010). In this study, participants were trained to learn which novel creatures were members of which category, on the basis of their auditory labels. In behavioural terms, participants learned the categories faster when the auditory category label was congruent with the shape of the creatures’ head. In a subsequent test phase with EEG, participants saw one of the creatures shortly after hearing one of the recently acquired category labels. Following picture onset, congruent combinations of stimuli elicited greater negativity during the 140–180 ms time window falling across the first prominent negative peak of the visually evoked potential – an effect observed over occipital regions on both sides. These findings showed that the early stages

14

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

of sound symbolic processing are not lateralised in the way suggested by fMRI studies of Pirog Revill et al. (2014). One other EEG study also suggests that early stages of sound symbolic processing may be non-lateralized – this time in an audio-visual study of infants. Asano et al. (2015) investigated the neural basis of sound symbolism in 11-month-old infants. When presented with a series of trials in which a novel shape was presented with a novel auditory word form (e.g. spiky or curvy shape with ‘‘tiki” or ‘‘momo”), increases in the gamma band amplitude were present within 300 ms of stimuli onset in the congruent condition, but phase synchronisation in the incongruent condition occurred around 400 ms. The authors link this finding to known increases in gamma-band activity during visual object processing, and propose that differences in the frequency band dynamics represent the time course of sensory integration between vision and audition – with delayed processing in the condition of least sensory congruity. Along with differences in frequency band, differences were also evident in the moment-to-moment event related potentials (ERPs), with increased negativity evident around 350– 550 ms over the scalp regions down the central midline. This time window is consistent with familiar N400-type incongruence between sound and meaning in infants around this age (for review, see Friedrich & Friederici, 2006), effects which have been observed when as little as a single vowel is incongruent with the typical label for a currently viewed picture (Duta, Styles, & Plunkett, 2012). Asano et al. (2015) propose that in the case of 11-monthold infants, sound symbolism is processed perceptually, on the basis of crossmodal binding mechanisms. The findings of Kovic et al. (2010) and Asano et al. (2015) demonstrate that the EEG method is a sensitive tool for investigating the neural processes underpinning sound symbolism during the first half second of online language processing. Both studies suggest that these early processes are symmetrically distributed. However, since both studies investigate processing of newly learned or recently exposed words, it remains unclear whether known words in the lexicons of adults would exhibit similar patterns of activation. One intriguing EEG study of infants’ auditory evoked potentials (AEPs) demonstrated that left lateralization in the AEP emerges as words become known (Mills, Plunkett, Prat, & Shafer, 2005), so it remains to be seen whether lateralized effects will be observed for the sound symbolism of known words relative to the sound symbolism of pseudo-words. 1.3.3. The present study In order to investigate the fine-grained time-course of sound symbolic processing for natural language stimuli, alongside the behavioural outcomes of these processes, we present two lexical decision tasks, with implicit interference based on the method of Westbury (2005). In these studies, the stimuli are more tightly controlled for visual, lexical and phonological properties, including the typicality of the frame shape, the phonemes under test, and the congruence of the word’s phonology to its meaning. In particular, since a novel pseudoword has no meaning (and therefore cannot exhibit sound-meaning mismatches), in this study we include only those concrete words where the phonology and the meaning are congruent, according to previously collected norms for the language of test. This means that only novel pseudowords and concrete nouns with clear patterns of sound symbolic phonology are included in the test set, thereby reducing potential sources of variability. As one recent study has pointed to different neural mechanisms underlying the processing of concrete and abstract words (Reilly et al., 2012), only concrete words were selected as stimuli, thereby avoiding potential confounds between concreteness and imageability. Furthermore, given the experimental fragility of sound symbolism, we implement both a behavioural and a neuropsychological

version of the task, to determine (a) whether behavioural measures are sufficiently sensitive to detect the effects of sound symbolism during lexical access, and (b) what are the underlying patterns of neural activation for sound symbolism. The current tasks were conducted in Serbian – a language with shallow orthography-tophonology mappings – meaning that the current studies reduce the possibility of irregular spellings interfering with the pattern of phonological activation during word reading. 2. Experiment 1 We conducted a partial replication of Westbury’s behavioural lexical access task with implicit interference, examining shapeto-sound congruence effects on reaction time and error rates in written Serbian. 2.1. Participants Twenty-five participants took part in the Experiment 1. Participants were second year psychology students at the University of Belgrade who gave informed consent and received course credit for participation. The study was approved by the Departmental ethics board. All participants reported normal or corrected-to-normal vision. 2.2. Stimuli A recent study by the authors (Ilic´, Kovic´, & Jankovic´, in preparation) investigated sound symbolism in Serbian concrete nouns produced in an elicitation task. Participants were asked to list 15–30 nouns denoting sharp/angular objects and 15–30 nouns denoting round/oval objects. The resulting corpus contains over 1000 elicited concrete nouns for canonically round and spiky objects in contemporary Serbian. A previous study of Serbian phonology in a novel word generation task (Jankovic´ & Markovic´, 2001) had established the normative frequency with which letters of Serbian were employed in novel name generation for bouba and kiki type shapes. The studies also tracked the phonotactic structure of novel and real names for spiky and round objects, and both studies agreed on the set of consonants, and syllable structures which were most strongly associated with the different shapes. In particular, /k/, /z/, /r/, /ʧ/, /ʃ/ and consonant clusters were associated with spiky shapes, and /m/, /l/, /b/, /v/, /n/, and vowel-initial or open syllables were associated with round shapes. Taking these data sets together, the authors selected 50 ‘soft’ and 50 ‘sharp’ high-frequency words (frequencies from Kostic, 1999), which contained consonants which were typical for the shape of the object they described. The authors also attempted to identify 50 words of each kind for which the phonology did not match the shape, but extremely few words of this kind were available, meaning it was not possible to fulfil this criterion. The selected words (shape–sound congruent nouns) are therefore representative of the structure of the lexicon in contemporary Serbian. Selected words were presented in upper case letters of the Cyrillic script – one of two active writing systems in contemporary Serbian. Thus all words were either soft-sounding, round objects (e.g. .05). These effects appear around the time of the first positive deflection in the ERP over frontal regions, a time consistent with the onset of the visual evoked potential in response to a sudden onset of a visual stimulus (e.g., pattern onset, see Trick & Skarf, 2006, chap. 105). Despite appearing very early in the ERP, these effects could represent the integration of low level parafoveal information from the shape of the frames with the written forms presented to the fovea. 3.6.2. Early time window 100–160 ms The ERP differed across the scalp (Band: FG(1.07, 10.70) = 16.23, p < .01, pg2 = 0.62; Laterality: F(2, 20) = 27.21, p < .001, pg2 = 0.37; Band  Laterality: F(4, 40) = 2.69, p < .05, pg2 = 0.21). This window occurred across the first prominent positive–negative peak inflection, and the ERP differed significantly between congruent and incongruent trials (F(1, 10) = 13.11, p < .01, pg2 = 0.57), with generally more negativity for incongruent than for congruent trials, an effect which differed across the scalp (Frame Congruence  Band: F(2, 20) = 4.54, p < .05, pg2 = 0.31; Frame Congruence  Laterality:

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

18

FTL FT TL

5

F Fz z

4

4

3

3

2

2

2

1

1

1

0

700

-1

0

700

-1 -2

-2

-3

-3

-3

5

Cz C z

4

5

CTR C TR

4

5 4

3

3

3

2

2

2

1

1

1

0

0

700

0

700

-1 -2

-2

-3

-3

-3

Pz P z

4

5

POR P OR

4

5 4

3

3

3

2

2

2

1

1

0 -1 0 0 -1

700

700

-1

-2

5

700

-1

-2

-1

POL PO OL

5

F FTR TR

4

3

0

CTL CT TL

5

1

0

0

700

-1

700

-1

-2

-2

-2

-3

-3

-3 µV 2 1 0 -100 -1

WORD Congruent

BALLOON

BAVOON

WORD Incongruent

BALLOON

BAVOON

100

300

PSEUDO Congruent

500

5 F3 PC5 8 C3 12 T5 13 P3 18

PSEUDO Incongruent

A1 A1

O1

FZ 9 CZ 14 PZ 19

700

900 ms

6 F4 10 0 PC6 C4 16 15 T6 P4 20 O2

A2

24

Electrode placement & Analysis Zones

Early & Mid Effects

2

FTL 40-80ms

Fz 100-160ms

Pz 100-120ms

0

1

1

-1

0

0

-2

-1

-1

-3

-2

4

Pz 300-320ms

3 2 1 0

* Congruence x Lexicality

* Congruence

* Congruence

* Congruence

Late Effects (N400) Left 400-620ms

Midline 400-620ms

Right 400-620ms

2

2

2

1

1

1

0

0

0

* Lexicality

* Lexicality * Congruence

* Lexicality

Fig. 2. Grand average ERPs for four experimental conditions at each analysed zone, and bar charts for different analysis zones, for time windows of interest (main effects and interactions shown: *p < .05; error bars represent ±1 standard error).

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

F(2, 20) = 5.44, p < .05, pg2 = 0.35). This time window is broadly consistent with the previously reported audio-visual congruence effects in ERPs which emerging 140–180 ms after the onset of audio (Kovic et al., 2010). To uncover the nature of these interactions, zone-by–zone analysis was conducted, revealing effects with different onsets and durations over parts of the midline. At the frontal site Fz, the ERP began to differ significantly 100 ms after the onset of the visual stimuli, a difference which lasted for 60 ms (F(1, 13) = 5.71, p < .05, pg2 = 0.30), and during this period, incongruent stimuli elicited more negativity than congruent stimuli (t(13) = 2.39, p < .05, d = 1.32), across a negative peak in the ERP. At the parietal site Pz, a difference emerged at 100 ms, lasting for only 20 ms (F(1, 11) = 6.67, p < .05, pg2 = 0.39), and during this period, the incongruent condition elicited stronger negative wave deflections (t(11) = 2.58, p < .05, d = 1.56). 3.6.3. Mid time window 280–320 ms In the 280–300 ms time window, crossing a large positive peak, the ERP differed according to frame congruence (Frame Congruence: F(1, 10) = 18.23, p < .01, pg2 = 0.65), revealing that the ERP was more positive for stimuli presented in congruent frames than for stimuli presented in incongruent frames (Fig. 2). This effect was recorded most strongly over Pz in the time window 300–320 ms (F(1, 11) = 16.19, p < .01, pg2 = 0.59), where the ERP was more positive over a series of positive deflections, for congruent than for incongruent stimulus combinations (t(11) = 4.02, p < .01, d = 2.42). The timing of this effect, and its direction, are congruent with reports of an ERP effect known as the phonological mapping negativity (PMN: Newman & Conolly, 2009; Steinhauer & Connolly, 2008), understood to index goodness of fit of a phonological form to expectations generated by contexts including written words (e.g., Connolly, Service, D’Arcy, Kujala, & Alho, 2001) and also picture stimuli (e.g., Duta et al., 2012). The effect has been shown to be discrete from well known N400 effects of semantic congruence, and can be observed around 230–310 ms (Desroches, Newman, & Joanisse, 2008). The effect has also been observed to be graded, with larger deviations from expectation generating larger differences in the ERP (Duta et al., 2012). Consistent with a PMN, the effect reported here shows more negativity for stimuli presented in the incongruent condition, suggesting that the phonology of the written stimuli in the incongruent condition mismatch phonological expectations generated by the shape of the frame. Note that this effect showed no evidence of hemispheric lateralization. 3.6.4. Late time window 460–620 ms In the late time window, the ERP differed across the scalp (Band x Laterality: F(4, 40) = 3.30, p < .05, pg2 = 0.25). Stimulus lexicality generated a characteristic difference in the ERP (Lexicality: F (1, 10) = 7.69, p < .01, pg2 = 0.43). Some regions of the scalp differed according to the congruence of the frame to the phonology of the word form (Congruence  Laterality: F(2, 20) = 9.11, p < .01, 2 pg = 0.48). Congruent stimuli elicited generally more positivity across the scalp, a difference which was most pronounced along the midline, with pooled midline electrodes revealing main effects of both congruence and lexicality (Congruence: F(1, 10) = 7.52, p < .05, pg2 = 0.43; Lexicality: F(1, 10) = 9.17, p < .05, pg2 = 0.48). To further investigate the origin and timing of the lexicality effect across the scalp, zone-by-zone analysis was conducted. The lexicality effect was present in the majority of analysed zones, with the precise onset and the duration varying across the zones of the scalp: The effects began in left frontal regions (FTL: 460–520 ms, F (1, 15) = 6.04, p < .05, pg2 = 0.29), followed by, and overlapping with lateral centro-temporal zones (CTL: 480–620 ms, F(1, 19) = 10.21, p < .01, pg2 = 0.35; CTR: 480–600 ms, F(1, 19) = 7.85, p < .05,

19

g2 = 0.29). These effects were followed by further differences

p

emerging in lateral parietal zones (POL: 500–540 ms, F(1, 19) = 6.12, p < .05, pg2 = 0.24; POR: 500–540 ms, F(1, 17) = 6.35, p < .05, pg2 = 0.27), followed by effects emerging along the midline (Pz: 540–560 ms, F(1, 11) = 11.93, p < .01, pg2 = 0.52; Fz: 540– 620 ms, F(1, 13) = 6.86, p < .05, pg2 = 0.34). An identical pattern was present in all regions, i.e. pseudowords produced more negativity than words, and there were no substantial differences in hemispheric lateralization of this effect. The duration and the strength of the lexicality effect are consistent with the substantial N400 literature according to which unexpected stimuli elicit more negativity than expected/familiar stimuli (for review, see Kutas & Federmeier, 2011). In the current study, this general effect is accompanied by an effect of frame congruence, according to which, incongruent stimuli tended to elicit more negativity over the midline of the scalp during the same time window as the N400 effect – suggesting that the shape–sound congruence may possibly form part of a constellation of contextual cues which influence the N400. In our observations, the influence of frame congruence on the ERP was no longer observed at significant levels after 620 ms. This could explain why in Experiment 1, the congruence of the frame did not influence correct lexical decision times, which were typically closer to 700 ms: effects of shape–sound congruence are resolved prior to lexical access responses. These results confirm that even though no differences in reaction times to congruent versus incongruent stimuli were observed in the behavioural task, differences in the underlying neural processing of congruent versus incongruent stimuli were observed in the more sensitive measures of EEG. These effects were observed even in the absence of overt attention to sound-symbolic congruence embedded in the task, since the response paradigm effectively requires the frames to be ignored.

4. General discussion Decades of research have confirmed that sound symbolism in language clearly exists, but most demonstrations of these effects have been limited to somewhat artificial tasks involving novel word-forms, with no status in the lexicons of speakers of a language, and no association with possible word meanings. So far, those behavioural studies which have investigated symbolism in natural language have used only words from languages unfamiliar to the participants, and symbolism effects were observed through above-chance guessing of unfamiliar word meanings (e.g., Parault & Schwanenflugel, 2006). Although one study has attempted to investigate sound symbolic congruences while processing natural language stimuli in the native language of the participants (Westbury, 2005), it is difficult to draw general conclusions from that study given some of the limitations of the stimuli selected for test. In the current study we presented words previously demonstrated to have sound–shape-meaning congruence for speakers of Serbian, within frame shapes previously demonstrated to have canonical sound symbolic congruence for speakers of Serbian, and compared these with pseudowords generated from the real words using highly controlled phonological alternations consistent with the patterns of documented sound symbolism in Serbian. Furthermore, given the known fragility of sound symbolic effects in reaction time studies (Sucevic et al., 2013), we investigated these effects using both behavioural and neurophysiological measures of written word processing, to capture the underlying neural processing of sound symbolic effects, as well as the behavioural outcomes of these processes. Contemporary Serbian has some features which make it particularly interesting in the investigation of sound symbolism. It is a

20

J. Sucˇevic´ et al. / Brain & Language 145–146 (2015) 11–22

‘shallow orthography’ language, with near one-to-one correspondence between phonemes and letter-forms, meaning that irregular spellings will not distort the direct phonological representation of each of the sounds represented. Indeed, Carello et al. (1992), have presented a convincing argument that Serbian language processing does not conform to the Dual-route reading models of English (according to which it is possible to read either phonologically or from a whole-word orthographic memory), but rather, that the shallow orthography of Serbian produces a single-route ‘phonological’ reading process in all cases. If speakers of contemporary Serbian experience direct activation of phonological forms while reading, this may allow a more direct test of congruence between the phonology of a written word form and its visual frame, than may be possible for speakers of English. Our Serbian speakers showed a higher rate of errors when making decisions about real words than they did when making decisions about pseudowords. This seemingly counterintuitive finding is well grounded in general psycholinguistic processes, as well as in the peculiarities of the Serbian writing system. In our analysis of the relationship between speed and accuracy, it was clear that individuals exhibited a speed/accuracy tradeoff, whereby participants who responded faster made more errors. Given that words were responded to faster than pseudowords, we might expect more errors in this condition, as a consequence of fast responding. However, when we compared models including only individual speed, with models also including lexical status, it became clear that the lexical status of the stimulus was also contributing to the error rate, and we believe this effect is due to the differential effect of ambiguous letters on the interpretation of word and pseudoword stimuli. In written Serbian, for any stimulus containing ambiguous letters, more than a single percept is theoretically possible. For example, where a stimulus contains a single ambiguous letter (e.g., the ‘P’ in PO

Balloons and bavoons versus spikes and shikes: ERPs reveal shared neural processes for shape-sound-meaning congruence in words, and shape-sound congruence in pseudowords.

There is something about the sound of a pseudoword like takete that goes better with a spiky, than a curvy shape (Köhler, 1929:1947). Yet despite deca...
1004KB Sizes 1 Downloads 8 Views