Logopedics Phoniatrics Vocology

ISSN: 1401-5439 (Print) 1651-2022 (Online) Journal homepage: http://www.tandfonline.com/loi/ilog20

Perception of emotional nonsense sentences in China, Egypt, Estonia, Finland, Russia, Sweden, and the USA Teija Waaramaa To cite this article: Teija Waaramaa (2015) Perception of emotional nonsense sentences in China, Egypt, Estonia, Finland, Russia, Sweden, and the USA, Logopedics Phoniatrics Vocology, 40:3, 129-135, DOI: 10.3109/14015439.2014.915982 To link to this article: http://dx.doi.org/10.3109/14015439.2014.915982

Published online: 27 May 2014.

Submit your article to this journal

Article views: 31

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=ilog20 Download by: [Australian National University]

Date: 06 November 2015, At: 06:03

Logopedics Phoniatrics Vocology, 2015; 40: 129–135

Original article

Perception of emotional nonsense sentences in China, Egypt, Estonia, Finland, Russia, Sweden, and the USA Teija Waaramaa Downloaded by [Australian National University] at 06:03 06 November 2015

School of Communication Media and Theatre, University of Tampere, Tampere, Finland

Abstract The present study focused on the identification of emotions in cross-cultural conditions on different continents and among subjects with divergent language backgrounds. The aim was to investigate whether the perception of the basic emotions from nonsense vocal samples was universal, dependent on voice quality, musicality, and/or gender. Listening tests for 350 participants were conducted on location in a variety of cultures: China, Egypt, Estonia, Finland, Russia, Sweden, and the USA. The results suggested that the voice quality parameters played a role in the identification of emotions without the linguistic content. Cultural background may affect the interpretation of the emotions more than the presumed universality. Musical interest tended to facilitate emotion identification. No gender differences were found. Key words: Cross-cultural, emotion identification, gender, musical interests, nonsense text, voice quality

Introduction The interpretation of emotional expressions may not always be unambiguous, and even more so in crosscultural conditions where the speaker and the listener do not share the same mother tongue (first language), and hence are lacking the familiar perceptual cues. Universality is often related to the expression and perception of the basic emotions (anger, disgust, fear, joy, sadness, and surprise) which are thought to be similarly produced and interpreted independently of the cultural impacts. To achieve better understanding of the vocal attributes of emotions perceived, research of cultural differences in communicating emotional contents was essential. Scherer et  al. (1) conducted an investigation where German-speaking actors produced (nonsense) vocal utterances expressing four emotions and neutrality. The study was conducted in seven countries (USA, Indonesia, and European countries). The researchers noted that the closer the listeners were to the original language background of the speakers, the easier it was for them to identify the emotions expressed. The accuracy of the

identification decreased as the language/cultural similarities decreased. The percentage for the identification accuracy was 74% in Germany, 69% in Switzerland, 68% in Great Britain, the Netherlands, and the USA, 67% in Italy, 66% in France, 62% in Spain, and 52% in Indonesia. Mean accuracy across Western countries studied, excluding Germany, was 66%, and across all countries studied 66%. Similar results were reported by Sauter et al. (2) who conducted listening tests for identification of basic emotions expressed non-verbally by English and Himba speakers (Himba is spoken in Namibia). Like Scherer et  al. (1) these researchers concluded that as the dissimilarities increased, the emotion identification accuracy decreased among the listeners. In their cross-cultural study Waaramaa and Leisiö (3) reported that the identification percentage for emotions was somewhat higher (69%) than in the study by Scherer et  al. (1). Perception of valence (on an axis of positive—neutral—negative emotion) proved to be much higher: 91%. The countries studied were Estonia, Finland, Russia, Sweden, and the USA which have partly similar cultural and/ or language backgrounds, as Estonian and Finnish

Correspondence: Teija Waaramaa, PhD, Postdoctoral Researcher, School of Communication, Media and Theatre, Kalevantie 4, 33014 University of Tampere, Tampere, Finland. Fax:   35833568044. E-mail: [email protected] (Received 23 December 2013; accepted 13 April 2014) ISSN 1401-5439 print/ISSN 1651-2022 online © 2014 Informa UK, Ltd. DOI: 10.3109/14015439.2014.915982

Downloaded by [Australian National University] at 06:03 06 November 2015

130

T. Waaramaa

belong to the Finno-Ugric language family and the rest of the countries belong to the large IndoEuropean linguistic genus. Non-linguistic emotional vocalizations (affect bursts) were studied by Laukka et  al. (4). There, actors from India, Kenya, Singapore, and the USA served as subjects and produced positive and negative short emotional expressions without linguistic content. Swedish listeners participated in the perception test. The percentage for the recognition was 39%. Yanushevskaya et al. (5) studied different combinations of fundamental frequency (F0) and voice quality using synthesized stimuli. Modal, breathy, whispery, lax-creaky, and tense voice qualities were used. The stimuli were generated from a male voice expressing a Swedish phrase ‘Ja, adjö’ (Yes, goodbye). The first listening test was conducted with a Hiberno-English (Irish English) group of listeners whose task was to connect the samples to an affect. Next the test was conducted also with Japanese listeners (6), and cross-cultural differences in perception were studied. Altogether, Yanushevskaya (7) conducted perception tests besides with Hiberno-English and Japanese and also with Spanish and Russian listeners. It was concluded that there is no one-to-one mapping between an affect and the voice quality or F0 variations; however, voice quality tended to be more effective in affect signaling than did F0 variations. Voice quality played a role in emotion perception in all cultures studied. All listener groups associated lax-creaky voice quality with sadness and boredom. Tense voice quality was perceived as fearless and indignant. For the Hiberno-English and Russian groups of listeners whispery voice quality signaled intimate, apologetic, bored, and relaxed. Russian listeners also associated modal voice with fearless and formal. Scared and interested were poorly recognized from voice quality in these tests. However, with certain combinations of F0 they were cross-culturally recognized. The greatest differences between the listener groups were found for interpersonal, culturally learned stances such as intimate–formal and relaxed–stressed. The results showed that variations in F0 were of greater importance to the Japanese listeners than to the other listener groups, and variation in voice quality was of greater importance to the other listener groups than to the Japanese listeners. Interpretations of the affect perceived from the samples were strikingly different between the listener groups. Loudness of the voice in emotion perception was studied by the same researchers using synthesized stimuli (8). It was concluded that loudness may not determine the emotions perceived; however, a

combination of loudness with a modal or tense voice quality may affect the perception of activity levels. As there are similar elements involved in both vocal emotional expressions and in music, such as temporal elements, intensity, and pitch of the sound, it is hypothesized that vocal expression and music are linked together, even already in evolutional development (9). It is possible that the origin of speech and music were first of all connected to emotional expressions and shared the same mechanisms for communicating emotions through pitch, rhythm, duration, and intensity (10). Hence, it is hypothesized that the ability to recognize emotional states may be related to musicality. The present study focused on the identification of emotions largely in cross-cultural conditions in different countries with clearly divergent language backgrounds in order to see whether the perception of the basic emotions was universal. Listening tests for 350 participants were conducted on location on four continents, Africa, Asia, Europe, and North America, and in seven countries, China (Mandarin, tonal language), Egypt (Arabic, Semitic language), Estonia (Estonian, Finno-Ugric language), Finland (Finnish, Finno-Ugric language), Russia (Russian, Indo-European language), Sweden (Swedish, Indo-European language), and the USA (American English, Indo-European language). The aim was to investigate whether the cultural or language background had an impact on the interpretation of emotions. Also, it was of interest to study whether the perception of emotions was dependent on the acoustic parameters measured. Furthermore, valence perception (on an axis of positive—neutral—negative) was studied as well as gender differences. Nonsense vocal samples were used as material to eliminate the language impact since the words carry lexical meanings as such, and the focus was on the relation of the voice quality and emotion identification in a variety of cultures. Materials and methods Emotional nonsense sentences (n  32) were produced by Finnish professional actors (n  4) of both genders. Each sample consisted of 13 words which formed two sentences, thus, one two-sentence token formed one sample: [Elki neiku ko:tsa, fonta tegoa vi:fif:i askepan:a æspa. Fis:afi: te:ki sta:ku porkas talu]. Eight emotions were expressed while reading the sentences: anger, disgust, fear, interest, joy, sadness, surprise, and a neutral emotional state. The recordings were made at a professional recording studio, MediaBeat in Tampere, Finland. Sony Sound Forge 9.0 recording and editing system

Downloaded by [Australian National University] at 06:03 06 November 2015



Perception of emotions in cross-cultural conditions

and Rhøde NTK microphone were used. A microphone was placed 40 cm from the speaker’s lips. The recorded two-sentence nonsense utterances were measured with Praat Software for acoustic parameters of F0, maximum F0, SPL, duration, mean harmonics-to-noise ratio (HNR, dB), number of pulses, number and degree of voice breaks, and sample duration (11). HNR measures perturbation in the voice signal. The number of voice breaks is the ratio between the number of pulse distances (min 1.25) and the minimum F0 (11), and hence the parameter refers to the voice quality. Degree of voice breaks is the ratio between the non-voiced breaks and duration of the signal (11). The syllable duration of each utterance was also calculated in order to study possible rhythmic differences between the emotional expressions. The alpha ratio was calculated by subtracting the SPL in the range 50 Hz–1 kHz from SPL in the range 1–5 kHz. The alpha ratio is used to study the spectral energy distribution (12). The acoustic parameters and their correlations with the perception tests were investigated. Statistics were calculated by IBM SPSS 19. Listening test Listening tests for identification of the emotions were conducted on location in China, Egypt, Estonia, Finland, Russia, Sweden, and the USA. A total of 350 listeners, 50 in each country (25 males and 25 females), participated in the listening tests. The participants were randomly chosen volunteers, native speakers of the main language in the countries studied. The mean age of the listeners was 30.5 years, the youngest being in China (mean age 21 years) and the oldest in Finland (mean age 47.5 years). The anonymity of the participants was ensured. The tests were conducted one by one with the listeners. While doing the tests the listeners used Sennheiser HD 598 headphones, and they answered orally which emotion of the eight options they perceived. The test was a forced choice test. Before the actual testing the participants completed a questionnaire eliciting their background information in order to study correlations between their musical interests and accuracy of the identification of the

emotions. The participants responded to nine statements about their interests in musical activities by choosing their answer from the alternatives ‘Yes’, ‘No’, ‘Not any more’, ‘Sometimes’. The statements were: 1) ‘I like to listen to music’; 2) ‘It is easy for me to respond to music’; 3) ‘I am interested in singing’; 4) ‘I play a musical instrument’; 5) ‘I am interested in dancing’; 6) ‘It is easy for me to dance in the correct rhythm’; 7) ‘It is easy for me to learn a new melody’; 8) ‘Music may affect my mood’; and 9) ‘Music may cause me physical reactions’. The listening conditions differed between the countries. This was allowed since the idea was to replicate normal social communication situations between people. Thus, no attempt was made to eliminate random noises outside the testing room. In most of the countries a normal office room or a classroom was used; in Sweden a soundproof studio was available. In Egypt the listening tests were conducted only partly in an office room and mainly in relatively noisy conditions at a hospital. Hence, cultural differences were present in the testing procedure concerning the level of privacy and quietness. Translators were needed in Egypt and Russia to explain the testing procedure to those participants who did not speak English.

Results Voice quality The correlation between the acoustic parameters and the identification of the emotions was investigated. The results showed that number of voice breaks correlated significantly with emotion and valence identification (P  0.05), and mean harmonics-tonoise ratio (P  0.01) and SPL (P  0.01) with emotion identification in both genders. Emotion identification was also affected by maximum pitch (P  0.05) in females (Table I). Number of voice breaks was lowest for anger and highest for sadness in both genders. On the other hand, positive emotions seemed to have fewer voice breaks than negative ones, but the result of the degree of voice breaks was non-significant (Table II).

Table I. Pearson correlations between voice quality parameters and the share of emotions and valence identified.

Emotions identified, male listeners Emotions identified, female listeners Valence identified, male listeners Valence identified, female listeners

 131

Max pitch

n voice breaks

HNR (dB)

SPL (dB)

ns P  0.05* ns ns

P  0.05* P  0.05* P  0.05* P  0.05*

P  0.01** P  0.01** ns ns

P  0.01** P  0.01** ns ns

*­ *P  0.01; *P  0.05; ns  non-significant in Pearson correlation.

132

T. Waaramaa Excluding neutrality, sadness followed by disgust and interest were the most frequently chosen emotions for an answer, and joy most rarely. Anger and disgust as well as surprise and interest were frequently confused together. However, one quarter of the anger samples were perceived as interest in Egypt (Table IV). Student’s t test did not reveal any significance between the ‘Yes’ answers given for the statements on the questionnaire and the emotions identified, except one statement. ‘It is easy for me to learn a new melody’ was significantly connected to the emotions identified in females (P  0.022). The ‘Yes’ answers to the statements on the questionnaire were significantly connected to the country (P  0.001). The logistic regression model of the combined effects showed no significance between the ‘Yes’ answers, country, and the emotions identified. Perception of valence was significantly connected with the country (P  0.001). The logistic regression model of the combined effects showed significance in females between the country and the statements ‘It is easy for me to respond to the music’ (P  0.01) and ‘Music may affect my mood’ (P  0.011), and in males between the country and the statement ‘I play a musical instrument’ (P  0.001). In Egypt, only 20/50 participants were able to do the listening test in conditions relatively similar to the conditions in the other six countries. Hence, the majority of the listeners had to do the test in noisy conditions. As a result, the identification accuracy differed between these two groups in Egypt: the percentage of the identification accuracy was 54% in noisy conditions and 66.5% in quiet conditions. Overall, it is more difficult to hear in noisy than in quiet surroundings, thus it can be concluded that noisy surroundings also impede emotion perception.

Downloaded by [Australian National University] at 06:03 06 November 2015

Table II. Ratio between sample duration and unvoiced breaks (11) in percentages, both genders together. Emotion

Degree of voice breaks (%)

Joy Surprise Interest Disgust Neutral Sadness Anger Fear Total

51.4 57.6 58.1 59.2 62.1 65.1 66.0 70.9 61.3

The mean duration of the nonsense samples was 9,652 ms. Sample duration varied significantly negatively between sadness and anger (P  0.039) and the mean of the syllable duration between sadness and surprise (P  0.026), being longer in sadness in both the sentences and syllables (Table III). Standard deviation calculated from the syllable durations differed significantly between genders (P  0.006), being greater in females than in males for all the emotions expressed except for a neutral emotional state. A positive correlation was found between SPL and alpha ratio (r  0. 470). Listening test and questionnaire Emotions were identified with 66.5% and valence with 87.5% accuracy in the seven countries. When emotion identification was studied by gender only a 2% difference in the accuracy was found between the males (65.5%) and females (67.5%), and in valence identification a 0.5% difference (males 87%, females 87.5%). Thus, no significant gender differences occurred. Crohnbach’s alpha was 0.876 for the perception of the emotions for the seven countries studied. The samples were perceived significantly differently in China, Egypt, Russia, Sweden, and the USA compared to Finland, and between Estonia and Russia (P  0.05). The number of emotions identified out of 1,600 samples replayed/country was: Finland 1,264, Estonia 1,187, Sweden 1,135, Russia 1,029, China 1,004, USA 988, and Egypt 844.

Discussion Voice quality reflected in number of voice breaks significantly affected identification of emotions and valence, and HNR and SPL affected the identification of emotions. Thus, emotion evaluation seemed

Table III. Mean values of durations and standard deviations of the samples and syllables. Smallest and largest values are given in italics. Duration and SD Mean syllable duration (s) Mean sample duration (s) Sample SD Syllable SD

­SD  standard deviation.

Neutral

Sadness

Fear

Anger

Disgust

Joy

Surprise

Interest

0.2 7.86 1.52 0.18

0.24 11.48 3.51 0.3

0.2 9.88 3.49 0.28

0.21 7.14 0.75 0.2

0.24 7.87 1.72 0.27

0.25 7.28 2.39 0.28

0.18 7.73 1.38 0.2

0.2 7.79 1.6 0.2

Perception of emotions in cross-cultural conditions



133

Table IV. Confusion matrix of the answers given in the listening tests in the seven countries studied. Emotion identified *Emotion expressed cross-tabulation Emotion expressed

Downloaded by [Australian National University] at 06:03 06 November 2015

Emotion identified Neutral Sadness Fear Anger Disgust Joy Surprise Interest Total

Neutral

Sadness

Fear

Anger

Disgust

Joy

Surprise

Interest

Total

1192 64 13 17 53 7 3 51 1400

29 1250 67 6 42 0 1 5 1400

13 108 1141 34 47 0 26 31 1400

163 11 17 685 358 4 30 132 1400

34 123 28 344 821 15 13 22 1400

47 32 51 49 74 847 172 128 1400

27 4 25 97 86 55 784 322 1400

116 19 34 11 20 97 372 731 1400

1621 1611 1376 1243 1501 1025 1401 1422 11200

The figures in italics signify the matching intended and identified emotion samples.

to be based on the voice quality, especially on the degree of acoustic periodicity. Positive emotions tended to have fewer voice breaks than negative emotions. Fónagy (13) has suggested that perceptual regularity/continuity/ predictability is related to the melodicity of the speech. The degree of melodicity is further related to the positive/negative manner of speaking, e.g. speaking to children is more predictable, regular, rhythmic, and melodic than speaking to other interlocutors (see e.g. (14)). Speakers with ‘normal’ voice quality and resonance have been shown to be rated more positive than those with voice or resonance disorders (15). Yanushevskaya et al. (8) studied synthetized stimuli and concluded that a combination of loudness with a modal or tense voice quality may affect the perception of activity levels. In the present study of continuous natural speech, intensity (SPL) combined with voice quality tended to affect emotion identification. Characteristics of voice quality and the phonation type were reflected in acoustic periodicity and in alpha ratio which correlated positively with SPL. Alpha ratio and SPL are known to vary together (16,17). Voice quality has been shown to affect emotion perception also in earlier studies by Laukkanen et al. (18) and Waaramaa et  al. (19). In both studies F3 and F4 seemed to play a role in the perception of positive or negative emotions, F3 and F4 being higher in frequency in positive emotions than in negative ones. In these studies attenuated vowels were investigated. Females have conventionally been claimed to be more sensitive than males in emotion production and perception; nevertheless, gender was not a distinguishing element in the perception of emotions or valence in the present study. Koeda et al. (20) reported

similar findings concerning the gender. It has to be noted, however, that capability to recognize emotions may differ significantly between individuals. Differences in accuracy were significant in the emotion identification between Finnish listeners and the listeners in the other countries excluding Estonia. Hence it was concluded that the language background may have an effect on the perception of the emotional content of the speech even without actual words. The perception may be based on familiarity with the speech prosody, and in Estonia on the linguistic relationship between the two languages, Estonian and Finnish. The cultural similarity between Finland and Sweden did not seem to affect the perception as much as the language background. However, the perceptual differences were the next smallest between Finnish and Swedish listeners. There are six basic emotions (anger, disgust, fear, joy, sadness, and surprise) which are frequently claimed to be universal in the sense that they are expressed and perceived similarly all over the world regardless of culture or language. In emotion research a neutral emotional state is often included in emotional expression and perception tests. Neutrality can also serve as a control emotion in examinations. The six basic emotions, a neutral emotional state, and interest were investigated in the present study. Interest was included since interest is said to be the principal force in organizing consciousness, and therefore its existence was seen to be crucial in the present study (21). Yanushevskaya (7) found that tense or modal voice quality combined with F0 expressing ‘indignation’ was perceived as ‘interested’ in all the countries studied. In the present study, a quarter of anger samples were perceived as interest in Egypt. One reason for this result can be the fact that division of emotions according to their valence into positive or

Downloaded by [Australian National University] at 06:03 06 November 2015

134

T. Waaramaa

negative groups is arbitrary, since for instance anger may sometimes be understood as a motivational (or interested and further enthusiastic) action (21). Enthusiasm and—on the other hand—discouragement are always present as background emotions in a human mind (22). When a new emotion meets the ongoing emotion and cognition, a non-linear interaction process starts between these two. This process is unavoidable since an emotion or affect is always present in a human mind. ‘There is no such thing as affectless mind’ (21). The results may imply that emotions can be interpreted against the social or socio-cultural background rather than considered as universal. Moreover, as Arab culture is considered to be one of the so-called high contextual cultures, the interpretation of vocal expression differs from that of low contextual cultures (23). Low contextual cultures (such as the ‘Western’ cultures) pay more attention to the accurate meaning of a word. In high contextual cultures the tone of the voice or the context in which the word is used may convey more information than the actual word. Another example of the differences between high and low contextual cultures is the result of the listening tests in China and the USA. The Chinese listeners identified emotions with better accuracy than the US listeners even though the speakers represented the ‘Western’ culture. Chinese, being a tonal language, requires of its speakers the ability to listen and learn the tones of the voice very carefully since the tone is related to the meaning of the word. Furthermore, only five out of 50 of the Chinese participants reported not being interested in singing. Thus, it can be speculated whether this low number might be related to the learning of a tonal language as a child and hence might facilitate pitch and timbre perception (see e.g. (24)). On the other hand, the acoustic cues may not be the only channels through which to convey vocal information. A recent study by Smith and Burnham (25) was related to the perception of a tone by native speakers of Mandarin Chinese and by speakers of Australian English. Both groups were cochlear implant users. The tone-naïve participants outperformed native speakers of Mandarin in the research set where only visual information was available. As their conclusion the researchers stated that ‘visual speech information for tone may be language-general, rather than the product of language-specific learning’. In the present study it was assumed that the basic emotions were universal and that they would be perceived similarly by the majority of the listeners, no matter where they lived. The results showed, however, that this assumption was only partly correct. It turned out that the cultural aspect may be a stronger factor in the development of an individual in a cultural context than the assumed emotional universals.

Conclusions The aim of the present study was to investigate whether the perception of the basic emotions was universal, and whether it was dependent on voice quality, musical interests, and/or gender. Listening tests for the emotion identification were conducted on location on four continents, in seven countries, for 350 participants with different language and cultural backgrounds. The results suggested that voice quality parameters and degree of melodicity correlated with the identification of emotions and valence in the absence of the linguistic content. The listeners’ musical interests tended to facilitate the identification task. No gender differences were found in the identification of emotions or valence between the seven countries studied. Finally, similar language background and cultural contextualism may affect the interpretation of the emotions more than the presumed universality. This warrants further study. Acknowledgemets Special thanks to the participants in the listening tests in China, Egypt, Estonia, Finland, Russia, Sweden and the USA, and the contact persons who made the listening tests possible: China: Director Yang Xinyi, MA, and Pirkko Luoma, MA, Beijing Foreign Studies University, Beijing, China. Egypt: Dr. Ahmed Geneid, ENT and Phoniatrics department of Helsinki University Hospitals, Helsinki, Finland, Professor Mahmoud Youssef, the Head of the Phoniatrics Unit, and Dr. Ahmed Abul Kassem and Dr. Ahmed Mohamed Refaat, El-Demerdash and El Sahel Educational Hospitals, Ain Shams University, Cairo, Egypt. Estonia: Director, Dr. Pille Pruulmann-Vengerfeldt and the staff, Institute of Journalism and Communication, University of Tartu, Tartu, Estonia. Russia: Director, Dr. Pavel Skrelin and Tatiana Chukaeva, Department of Phonetics, Saint Petersburg State University, Saint Petersburg, Russia. Sweden: Professor Sten Ternström, the Head of the Music Acoustics group, and his students Ragnar Schön and Evert Lagerberg, KTH, Royal Institute of Technology, Stockholm, Sweden. USA: Associate Professor Graham D. Bodie and Dr. Christopher C. Gearhart, Department of Communication Studies, Louisiana State University, Baton Rouge, LA, USA. The author would also like to thank Hanna-Mari Puuska M.Sc. and Liudmila Lipiäinen M.Sc. for statistical analyses, Virginia Mattila M.A. for language correction of the manuscript, and the translators for translating the questionnaire.

Perception of emotions in cross-cultural conditions

Declaration of interest:  The author reports no conflicts of interest. Funding was been received from the Academy of Finland [grant no 139321].­­­

References 1.

2.

Downloaded by [Australian National University] at 06:03 06 November 2015

3.

4.

5.

6.

7.

8.

9.

10.

Scherer KR, Banse R, Wallbott HG. Emotion inferences from vocal expression correlate across languages and cultures. J Cross-Cultural Psychol. 2001;32:76–92. Sauter DA, Eisner F, Ekman P, Scott SK. Crosscultural recognition of basic emotions through nonverbal emotional vocalizations. Proc Natl Acad Sci. 2009;107: 2408–12. Waaramaa T, Leisiö T. Perception of emotionally loaded vocal expressions and its connection to responses to music. A cross-cultural investigation: Estonia, Finland, Sweden, Russia and the USA. Frontiers in Emotion Science. Open access publication. 2013. Available at: http://www. frontiersin.org/emotion_science/10.3389/fpsyg.2013. 00344/full. Laukka P, Elfenbein HA, Söder N, Nordström H, Althoff J, Chui W, et al. Cross-cultural decoding of positive and negative non-linguistic emotion vocalizations. Front Psychol. 2013;4:353. Yanushevskaya I, Gobl C, Ní Chasaide A. Voice quality and f0 cues for affect expression: implications for synthesis. Proceedings of the 9th European Conference on Speech Communication and Technology, INTERSPEECH 2005. Lisbon, 4–8 September 2005. p. 1849–52. Available at: http://www.tara.tcd.ie/handle/2262/39405. Yanushevskaya I, Gobl C, Ní Chasaide A. Mapping voice quality to affect: Japanese listeners. Proceedings of the Speech Prosody. 2–5 May 2006. Dresden, Germany. Available at: http://sprosig.isle.illinois.edu/sp2006/. Yanushevskaya I. Vocal correlates of affective states [Academic dissertation]. Centre for Language and Communication Studies, School of Linguistic, Speech and Communication Sciences, Trinity College, Dublin, 2010. Yanutshevskaya I, Gobl C, Ní Chasaide A. Voice quality in affect cueing: does loudness matter? Front Psychol. 2013;4:335. Juslin PN, Laukka P. Communication of emotions in vocal expression and music performance: different channels, same code? Psychol Bull. 2003;129:770–814. Thompson WE, Schellenberg EG, Husain G. Decoding speech prosody: do music lessons help? Emotion. 2004;4: 46–64.

135

11. Praat manual. Available at: http://www.fon.hum.uva.nl/praat/ manual/Voice_1__Voice_breaks.html. 12. Frøkjær-Jensen B, Prytz S. Registration of voice quality. Brüel & Kjær Technical Review. 1973;3:3–17. 13. Fónagy I. Emotions, voice and music. In: Sundberg J, editor. Research Aspects on Singing, Royal Swedish Academy of Music. 1981; 33: 51–79. Available at: http:// www.speech.kth.se/music/publications/kma/papers/kma33ocr.pdf. 14. Trehub SE. The developmental origins of musicality. Nat Neurosci. 2003;6:669–73. 15. Lallh AK, Rochet AP. The effect of information on listeners’ attitudes toward speakers with voice or resonance disorders. J Speech Lang Hear Res. 2000;43:782–95. 16. Nordenberg M, Sundberg J. Effect on LTAS of vocal loudness variation. TMH-QPSR 2003;45:93–100. 17. Sundberg J, Nordenberg M. Effects of vocal loudness variation on spectrum balance as reflected by the alpha measure of long-term-average spectra of speech. J Acoust Soc Am. 2006;120:453–7. 18. Laukkanen AM, Vilkman E, Alku P, Oksanen H. On the perception of emotions in speech: the role of voice quality. Scand J Logoped Phoniatr Vocol. 1997;22:157–68. 19. Waaramaa T, Alku P, Laukkanen AM. The role of F3 in the vocal expression of emotions. Logoped Phoniatr Vocol. 2006;31:153–6. 20. Koeda M, Belin P, Hama T, Masuda T, Matsuura M, Okubo Y. Cross-cultural differences in the processing of nonverbal affective vocalizations by Japanese and Canadian listeners. Front Psychol. 2013;4:105. Available at: http:// www.frontiersin.org/Emotion_Science/10.3389/fpsyg. 2013.00105/abstract. 21. Izard CE. Basic emotions, natural kinds, emotion schemas, and a new paradigm. Perspectives on Psychological Science. 2007;2:260–80. 22. Damasio A. Brain and mind: from medicine to society. Conference lecture (video) in “Brain and mind: from medicine to society”, Barcelona, Spain. 24 May 2007. Available at: ½http://www.youtube.com/watch?v  Kbac W1HVZVk & NR  1. 9.5.2008. 23. Hall ET. The dance of life. The other dimension of time. Garden City, NY: Anchor Press/Doubleday; 1984. 24. Strait DL, Kraus N, Skoe E, Ashley R. Musical experience and neural efficiency – effects of training on subcortical processing of vocal expressions of emotion. Eur J Neurosci. 2009;29:661–8. 25. Smith D, Burnham D. Facilitation of Mandarin tone perception by visual speech in clear and degraded audio: implications for cochlear implants. J Acoust Soc Am. 2012;131:1480–9.

Perception of emotional nonsense sentences in China, Egypt, Estonia, Finland, Russia, Sweden, and the USA.

The present study focused on the identification of emotions in cross-cultural conditions on different continents and among subjects with divergent lan...
454KB Sizes 0 Downloads 3 Views