Cochlear Implants International An Interdisciplinary Journal

ISSN: 1467-0100 (Print) 1754-7628 (Online) Journal homepage: http://www.tandfonline.com/loi/ycii20

Pitch and lexical tone perception of bilingual English–Mandarin-speaking cochlear implant recipients, hearing aid users, and normally hearing listeners Valerie Looi, Elizabeth-Raye Teo & Jenny Loo To cite this article: Valerie Looi, Elizabeth-Raye Teo & Jenny Loo (2015) Pitch and lexical tone perception of bilingual English–Mandarin-speaking cochlear implant recipients, hearing aid users, and normally hearing listeners, Cochlear Implants International, 16:sup3, S91-S104 To link to this article: http://dx.doi.org/10.1179/1467010015Z.000000000263

Published online: 12 Nov 2015.

Submit your article to this journal

Article views: 22

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=ycii20 Download by: [Washington University in St Louis]

Date: 16 March 2016, At: 10:46

Pitch and lexical tone perception of bilingual English–Mandarin-speaking cochlear implant recipients, hearing aid users, and normally hearing listeners Valerie Looi1 , Elizabeth-Raye Teo2, Jenny Loo 2 Sydney Cochlear Implant Centre, Sydney, Australia, 2Yong Loo Lin School of Medicine, National University Singapore, Singapore Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

1

Objectives: The purpose of this current study was to investigate whether pitch, lexical tone, and/or speech-innoise perception were significantly correlated for Singaporean teenagers or adults who spoke both Mandarin and English. Methods: Thirty-three normal hearing or near-normal hearing listeners who did not use a hearing device (NNH group), eight postlingually deafened cochlear implant (CI) recipients (CI group), and three postlingually deafened bilateral hearing aid (HA) users (HA group) were recruited. All participants were bilingual Mandarin–English-speaking Singaporean residents. Participants were assessed on tests of pitchranking, lexical tone perception, and speech-in-noise. Results: The NNH group scored significantly better than the CI group for all tests and subtests. There were no significant differences for the pitch test between the HA group and either the CI or NNH group. However, HA users scored significantly better than the CI group, and more aligned with the NNH group’s scores for both the lexical tone and Mandarin speech-in-noise test. There were highly significant moderate positive correlations between all three tests. Discussion: Overall, the performance of the CI users in this study indicates that CI recipients still struggle on pitch-related auditory perception tasks. Additionally, although the test scores from the HA users were better than the CI recipients, they were not as good as the NNH listeners. The significant moderate correlations between all three tests indicate that there is at least some degree of overlap in the skills required to accurately perceive these stimuli. Conclusion: The overall results suggest that CI users, and to a lesser extent HA users, still struggle with complex auditory perceptual tasks, particularly when it requires the perception of pitch. However, it may be possible that training one of these skills (e.g. musical pitch) may then generalize to other tasks (e.g. lexical tone and/or speech-in-noise). This is important for counseling, as well as for planning effective rehabilitation programs. Keywords: Cochlear implants, Music, Pitch, Lexical tone, Hearing aid, Tonal language

Introduction It is generally well accepted that most postlingually deafened cochlear implant (CI) recipients achieve excellent speech perception outcomes, at least for non-tonal languages such as English, in quiet listening environments. However, outcomes for more-complex stimuli (e.g. music, tonal languages) and challenging listening environments (e.g. in background noise) tend to be somewhat poorer. One element required for the accurate perception of these more-complex stimuli is pitch. Pitch is a psychoacoustic attribute of Correspondence to: Valerie Looi, Sydney Cochlear Implant Centre, The Australian Hearing Hub, Macquarie University, Ground Floor, 16 University Avenue, 2109 NSW, Australia. Email: [email protected]

© W. S. Maney & Son Ltd 2015 DOI 10.1179/1467010015Z.000000000263

auditory perception, primarily determined by frequency, where sounds are ordered on a scale from low to high (ANSI, 1994). Duration and intensity can also affect the pitch percept. Although it is a key element of music, pitch is also integral to speech perception. In non-tonal languages such as English, pitch cues provide non-linguistic or paralinguistic information, such as determining the identity or gender of a speaker, differentiating questions from statements, or detecting emotions in speech (Luo et al., 2007; Peters, 2006). For tonal languages such as Mandarin or Cantonese, accurate pitch perception is more crucial, providing phonemic, lexical, and semantic information (Houtsma, 1997).

Cochlear Implants International

2015

VOL.

16

NO.

S3

S91

Looi et al.

Pitch and lexical tone perception

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Accurate pitch perception of complex tones relies on the perception of both the fundamental frequency (F0), as well as the harmonic components. The mechanisms related to pitch perception in normally hearing (NH) listeners are well described by others (Moore, 2007; Moore and Carlyon, 2005). For individuals with a significant sensorineural hearing loss, wider auditory filter bandwidths and less precise phaselocking result in reduced frequency selectivity and less accurate pitch perception (Arehart, 1994; Moore, 2007; Woolf et al., 1981). Further, for more significant levels of hearing loss, cochlear dead regions can further impact on pitch perception (Moore and Carlyon, 2005). For CI recipients, pitch perception is further compromised by limitations in the technology and the electrical stimulation of hearing (Drennan and Rubinstein, 2008; Limb and Roy, 2014). Looi (2008) and McDermott (2012) provide an overview of research comparing the pitch perception of CI recipients to NH listeners. Additionally, Looi et al. (2008a, 2008b) report that although CI recipients’ pitch perception accuracy is significantly poorer than hearing aid (HA) users who have similar levels of hearing loss, the HA users are not as accurate as NH listeners on pitch-perception tasks. In short, CI recipients and, to a lesser extent, HA users with a sensorineural hearing loss are less accurate at perceiving pitch information than NH listeners. For tonal language speakers using CIs and/or HAs, this could then adversely affect their speech perception and communicative interactions. There have been several studies reporting that CI recipients speaking tonal languages (adults or children) experience difficulty in accurately differentiating between lexical tones, even after several years of CI experience, and their speech perception development post-implantation does not replicate the trajectory seen for a ‘typical’ recipient whose native language is non-tonal (Au, 2003; Wei et al., 2000; Wu and Yang, 2003). Mandarin is the most widely spoken tonal language, and the official language of China. It has four different lexical tone patterns, predominantly identified through variations in the F0 contour and duration (Peng et al., 2004). There are two main categories in tonal languages – register tone systems and contour tone systems, of which Mandarin belongs to the latter (Hyman, 1993; Yip, 1989). Tones from this system are distinguished by changes in pitch shape or contour rather than the pitch levels relative to each other, so word discrimination relies not only on the segmental structure of the utterance, but also on the variation in the pitch contour (Koelsch and Siebel, 2005). As can be seen in Fig. 1, the four lexical tonal patterns can be classified as ‘flat and high’ for Tone 1, ‘rising’ for Tone 2, ‘low and dipping’ for Tone 3, and ‘falling’ for Tone 4, characterized by the height S92

Cochlear Implants International

2015

VOL.

16

NO.

S3

Figure 1 Mandarin tonal patterns: fundamental frequency contours plotted against time (Wei et al., 2004).

of the F0 and the direction of the pitch contour; each of the four tones results in a different meaning for the same syllable of /ma/ (Wei et al., 2004). Tone 3 has the longest duration, and Tone 4 the shortest. Fu and Zeng (2000) reported the mean duration of each tone to be as follows: Tone 1, 339.5 milliseconds (ms); Tone 2, 374.7 ms; Tone 3, 463.3 ms; Tone 4, 334.4 ms. A study by Fu et al. (2004) with nine CI recipients (eight children, one adult) showed that tone recognition was best for Tone 4 and worst for Tone 2, in their cohort. In that study, Fu et al. (2004) compared Mandarin tone recognition for two different speech-processing strategies (advanced combination encoder (ACE) and continuous interleaved strategy), and various different manifestations of the two strategies in nine CI recipients. The tone recognition test involved 96 test tokens – two male and two female native Mandarin speakers producing each of the four Mandarin tones for six different vowels, with a 4-alternate-forced-choice (4AFC) task response. Overall best results were obtained for the ACE speech-processing strategy using a 900 Hz rate, where mean tone recognition was 71.3% correct. Wei et al. (2004) examined tone recognition in five CI recipients (aged 21–59 years) who used the Cochlear Ltd (Cochlear Ltd. Sydney, NSW, Australia) Nucleus 22 implant with the SPEAK (Spectral Peak) processing strategy. Four of the five recipients were postlingually deafened, and duration of implant use ranged from 10 to 30 months. The lexical tone test (LTT) was the same as the one used in this current paper, although only the male-speaker tokens were used in the Wei et al. (2004) study. There were 25 consonant–vowel combinations for each of the four tonal patterns, providing 100 tokens in the test. In the Wei et al. (2004) paper, tone recognition was compared as a function of number of electrodes activated. For the ‘full’ activation of 20 electrodes, mean recognition was 57% correct, ranging from 25 to 71% correct.

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Looi et al.

Han et al. (2009) also looked at Mandarin lexical tone recognition in a study where CI children were upgraded to a new processor. There were 21 prelingually deafened native Mandarin-speaking children, aged 3.5–16.5 years with at least 1 year of experience using the Advanced Bionics CII or 90K device. Children were initially assessed using the HiRes speech-processing strategy, and then 1, 3, and 6 months later with the HiRes120 strategy. The toneperception task was a 2-alternate-forced-choice (2AFC) task where the children had to discriminate between pairs of monosyllabic words which differed in their tone pattern. Two male and two female speakers were used, and 48 stimuli-pairs were presented in the test. A picture of the word was provided for children to select from in the closed-set task, and unpublished pilot work reported that NH children scored close to 100% on this task. Mean percent-correct tone recognition scores were 74, 75, 75, and 82% for the four respective time points, with no significant differences between these session means. Zhou et al. (2013) looked at the relationship between tone perception and tone production in 110 prelingually deafened Mandarin-speaking CI children (aged 2–16 years) and 125 NH children (aged 3–10 years). It should be noted that the duration of implant use in the CI cohort was relatively short, with a mean of 1.27 years (range 0.09–4.9 years), and the age of implantation ranging from 1 to 13 years. The tone-perception task was the same as that used by Han et al. (2009). Mean percent-correct scores for the CI group was 67.31% (standard deviation (SD): 13.51%), significantly poorer than the 98.7% for the NH group (SD: 2.67%). Although it would seem reasonable to assume that musical pitch perception may be related to lexical tone perception for CI recipients, this topic has largely remained uninvestigated. Wang et al. (2011) compared 19 CI recipients (aged 14–57 years) to 10 NH listeners on a lexical tone perception and a pitch-interval discrimination task. All participants were native Mandarin-Chinese speakers. Five of the 19 CI recipients were prelingually deafened, including one child, and 8 of the 19 had less than 6 months’ implant experience. The LTT comprised 10 monosyllables, each spoken using the four different tones, once by a male and once by a female speaker. There were 160 tone tokens as each was presented twice. All tones were equal in duration, and a 4AFC task response mode was used. The pitch test (PT) used digitally synthesized complex tones of 300 ms duration, comprising the F0 plus the first three harmonics and representative of a piano timbre. Pitch discrimination was assessed by having the listener compare a pair of sequentially presented melodies, where one note of one of the melodies was changed, with the listener

Pitch and lexical tone perception

deciding if the melody pair was the ‘same’ or ‘different’. Two melodies were used – either the first seven notes of ‘Twinkle Twinkle Little Star’ (where the fifth note was changed), or the first six notes of ‘Happy Birthday’ (where the last note was changed). An adaptive algorithm was implemented to provide a pitch discrimination threshold (50%-correct), with testing starting at a two-octave pitch difference for the CI recipients and a half-octave difference for the NH listeners. Both the LTT and PT were presented in a freefield set-up using a single loudspeaker. Of the 19 CI recipients, three were unable to perform the PT at the largest F0 difference of two octaves. Of the remaining 16 CI recipients, the mean discrimination threshold across both melodies was 5.66 semitones (SD 5.57 semitones). The NH group’s average of 0.44 semitones (SD 0.2 semitones) was significantly better. For the tone-perception test, the CI recipients averaged 58.3%-correct (SD 19.78%), significantly poorer than the NH group’s mean of 97.3%-correct (SD 1.32%). For the CI participants, Tone 2 was significantly more poorly recognized than Tone 3 and Tone 4. There was a strong negative correlation between the pitch discrimination thresholds and tone-perception scores (r = −0.75, P < 0.001) for the 16 CI recipients who completed both tests. In contrast to Wang et al. (2011), a more recent study by Tao et al. (2014) found no significant correlation between lexical tone and pitch perception amongst either pre- or postlingual CI recipients. There were 21 prelingually deafened children (aged 6–16 years, mean 10.8 years) and 11 postlingually deafened recipients (aged 9–26 years, mean 17.1 years). Mean implant experience was 6.5 years (range 2–11 years) for the prelingual group and 2.9 years (range 0.3–6 years) for the postlingual group. Pitch perception was assessed using a melodic contour identification task, developed and detailed by Galvin et al. (2007, 2008). There were nine different five-note contours where each 500 ms note was representative of a piano sound. The pitch intervals between notes in the contours ranged from 1 to 6 semitones (one interval size per contour), resulting in 54 contours in the test (nine contours, six semitone intervals). A 9alternate-forced-choice task response procedure was used. The LTT was developed by the authors where four Mandarin monosyllables for each of the four tones were extracted from a Standard Chinese Database (Wang, 1993). There were two male and two female recordings of each monosyllable–tone combination, providing a test with 64 test items (four speakers, four tones, four syllables). Duration and amplitude cues were preserved in all stimuli to maintain the ‘natural’ speech features of the tones. Overall, the prelingual group scored 80.9 and 18.5%

Cochlear Implants International

2015

VOL.

16

NO.

S3

S93

Looi et al.

Pitch and lexical tone perception

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

for the lexical tone and melodic contour tests, respectively. The postlingual group scored 81.1% and 32.3% for the two respective tests. There was no significant difference between the groups for the LTT, but a significant difference for the melodic contour test. There was also no significant correlation between the melodic contour test and the LTT for either group. Evident from the above overview is that not only are both lexical tone and musical pitch perception compromised for CI recipients when compared to NH listeners, but also that outcomes for both are highly variable between individuals. Although pitch perception underlies both tasks, the relationship between the two tasks is still debatable. Accurate tone perception is dependent on the acquisition of linguistic contrast when learning the language, which is not required for music perception. Many of the abovementioned studies combined pre- and postlingual recipients, which may confound analyses given that prelingually deafened children have learnt to hear the pitch and lexical tone cues with their implant and do not have a ‘normal acoustic hearing’ representation of the sound to refer to. There have been no published studies (at least, not published in English) examining relationships between lexical tone and musical pitch perception of Mandarin-speaking adults using HAs. Further, all of the above studies were conducted with native Mandarin-speaking CI recipients from mainland China. However, Mandarin is spoken widely outside of China, including Taiwan, SouthEast Asia, and with the ever-increasing immigration levels, many Western countries as well. According to the seventeenth edition of Ethnologue (Paul et al., 2014), it is the most widely spoken language internationally, and the first or second language for 1.35 billion people. Slight differences arise for the same language spoken in different countries (e.g. accents, dialects, grammar, pronunciation, etc.). For example, the Mandarin spoken in China would be subtlety different to the Mandarin spoken in Taiwan, or Singapore, or other South-East Asian countries. Hence outcomes for Mandarin speakers residing outside of China may differ from the published studies conducted in mainland China. The purpose of this current study was to investigate whether pitch, lexical tone, and speech-in-noise perception were significantly correlated for teenagers or adults who spoke both Mandarin and English, and resided in Singapore. By having participants who spoke both English and Mandarin provided the additional benefit of enabling the researchers to instruct the participants in English (the official language of Singapore) and ensure that instructions were appropriately understood. Postlingually deafened CI recipients and HA users, as well as non-device users with normal to near-normal hearing were involved.

S94

Cochlear Implants International

2015

VOL.

16

NO.

S3

It was hypothesized that the non-device users would perform significantly better than the HA users, who would in turn perform significantly better than the CI recipients on all tasks, and that there would be significant correlations between the pitch, lexical tone, and speech-in-noise tasks.

Methods Participants Three participant groups were recruited for this study: (1) normal hearing or near-normal hearing listeners who did not use a hearing device (NNH group), (2) postlingually deafened CI recipients (CI group), and (3) postlingually deafened bilateral HA users (HA group). All participants spoke both Mandarin and English proficiently, and were aged between 13 and 60 years. Given that English is the national language of Singapore and recruitment, consent, and test instructions were provided in English, a Mandarin Background questionnaire was administered to participants to determine their Mandarin competency. All participants self-rated that they were competent in written and spoken Mandarin, and all except one HA user learnt the language at age 7 or younger. Subjects who had a significant cognitive, neurological, visual, and/or physical impairment(s) that would affect their ability to participate in the study were excluded. All hearingimpaired participants were recruited by the audiologists at the National University Hospital (NUH) Audiology Clinic in Singapore. NNH participants were recruited via word of mouth, and advertisements placed around NUH. The specific inclusion criteria for the NNH group were that their bilateral hearing thresholds had to be ≤40 dBHL at each of 0.25, 0.5, 1, 2, and 4 kHz. For the CI group, the inclusion criteria were at least 6 months of experience with their current implant, daily full-time use of the implant, and postlingual onset of their hearing loss. CI users whose main mode of communication was sign language, who had been severely to profoundly deaf without any amplification for more than 5 years, and/or had a short electrode array were excluded. For the HA group, participants had to have had a bilateral, symmetrical, moderate to severe postlingually acquired sensorineural hearing loss. They also had to have bilateral HAs which they wore every day for at least 3 months. Again, those whose main mode of communication was sign language, and those who had a severe or worse loss without any amplification for more than five years, were excluded. Thirty-three NNH participants, eight CI recipients, and three HA users were recruited. Unfortunately the recruitment of bilingual, bilaterally fitted HA users proved extremely difficult, in part due to the fact

Looi et al.

Pitch and lexical tone perception

Table 1 CI group participant details

Subject Age (M/F) (years) Etiology

Age Age fitted Time with Music Daily diagnosed HA CI experience Mandarin (years) (years) (months) score use (hours) Implant

1 (F) 2 (F) 3 (F) 4 (M)

34 15 24 38

28 7 20 4

31 9 22 38

28 41 18 10

1 3 2 1

8 10 1 4

5 (M) 6 (F)

13 17

4 4

11 17

27 10

2 3

8 2

7 (M)

27

14

86

2

5

8 (M)

25

9

84

2

6

Unknown Unknown EVAS Thyroid fever Unknown Unknown

EVAS and 13 Mondini Unknown 8

Med-El Cochlear Cochlear Advanced Bionics Cochlear Advanced Bionics Med-El

Speechprocessing strategy CI ear FSP-4 ACE ACE HiRes Optima-P ACE HiRes Fidelity 120 FSP

Bimodal

Right Bilateral Right Right

N N Y N

Right Right

Y Y

Right

N

Advanced HiRes Left Bionics Fidelity 120

N

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

EVAS: Enlarged Vestibular Aqueduct Syndrome.

that at the hospital clinic where this study was conducted, many of the HA patients either had asymmetrical losses, unilateral HAs, or were aged 70 years or older, as the Singaporean Government subsidizes HAs for the elderly. Further, unlike CI recipients who had regular appointments at the clinic for MAPping, once fitted and ‘stable’, the HA patients did not return to the clinic unless a significant issue arose. Additionally, hearing-impaired children in Singapore are exempted from having to learn Mandarin at school, hence the number of bilingual Mandarin–English HA users (or CI users) is quite small relative to the number of bilingual NNH listeners. The NNH group comprised 26 females and 7 males, ranging in age from 18 to 60 years (mean 34.7 years; SD 10.5). The eight CI recipients (four females, four males) ranged in age from 13 to 38 years (mean 23.5 years; SD 8.9), and had used their implant for between 10 and 86 months (mean 38.0 months; SD 30.8). All except for 1 recipient were unilateral recipients, with three using a HA in their contralateral ear. Further details appear in Table 1. The three adults (two females, one male) in the HA group were aged 26, 30, and 59. They had a range of device experience (range: 23–103 months), with two having a moderate

hearing loss and one having a severe hearing loss. Table 2 provides more information.

Materials A summary of the tests used in this study is provided in Table 3. PT The pitch-ranking test developed and used by Looi et al. (2008b) and Sucher and McDermott (2007) was used in this study. Only the half-octave, quarteroctave, and semitone interval subtests were used. Each item in the subtest comprised two notes, separated by the designated interval size. The stimuli were male- and female-sung vowels (English /a/ vowel), which were recorded from professional singers, and subsequently edited to be 500 ms in length, and randomized in intensity (6 dB range). Participants had to decide which of the two notes was higher in pitch, using a 2AFC response format. Scores were converted into a percent-correct for each subtest. Specific recording and editing details are provided by Looi et al. (2008a, 2008b), and the F0 of the stimuli are in Table 4. Stimuli were presented via the software program ‘MACarena’ (Lai and Dillier, 2002), which randomized the stimuli order and enabled responses

Table 2 HA group participant details Age diagnosed (years)

Device experience (months)

Puretone average (PTA) (dBHL)

Music experience score

Daily Mandarin use (hours)

Renal failure

24

23

70.00

1

8.0

59

Presbycusis

59

4

47.50

1

0*

30

Gentamicin

21

103

48.75

3

0*

Subject (M/F)

Age (years)

1 (M)

26

2 (F) 3 (F)

Etiology

HA brand (model) Siemens Pure 301 XCLM Siemens 7mi Widex CIC Inteo

*Note these two participants, although proficient in Mandarin, did not speak it on day-to-day basis.

Cochlear Implants International

2015

VOL.

16

NO.

S3

S95

Looi et al.

Pitch and lexical tone perception

Table 3

Summary of the tests

Test

Subtests

Stimuli

Response

Scoring

Pitch-ranking (one male singer, one female singer)

(i) Half-octave (ii) Quarteroctave (iii) Semitones (i) Malevoiced (ii) Femalevoiced

24 intervals per singer 32 intervals per singer

2AFC

/48 /64

Lexical tone identification

M-HINT

Table 4

4AFC

10 sentences, 10 words per sentence

Repeat sentence. Scored as number of words correct.

Fundamental frequencies of sung vowels in the PT

Subtest 1 Half-octave Subtest 2 Quarter-octave Subtest 3 Semitones

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

48 intervals per singer 100 tones – 25 monosyllables × 4 tones

Female Male Female Male Female Male

262 Hz (C4)–740 Hz (F#5) 98 Hz (G2)–277 Hz (C#4) 262 Hz (C4)–523 Hz (C5) 139 Hz (C#3)–277 Hz (C) 370 Hz (F#4)–523 Hz (C5) 139 Hz (C#3)–196 Hz (G3)

to be entered directly into the program for later analyses. The larger interval sizes were presented before the smaller interval sizes, with the male/female order being randomized. LTT Developed by Wei et al. (2004), this test comprised 200 Mandarin tone tokens, 100 spoken by a male talker and 100 by a female talker. The F0 range of the tokens was between 150 and 350 Hz for the female talker, and 80 and 250 Hz for the male talker. The 100 tokens for each talker consisted of 25 consonant–vowel combinations, each spoken with the four Mandarin tonal patterns. Wei et al. (2004) stated that the consonant–vowel monosyllabic tokens were all lexically meaningful tones, and chosen to ensure diversity of consonant–vowel combinations. The exact tokens are specified in the Wei et al. (2004) paper. The amplitudes of the tones were controlled, with the Root-Mean-Square (RMS) levels being equalized across the tones; however, the natural duration of the tone was maintained. Durations of the tones ranged from 0.4 to 0.7 seconds with specific details for each tone provided in Table 5. Participants had to discriminate between the tones using a 4AFC task response. Mandarin hearing-in-noise test The Mandarin hearing-in-noise test (M-HINT: Wong et al., 2008), recorded by a male speaker with a Beijing Table 5

Tone 1 Tone 2 Tone 3 Tone 4

S96

Tone durations in the LTT (in seconds) Mean

Min.

Max.

0.725 0.710 0.847 0.678

0.457 0.499 0.641 0.405

1.014 0.999 1.163 0.985

Cochlear Implants International

2015

VOL.

16

NO.

S3

/96 /100 /100 /100

accent, was used in this study. It is used clinically in Singapore for native Mandarin speakers, as there are no locally recorded Mandarin speech tests using a Singaporean speaker. Two lists from 24 available were randomly selected for each participant. Each list had 10 sentences with 10 words, and scoring was based on the number of words correctly repeated. It should be noted that this study used a fixed signalto-noise (SNR) ratio of +10 dB (steady-state speech noise), in keeping with the clinical procedures at NUH, to give a percent-correct score.

Overall procedures Pure tone audiometry was performed for the NNH group prior to testing to ensure they met the inclusion criteria using standard clinical testing protocols. The order of the tests in the test battery was pseudo-randomized and the order of the stimuli within each of the subtests being fully randomized. The PT and LTT were presented from an ASUS ‘Zenbook’ laptop computer, connected to an external amplifier (PA210) connected to a single Canton Plus MX.2 loudspeaker placed 0.6 m away from the participant at 0° azimuth. Individually verified comfortable presentation levels were used, and participants used their everyday listening program and device settings for testing. Bimodal CI recipients were tested CI-only, removing their contralateral HA for testing (this ear was not plugged). For the M-HINT, stimuli were played from a Sony compact disk player routed through a MADSEN Orbiter 922-2 clinical audiometer and an amplifier (PA210) to a loudspeaker (FBT Project640BT) positioned at 0° azimuth, 1 m in front of the listener. The M-HINT was administered at 60 dBSPL with a 10 dB SNR. These M-HINT procedures were aligned with the standard clinical test procedures. Standardized instructions for each of the tests were given to the subjects, with no feedback provided during the test itself. Tests were administered in a single session lasting approximately 1.5 hours, with breaks as required. In addition, two short questionnaires were administered to all participants to determine Mandarin and English competency, as well as music experience levels. The former questionnaire

Looi et al.

Pitch and lexical tone perception

Table 6 Mean percent-correct PT scores for each group (SD in parentheses)

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Half-octave Mean (SD)

Male

Female

NNH CI HA

90 (10.4) 76 (15.0) 83 (19.1)

Quarter-octave

Combined

92 (11.5) 72 (15.4) 78 (25.5)

Male

91 (10.1) 74 (14.1) 81 (22.3)

84 (14.3) 59 (9.1) 79 (21.9)

asked participants to estimate their competency and daily use of both languages, with a particular focus on Mandarin given that English is the national language of Singapore. The music questionnaire was based on those used by Looi et al. (2008a, 2008b, 2012a, 2012b) in previous studies. The music questionnaire provided an overall ‘music experience score’ where 1 = ‘no prior music training or participation in a formal music activity’; 2 = ‘less than 5 years of music training or participation in a formal music activity’; and 3 = ‘5 years or more of music training or participation in a formal music activity’.

Results PT The groups’ mean scores for the three pitch subtests are shown in Table 6. Paired samples t-tests for both the NNH and CI groups, and a non-parametric Wilcoxon signed ranks test for the HA group, showed no significant difference between the maleand female-sung vowels. Hence the scores have been combined for further analyses. A mixed-model analysis of variance (MM-ANOVA) showed a significant within-subject effect of interval size (P < 0.001), a significant between-subject effect of group (P < 0.001), with no significant interaction. Post hoc pairwise comparisons with Bonferroni corrections showed that the NNH group scored significantly better than the CI group (P < 0.001), with no significant difference between the NNH group and the HA group, or the Table 7 Mean percent-correct LTT scores for each group (SD in parentheses) Mean (SD) NNH CI HA

Male 93 (8.8) 68 (19.0) 86 (7.1)

Female

Combined

90 (7.6) 64 (13.9) 83 (12.6)

91 (8.0) 66 (16.1) 85 (9.8)

Female

Semitone

Combined

85 (14.9) 59 (22.4) 69 (32.9)

84 (14.2) 59 (13.2) 74 (27.4)

Male 84 (15.6) 61 (9.1) 76 (21.1)

Female

Combined

83 (17.1) 60 (14.2) 73 (23.6)

84 (15.9) 61 (11.1) 74 (22.3)

HA group and the CI group. Scores on the halfoctave stimuli were significantly better than both the quarter-octave (P < 0.001) and semitone (P = 0.005) stimuli, with no significant difference between these two intervals. In view of the fact that there were only three HA users in this study, a second MM-ANOVA (with Bonferroni corrections for the post hoc analyses) was conducted to compare just the NNH and CI groups. Similar results were obtained with the NNH scores being significantly better than the CI scores (P < 0.001), and the half-octave subtest being significantly better than both the quarter-octave (P < 0.001) and semitone (P < 0.001) subtest, with no significant difference between the latter two interval sizes.

LTT The groups’ mean percent-correct scores for the LTT (male speaker, female speaker, and combined) are shown in Table 7. An MM-ANOVA showed a significant effect of speaker gender (P = 0.009, with the male stimuli being better than female), a significant between-subject effect of group (P < 0.001), with no significant interaction. Post hoc pairwise comparisons with Bonferroni corrections showed that both the NNH group and the HA group scored significantly better than the CI group (NNH: P < 0.001; HA: P = 0.03), with no significant difference between the NNH and HA groups. Re-analysis without the HA group provided similar findings with a significant difference between the NNH and CI groups, and a significant difference between the male and female speakers (P < 0.05 for all). The confusion matrices for the NNH group’s male and female LTT responses are provided in Table 8, with the confusion matrices for the CI group in Table 9 and HA group in Table 10. As an MMANOVA showed a highly significant interaction

Table 8 NNH group lexical tone confusion matrix (scores are in %) Stimuli presented Male speaker

Response given

Tone

1

1 2 3 4

97.3 1.1 1.0 0.6

Female speaker 2

3

4

2.1 90.1 6.9 1.0

0.8 12.2 86.6 0.4

0.8 1.7 1.0 96.5

Tone

1

1 2 3 4

97.3 1.3 1.1 0.3

Cochlear Implants International

2

3

4

1.2 90.1 7.9 0.8

0.6 24.2 74.8 0.4

0.8 1.5 1.3 96.4

2015

VOL.

16

NO.

S3

S97

Looi et al.

Pitch and lexical tone perception

Table 9

CI group lexical tone confusion matrix (scores are in %) Stimuli presented Male speaker

Response given

Female speaker

Tone

1

2

3

4

Tone

1

2

3

1 2 3 4

81 7.5 6.5 5

19 54.5 20 6.5

4 42 51 3

5.5 4.5 3 87

1 2 3 4

82 5 7.5 5.5

15 46.5 34.5 4

7.5 36 52.5 4

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

between group (CI and NNH) and tone for both the male- and female-speaker subtests, separate one-way ANOVAs were conducted for these two groups to determine if there were significant differences in recognition accuracy between the four tones; this was significant for all four tests (NNH: P < 0.001; CI: P < 0.03). Statistical analyses to compare across the four tones were not undertaken for the HA group, due to the lack of participant numbers. Post hoc pairwise comparisons with Bonferroni corrections showed that for the NNH group on the male-sung LLT, scores for Tone 3 were significantly worse than both Tone 1 (P = 0.001) and Tone 4 (P = 0.002). The difference between Tones 1 and 2 (Tone 1 higher) was nearly significant (P = 0.052). For the female-sung subtest, Tone 3 was significantly poorer than all the other tones (P < 0.001 for all), with Tone 2 also being significantly poorer than Tone 1 (P = 0.047). For the CI group, none of the post hoc pairwise comparisons with Bonferroni corrections were significant for the male-spoken tones. For the female-spoken tones, only the difference between Tones 1 and 2 was significant (P = 0.015; Tone 1 higher), with the difference between Tones 1 and 3 approaching significance (P = 0.061; Tone 1 higher).

M-HINT The mean scores of the M-HINT (averaged for the two lists) were: NNH 98% (SD: 2.6); CI 41% (SD: 24.5); and HA 83% (SD: 14.1). A one-way ANOVA with Bonferroni corrections showed the NNH and HA groups’ scores were significantly better than the CI group’s scores (P < 0.001 for both), with the difference between the NNH and HA groups approaching significance (P = 0.072). An independent samples t-test

4 9.5 6 8.5 76

to compare only the NNH to CI group provided the same result.

Correlations between tests Correlational analysis was performed to examine for relationships between the PT, LTT, and M-HINT scores. For this analysis, a single mean score for each of these three tests was used for each participant, calculated by averaging the subtest scores for the PT and LLT. Also, due to the low number of participants for the CI and HA groups, analysis was performed across all 44 participants. Significant moderately strong positive correlations (P < 0.001 for all) were found between all three tests: PT and LTT (r = 0.669); PT and M-HINT (r = 0.508); LTT and M-HINT (r = 0.613).

Predictive factors In order to examine what factors may predict performance on each of the three tests, a Backward Regression model was built with group (i.e. NNH, CI, and HA) entered as a fixed factor; and age, daily hours of Mandarin use, music experience score, and the mean scores from the other two tests being entered as covariates into the initial model. For example, if the dependent variable was the PT score, then the LTT and M-HINT means were ‘the other two test scores’ entered into the model. As all participants’ first language was English, only the variable related to Mandarin use (‘daily hours of Mandarin use’) was entered. For the PT, the initial model with all six factors provided an R square (R 2) value of 0.631 and an adjusted R squared (adj R 2) of 0.559. Of the six factors, age, group, and M-HINT scores were highly insignificant, and therefore removed from the model. The new model provided an R 2 value of 0.600 (adj

Table 10 HA group lexical tone confusion matrix (scores are in %) Stimuli presented Male speaker

Response given

S98

Female speaker

Tone

1

2

3

4

Tone

1

1 2 3 4

93.3 2.7 4 0

0 78.7 21.3 0

0 25.3 73.3 1.3

0 2.7 0 97.3

1 2 3 4

93.3 1.3 0 5.3

Cochlear Implants International

2015

VOL.

16

NO.

S3

2

3

4

2.7 82.7 14.7 0

0 36 64 0

2.7 4 0 93.3

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Looi et al.

R 2 = 0.570), with all three remaining factors being significant. That is, collectively, lexical tone perception, music experience, and daily Mandarin use accounted for approximately 60% of the variance in PT scores. The LTT score was the most significant factor (beta = 0.681; P < 0.001) followed by music experience (beta = 0.446; P = 0.02) and Mandarin use (beta = −1.407; P = 0.015). For the LTT, the R 2 for initial model was 0.673 (adj R 2 = 0.609), with music experience, age, and M-HINT being highly insignificant. After removing these three factors, the final model was able to account for approximately 65% of the variance (R 2 = 0.645; adj R 2 = 0.609). ‘Group’ had the largest impact on the model (CI referenced to NNH: beta = −18.21, P < 0.001; CI referenced to HA: beta = −16.19, P = 0.01; HA referenced to NNH was not significant). That is, the model predicted a mean difference of 18.21 points between the NNH and CI groups, and 16.19 points between the HA and CI groups for the LTT, with the CI group scoring lower for both predictions. Mandarin use (beta = 1.068, P = 0.043) and PT score (beta = 0.425; P < 0.001) were the other two significant predictors of LTT. Finally for the M-HINT, the initial model provided an R 2 value of 0.844 (adj R 2 = 0.814) with only ‘daily Mandarin use’ and ‘group’ being significant factors. The final model accounted for approximately 83% of the variance (R 2 = 0.834; adj R 2 = 0.821), with ‘group’ having the largest effect (CI referenced to NNH: beta = −60.19, P < 0.001; CI referenced to HA: beta = −45.26, P = 0.021; HA referenced to NNH: beta = −14.93, P = 0.021). In other words, the predicted mean difference between the NNH and CI groups for the M-HINT was 60.19 points, and 45.26 points between the HA and CI groups (CI group score lower). The predicted mean difference between the NNH and HA groups was 14.93 points (HA group score lower). The regression coefficient for daily Mandarin use was beta = 1.364 (P = 0.026).

Discussion NNH vs. CI vs. HA comparisons The results from the PT, LTT, and M-HINT were partially consistent with the hypotheses in that the NNH group was significantly better than the CI group for all tests and subtests. Due to the lack of HA participants, statistically significant differences between this group and the CI or NNH group were less apparent. There were no significant differences for the PT between the HA group and either the CI or NNH group. However in the LTT, even with only three HA users, statistical analyses showed that their LTT score was significantly better than the CI group, and more aligned with the NHH group’s scores. Similarly for the M-HINT, HA users scored significantly better

Pitch and lexical tone perception

than the CI recipients, with the difference between the NHH and HA users approaching significance. It is acknowledged that with only three HA users, findings are not generalizable and should be interpreted conservatively; however, given the effect size between the three groups for some of the tests, and the paucity of publications on pitch and lexical tone perception of HA users, the authors felt that the HA data were still relevant and worthwhile to include. These findings are very much in alignment with existing research on pitch and music perception of CI recipients. It is well accepted that CI recipients score significantly lower than NH listeners on pitchbased music tasks, which not only includes pitchperception tasks, but also timbre and melody perception tasks as well (Gfeller et al., 2007, 2008; Looi, 2008; Looi et al., 2012a). Research comparing CI users to HA users has also shown that whilst CI users perform poorer than HA users on pitchranking tests, the HA users do not perform as well as NH listeners on the same task. This finding was observed in both HA users with severe-to-profound hearing losses (Looi et al., 2008a, 2008b), as well as HA users with lesser degrees of hearing loss (Looi et al., 2012b). A cochlear hearing loss affects both the temporal and place cues which are utilized by NH individuals to perceive pitch. For example, broader auditory filter bandwidths impede the use of place cues, in turn affecting frequency selectivity. With decreased frequency resolution, it becomes harder to accurately perceive the F0 (Moore, 2007). Shallow, less finely tuned psychophysical tuning curves, asymmetric auditory filters, and cochlea dead regions are other documented changes related to sensorineural hearing loss. In addition to these considerations, CI recipients have the additional confound of electrical stimulation and the sound processing that is required by the CI to enable this. In CI-mediated pitch perception, both place and temporal coding mechanisms are impaired, as detailed by Limb and Roy (2014) and McDermott (2012). Although the main phonetic feature for lexical tones is the F0 contour, there are other acoustic characteristics which may help in tone recognition such as vowel duration and the acoustic amplitude. Fu et al. (1998) analyzed the properties of Mandarin lexical tones reporting that Tone 3 has the longest vowel duration and lowest amplitude peak whereas Tone 4 is the shortest in duration with the highest amplitude peak. It is worthwhile comparing the confusion matrices of the recipients in this study to those from Tao et al.’s (2014) study whose LTT also maintained the original amplitude and duration cues for the four tones. The stimuli in Tao et al. (2014) were also spoken by both male and female speakers, but collated for analyses.

Cochlear Implants International

2015

VOL.

16

NO.

S3

S99

Looi et al.

Pitch and lexical tone perception

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Tao et al. (2014) reported that for their 32 recipients, Tone 4 was the most accurately recognized and Tone 2 the least accurately recognized. Their statistical analyses revealed that Tone 2 was significantly more poorly recognized than Tone 1 (P = 0.008), Tone 3 (P = 0.006), and Tone 4 (P < 0.001), with Tone 1 being significantly more poorly recognized than Tone 4 (P = 0.037). In Wang et al.’s (2011) study which controlled duration cues, Tone 2 was the most poorly recognized tone, and most often confused with Tone 1. In the current study, analyses were separated for the male and female speakers as speaker gender had a statistically significant effect on LTT scores. Overall, there was a lack of homogeneity in the results across the groups in terms of which tone(s) were most vs. least accurately recognized. For the male speaker, Tone 4 was the most accurately recognized by the CI recipients, and Tone 3 the least accurate (although not statistically significant). For the female speaker, Tone 1 was the most accurately recognized by the CI recipients, and Tone 2 the least accurate; this difference was statistically significant (P = 0.015). For the NNH group, Tone 1 was the most accurately recognized and Tone 3 the least accurate for both the male and female speakers (P ≤ 0.001 for both speakers). For the HA group, Tone 4 was the most accurately recognized for the male speaker with Tone 1 and Tone 4 equally best recognized for the female speaker. Tone 3 was the least accurately recognized for both genders. It is interesting to note that of the four tones in this study, Tone 3’s duration was substantially longer compared to the other tones. Despite this, it was generally the poorest recognized tone across all participants, with the only exception being the female-spoken tones for CI recipients. Hence temporal cues do not appear to assist with lexical tone perception in the same way as they help with music perception. The authors propose that this could be primarily related to two factors – the ‘familiarity’ and awareness of the temporal cues, as well as the length of the temporal cue. In a melody, rhythm is a prominent feature; if the melody is known by the listener, the rhythm would provide cues to aid recognition. Even if the melody is not familiar, the listener may have an expectation of the tempo (speed) of a piece based on considerations such as the musical style or artist/band, or a characteristic rhythmic ‘feel’ that a style may impart (for example, a ‘Swing Jazz’, a ‘Latin American Tango’, or a ‘March’). However rhythm cues within speech do not provide the same level of information about the speech signal, and one may not know the relative duration of the tones (for example that Tone 3 is longer than the other tones). Further, the short duration of the tone, with less than 1 second difference

S100

Cochlear Implants International

2015

VOL.

16

NO.

S3

between the means of the shortest and longest tone, would make it hard for a listener to use relative duration cues to discriminate between tones. Interestingly, unlike the LTT and previous studies (Looi et al., 2008b, 2012b) which used the same PT as this study, there was no significant difference in pitch-ranking scores for the male vs. female singer, for any group. Psychoacoustic research has indicated that for CI recipients, whilst temporal modulations in the stimuli can provide more reliable pitch cues than place-pitch changes, these temporal cues are only salient for lower rates up to around 300 Hz only. Hence stimuli with lower F0 have been noted to result in better pitch-ranking scores than higher F0. For example, in Looi et al. (2008b), the CI recipients were significantly better at pitch-ranking the male-sung vowels than female-sung vowels, whereas the HA users were significantly better with the female-sung vowel stimuli than the male. It may be that the temporal pitch cues were less salient for the group of recipients in this study, and/or that they were less able to use the temporal cues available to them. However their combined pitch-ranking scores were better than the 15 CI recipients in the Looi et al. (2008b) paper who averaged 64%-correct for the half-octave subtest and 52%-correct for the quarter-octave subset (the semitone interval was not assessed). The CI recipients in this study averaged 74%-correct and 59%-correct for the half- and quarter-octave subtests, respectively.

Relationship between pitch, lexical tone, and speech-in-noise perception Overall, the findings of this study were consistent with the hypothesis that there would be significant correlations between the tests of pitch, lexical tone, and speech-in-noise, with highly significant moderate positive correlations found between all three tests. This is in keeping with Wang et al.’s (2011) study, who reported strong correlations between pitch and lexical tone perception when duration cues were controlled for in the lexical tone task. Speech-in-noise was not assessed in the Wang et al. (2011) study. To investigate the relationship between the different perceptual tasks further, regression analyses were conducted for each test to look at predictive variables. For the PT, lexical tone perception, music experience, and daily Mandarin use accounted for approximately 60% of the variance in PT scores. The fact that group was not a significant predictor of PT scores was probably related to the small number of HA users who collectively performed somewhat similarly to the NHH group. The LTT score was the most significant variable, followed by music experience and Mandarin use. For the LTT, approximately 65% of the variance was accounted by the variables of ‘group’, daily

Looi et al.

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Mandarin use, and PT score. Group was the most significant factor in this model, with the model predicting a mean difference of 18.21 points between the NNH and CI groups, and 16.19 points between the HA and CI groups for the LTT. Other significant predictors were daily Mandarin use and PT score. These two findings add further weight to the argument of a significant relationship between musical pitch perception and lexical tone perception. For the M-HINT, only ‘daily Mandarin use’ and ‘group’ were significant factors, accounting for approximately 83% of the variance. Again, ‘group’ had the more significant predictive value of the two factors. Age was not a significant predictor for any of the three test scores.

General discussion This relationship between lexical tone perception and musical pitch perception raises the question as to whether training one area may generalize to the other area. For example, would music training generalize to improving lexical tone perception, or for non-tonal languages, to improve pitch-related aspects of speech perception such as prosody or emotion identification. For example, existing studies involving native English-speaking CI recipients report that they are significantly poorer than both NH and HA users at emotion identification tasks, and tend to rely more on intensity cues than pitch cues (Luo et al., 2007; Pereira, 2000; Peters, 2006). Hence, training one skill could potentially have wider benefits than just improving the perceptual abilities for that specific skill (François et al., 2013; Torppa et al., 2014). Existing research has indicated that transfer of learning occurs, even in adults – i.e. training can generalize to situations outside of the training session, and/or to novel stimuli (Tremblay et al., 1997). For example, Kraus et al. (1995) provided evidence that auditory training can lead to neurophysiological changes at both the cortical and subcortical areas in NH listeners. When Limb et al. (2010) compared brain activation patterns using positron emission tomography scans from postlingually deafened CI users to NH listeners, they found that CI recipients used nontraditional areas of the brain for auditory processing tasks, potentially due to the need to interpret the different sound that a CI provides. This is suggestive that the adult brain is plastic, and capable of neural reorganization and adaptation. Overall, the performance of the CI users in this study is consistent with existing research, indicating that recipients struggle on pitch-related auditory perception tasks. The percent-correct scores for the pitch-ranking task suggest that CI recipients perform at close to chance level when the interval between two successive notes is three semitones apart, or less. In the lexical tone task, the regression analysis

Pitch and lexical tone perception

predicted that the NNH group would score 18.21 percentage-points higher than the CI group. Although the lack of HA users in this study prevented robust statistical analyses and requires for the findings to be interpreted with caution, it is still valuable to note that the PT and LTT scores from the HA users were better than the CI recipients, but not as good as the NNH listeners. This is consistent with the findings of Looi et al. (2008a, 2008b) who compared CI recipients to HA users, as well as Kong et al. (2005) and Gfeller et al. (2006), who looked at the impact of acoustic hearing on electrical stimulation. A HA would largely preserve much of the original acoustic information important for pitch perception (i.e. F0 and fine-temporal information), therefore HA users with aidable levels of acoustic hearing would have some access to this information. However, it must be remembered that a sensorineural hearing loss results in changes to cochlea structures which in turn impact on psychoacoustic percepts including pitch. In contrast, CI processors largely eliminate or substantially reduce the fine-temporal information that is conveyed to the recipient, providing predominantly envelope information only. Further, the salience and reliability of both temporal and place pitch cues are substantially reduced with electrically stimulated hearing (Limb and Roy, 2014; McDermott, 2012). Although three of the eight CI recipients in this study used a contralateral HA, they were tested in a CI-only listening condition. A possible study for the future would be to not only compare the pitch and lexical tone perception of bimodal recipients in three different listening conditions (CI-only, HA-only, and bimodal, using an intra-subject analysis), but additionally to compare these results to a group of unilateral CI users, a group of bilateral CI users, and a group of bilateral HA users (using an inter-subject analysis). This would enable a better understanding of impact of acoustic hearing when used in conjunction with electric hearing. Unlike existing studies, age was not predictive of results in any test (Gfeller et al., 2008; Looi et al., 2012b). This could be in part due to the fact that participants in this study, and particularly the CI group, tended to be younger than the participants in most other comparable CI music studies (i.e. studies testing only postlingually deafened recipients). The mean age of the CI group in this study was 23.5 years, with the eldest recipient being 38 years old. Although statistics on the average age of postlingually deafened CI recipients in Singapore is not available, it would be reasonable to state that when compared to the countries where most of the existing published CI music perception studies have originated (e.g. USA, Australia), there are fewer elderly/very elderly adults

Cochlear Implants International

2015

VOL.

16

NO.

S3

S101

Looi et al.

Pitch and lexical tone perception

in Singapore with an implant. This is in part due to cultural reasons (e.g. preference to spend the money on their children or save it from ‘the next generation’), more conservative attitudes, higher levels of anxiety and trepidation in the Singaporean elderly with regard to surgery and anesthesia, and costs.

Limitations and future directions

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

In addition to the small number of HA and CI users in this study, the authors also acknowledge that the level of Mandarin fluency varied widely across participants. Although all participants were competent in listening and speaking Mandarin, and used Mandarin for communication on a regular basis, this was not formally assessed prior to the study (although the M-HINT was administered as part of the study). A self-administered questionnaire to determine Mandarin competency was used which involved both self-rating scales as well as questions such as ‘How many hours per day do you speak Mandarin’? None of the CI or HA users’ native language was Mandarin, although it should be noted that all participants scored above chance level on the M-HINT, where Mandarin sentences were presented with simultaneous noise. In comparing the results from this study to existing studies, it should be remembered that all participants in this study were bilingual. There are very few, if any, studies that compare language development and speech perception for bilingual CI recipients to monolingual recipients for the same language, so the authors cannot comment on the potential differences that may be observed for English–Mandarin bilingual recipients to the monolingual Mandarin recipients in other studies. This would be an area for future investigation. Further, as this study was conducted in Singapore, with Singaporean participants, some consideration should be given to the spoken test material used in this study (i.e. the LTT and M-HINT). The Mandarin dialect used in Singapore is slightly different to the Beijing-Mandarin that was used in the LTT and M-HINT, in terms of the accent and grammatical structure. For example, acoustic analysis of the four Mandarin tones has reported the contour of Tone 2 is different between Singaporean-Mandarin and Beijing-Mandarin. In Singaporean-Mandarin, the Tone 2 pitch contour exhibits a longer, lower plateau before rising, relative to the Beijing-Mandarin contour (Lee, 2010). That is, Singaporeans pronounce Tone 2 differently to Beijing residents. It would be comparable to using an American English test to assess recipients in another English-speaking nation. However there are no Singaporean-accent recordings of any speech test material (either in English, or Mandarin), with Beijing-Mandarin material being used by the NUH clinic to assess native Mandarin-

S102

Cochlear Implants International

2015

VOL.

16

NO.

S3

speaking patients. A detailed linguistic analysis of Singaporean-Mandarin, in comparison to BeijingMandarin, is available in Lee (2010).

Summary In summary, this study compared bilingual Mandarin–English NNH hearing listeners, CI recipients, and HA users on tasks of pitch, lexical tone, and speech-in-noise perception. The CI recipients were significantly poorer than the NNH listeners on the pitch-ranking task, with no significant difference between the CI and HA groups, or the HA group and the NNH group. For the lexical tone and speech-in-noise task, the CI group was significantly poorer than both the NNH and HA groups, with no significant difference between the latter two groups. There were highly significant moderate correlations between all three tests, indicating that there is at least some degree of overlap in the skills required to accurately perceive these stimuli. The lexical tone perception score was the most significant predictor of pitch-perception scores, and along with music experience and daily Mandarin use, these three factors accounted for approximately 60% of the variance in the pitch scores. For lexical tone perception, group, pitch score, and daily Mandarin use accounted for 64% of the variance in performance. For the Mandarin speech-in-noise test, group and daily Mandarin use accounted for 83% of the variance in scores. The overall results suggest that CI users, and to a lesser extent HA users, still struggle with complex auditory perceptual tasks, particularly when the stimuli requires perception of pitch. However it may be possible that training one of these skills (e.g. musical pitch perception) may then generalize to other tasks (e.g. lexical tone and/or speech-in-noise perception). This is an important consideration for both counseling clients, as well as for planning suitable, time-efficient rehabilitation plans.

Acknowledgments The authors would like to thank Dr Alex Cook for statistical advice, Dr Wai Kong Lai for the MACarena software, Prof. Fan Gang Zeng and Tom Lu for the lexical tone test used in this study as well as the Matlab analyses of the tone durations, Yuhan Wong and Edmund Choo for help with the test setup, the audiologists at National University Hospital Audiology Clinic for help with recruitment, and the participants who gave generously their time to participate in this study.

Disclaimer statements Funding None.

Looi et al.

Conflicts of interest The authors have no conflict of interest to report. Ethics approval Ethical approval for this study was obtained from the National Healthcare Group Domain Specific Review Board in Singapore, and all procedures were in accordance with this approval. Participants did not receive any reimbursement for their participation.

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

References ANSI 1994. American National Standard Acoustical Terminology. New York: American National Standards Institute. Arehart, K.H. 1994. Effects of harmonic content on complex-tone fundamental-frequency discrimination in hearing-impaired listeners. Journal of the Acoustical Society of America, 95: 3574–3585. Au, D.K. 2003. Effects of stimulation rates on Cantonese lexical tone perception by cochlear implant users in Hong Kong. Clinical Otolaryngology & Allied Sciences, 28: 533–538. Drennan, W.R., Rubinstein, J.T. 2008. Music perception in cochlear implant users and its relationship with psychophysical capabilities. Journal of Rehabilitation Research and Development, 45: 779–789. François, C., Chobert, J., Besson, M., Schön, D. 2013. Music training for the development of speech segmentation. Cerebral Cortex, 23(9): 2038–2043. Fu, Q.J., Zeng, F.G. 2000. Identification of temporal envelope cues in Chinese tone recognition. Asia Pacific Journal of Speech, Language and Hearing, 5: 45–57. Fu, Q.J., Zeng, F.G., Shannon, R.V., Soli, S.D. 1998. Importance of tonal envelope cues in Chinese speech recognition. Journal of the Acoustical Society of America, 104: 505–510. Fu, Q.J., Hsu, C.J., Horng, M.J. 2004. Effects of speech processing strategy on Chinese tone recognition by Nucleus-24 cochlear implant users. Ear and Hearing, 25: 501–508. Galvin, J.J., Fu, Q., Nogaki, G. 2007. Melodic contour identification by cochlear implant listeners. Ear and Hearing, 28: 302–319. Galvin, J.J., Fu, Q., Oba, S. 2008. Effect of instrument timbre on melodic contour identification by cochlear implant users. Journal of the Acoustical Society of America, 124: EL189–EL195. Gfeller, K., Olszewski, C., Turner, C., Gantz, B., Oleson, J. 2006. Music perception with cochlear implants and residual hearing. Audiology & Neuro-otology, 11(Suppl 1): 12–15. Gfeller, K., Turner, C., Oleson, J., Zhang, X., Gantz, B., Froman, R., et al. 2007. Accuracy of cochlear implant recipients on pitch perception, melody recognition, and speech reception in noise. Ear and Hearing, 28: 412–423. Gfeller, K., Oleson, J., Knutson, J., Breheny, P., Driscoll, V. and Olszewski, C. 2008. Multivariate predictors of music perception and appraisal by adult cochlear implant users. Journal of the American Academy of Audiology, 19: 120–134. Han, D., Liu, B., Zhou, N., Chen, X., Kong, Y., Liu, H., et al. 2009. Lexical tone perception with HiResolution and HiResolution 120 sound-processing strategies in pediatric Mandarin-speaking cochlear implant users. Ear and Hearing, 30: 169–177. Houtsma, A.J.M. 1997. Pitch and timbre: definition, meaning and use. Journal of New Music Research, 26: 104–115. Hyman, L.M. 1993. The phonology of tone: the representation of tonal register. In: Van Der Hulst, H. and Snider, K. eds. Register tone and tonal geometry. Berlin: Mouton de Gruyter. Koelsch, S., Siebel, W.A. 2005. Towards a neural basis of music perception. Trends in Cognitive Sciences, 9: 578–584. Kong, Y., Stickney, G., Zeng, F.G. 2005. Speech and melody recognition in binaurally combined acoustic and electric hearing. Journal of the Acoustical Society of America, 117: 1351–1361. Kraus, N., McGee, T., Carrell, T., Sharma, C. 1995. Neurophysiologic bases of speech discrimination. Ear and Hearing, 16: 19–37. Lai, W., Dillier, N. 2002. MACarena: a flexible computer-based speech testing environment. 7th international cochlear implant conference, 4–6 September 2002, Manchester.

Pitch and lexical tone perception

Lee, L. 2010. The tonal system of Singapore Mandarin. Proceedings of the 22nd North American Conference on Chinese Linguistics and the 18th Annual Meeting of the International Association of Chinese Linguistics. 20–22 May 2010, Harvard University, Cambridge, MA. Limb, C. J., Molloy, A. T., Jiradejvong, P., Braun, A. R. 2010. Auditory cortical activity during cochlear implant-mediated perception of spoken language, melody, and rhythm. Journal of the Association for Research in Otolaryngology, 11: 133–143. Limb, C.J., Roy, A.T. 2014. Technological, biological, and acoustical constraints to music perception in cochlear implant users. Hearing Research, 308: 13–26. Looi, V. 2008. The effect of cochlear implantation of music: a review. Otorinolaringologia, 58: 169–190. Looi, V., McDermott, H., McKay, C., Hickson, L. 2008a. The effect of cochlear implantation on music perception by adults with usable pre-operative acoustic hearing. International Journal of Audiology, 47: 257–268. Looi, V., McDermott, H., McKay, C., Hickson, L. 2008b. Music perception of cochlear implant users compared with that of hearing aid users. Ear and Hearing, 29: 421–434. Looi, V., Gfeller, K., Driscoll, V. 2012a. Music appreciation and training for cochlear implant recipients: a review. Seminars in Hearing, 33: 307–334. Looi, V., King, J., Kelly-Campbell, R. 2012b. A music appreciation training program developed for clinical application with cochlear implant recipients and hearing aid users. Seminars in Hearing, 33: 361–380. Luo, X., Fu, Q.J., Galvin, J.J. 2007. Vocal emotion recognition by normal-hearing listeners and cochlear implant users. Trends in Amplification, 11: 301–315. McDermott, H. 2012. Music perception. In: Zeng, F.G., Popper, A., Fay, R.R. (eds.) Auditory prostheses: new horizons. New York: Springer-Verlag. Moore, B. 2007. Cochlear hearing loss: physiological, psychological and technical issues. West Sussex: John Wiley & Sons. Moore, B.C., Carlyon, R.P. 2005. Perception of pitch by people with cochlear hearing loss and by cochlear implant users. Pitch. New York: Springer. Paul, L., Simons, G., Fenning, C. 2014. Ethnologue: languages of the world, 17th ed. [Online]. Dallas, TX: SIL International. Available: http://www.ethnologue.com/17/. [Accessed 14 March 2015]. Peng, S.C., Tomblin, J.B., Cheung, H., Lin, Y.S., Wang, L.S. 2004. Perception and production of Mandarin tones in prelingually deaf children with cochlear implants. Ear and Hearing, 25: 251–264. Pereira, C. 2000. Perception and expression of emotion in speech. Unpublished PhD thesis, Macquarie University, Sydney. Peters, K. 2006. Emotion perception in speech: discrimination, identification and the effects of talker and sentence variability. Unpublished Doctor of Audiology Capstone Project, Washington University School of Medicine. Sucher, C.M., McDermott, H.J. 2007. Pitch ranking of complex tones by normally hearing subjects and cochlear implant users. Hearing Research, 230: 80–87. Tao, D., Deng, R., Jiang, Y., Galvin, J.J., Fu, Q.J., Chen, B. 2014. Melodic pitch perception and lexical tone perception in Mandarin-speaking cochlear implant users. Ear and Hearing, 36: 102–110. Torppa, R., Faulkner, A., Huotilainen, M., Järvikivi, J., Lipsanen, J., Laasonen, M., et al. 2014. The perception of prosody and associated auditory cues in early-implanted children: the role of auditory working memory and musical activities. International Journal of Audiology, 53(3): 182–191. Tremblay, K., Kraus, N., Carrell, T.D., McGee, T. 1997. Central auditory system plasticity: generalization to novel stimuli following listening training. Journal of the Acoustical Society of America, 102: 3762–3773. Wang, R. 1993. The Standard Chinese Database. Unpublished internal material, University of Science and Technology of China. Wang, W., Zhou, N., Xu, L. 2011. Musical pitch and lexical tone perception with cochlear implants. International Journal of Audiology, 50: 270–278. Wei, W.I., Wong, R., Hui, Y., Au, D.K., Wong, B.Y., Ho, W., et al. 2000. Chinese tonal language rehabilitation following cochlear implantation in children. Acta Oto-laryngologica, 120: 218–221.

Cochlear Implants International

2015

VOL.

16

NO.

S3

S103

Looi et al.

Pitch and lexical tone perception

Downloaded by [Washington University in St Louis] at 10:46 16 March 2016

Wei, C.G., Cao, K., Zeng, F.G. 2004. Mandarin tone recognition in cochlear-implant subjects. Hearing Research, 197: 87–95. Wong, L.L., Liu, S., Han, N. 2008. The Mainland Mandarin hearing in noise test. International Journal of Audiology, 47: 393–395. Woolf, N.K., Ryan, A.F., Bone, R.C. 1981. Neural phase-locking properties in the absence of cochlear outer hair cells. Hearing Research, 4: 335–346.

S104

Cochlear Implants International

2015

VOL.

16

NO.

S3

Wu, J.L., Yang, H.M. 2003. Speech perception of Mandarin Chinese speaking young children after cochlear implant use: effect of age at implantation. International Journal of Pediatric Otorhinolaryngology, 67: 247–253. Yip, M. 1989. Contour tones. Phonology, 6: 149–174. Zhou, N., Huang, J., Chen, X., Xu, L. 2013. Relationship between tone perception and production in prelingually deafened children with cochlear implants. Otology & Neurotology, 34: 499–506.

Pitch and lexical tone perception of bilingual English-Mandarin-speaking cochlear implant recipients, hearing aid users, and normally hearing listeners.

The purpose of this current study was to investigate whether pitch, lexical tone, and/or speech-in-noise perception were significantly correlated for ...
566B Sizes 0 Downloads 7 Views