Auris Nasus Larynx 41 (2014) 239–243

Contents lists available at ScienceDirect

Auris Nasus Larynx journal homepage: www.elsevier.com/locate/anl

Gender disparity in subcortical encoding of binaurally presented speech stimuli: An auditory evoked potentials study Mohsen Ahadi a, Akram Pourbakht a,b,*, Amir Homayoun Jafari c,d, Zahra Shirjian c, Amir Salar Jafarpisheh c a

Department of Audiology, School of Rehabilitation, Iran University of Medical Sciences, Tehran, Iran Rehabilitation Research Center, Iran University of Medical Sciences, Tehran, Iran Medical Physics and Biomedical Engineering Department, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran d Research Center of Biomedical Technology and Robotics, Tehran University of Medical Sciences, Tehran, Iran b c

A R T I C L E I N F O

A B S T R A C T

Article history: Received 23 June 2013 Accepted 4 October 2013 Available online 30 October 2013

Objectives: To investigate the influence of gender on subcortical representation of speech acoustic parameters where simultaneously presented to both ears. Methods: Two-channel speech-evoked auditory brainstem responses were obtained in 25 female and 23 male normal hearing young adults by using binaural presentation of the 40 ms synthetic consonantvowel /da/, and the encoding of the fast and slow elements of speech stimuli at subcortical level were compared in the temporal and spectral domains between the sexes using independent sample, two tailed t-test. Results: Highly detectable responses were established in both groups. Analysis in the time domain revealed earlier and larger Fast-onset-responses in females but there was no gender related difference in sustained segment and offset of the response. Interpeak intervals between Frequency Following Response peaks were also invariant to sex. Based on shorter onset responses in females, composite onset measures were also sex dependent. Analysis in the spectral domain showed more robust and better representation of fundamental frequency as well as the first formant and high frequency components of first formant in females than in males. Conclusions: Anatomical, biological and biochemical distinctions between females and males could alter the neural encoding of the acoustic cues of speech stimuli at subcortical level. Females have an advantage in binaural processing of the slow and fast elements of speech. This could be a physiological evidence for better identification of speaker and emotional tone of voice, as well as better perceiving the phonetic information of speech in women. ß 2013 Elsevier Ireland Ltd. All rights reserved.

Keywords: Speech Brain stem Auditory brainstem response Gender

1. Introduction Gender-related differences could be easily tracked in almost every stage of auditory system ranging from cochlea to subcortical and cortical auditory areas. Morphological, physiological and biochemical distinctions in intra sexes has already been the topic of an inordinate number of investigations [1–4]. Females have better hearing thresholds, higher susceptibility to noise exposure, shorter latencies in their auditory brain-stem responses, enhanced spontaneous otoacoustic emissions (SOAEs), and stronger clickevoked otoacoustic emissions than males. However, males show

* Corresponding author at: Department of Audiology, School of Rehabilitation, Iran University of Medical Sciences, Nezam St., Shah-Nazari St., Madar Sq., Mirdamad Blvd., Tehran 1545913187, Iran. Tel.: +98 21 22228051x401; fax: +98 21 22220946. E-mail addresses: [email protected], [email protected] (A. Pourbakht). 0385-8146/$ – see front matter ß 2013 Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.anl.2013.10.010

better sound localization, distinguish more binaural beats, and recognize signals in complex masking tasks better than females do [5,6]. With respect to electrophysiological measures at subcortical level, distinct differences for female vs. male adults have been reported. Females show earlier latencies and larger amplitudes than males do for click-evoked auditory brainstem response (ABR), a response that reflects the activation of high frequency regions of the cochlea [7,8] and mainly convey rapid acoustic information of auditory signals. However, no gender differences were detected for encoding of low frequency sinusoids that transmit slow acoustic information at brainstem level [5,9]. Simple acoustic stimuli such as clicks and puretones show poor resemblance to the behaviorally relevant sounds man confronts in real life circumstances and therefore, auditory neuroscience has transitioned to properly use complex sounds such as speech stimuli [10]. Commonly used for investigating the subcortical encoding of acoustic elements of the speech, speech-evoked auditory brainstem responses (Speech-ABR), are scalp recorded

240

M. Ahadi et al. / Auris Nasus Larynx 41 (2014) 239–243

Fig. 1. Time-domain waveform of the 40 ms speech syllable /da/. This synthetic stimulus evokes seven prominent peaks in the Speech-ABR that have termed V, A, C, D, E, F and O.

neural events that are synchronously lock to acoustic elements of speech stimulus [11–13], and involve presenting a consonantvowel (CV) speech syllable. The stimulus is made up of sharp onset burst, a brief period of formant transition, and a longer period correlated with the vowel [11]. Recent studies revealed that auditory brainstem has considerable fidelity in representing the basic acoustic features of this stimulus through highly precise spectral and temporal neural codes [10,14]. In a primal study to determine if the subcortical response to a complex auditory stimulus is encoded differently between the sexes, Krizman et al. [5] recorded the Speech-ABR to the stop consonant-vowel /da/ presented to the right ear in a normal hearing, young adult population. Their findings demonstrated gender differences in the encoding of rapid, but not slow features of speech, while females showed significantly earlier and greater response to only transient portion of stimulus compared to males. No gender-related distinction was reported in response to slower elements, indicating similar neural phase-locking between sexes [5]. Although, in real situations we usually use our two ears for listening and auditory subcortical nuclei plays an important role in binaural processing [15]. Therefore, it seems that binaural stimulation is preferred for studying the role of the auditory brainstem in encoding of the speech elements and warrants further investigation. The current study aimed to compare the gender-related differences in brainstem encoding of the acoustic parameters of speech in response to binaural presentation of the stimulus. The authors premise was that gender has distinctive effect on encoding of rapid (high frequency) vs. slow (low frequency) components of speech. More specifically, we hypothesized that binaural presentation of speech syllables in females would lead to earlier, stronger and more robust response compared to males. 2. Materials and methods

frequencies 250–8000 Hz). None of the female subjects had used oral contraceptives or underwent hormonal therapy. Subjects gave their written consent to intensively participate. All procedures were approved by deputy of research review board, Iran University of Medical Sciences. 2.2. Stimuli and recording parameters Brainstem responses to speech sound were subject to be elicited and collected using a Biologic Navigator Pro (Natus Medical Inc., San Carlos, CA, USA). The stimuli consisted of a 40 ms synthesized stop consonant /da/ provided with the BioMARK module (Fig. 1). The syllable contains initial noise burst, a formant transition between the consonant and a steady-state vowel with a fundamental frequency (F0) that linearly rises from 103 to 125 Hz; the voicing begins at 5 ms with an onset release burst during the first 10 ms. The first formant (F1) frequency linearly rises from 220 to 720 Hz, while the second formant (F2) decreases from 1700 to 1240 Hz over the duration of the stimulus. The third formant (F3) falls slightly from 2580 to 2500 Hz, while the fourth (F4) and fifth (F5) formants remain constant at 3600 and 4500 Hz, respectively [14,16]. For recording electrophysiological responses, the Subjects were asked to be pacifically seated in a quiet room. Using Ag-AgCl electrodes, two channels Speech-ABR were collected with Vertex (Cz) electrode as noninverting, earlobes as inverting and forehead (Fpz) as ground. During the recording session, impedance was kept below 5 kV and inter-electrode impedance below 3 kV. For each subject, Stimuli were presented binaurally through Biologic insert earphone (580-SINSER) at 80 dB SPL in alternating polarity and at a rate of 10.9 s–1. A time window of 85.33 ms (including a 15 ms pre-stimulus time) and online filter setting of 100–2000 Hz were used for recording purposes. Individual traces exceeding 23.8 mV were rejected from the average and a total of 6000 (two subaverages of 3000 sweeps) artifact free responses were obtained.

2.1. Participants 2.3. Data analysis Forty-eight volunteer students from school of rehabilitation, Iran University of Medical Sciences, 25 female and 23 male, aged 20–28 years (females: mean  SD = 22.56  1.73, males: mean  SD = 23.00  2.37), registered to initiate the experiment. None of the subjects had a history of auditory, learning or neurologic problems. All were right handed and monolingual Persian speakers by self-report. All had normal middle ear function supported by immitance findings and performed within normal limits on pure tone audiometry (Air conduction thresholds 20 dB HL for octave

For each participant, the peak picking criteria stated by Krizman et al. was used [5]. The latencies of all peaks of interest were marked by first experimenter and consequently verified by second author. The speech-ABR comprises of a series of seven peaks, including the onset (V and A), onset of voicing (C), frequency following response (FFR) (D, E and F) and finally offset (O) peaks. For analyzing the response waveform, the timing and magnitude of both the discrete peaks and FFR aspects were evaluated. Composite

M. Ahadi et al. / Auris Nasus Larynx 41 (2014) 239–243 Table 1 Means (and standard deviations) of the audiometric thresholds (dB) as a function of gender for the six test frequencies. Independent Samples t-test for hearing threshold differences at each of the test frequencies are also provided. Frequency (Hz)

Female

Male

250 500 1000 2000 4000 8000

9.40 8.80 8.60 6.80 8.20 8.80

8.04 9.34 7.82 8.26 9.56 8.26

(5.26) (4.62) (4.68) (2.84) (4.05) (4.62)

t (4.94) (5.28) (3.93) (4.15) (4.98) (4.15)

0.918 0.383 0.617 1.431 1.045 0.423

df

p value

46 46 46 46 46 46

0.363 0.704 0.540 0.159 0.301 0.674

measures of neural synchrony to the onset of stimulus (V/A duration, amplitude, slope and area) were also analyzed. Spectral encoding across FFR region (11.4–40.6 ms) that includes peaks C, D, E and F were analyzed as well by using Fast Fourier Transform (FFT). For analyzing the FFR, five techniques consisting of root mean square (RMS) amplitude, amplitude of the spectral component corresponding to the stimulus fundamental frequency (F0) 103–121 Hz, amplitude of the spectral component corresponding to the first formant frequencies of the stimulus (F1) 454–719 Hz, higher frequencies corresponded to the 7th through 11th harmonics of the F0 of the stimulus (HF) 721–1155 Hz, and stimulus-to-response correlations were employed. 2.4. Statistical methods Brainstem response measures in both transient and sustained portions of response were analyzed using independent sample,

241

two-tail t-tests. Missing data were eliminated from the analysis. All statistical tests were considered to be significant at p  0.05. Data processing was performed in MATLAB version R2010a (The MathWorks, Inc., Natick, Massachusetts, USA) and statistical analyses were operated in SPSS 16.0 (SPSS Inc., Chicago, USA). 3. Results Table 1 presents a summary of the means and standard deviations of the audiometric thresholds as a function of gender for the six test frequencies. There were no significant differences in audiometric thresholds between males and females. 3.1. Latency and amplitude measures Based on evaluation of 48 subjects, latencies and amplitudes of all main peaks of response were established and displayed in Table 2. In almost all of the peaks, detectability was very robust. Onset and offset peaks (V–A and O) were 100% perceptible in all subjects. The poorest detectability was related to peak C. The variability of latency was smallest for onset peaks and increased for FFR peaks. The latency values of the onset peaks of the Speech-ABR were sex dependent with females having significantly shorter timing at peaks V (t(46) = 2.628, p = 0.012) and A (t(46) = 2.860, p = 0.006) compared to males. Latency of other peaks was not dependent on the sex of the subject, including the transition peak C (t(39) = 0.097, p = 0.923), the FFR peaks, D (t(43) = 0.666, p = 0.509), E (t(45) = 0.011, p = 0.991), F (t(46) = 0.777, p = 0.441), and the offset peak O (t(46) = 0.102, p = 0.919).

Table 2 The mean and standard deviation of the latencies and amplitudes in males and females are given considering seven peaks of the Speech-ABR. Significances are indicated with asterisks. Female

Male

n

Mean

SD

n

Mean

SD

Latency (ms) V* A** C D E F O D–E E–F

25 25 21 24 24 25 25 23 24

6.61 7.67 18.70 22.89 31.71 40.02 48.67 8.84 8.33

0.32 0.44 1.00 0.65 1.34 1.23 1.21 1.05 0.66

23 23 20 21 23 23 23 21 23

6.89 8.02 18.72 23.04 31.71 40.29 48.64 8.65 8.57

0.42 0.41 0.59 0.82 1.01 1.14 0.52 0.61 0.78

Amplitude (mV) V A C D E F O SNR

25 25 21 24 24 25 25 25

0.26 0.38 0.11 0.36 0.28 0.28 0.25 4.33

0.08 0.08 0.17 0.16 0.10 0.13 0.12 2.45

23 23 20 21 23 23 23 23

0.17 0.29 0.11 0.26 0.26 0.22 0.19 3.74

0.06 0.06 0.17 0.13 0.08 0.11 0.13 1.66

Composite onset measures V/A duration (ms) V/A amplitude (mV)** V/A slope (mV/ms)** V/A area (mV  ms)**

25 25 25 25

1.05 0.65 0.64 0.36

0.25 0.14 0.20 0.10

23 23 23 23

1.13 0.47 0.43 0.28

0.21 0.11 0.14 0.08

Spectral magnitudes (mV) F0* F1** HF**

25 25 25

20.32 2.47 0.97

8.02 0.70 0.22

23 23 23

15.17 1.93 0.76

5.71 0.60 0.20

Stimulus–response correlations SR correlation (r) SR lag (ms)

25 25

0.07 8.58

0.03 1.06

23 23

0.08 8.06

0.04 1.25

* **

Statistically significant at p < 0.05. Statistically significant at p < 0.01.

242

M. Ahadi et al. / Auris Nasus Larynx 41 (2014) 239–243

Fig. 2. Grand Average waveform for the Speech-ABR obtained from 23 male and 25 female subjects to binaural presentation of 40 ms speech syllable /da/. The stimulus evoked seven prominent peaks that labeled as V, A, C, D, E, F and O. Peaks I and III corresponding to clickevoked ABR are also marked.

Moreover, there was no significant interpeak interval difference for FFR peaks D to E (t(42) = 0.736, p = 0.466) and E to F (t(45) = 1.124, p = 0.267) between males and females. Owing to the multiple factors such as background noise level, influencing the amplitude of individual peaks in the temporal domain, no analysis was taken for peak amplitude differences between the sexes; however, mean values are reported in the Table 2. Measuring of signal-to-noise ratio (SNR) as quotient of response RMS amplitude and prestimulus baseline RMS amplitude provided in Table 2. Grand average SpeechABR waveforms for males and females are shown in Fig. 2. 3.2. Composite onset measures Interpeak measurements of onset response to speech syllable / da/ including duration, amplitude, slope and area was further inspected and resulted in attainment of values illustrated in Table 2. Independent sample t-test showed that there was an effect of sex on V/A amplitude (t(46) = 4.612, p = 0.000), slope (t(46) = 4.058, p = 0.000), and area (t(46) = 2.963, p = 0.005). However, the duration of the V/A complex did not approached to significance level (t(46) = 1.023, p = 0.312). 3.3. Spectral encoding measures The response to formant transitions of stimuli were analyzed using the FFT and stimulus to response correlation. The largest magnitudes for all of the sustained response measures were observed in females (Table 2). Sex affected the spectral encoding of F0 of speech sound (t(46) = 2.544, p = 0.014), F1 range (t(46) = 2.833, p = 0.007), and HF response components (t(46) = 3.383, p = 0.001). Maximum stimulus to response correlation did not differ significantly between the two groups (t(46) = 1.073, p = 0.289), and associated lag between stimulus and the response – as a measure of time that takes for neural impulses to propagate through the brainstem – was also invariant to sex (t(46) = 1.535, p = 0.132). 4. Discussion The present study was set out to determine the effects of gender on the encoding of the binaurally presented acoustic elements of

the speech stimuli throughout the auditory subcortical nuclei. The speech syllable /da/ that used here is an acoustically complex sound beginning with a transient segment and is followed by sustained periodic segment [10]. The findings of the current study revealed a difference between males and females in neural encoding of the transient and sustained portion of the speech stimuli. That is, female subjects had earlier onset peaks (V and A) latencies than male counterparts. Such discrepancy was not seen in case of sustained FFR or offset peaks in the time domain analysis. Observing the shorter latencies for transient portion of the stimulus are consistent in a great deal to the previous works reported for click evoked ABR [3,7,8]. This result may be explained by a number of different factors. Females, as a group, have smaller head size, less brain volume and perhaps less skull thickness [7,8] and according to this assumption, latencies are shorter because the fiber tracks are shorter in women. Another possible explanation for this, is that the females have shorter cochlea than males [17]. Some authors [18] have speculated that this shorter cochlea could lead to faster traveling wave velocity and increasing the neural synchrony of response across the cochlea in females than in males. Influence of sex hormones, including estrogen, is also an explanation for better auditory performance in females [4]. In addition, the differences in the core body temperatures, external ear canal volume and middle ear transfer function between the two sexes may account for this gender related differences [7,8]. Restriction of gender effects to the encoding of the fast elements of speech syllable is also in accordance with the report of Krizman et al. [5]. This study, as predicted, did not detect any evidence for gender effects on encoding of sustained segment response (FFR) that reflects phase locked neural activities to low frequency (slow) information of stimulus, neither to the corresponding temporal interpeak intervals that represents the period of the fundamental frequency. These findings further support the previous researches in the phase locked responses to pure tones [9] and speech syllable /da/ [5]. Lack of gender difference between these interpeak intervals reveals a similar neural phase locking between sexes to the period of the F0 components. For deeper evaluation of response to onset of speech syllable / da/, we also analyzed the interpeak measurements of peaks V and A. The V/A amplitude, area and slope were considerably larger in females compared to males. This finding is to further support the

M. Ahadi et al. / Auris Nasus Larynx 41 (2014) 239–243

previously mentioned contributing factors that led to larger peak amplitudes and shorter latencies in women. The V/A duration was also shorter in female group but did not reach to significance level. This finding can be partly explained by modest sample size of the study and caution must be applied, as this finding might not be transferable to whole population. In contrast to time-domain analysis that shows how the response changes over time, the spectral-domain analysis shows how much of the response lies within each given frequency band over a range of frequencies. FFT is the most common algorithm for performing spectral analysis and can be used to decompose the complex waveform of Speech-ABR into a set of sine waves. The magnitude of each sine wave corresponds to the amount of energy contained in the complex waveform at that frequency. Fourier analysis is also useful for calculating the amplitude over a range of frequencies, especially in cases when the stimulus has time-varying features such as formant transitions [10]. Spectral analysis of the response was applied to measure the precision and magnitude of neural phase locking at fundamental frequency, first formant frequency and higher frequencies of the first formant. Analyses in the spectral domain of responses indicated that the total extent of activity occurring around the F0, F1 and HF were greater in females than in males. Despite lack of gender difference for the interpeak intervals of the FFR peaks in the time domain that reflects the period of F0, encoding of F0 and F1 amplitude were sex dependent in the spectral domain. This finding was unexpected and suggests that precision and magnitude of neural phase locking to F0, F1 and HF range are larger in women. The F0 is a low frequency component of speech that results from the periodic beating of the vocal folds and contributes to the perceived pitch of an individual’s voice. F0 reflects prosodic aspects of speech and encoding of F0 is important for identifying the speaker and emotional tone of voice [10,13]. However, formant structure describes a series of discrete peaks in the frequency spectrum of speech that are the result of an interaction between the frequency of vibration of the vocal folds and the resonances within a speaker’s vocal tract. The F1 and HF range of the F1 has a special role in the perception of vowels and represents phonetic information of the stimulus [10,13]. Larger subcortical representation of F0, F1, and HF in the spectral domain in the female group, could be a physiological evidence for better performance of the women in identifying the speaker and emotional tone of voice, as well as better perceiving the phonetic information of the speech than males do. Dissimilar encoding of F0 and F1 between sexes does not support the reports of Krizman et al., but larger higher frequency response components in females is consistent with findings of those authors [5]. The differences can be explained owing to different stimulation mode that was used in the current study. Binaural presentation of speech could result to the summation of stimulus energy and the sound captured by the two ears is usually judged louder than the same sound heard with one [19]. Better signal to noise ratio in the binaural stimulation mode could positively influence the neural encoding of spectral cues in the F0 and F1 region. This difference in findings is interesting and certainly expresses the idea that monaural and binaural speech processing employ different neural mechanisms. Comparing the morphology and timing of stimulus and response using stimulus to response correlation, and determining the associated time delay between them, it did not show any significant gender effect. Stimulus to response lag values of current study seems to be comparable with those of Hornickel et al. [20] and Vander Werff and Burns [21]. 5. Conclusion This study could successfully establish the intuition into biological distinctions between females and males that is to

243

influence the neural encoding of speech- as the most important behaviorally relevant sound- in human soundscape and it can also be served as a basis for investigating the gender distinctions in auditory processing disorders and language impairments. The current findings add value to the growing body of literature on Speech-ABR recordings recorded in adult populations. Developing gender specific norms for Speech-ABR parameters should also be considered for each laboratory. It is recommended that further research is to be undertaken on sex differences in perception of voice pitch using phase analysis.

Conflict of interest None. Acknowledgements This study was part of a Ph. D. Dissertation supported by Tehran University of Medical Sciences, Tehran, Iran (grant no.: 218/4d/26/ p). Authors would like to address Dr. Shohreh Jalaei for assisting with the statistical analysis. We would also like to thank Erika Skoe from Northwestern University for her unforgettable commitment to provide us with the Brainstem Toolbox.

References [1] Bowman D, Brown D, Kimberley B. An examination of gender differences in DPOAE phase delay measurements in normal-hearing human adults. Hear Res 2000;142:1–11. [2] Burman DD, Bitan T, Booth JR. Sex differences in neural processing of language among children. Neuropsychologia 2008;46:1349–62. [3] Jerger J, Hall J. Effects of age and sex on auditory brainstem response. Arch Otolaryngol 1980;106:387. [4] McFadden D, Martin GK, Stagner BB, Maloney MM. Sex differences in distortion-product and transient-evoked otoacoustic emissions compared. J Acoust Soc Am 2009;125:239. [5] Krizman J, Skoe E, Kraus N. Sex differences in auditory subcortical function. Clin Neurophysiol 2012;123:590–7. [6] McFadden D. Sex differences in the auditory system. Dev Neuropsychol 1998;14:261–98. [7] Burkard RF, Eggermont JJ, Don M. Auditory evoked potentials: basic principles and clinical application. Philadelphia: Lippincott Williams & Wilkins; 2007. [8] Hall JW. New handbook of auditory evoked responses. Boston: Pearson Education, Inc.; 2007. [9] Hoormann J, Falkenstein M, Hohnsbein J, Blanke L. The human frequencyfollowing response (FFR): normal variability and relation to the click-evoked brainstem response. Hear Res 1992;59:179–88. [10] Skoe E, Kraus N. Auditory brainstem response to complex sounds: a tutorial. Ear Hear 2010;31:302. [11] Chandrasekaran B, Kraus N. The scalp-recorded brainstem response to speech: neural origins and plasticity. Psychophysiology 2009;47:236–46. [12] King C, Warrier CM, Hayes E, Kraus N. Deficits in auditory brainstem pathway encoding of speech sounds in children with learning problems. Neurosci Lett 2002;319:111–5. [13] Russo N, Nicol T, Musacchia G, Kraus N. Brainstem responses to speech syllables. Clin Neurophysiol 2004;115:2021–30. [14] Johnson KL, Nicol TG, Kraus N. Brain stem response to speech: a biological marker of auditory processing. Ear Hear 2005;26:424. [15] Moore DR. Anatomy and physiology of binaural hearing. Audiology 1991;30:125–34. [16] Karawani H, Banai K. Speech-evoked brainstem responses in Arabic and Hebrew speakers. Int J Audiol 2010;49:844–9. [17] Sato H, Sando I, Takahashi H. Sexual dimorphism and development of the human cochlea: computer 3-D measurement. Acta Otolaryngol 1991;111:1037–40. [18] Don M, Ponton CW, Eggermont JJ, Masuda A. Gender differences in cochlear response time: an explanation for gender amplitude differences in the unmasked auditory brain-stem response. J Acoust Soc Am 1993;94:2135. [19] Porsolt R, Irwin R. Binaural summation in loudness of two tones as a function of their bandwidth. Am J Psychol 1967;80:384–90. [20] Hornickel J, Skoe E, Kraus N. Subcortical laterality of speech encoding. Audiol Neurootol 2009;14:198–207. [21] Vander Werff KR, Burns KS. Brain stem responses to speech in younger and older adults. Ear Hear 2011;32:168.

Gender disparity in subcortical encoding of binaurally presented speech stimuli: an auditory evoked potentials study.

To investigate the influence of gender on subcortical representation of speech acoustic parameters where simultaneously presented to both ears...
535KB Sizes 0 Downloads 0 Views