Acoustic characteristics of Greek fricatives Elina Nirgianakia) Laboratory of Phonetics, Department of Linguistics, University of Athens, Athens, Greece

(Received 2 April 2013; revised 19 March 2014; accepted 25 March 2014) The present study examined the acoustics of Greek fricative consonants in terms of temporal, spectral, and amplitude parameters. The effects of voicing, speaker’s gender, place of articulation, and post-fricative vowel on the acoustic parameters were also investigated. The results indicated that first and second spectral moments (i.e., spectral mean and spectral variance), as well as second formant (F2) onset, and normalized amplitude values are the acoustic parameters most correlated with the Greek fricative place of articulation distinction. F2 onset and spectral mean were the parameters that distinguished all five places of articulation, while normalized amplitude differentiated sibilants from non-sibilants. In addition, normalized duration and normalized amplitude are the parameters that distinguish Greek voiced from voiceless fricatives, with high C 2014 Acoustical Society of America. classification accuracy. V [http://dx.doi.org/10.1121/1.4870487] PACS number(s): 43.70.Fq [CHS]

Pages: 2964–2976

I. INTRODUCTION

Finding the properties that distinguish among naturally produced speech sounds has always been a main challenge in speech research. Many studies (using various methods and techniques) have suggested that there are reliable acoustic properties that can be found in the speech signal (e.g., Behrens and Blumstein, 1988a,b; Forrest et al., 1988; Sussman et al., 1991; Hedrick and Ohde, 1993; Jongman et al., 2000; Nissen, 2003). Many such studies have investigated the acoustic characteristics of fricative consonants in several languages, and especially in English, focusing mainly on the acoustic properties that may distinguish their place of articulation. Fricatives are basically produced with turbulence generated in the air flow somewhere in the pharynx or oral cavity. In order for this turbulence to be generated, a steady air flow passes through a narrow constriction, which generates random velocity fluctuations in the air flow. These fluctuations act as the source for frication noise (e.g., Stevens, 1998). According to Shadle (1990), frication noise can be generated at either an obstacle or a wall. An obstacle source occurs in fricatives in which sound is generated primarily at a rigid body perpendicular to the air flow (e.g., the production of voiceless alveolar and post-alveolar /s/ and /S/): The upper and lower teeth act as the obstacle for the airflow. On the other hand, a wall source occurs in fricatives in which sound is generated primarily along a rigid body parallel to the air flow (e.g., the production of voiced and voiceless velar fricatives /Ç/ and /x/). Frication noise can also be generated just by a turbulent jet that does not hit an obstacle or wall, as for the bilabial fricatives /u, b/. Apart from the turbulence generated in the vicinity of the constriction, vocal fold vibration provides another source of sound in voiced fricative production. a)

Author to whom correspondence should be addressed. Electronic mail: [email protected]

2964

J. Acoust. Soc. Am. 135 (5), May 2014

Greek fricative consonants are grouped into five classes according to their place of articulation: Labiodental /f, v/, dental /h, ð/, alveolar /s, z/, palatal /c¸, Œ/, and velar /x, Ç/. In particular, the palatal place of articulation occurs mainly as a result of the coarticulation of velar fricatives with front vowels [see Table I: “xoma” (“earth”), “c¸ina” (“goose”)], as well as in non-stressed syllables, when the phenomenon of palatalization takes place [e.g., “vraxos” (“rock”)—“vrac¸i” (“bq avoi”)—“vrac¸a” (“rocks”/neutral type)] (e.g., Botinis, 2011). Both static and dynamic cues have been reported to participate with different degrees in the identification of fricatives. The acoustic cues to fricative place and voicing distinctions that have been examined in fricatives, and mostly English fricatives, are: The amplitude and spectral properties of the frication noise, the characteristics of the transition from the frication noise to the following vowel, and the duration of frication noise. Along this line of research, the present study investigated the defining properties of fricative sounds as produced in Greek. It examined static and dynamic acoustic cues of Greek fricative consonants in word initial position as a function of voicing (voiced, voiceless), speaker’s gender (female, male), post-fricative vowel [/a/, /e/, /o/, /i/, /u/ (e.g., Fourakis et al., 1999; Botinis, 2011)] and place of articulation (labiodental, dental, alveolar, palatal, velar). In particular, the acoustic analysis included temporal (fricative duration), spectral (spectral peak location, spectral moments, F1 and F2 onset, and locus equations), and amplitude (normalized amplitude) parameters. Focusing on Greek fricatives is important for various reasons. Most studies on the acoustic characteristics of fricatives have focused on the fricatives of English, particularly American English. Since there is a gap in the literature regarding research on Greek speech sounds, the present study is important, both from a theoretical and a practical point of view. Besides, Greek fricatives are articulated using most of the places of articulation in the oral cavity, with

0001-4966/2014/135(5)/2964/13/$30.00

C 2014 Acoustical Society of America V

TABLE I. The words used in the experiment (Greek spelling, phonetic transcription, English translation). Fricative/ Vowel /f/ /v/ /h/ /ð/ /s/ /z/ /c¸/ /Œ/ /x/ /c/

/a/

/e/

/i/

/o/

/u/

/ asa, ‘fata, ‘eat them’ b asa, ‘vata, ‘padding’ H ao, ‘hano, ‘Thano’ - name d ajo, ‘ðako, ‘disease affecting olive trees r aki, ‘sali, ‘muffler’ f akg, ‘zali, ‘dizziness’ vi afx, ‘c¸azo, ‘mark x’ ct aka, ‘Œala, ‘glass jar’ v alx, ‘xamo, ‘on the ground’ c alo, ‘camo, ‘marriage’

/esa, ‘feta, ‘feta cheese’ Besa, ‘veta, ‘Veta’ - name hela, ‘hema, ‘subject’ dela, ‘ðema, ‘parcel’

/ıdi, ‘fiði, ‘snake’ bıda, ‘viða, ‘screw’ hgjg, ‘hici, ‘case’ dıjg, ‘ðici, ‘trial’

/ oka, ‘fola, ‘poison-ball’ b oki, ‘voli, ‘buckshot’ h oko, ‘holo, ‘dome’ d oko, ‘ðolo, ‘deceit’

Uo tka, ‘fula, ‘Fula’ - name bo tka, ‘vula, ‘spot’ ho tcia, ‘huja, ‘thuja’ - plant do tka, ‘ðula, ‘slave’

reka, ‘sela, ‘saddle’ Zesa, ‘zeta, ‘Zeta’ - name vea, ‘c¸ena, ‘henna’ cea, ‘Œena, ‘birth’ — —

rgsa, ‘sita, ‘screen’ fgsa, ‘zita, ‘ask’ vga, ‘c¸ina, ‘goose’ cıe, ‘Œine, ‘become’ — —

 rxx, ‘sono, ‘save’  fxx, ‘zono, ‘belt’ vi oi, ‘c¸oni, ‘snow’ ci ocja, ‘Œoga, ‘yoga’  vxla, ‘xoma, ‘earth’ c ola, ‘coma, ‘eraser’

Ro tka, ‘sula, ‘Sula’ - name fo tka, ‘zula, ‘squeeze’ vio tloq, ‘c¸umor, ‘humor’ cio tpi, ‘Œupi, ‘yippee’ vo tsa, ‘xuda, ‘junta’ co ta, ‘cuna, ‘fur’

both a voiced and voiceless fricative occurring at each one of these places. This provides the opportunity for a larger segmental inventory compared to other languages. Moreover, the results of this investigation can be used to optimize speech synthesis and recognition software for Greek. Several analysis methods that have been described and used in this study are strongly related to both of these applications and have been proven both challenging in general and complicated for fricatives in particular. Studies of disordered speech can also benefit from the outcomes of such analyses in order to identify and classify problems and variations in speech production, as well as changes in speech over time, e.g., after a cochlear implant. Hence, the knowledge of the acoustic characteristics that distinguish Greek fricatives across different speakers and contexts, and are important for their perception, is necessary not only for theoretical purposes but for a number of applications as well. Most studies on Greek fricatives examine /s/, while there is less information about the acoustics of the other Greek fricatives. Fourakis (1986) reported an average duration of 118 ms for word-initial /s/ and Nicolaidis (2002) 113 ms for inter-vocalic /s/. Botinis et al. (1999) reported 76 ms duration for word-initial /s/, averaged across stressed and unstressed syllables. As far as the duration of the other fricatives is concerned, Fourakis (1986) reported /f/ to be on average 113 ms, /h/ 114 ms, and /x/ 118 ms. His data showed also that all these fricatives had a shorter duration before /a/, while they were longer before /e/ and longest before /i/. Fricative duration has also been shown to distinguish English voiced from voiceless fricatives in syllable-initial position, with voiceless fricatives having longer noise durations than voiced ones (e.g., Baum and Blumstein, 1987; Behrens and Blumstein, 1988a; Crystal and House, 1988; Jongman, 1989; Jongman et al., 2000). It has also been shown to differentiate sibilant from non-sibilant fricatives, with the first having significantly longer durations than the second (e.g., Jongman et al., 2000). As far as the spectral shape of fricatives is concerned, it has been reported to be determined by the size and shape of the oral cavity in front of the constriction: The longer the anterior cavity, the more well-defined the fricative spectrum (e.g., Stevens, 1998). Hence, the alveolar and palato-alveolar fricatives are characterized by well-defined spectral shapes J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

(spectral peaks at around 4 to 5 and 2.5 to 3 kHz, respectively), while labiodental and dental fricatives display a relatively flat spectrum (e.g., Strevens, 1960; Behrens and Blumstein, 1988a; Maniwa et al., 2009). Although several studies in English fricatives have shown that spectral properties of frication noise can distinguish sibilant /s, z, S, Z/ from non-sibilant /f, v, h, ð/ fricatives, as well as alveolar (/s, z/) from post-alveolar (/S, Z/) sibilants (e.g., Hughes and Halle, 1956; Strevens, 1960; Shadle, 1990; Behrens and Blumstein, 1988a; Evers et al., 1998), fricative spectral peak location has been shown to be speaker dependent (Hughes and Halle, 1956) and vowel dependent (Soli, 1981). Jongman et al. (2000) have shown that fricative spectral peak location decreases in frequency as the place of articulation moves further back in the oral cavity, reporting significant differences among all four places of English fricative articulation. Also, according to Jongman et al. (2000) spectral peak location is one of the acoustic parameters distinguishing English fricatives in terms of place of articulation with reasonable classification accuracy. Spectral moments’ analysis is a statistical procedure for classifying obstruents, first used by Forrest et al. (1988). Each discrete Fourier transform is treated as a random probability distribution from which the first four moments (center of gravity, variance, skewness, and kurtosis) are computed. The center of gravity corresponds to the mean of the distribution. Standard deviation corresponds to the amount of dispersion of spectral energy around the mean (i.e., whether the energy is concentrated mainly in a small band or spread out over a wide range of frequencies). Skewness indicates whether the distribution is tilted to the low or high frequencies. Positive skewness suggests a negative tilt with a concentration of energy in the lower frequencies. Negative skewness is associated with a positive tilt and a predominance of energy in the higher frequencies. Kurtosis indicates whether the shape deviates from that of a Gaussian distribution (basically, whether it is more peaked or more flat). Positive kurtosis values indicate a relatively high peakedness (the higher the value, the more peaked the distribution), while negative values indicate a relatively flat distribution. Most studies investigating English fricatives have shown that spectral moments can differentiate /S/ from /s/ in terms of spectral mean (e.g., Nittrouer et al., 1989; Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2965

Nittrouer, 1995; Tjaden and Turner, 1997; McFarland et al., 1996; Jongman et al., 2000), standard deviation (e.g., Tomiak, 1990), skewness (e.g., Nittrouer, 1995; McFarland et al., 1996), and kurtosis (e.g., Tomiak, 1990; Nittrouer, 1995; McFarland et al., 1996). Tomiak (1990) also reported that /h/ displays a greater standard deviation, skewness, and kurtosis than /f/. Shadle and Mair (1996) showed that /S/ is characterized by the lowest spectral mean among the English fricatives. Jongman et al. (2000) reported that variance is low for the sibilant fricatives and high for the non-sibilants and that skewness distinguishes all four places of articulation. Classification in terms of spectral moments has yielded poor results for the non-sibilant fricatives and high rates for the sibilants (e.g., Forrest et al., 1988; Tomiak, 1990), while Fox and Nissen (2005) demonstrated that a discriminant function based on adult tokens was generally better at classifying voiceless fricatives produced by male than female speakers, regardless of age. Previous studies regarding the effect of a vowel’s F2 onset on preceding English fricative consonants have shown that F2 onset is progressively higher as the place of constriction moves back in the oral cavity (e.g., Wilde, 1993; Jongman et al., 2000). The only study on Greek fricative consonants that examined the effect of fricative place of articulation on vowel formants has shown that despite some variability by vowel context, F2 onset values are reliably distinct among the five places of articulation of Greek fricatives (Lee and Malandraki, 2004). The results of this study have indicated highest F2 onset values for palatals, followed by alveolars, dentals, labiodentals, and velars. Another way to examine the spectral properties of the transition from the fricative into the following vowel is locus equations. In particular, locus equations are linear regressions of the onset of F2 transitions on their offsets measured in the vowel nucleus (Lindblom, 1963). They are line regression fits to data points formed by plotting onsets of F2 transitions along the y axis and their corresponding mid-vowel nuclei values along the x axis (e.g., Sussman et al., 1991; Sussman and Shore, 1996). Studies of English fricative consonants agree that locus equations’ coefficients, slope, and y-intercept, distinguish the labiodental place of articulation (e.g., Wilde, 1993; Fowler, 1994; Sussman, 1994; Jongman et al., 2000). Fowler (1994), however, has shown that they can provide good classification for all fricatives’ place of articulation. Regarding the amplitude characteristics of fricatives, most studies of English fricative amplitude have focused on voiceless fricatives, revealing that sibilants (/s, S/) have higher amplitude than non-sibilants (/f, h/), with no differences within each class (e.g., Strevens, 1960; Behrens and Blumstein, 1988a). Shadle (1985) has concluded that the lower teeth act as an obstacle at the noise source of sibilant constriction. Such configuration results in an increase in turbulence of the airflow, which in turn causes an increase in the sibilant amplitude. More importantly, it produces a strong localized source which in turn excites the transfer function typical of sibilants. It is both the stronger source type and the shape of the transfer function that cause the higher amplitude of sibilants. Non-sibilant fricatives, on the other hand, have no such obstacle, resulting in very low energy levels. Behrens 2966

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

and Blumstein (1988b) and Jongman et al. (2000) measured the difference between the root-mean-square (rms) amplitude of the entire frication noise (in dB) and the average rms amplitude of three consecutive pitch periods at the point of maximum vowel amplitude. The latter found that this fricative “normalized rms amplitude” can differentiate among all four places of fricatives in English and that it is also one of the acoustic parameters that distinguish the fricatives in terms of place of articulation with reasonable classification accuracy. Thus, the particular aims of the present study were: (a) Describe the acoustic characteristics of Greek fricative productions of adults in terms of multiple acoustic parameters (i.e., temporal, spectral, and amplitude); (b) examine to what extent the acoustic characteristics of fricative productions change as a function of voicing, gender, place of articulation, and vowel context; (c) perform discriminant analysis to determine how well the examined acoustic parameters can successfully categorize fricatives in terms of place of articulation and voicing distinction. II. METHODOLOGY A. Speakers

Twenty speakers, 10 females and 10 males, aged 20 to 35 years old, produced the experimental material. All were native speakers of Greek, born and raised in Athens, speaking what is commonly called standard Athenian. None of them had any history of speech or hearing disorders. B. Material

The ten Greek fricative consonants /f/, /v/, /h/, /ð/, /s/, /z/, /c¸/, /Œ/, /x/, and /Ç/ were recorded in real, two-syllable words (CVCV) stressed on the first syllable. Each fricative was in an initial position and the following vowel varied over all five Greek vowels /a/, /e/, /o/, /i/, /u/. Words beginning with /x/ and /Ç/ were followed only by /a/, /o/, /u/, since their allophones, [c¸] and [Œ], appear before front vowels /e/ and /i/. Table I presents all the words used in the experiment. The carrier phrase was “ipa _ ksa’na” (“I said _ again”). Each token was repeated 5 times, yielding a total of 230 tokens per speaker (8 fricatives  5 vowels  5 repetitions and 2 fricatives  3 vowels  5 repetitions). C. Procedure

All speakers were recorded in the Phonetics Laboratory of the University of Athens, in a sound treated room, with a highquality condenser microphone (RODE NT2-A), microphone pre-amp, and a console (SOUNDCRAFT SPIRIT M4) connected with a sound card [AUDIOPHILE 2496 (M-AUDIO)] to a computer. The distance between the speakers’ mouth and the microphone was 15 cm and the angle 45 . All recordings were sampled at 44 kHz (16 bit quantization) and saved directly to hard disk using the “CoolEditPro 2.1” software. D. Analysis

Analysis was carried out using the Praat (version 5.1.01) and MATLAB software packages. Elina Nirgianaki: Acoustic characteristics of Greek fricatives

Fricative and vowel segmentation involved the simultaneous consultation of waveform and wideband spectrogram. Fricative onset was located at the point at which high frequency energy first appeared on the spectrogram and/or at the point at which the number of zero crossings rapidly increased. The offset of voiceless fricatives was located immediately prior to the onset of the vowel’s first pitch period. For the voiced fricatives, the earliest pitch period in the fricative showing a change in the waveform pattern from that seen throughout the fricative was identified. The zero crossing of the preceding pitch period was designated as the end of the voiced fricative (see Yeni-Komshian and Soli, 1981). Fricative duration was defined as the interval between fricative onset and offset. Since absolute duration may have varied as a function of speaking rate, fricative “normalized” duration was examined. Given that the second syllable of the words in the corpus contained different segments (in the words /pa’FV1C2V2/, C2 was realized by either a voiced or voiceless consonant of different manner and place of articulation; V2 was one of the vowels /a  o  i/), normalized duration was defined as the ratio of fricative (F) duration over utterance (/pa’FV1/) duration. Spectral properties (spectral peak location and spectral moments) of the fricatives were examined over spectra calculated using the multitaper method with seven data tapers (Percival and Walden, 1993; Blacklock, 2004). No preemphasis was used. Spectral peak is the estimation of the highest amplitude peak of spectra calculated from the middle 40 ms of the fricative. Spectral moments were calculated from 40 ms at three different locations in the fricative: Onset, middle, and end. Both spectral peak location and spectral moments were estimated in a frequency range between approximately 1.0 and 11 kHz. Frequencies below approximately 1000 Hz were excluded, in order to minimize the effect of low frequency acoustic energy, due to voicing. In particular, a minimum frequency value was set for each speaker, equal to the value of their mean fundamental frequency (F0) multiplied by seven. For both voiced and voiceless fricatives, F1 and F2 were calculated at vowel onset and at the vowel midpoint using the Praat formant tracking algorithm. Specifically, F1 and F2 at vowel onset were obtained using short-term LPC analysis (Burg algorithm), at the first glottal pulse following cessation of the fricative. Similarly, F1 and F2 at vowel nucleus were estimated at the vowel’s midpoint. In addition, wideband spectrograms, and fast Fourier transform (FFT) spectra were also consulted [using a 25 ms full Hamming window (similar to Sussman et al., 1991; Fowler, 1994; Jongman et al., 2000)]. RMS amplitude in dB was measured for the entire portion of each fricative. In order to normalize for intensity differences among speakers, the difference of fricative amplitude minus vowel amplitude (“normalized amplitude”) was calculated, where vowel amplitude was defined as rms amplitude (in dB) averaged over three consecutive pitch periods at the point of maximum vowel amplitude (see Behrens and Blumstein, 1988b; Jongman et al., 2000). The statistical analysis was carried out using the SPSS 19.0 software package. Four-way analysis of variances J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

(ANOVAs) (voicing  gender  place of articulation  postfricative vowel) were conducted for all the examined parameters across the analyses, except for locus equations for which a three-way ANOVA was conducted (voicing  gender  place of articulation), since they were averaged across vowel contexts. Finally, a discriminant analysis was carried out, in order to reveal the acoustic parameters that could categorize the fricatives in terms of place of articulation and voicing. III. RESULTS

The results of the present study are presented first for the temporal parameters, followed by spectral, formant transition, and amplitude parameters. A. Temporal parameters 1. Normalized duration

Table II shows mean fricative and utterance duration and their ratio for each fricative and each place of articulation. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel) with normalized duration as the dependent variable revealed a main effect of voicing [F(1, 828) ¼ 2551.285, p < 0.0001, g2 ¼ 0.616], indicating voiceless fricatives were significantly longer (0.294) than voiced ones (0.212). Place of articulation was also a significant factor [F(4, 828) ¼ 40.124, p < 0.0001, g2 ¼ 0.039]. Bonferroni post hoc tests indicated that differences were significant between fricatives of all places of articulation, except for the dental and palatal ones. Sibilants exhibited the longest normalized duration (0.270), followed by labiodentals (0.259), velars (0.253), dentals (0.248), and palatals (0.246). Absolute fricative duration showed also the same pattern: Alveolars (112.2 ms) > labiodentals (103.75 ms) > palatals (101.95 ms) > dentals (99.65 ms) > velars (95.9 ms). The factor of following vowel was significant as well [F(4, 828) ¼ 96.539, p < 0.0001, g2 ¼ 0.093]. Differences between all vowels were significant, except for that between the high vowels, /i/ and /u/ (fricatives before /a/: 0.229 < /e/: 0.243 < /o/: 0.253 < /i/: 0.276, /u/: 0.269). Place of articulation interacted significantly with voicing [F(4, 828) ¼ 28.124, p < 0.0001, g2 ¼ 0.027]; it was TABLE II. Mean absolute duration for each fricative and word (in ms) and mean normalized duration for each fricative and each place of articulation.

Fricative /v/ /f/ /ð/ /h/ /z/ /s/ /Œ/ /c¸/ /c/ /x/

Fricative duration (ms)

Utterance duration (ms)

Normalized duration

89.1 118.4 85.1 114.2 94.8 129.6 82.0 121.9 70.4 121.4

399.3 400.7 397.1 407.2 413.2 417.6 406.2 423.8 408.7 409.8

0.223 0.295 0.215 0.281 0.230 0.311 0.203 0.289 0.173 0.296

Mean normalized duration (place of articulation) 0.259 0.248 0.270 0.246 0.253

Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2967

revealed that the difference between normalized duration of voiced and voiceless fricatives was smallest in the cases of dental and labiodental place of articulation and largest in the case of velar fricatives. A place  gender interaction [F(4, 828) ¼ 2.946, p < 0.05, g2 ¼ 0.003] indicated that although differences were not significant, female speakers produced the front articulated fricatives longer than male speakers and male speakers produced the alveolar, palatal, and velar fricatives longer than female ones. Place of articulation also interacted significantly with post-fricative vowel [F(14, 828) ¼ 4.555, p < 0.0001, g2 ¼ 0.015] revealing that, while fricatives before /i/ had either longer or the same duration than fricatives before /u/, this was not the case for labiodental fricatives, which exhibited longer duration before /u/.

B. Spectral parameters 1. Spectral peak location

Figure 1 shows the values of spectral peak location (in Hz), averaged across vowels and speakers’ gender, as a function of place of articulation and voicing. Figure 2 shows the values of spectral peak location (in Hz) averaged across voiced and voiceless tokens and speakers’ gender, as a function of place of articulation and following vowel. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel) with spectral peak location as the dependent variable, revealed a main effect of place of articulation [F(4, 828) ¼ 68.049, p < 0.0001, g2 ¼ 0.183]. Bonferroni post hoc tests indicated that spectral peak location differentiated fricatives of all places of articulation, except for the dentals from alveolars, as well as the labiodentals from palatals. In general, dental fricatives were characterized by the highest spectral peak values (5178 Hz),

FIG. 1. Spectral peak location in Hz (averaged across vowels and speakers’ gender), as a function of place of articulation and voicing. 2968

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

FIG. 2. Spectral peak location in Hz (averaged across voiced-voiceless tokens and speakers’ gender), as a function of place of articulation and following vowel.

followed by alveolars (5145 Hz), labiodentals (4492 Hz), palatals (4149 Hz), and velars (2377 Hz). Voicing had a significant effect on spectral peak location [F(1, 828) ¼ 33.936, p < 0.0001, g2 ¼ 0.023]; voiced fricatives were characterized by spectral peaks at lower frequency (4090 Hz) than voiceless ones (4776 Hz). Gender also had a significant effect on spectral peak location [F(1, 828) ¼ 94.843, p < 0.0001, g2 ¼ 0.064], indicating that fricatives produced by females had spectral peaks at higher frequency (4947 Hz) than fricatives produced by males (3919 Hz). A main effect was also obtained for following vowel [F(4, 828) ¼ 2.485, p < 0.05, g2 ¼ 0.007]. Post hoc tests indicated significant differences between fricatives followed by /a/ and fricatives followed by /e/, /i/, and /u/. Fricative spectral peak was highest before /i/: 4744 Hz > /e/: 4651 Hz > /u/: 4524 Hz > /o/: 4315 Hz > /a/: 4036 Hz. A voicing  place interaction [F(4, 828) ¼ 18.274, p < 0.0001, g2 ¼ 0.049] revealed that significant differences in the spectral peak values were observed between voiceless dental and labiodental fricatives and their voiced counterparts, contrary to the fricatives of the other places of articulation (Fig. 1). A place  vowel interaction [F(14, 828) ¼ 7.543, p < 0.0001, g2 ¼ 0.071] was also obtained. Spectral peaks of dental, labiodental, and velar fricatives before back vowels /o/ and /u/ were higher than before the other vowels, contrary to the alveolar and palatal fricatives (Fig. 2). Post-fricative vowel also interacted significantly with voicing [F(4, 828) ¼ 2.603, p < 0.05, g2 ¼ 0.007], revealing that voiceless fricatives had spectral peaks at higher frequency than voiced fricatives before /a/, /e/, /o/, and /i/, but at almost the same frequency before /u/. Finally, gender interacted significantly with voicing [F(1, 828) ¼ 4.635, p < 0.05, Elina Nirgianaki: Acoustic characteristics of Greek fricatives

g2 ¼ 0.003], revealing a higher difference between spectral peak frequencies of voiced and voiceless fricatives for male than female speakers. 2. Spectral moments

Table III presents the mean spectral moment values for each place of articulation, averaged across window location, speaker’s gender, voiced and voiceless tokens, and vowel context. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel) across window locations, with four moments as dependent variables, revealed a main effect of place of articulation on spectral mean [F(4, 828) ¼ 145.681, p < 0.0001, g2 ¼ 0.256]. According to post hoc tests all places were differentiated by first moment (alveolar > dental > labiodental > palatal > velar). Place of articulation had also a main effect on variance [F(4, 828) ¼ 749.589, p < 0.0001, g2 ¼ 0.745]; differences among all places were significant except for that between labiodentals and dentals. Place of articulation was also significant for skewness [F(4, 828) ¼ 64.305, p < 0.0001, g2 ¼ 0.327], although post hoc tests showed that skewness did not differentiate labiodentals, alveolars, and dentals. A main effect of place of articulation on kurtosis was obtained as well [F(4, 828) ¼ 65.185, p < 0.0001, g2 ¼ 0.194]. Post hoc tests showed that kurtosis failed to differentiate labiodentals from dentals, as well as alveolars from dentals. Voicing had a significant effect on first [F(1, 828) ¼ 22.899, p < 0.0001, g2 ¼ 0.01], third [F(1, 828) ¼ 22.095, p < 0.0001, g2 ¼ 0.012], and fourth spectral moment [F(1, 828) ¼ 27.176, p < 0.0001, g2 ¼ 0.02]. Voiced fricatives were characterized by lower spectral mean values (4722 Hz) than voiceless ones (4976 Hz), higher values of spectral skewness (1.060 vs 0.865), and higher values of kurtosis (3.189 vs 1.767), revealing that the spectra of the first had a concentration of energy in lower frequencies and better defined peaks than the latter ones. Gender had a significant effect on spectral mean [F(1, 828) ¼ 310.024, p < 0.0001, g2 ¼ 0.136], indicating higher values for female than male speakers (5285 vs 4413 Hz, respectively) and thus showing that the spectra of female speakers had a concentration of energy toward higher frequencies. It was also significant for spectral variance [F(1, 828) ¼ 5.218, p < 0.05, g2 ¼ 0.001], indicating higher values and more dispersed energy for male (2097 Hz) than female productions (2048 Hz). A main effect of gender was also obtained for skewness [F(1, 828) ¼ 75.437, p < 0.0001, TABLE III. Mean spectral moment values for each place of articulation, averaged across window location, speaker’s gender, voiced and voiceless tokens, and vowel context (velar place includes only back vowel contexts). Place of articulation Labiodental Dental Alveolar Palatal Velar

Mean (Hz)

Variance (Hz)

Skewness

Kurtosis

4931 5230 5482 4625 3397

2694 2657 1465 1573 1913

0.615 0.519 0.629 1.486 1.962

0.448 0.813 1.794 3.590 7.924

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

g2 ¼ 0.042] and kurtosis [F(1, 828) ¼ 51.221, p < 0.0001, g2 ¼ 0.038]. Fricatives produced by females were characterized by lower values of skewness (0.775 vs 1.150) and kurtosis (1.495 vs 3.462) than the ones produced by males; hence, males’ fricative spectra had a concentration of energy in lower frequencies, and better defined peaks than females’ spectra. Post-fricative vowel was a significant factor for first [F(4, 828) ¼ 5.803, p < 0.0001, g2 ¼ 0.01], second [F(4, 828) ¼ 9.626, p < 0.0001, g2 ¼ 0.01], third [F(4, 828) ¼ 13.320, p < 0.0001, g2 ¼ 0.03], and fourth moments [F(4, 828) ¼ 5.924, p < 0.0001, g2 ¼ 0.018]. Post hoc tests indicated that spectral mean did not differentiate fricatives before the front vowels, /e/ and /i/, before the back vowels /o/ and /u/ and before /a/ and /o/. Spectral variance differentiated fricatives before /a/ and /e/, /a/ and /i/, /a/ and /u/, as well as before /o/ and /u/. Skewness and kurtosis differentiated fricatives before /a/ and all the other vowels. A place  voicing interaction was obtained for spectral mean [F(4, 828) ¼ 22.789, p < 0.0001, g2 ¼ 0.04], skewness [F(4, 828) ¼ 12.978, p < 0.0001, g2 ¼ 0.029], and kurtosis [F(4, 828) ¼ 5.164, p < 0.0001, g2 ¼ 0.015]. Moreover, place of articulation interacted significantly with gender in the case of the first spectral moment [F(4, 828) ¼ 4.383, p < 0.005, g2 ¼ 0.008]. A place  vowel interaction was also observed for all spectral moments: first [F(14, 828) ¼ 20.516, p < 0.0001, g2 ¼ 0.126], second [F(14, 828) ¼ 6.727, p < 0.0001, g2 ¼ 0.023], third [F(14, 828) ¼ 5.849, p < 0.0001, g2 ¼ 0.045], and fourth [F(14, 828) ¼ 3.928, p < 0.0001, g2 ¼ 0.041]. An interaction between voicing and gender was evident in the cases of the first [F(1, 828) ¼ 26.114, p < 0.0001, g2 ¼ 0.011], second [F(1, 828) ¼ 4.294, p < 0.05, g2 ¼ 0.001], third [F(1, 828) ¼ 17.745, p < 0.0001, g2 ¼ 0.01], and fourth [F(1, 828) ¼ 19.833, p < 0.0001, g2 ¼ 0.015] spectral moments. Voicing interacted significantly with vowel in the cases of the first [F(4, 828) ¼ 4.876, p < 0.005, g2 ¼ 0.009], and third [F(4, 828) ¼ 4.153, p < 0.005, g2 ¼ 0.009] spectral moments. Finally, a place  voicing  gender interaction was obtained also for first [F(4, 828) ¼ 3.939, p < 0.005, g2 ¼ 0.007], third [F(4, 828) ¼ 4.928, p ¼ 0.001, g2 ¼ 0.011], and fourth [F(4, 828) ¼ 4.551, p ¼ 0.001, g2 ¼ 0.013] spectral moments. Table IV presents the effect of the examined factors as well as the interactions among them for each spectral moment (M1–M4). One-way ANOVAs for place of articulation and subsequent Bonferroni post hoc tests were conducted for each moment at each window location. Figures 3–6 present the average spectral moments (spectral mean, variance, skewness, kurtosis) for each window location, as a function of place of articulation. Spectral mean (Fig. 3) distinguished most places of articulation at the first (fricative onset) and second (fricative middle) window locations. Alveolar fricatives had the highest spectral mean and velar fricatives the lowest. Variance (Fig. 4) distinguished most places at the second window location. Variance was highest for fricatives articulated front in the oral cavity (labiodentals and dentals) and lowest for alveolars and palatals; fricatives articulated in the back oral Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2969

TABLE IV. The effect of the examined factors as well as the interactions among them for each spectral moment (M1–M4) (* ¼ significant, n/s ¼ not significant).

Place Voicing Gender Vowel Place  voicing Place  gender Place  vowel Voicing  gender Voicing  vowel Gender  vowel Place  voicing  gender Place  voicing  vowel Place  gender  vowel Voicing  gender  vowel

M1

M2

M3

M4

* * * * * * * * * n/s * n/s n/s n/s

* n/s * * n/s n/s * * n/s n/s n/s n/s n/s n/s

* * * * * n/s * * * n/s * n/s n/s n/s

* * * * * n/s * * n/s n/s * n/s n/s n/s

cavity (velars) fell in between. Skewness (Fig. 5) distinguished most places at the first and second window locations. Skewness was highest for fricatives articulated back in the oral cavity (palatals and velars), indicating their concentration of energy in the lower frequencies. Kurtosis (Fig. 6) distinguished most places at the second and third (fricative end) window locations. Kurtosis was also highest for fricatives articulated back in the oral cavity (palatals and velars), indicating a spectrum with more clearly defined peaks, and lowest for fricatives articulated front in the oral cavity

FIG. 3. Spectral mean in Hz (averaged across vowels, voiced-voiceless tokens, and speakers’ gender) for each fricative location, as a function of place of articulation. 2970

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

FIG. 4. Spectral variance in Hz (averaged across vowels, voiced-voiceless tokens, and speakers’ gender) for each fricative location, as a function of place of articulation.

(labiodentals and dentals), indicating a relatively flat spectrum. Alveolars’ kurtosis values fell in between. In general, across moments, window location 2 (middle) contained the most distinctive information regarding the fricative place of articulation.

FIG. 5. Spectral skewness (averaged across vowels, voiced-voiceless tokens, and speakers’ gender) for each fricative location, as a function of place of articulation. Elina Nirgianaki: Acoustic characteristics of Greek fricatives

FIG. 6. Spectral kurtosis (averaged across vowels, voiced-voiceless tokens, and speakers’ gender) for each fricative location, as a function of place of articulation.

C. Formant transition parameters 1. Formant onset values

Table V presents F1 and F2 onset values for male and female speakers and for each place of articulation, averaged across voiced and voiceless tokens, and vowel contexts. a. F1 onset. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel), with F1 onset as the dependent variable, revealed a main effect of place of articulation [F(4, 828) ¼ 143.289, p < 0.0001, g2 ¼ 0.134]. Bonferroni post hoc tests indicated that F1 onset values did not differentiate the labiodental from the dental place of articulation. A main effect of voicing [F(1, 828) ¼ 154.030, p < 0.0001, g2 ¼ 0.036] was obtained, since F1 onset values after voiceless fricatives (471 Hz) were significantly higher than those after voiced ones (429 Hz). As expected, there was a main effect of gender [F(1, 828) ¼ 300.486, p < 0.0001, g2 ¼ 0.07]; F1 onset was significantly higher for female (479 Hz) than for male (421 Hz) speakers. There was TABLE V. Mean F1 and F2 onset values (Hz) (averaged across voiced and voiceless tokens, and vowels) as a function of place of articulation and speaker gender (M for male, F for female) (velar place includes only back vowel contexts). F1 onset (Hz) Place of articulation Labiodental Dental Alveolar Palatal Velar

F2 onset (Hz)

M

F

Mean

M

F

Mean

457 434 414 354 465

510 509 457 403 538

484 472 436 378 501

1271 1383 1490 1879 992

1472 1645 1767 2294 1155

1371 1514 1629 2087 1074

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

also a main effect of vowel [F(4, 828) ¼ 507.444, p < 0.0001, g2 ¼ 0.475]: F1 onset was 350 Hz in the context of /i/, 387 Hz before /u/, 467 Hz before /e/, 458 Hz before /o/, and 572 Hz before /a/. Post hoc tests indicated that F1 onset values significantly increased as a function of decreasing vowel height. All differences among vowels were significant except that between /e/ and /o/. A gender  vowel interaction [F(4, 828) ¼ 16.356, p < 0.0001, g2 ¼ 0.015] and post hoc tests revealed that the difference between males and females F1 onset was larger in the cases of /a/, /e/, and /i/, and smaller in the cases of the back vowels, /o/ and u/. A voicing  vowel interaction [F(4, 828) ¼ 29.887, p < 0.0001, g2 ¼ 0.028] revealed that while F1 onset was significantly higher for voiceless fricatives before /a/, /e/, and /o/, it was not significant before the high vowels /i/ and /u/. A place  vowel interaction [F(14, 828) ¼ 9.843, p < 0.0001, g2 ¼ 0.032] was observed as well, mainly due to the significant difference of the labiodental fricatives’ F1 onset values before /i/ and /u/. Place of articulation also interacted significantly with gender [F(4, 828) ¼ 3.990, p < 0.005, g2 ¼ 0.004], revealing that the difference between F1 onset values of male and female speakers was higher in the case of the dental place of articulation. A gender  voicing interaction [F(1, 828) ¼ 5.430, p < 0.05, g2 ¼ 0.001] indicated that the difference between F1 onset values of male and female speakers was higher in the case of voiceless fricatives. b. F2 onset. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel), with F2 onset as the dependent variable, revealed a main effect of place of articulation [F(4, 828) ¼ 717.027, p < 0.0001, g2 ¼ 0.345]. Bonferroni post hoc tests indicated that F2 onset values differentiated all places of articulation. Differences were significant between all paired comparisons. Voicing was also significant for F2 onset [F(1, 828) ¼ 5.143, p < 0.05, g2 ¼ 0.001], indicating higher values for voiced (1591 Hz) than voiceless fricatives (1559 Hz). As expected, there was a main effect of gender [F(1, 828) ¼ 684.971, p < 0.0001, g2 ¼ 0.082]; F2 onset was significantly higher for female (1711 Hz) than for male speakers (1439 Hz). A main effect of vowel [F(4, 828) ¼ 848.073, p < 0.0001, g2 ¼ 0.408] was obtained: F2 onset was 2146 Hz in the context of /i/, 1850 Hz before /e/, 1532 Hz before /a/, 1260 Hz before /o/, and 1256 Hz before /u/. Post hoc tests indicated that F2 onset values were higher for front vowels compared to back vowels. All differences among vowels were significant except that between /o/ and /u/. A gender  place interaction [F(4, 828) ¼ 14.262, p < 0.0001, g2 ¼ 0.007] and post hoc tests indicated the largest differences between males’ and females’ F2 onset values for the palatal place of articulation. A gender  vowel interaction [F(4, 828) ¼ 12.758, p < 0.0001, g2 ¼ 0.006] revealed higher differences between males’ and females’ F2 onset values for /a/, /e/, and /i/ and lower for the back vowels /o/ and /u/. A voicing  place interaction [F(4, 828) ¼ 15.159, p < 0.0001, g2 ¼ 0.007] was also obtained; voiced fricatives exhibited higher F2 onset values in the cases of the alveolar, palatal, and velar places of articulation and lower in the cases of the labiodental and dental ones. A voicing  vowel interaction [F(4, 828) ¼ 15.460, p < 0.0001, g2 ¼ 0.007] Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2971

revealed higher F2 onset values for voiced fricatives than voiceless ones before /a/, /o/, and /u/ and lower before /e/ and /i/. A place  vowel interaction [F(14, 828) ¼ 19.260, p < 0.0001, g2 ¼ 0.032] was observed as well, mainly due to the fact that the dental and alveolar fricatives exhibited almost the same F2 onset values before /e/ and /i/. 2. Locus equations

Locus equation scatterplots were generated, for all subjects. Slope and y-intercept values were derived for each fricative for each speaker, averaged across vowel context. Table VI presents slope and y-intercept values for each place of articulation for females and males, averaged across voiced and voiceless tokens and all vowel contexts. A three-way ANOVA (place  gender  voicing) for slope revealed a main effect of place of articulation [F(4, 180) ¼ 88.118, p < 0.0001, g2 ¼ 0.601]. Post hoc tests indicated that the slope value did not differentiate alveolars, labiodentals, and dentals. There was also a main effect for voicing [F(1, 180) ¼ 37.252, p < 0.0001, g2 ¼ 0.064]; slope values were significantly higher for voiceless (0.763) than for voiced (0.623) fricatives. A main effect was also obtained for gender [F(1, 180) ¼ 4.608, p < 0.05, g2 ¼ 0.008], revealing higher slope values for male (0.717) than female speakers (0.668). A three-way ANOVA (place of articulation  speaker gender  voicing) for the y-intercept, revealed a main effect for place of articulation [F(4, 180) ¼ 205.656, p < 0.0001, g2 ¼ 0.748]; subsequent post hoc tests revealed that the y-intercept did not differentiate only the labiodental from the dental place of articulation. A main effect was also obtained for gender [F(1, 180) ¼ 26.915, p < 0.0001, g2 ¼ 0.024], indicating that the y-intercept was significantly higher for female (656 Hz) than male (491 Hz) speakers. The voicing distinction had a main effect on the y-intercept [F(1, 180) ¼ 50.182, p < 0.0001, g2 ¼ 0.046], indicating that the y-intercept values were significantly higher for voiced (686 Hz) than for voiceless (461 Hz) fricatives. No significant interaction between the examined factors was observed either for the slope or for the y-intercept values.

TABLE VII. Mean rms amplitude for each fricative and vowel and mean normalized amplitude for each fricative and each place of articulation.

Fricative /v/ /f/ /ð/ /h/ /z/ /s/ /Œ/ /c¸/ /c/ /x/

Fricative amplitude (dB)

Vowel amplitude (dB)

Normalized amplitude

60.5 53.3 61.2 52.8 64.8 64.4 62.0 55.6 59.8 54.6

74.3 73.8 74.1 73.3 73.5 72.8 74.2 73.2 74.0 73.9

13.8 20.5 12.9 20.5 8.7 8.4 12.3 17.6 14.2 19.3

Mean normalized amplitude (place of articulation) 17.1 16.7 8.6 14.9 16.8

normalized amplitude values for each fricative and each place of articulation. Figure 7 shows the values of normalized amplitude averaged across vowels and places of articulation, as a function of speakers’ gender and voicing. Figure 8 shows the values of normalized amplitude averaged across vowels and speakers’ gender, as a function of place of articulation and voicing. A four-way ANOVA (voicing  gender  place of articulation  post-fricative vowel), with normalized amplitude as dependent variable, revealed a main effect of place of articulation [F(4, 828) ¼ 248.210, p < 0.0001, g2 ¼ 0.376]. Bonferroni post hoc tests indicated that normalized amplitude did not differentiate the labiodental, dental, and velar places of articulation. A main effect of voicing [F(1, 828) ¼ 537.907, p < 0.0001, g2 ¼ 0.204] indicated that voiced fricatives had a significantly greater normalized amplitude (12.2) than voiceless ones (17.1) (Fig. 7). Gender

D. Amplitude parameters 1. Normalized amplitude

Table VII shows the mean amplitude values for each fricative and following vowel (in dB), as well as the TABLE VI. Mean slope and y-intercept (in Hz) (averaged across voiced and voiceless tokens and vowels) as a function of place of articulation and speaker gender (M for male, F for female) (velar place includes only back vowel contexts). y-intercept

Slope Place of articulation Labiodental Dental Alveolar Palatal Velar

2972

M

F

Mean

M

F

Mean

0.725 0.743 0.607 0.419 1.093

0.600 0.631 0.563 0.466 1.081

0.662 0.687 0.585 0.442 1.087

305 341 647 1221 60

506 589 807 1439 62

405 465 727 1330 61

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

FIG. 7. Normalized amplitude (averaged across vowels, and speakers’ gender) as a function of place of articulation and voicing. Elina Nirgianaki: Acoustic characteristics of Greek fricatives

FIG. 8. Normalized amplitude (averaged across vowels, and places of articulation) as a function of speakers’ gender and voicing.

was also significant for normalized amplitude [F(1, 828) ¼ 4.554, p < 0.05, g2 ¼ 0.002]. Fricatives produced by females were characterized by higher values of normalized amplitude (14.4) than the ones produced by males (14.9). Post-fricative vowel had a significant effect on normalized amplitude [F(4, 828) ¼ 3.709, p ¼ 0.005, g2 ¼ 0.006]; post hoc tests indicated that normalized amplitude differentiated fricatives before /a/ and /i/ and fricatives before /o/ and /i/. A voicing  place interaction [F(4, 828) ¼ 47.347, p < 0.0001, g2 ¼ 0.072] and subsequent post hoc tests indicated that although the difference between voiced and voiceless fricatives was significant for all non-sibilant fricatives, this was not the case for sibilant (alveolar) fricatives, /s/ and /z/, which exhibited almost the same normalized amplitude (Fig. 7). Voicing also interacted significantly with speaker’s gender [F(1, 828) ¼ 9.115, p < 0.005, g2 ¼ 0.003]; post hoc tests revealed that the difference between female and male normalized amplitude was evident in the case of voiceless fricatives (Fig. 8). A place  vowel interaction [F(14, 828) ¼ 2.069, p < 0.05, g2 ¼ 0.011] indicated that the difference in normalized amplitude of fricatives before /a/ and /i/, as well as before /o/ and /i/, was apparent in the cases of the labiodental and palatal places of articulation. E. Discriminant analysis

In order to quantify how well the above examined acoustic parameters serve as place descriptors for Greek fricative consonants, a discriminant analysis was carried out. All the acoustic parameters examined above, except for locus equations because they were not counted as individual productions, were used as predictor variables for place of articulation category. For the spectral moments, all fricative locations were entered. A stepwise linear discriminant J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

analysis was conducted with 17 predictors (normalized duration, spectral peak frequency mid-fricative, 4 spectral moments  3 time-windows within the fricative, normalized amplitude, F1 onset, F2 onset). The number of cases used was 4600 (8 consonants  5 vowels  20 speakers  5 repetitions þ 2 consonants  3 vowels  20 speakers  5 repetitions). After cross-validation, the overall percent correct classification across the five groups was 67.7%. Of the total cases (4600), 3112 were correctly classified. The classification scores were clearly significant for the alveolar and palatal places of articulation (91.9% and 83.2%, respectively). Classification errors rarely crossed these categories, which were mainly confused with dental fricatives (5.8% and 9.3%, respectively). Labiodentals and velars were mostly confused with each other and dentals were mostly confused with labiodentals. Cross-validated classification scores for each place of articulation are shown in Table VIII. The standardized canonical discriminant function coefficients suggested that normalized amplitude, spectral variance at fricative midpoint, F2 onset and spectral mean at fricative onset were the main parameters used for fricative classification. A subsequent discriminant analysis with those predictors yielded an overall classification rate of 60.7%. Classification rates for this analysis were as follows: Labiodental: 51.9%, dental: 44.8%, alveolar: 91.2%, palatal: 67.9%, and velar: 38.7%. Furthermore, when the data was split into voiced and voiceless subgroups, the cross-validated classification accuracy was 67.7% for voiced and 83.2% for voiceless fricatives (Table IX). Spectral variance at fricative midpoint, F2 onset, and spectral mean at fricative onset served most for fricative classification in both categories, while normalized rms amplitude was also a main parameter for voiceless fricative classification. Similarly, a discriminant analysis was carried out, in order to quantify how well the above examined acoustic parameters serve as voicing descriptors for Greek fricative consonants. The acoustic parameters used as predictors and the number of cases were the same as before. After crossvalidation, the overall percent correct classification across the two groups was 94.7%. Of the total cases (4600), 4355 were correctly classified. The classification score for voiced fricatives was 94.3% and for voiceless fricatives 95.1%. The standardized canonical discriminant function coefficients TABLE VIII. Predicted group membership (%) in terms of fricative place of articulation. Classification is based on a stepwise linear discriminant analysis with all acoustic parameters as predictors. Bold percentages indicate correct classification rates. Overall correct classification was 67.7%. Predicted group membership Place of articulation

Labiodental

Dental

Alveolar

Palatal

Velar

61.5 20.8 0.7 2.7 35.8

10.0 48.7 5.8 9.3 15.5

1.6 6.1 91.9 4.8 3.3

12.4 18.3 1.2 83.2 2.2

14.5 6.1 0.4 0 43.2

Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2973

Labiodental Dental Alveolar Palatal Velar

TABLE IX. Predicted group membership (%) in terms of voiced and voiceless fricative place of articulation. Classification is based on a stepwise linear discriminant analysis with all acoustic parameters as predictors. Bold percentages indicate correct classification rates. Overall correct classification was 67.7% for voiced and 83.2% for voiceless fricatives. Predicted group membership

Voiced

Voiceless

Place of articulation

Labiodental

Dental

Alveolar

Palatal

Velar

Labiodental Dental Alveolar Palatal Velar Labiodental Dental Alveolar Palatal Velar

45.6 35.8 0 1.6 23.0 73.2 0.2 0.8 0.4 15.3

23.8 32.4 0 5.0 5.0 2.6 84.0 2.4 4.4 7.0

1.0 2.2 99.8 0.2 2.3 1.2 3.0 90.2 3.8 3.0

13.6 18.2 0 93.0 2.3 8.2 12.6 5.4 91.4 1.3

16.0 11.4 0.2 0.2 67.3 14.8 0.2 1.2 0 73.3

suggested that normalized duration and normalized amplitude were the main parameters used for fricative classification. A subsequent discriminant analysis with those predictors yielded an overall classification rate of 89.0%. Hence, normalized duration and normalized amplitude served to distinguish the fricatives in terms of voicing with high classification accuracy. Classification rates for this analysis were 89.6% for voiced and 88.5% for voiceless fricatives. IV. DISCUSSION

Several acoustic parameters were investigated in the present study with the aim of describing the acoustic characteristics of fricatives as produced by native speakers of Greek. In accordance with the results, spectral and formant transition acoustic parameters provide the most important information for Greek fricatives’ place of articulation, while temporal and amplitude acoustic parameters provide the most important information for Greek fricatives’ voicing. In particular, first and second spectral moments are among the parameters that distinguish fricatives in terms of place of articulation with reasonable classification accuracy. However, only the mean first spectral moment distinguished all places of articulation. Across moments, a comparison of window locations suggests that window location 2 (fricative middle) contains the most distinctive information for fricative place of articulation. Most studies of the spectral moments of the English fricatives have focused on the distinction between /S/ and /s/. The Greek phonetic system (the common Athenian dialect) does not include post-alveolar sibilants; hence, direct comparisons are difficult to make. Moreover, while the place of articulation of the English /s/ is alveolar or dento-alveolar, the place of articulation of the Greek /s/ is alveolar ranging to advanced post-alveolar (e.g., Nicolaidis, 1994), and this may partly account for its spectral characteristics. However, our results agree with those reported by Tomiak (1990) and Jongman et al. (2000) in terms of the spectral variance, in that it fails to differentiate 2974

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

only the dental from labiodental place of articulation. They also agree with the results reported by Shadle and Mair (1996), Jongman et al. (2000), and Nissen and Fox (2005) in that variance is a more reliable indicator of fricative place of articulation than skewness, as well as with Shadle and Mair (1996) in that variance is large for the labiodental and dental fricatives. Although skewness did not distinguish fricatives in terms of place of articulation with reasonable classification accuracy, it did however differentiate two groups of them: Fricatives articulated at or anterior to the alveolar ridge (labiodentals, dentals, and alveolars) from fricatives articulated posterior to the alveolar ridge (palatals and velars). This result agrees with those of Jongman et al. (2000) and Nissen (2003)—for voiceless fricatives—who also reported significantly higher values for post-alveolar fricatives compared to alveolar and anterior ones. Regarding kurtosis, in our study, it was substantially higher for velar fricatives /x, c/ at all window locations than all other fricatives. Furthermore, the peakedness (high kurtosis values) of alveolar fricative spectra observed elsewhere in the literature (Tomiak, 1990; Jongman et al., 2000; Nissen, 2003) was not observed in our results. Normalized amplitude did not differentiate the labiodental, dental, and velar places of articulation. However, it did differentiate sibilants from non-sibilants, which has been also reported in many previous studies on English fricatives (e.g., Strevens, 1960; Behrens and Blumstein, 1988a,b; Jongman et al., 2000; Nissen and Fox, 2005). F1 onset did not distinguish the labiodental, dental, and velar places of articulation, while F2 onset distinguished all five Greek fricatives’ places of articulation. This latter result is in accord with the results of Lee and Malandraki (2004), which demonstrate that all five Greek fricatives’ places of articulation can be distinguished clearly by the F2 onset values. Moreover, in both studies the results indicate highest F2 values for palatals, followed by alveolars, dentals, labiodentals, and velars. At this point, it is important to note that velar fricatives’ formants’ values are not applicable for all vowel contexts, but only for back vowels, which is clearly reflected in the results. F2 onset has been reported not to distinguish the English fricative place of articulation (e.g., Jongman et al., 2000). However, the onset of F2 at the fricative vowel boundary has been shown to be significantly higher for dental fricatives than for labiodental ones (Jongman et al., 2000; Nittrouer, 2002). The present results also indicated that apart from the third and fourth spectral moments, spectral peak location and fricative duration do not distinguish fricative place of articulation with adequate classification accuracy. Spectral peak location has been shown to distinguish English sibilants from non-sibilants (e.g., Hughes and Halle, 1956; Strevens, 1960; Behrens and Blumstein, 1988a; Shadle, 1990; Shadle and Mair, 1996; Jongman et al., 2000). This was not the case for Greek fricatives, since high spectral peak values were observed not only in alveolar fricatives but also in labiodental and dental voiceless fricatives. Normalized duration distinguishes the alveolar (sibilants) from all other places of articulation. English sibilant Elina Nirgianaki: Acoustic characteristics of Greek fricatives

fricatives have been also reported to be longer than nonsibilants (e.g., Behrens and Blumstein, 1988a; Jongman et al., 2000). Both slope and y-intercept of locus equations distinguish Greek palatal and velar fricatives from the other places of articulation. In terms of slope, palatals exhibit the lowest and velars the highest values, while in terms of y-intercept, palatals exhibit the highest and velars the lowest values. However, most studies on English fricatives report that the two coefficients distinguish the English labiodental fricatives from the other categories (e.g., Wilde, 1993; Sussman, 1994; Jongman et al., 2000), while others report that locus equation coefficients can provide good classification for all English fricative places of articulation (e.g., Fowler, 1994). The present results also indicated that voiced fricatives of all places of articulation are considerably shorter than voiceless ones (23%). This result is in agreement with previous results reported for English fricative consonants (e.g., Baum and Blumstein, 1987; Behrens and Blumstein, 1988a; Crystal and House, 1988; Jongman, 1989; Jongman et al., 2000; Nissen, 2003). Both normalized duration and normalized amplitude served as voicing descriptors for Greek fricative consonants with high classification accuracy. The distinction between voiced and voiceless fricatives in terms of normalized amplitude has been also reported for English fricatives (e.g., Jongman et al., 2000). However, no such distinction in normalized amplitude has been found in the case of sibilant fricatives. Moreover, F1 onset differentiated voiced from voiceless fricatives, with voiceless fricatives having significantly higher F1 onset values than voiced ones. This distinction is known for stop consonants and has been investigated extensively (e.g., Liberman et al., 1957; Benkı, 2001). Nevertheless, this distinction was not evident for fricatives before the high vowels, /i/ and /u/. Voiced fricatives are also characterized by lower spectral peak, spectral mean, and slope values in locus equations than voiceless ones, as well as by higher values of skewness, kurtosis, and y-intercept in locus equations. Studies in English fricatives also suggest that voiced fricatives are characterized by lower spectral peak values, but less well defined peaks (lower values of kurtosis), compared to their voiceless counterparts (Jongman et al., 2000). In the present study, fricatives produced by females were shown to have significant differences from those produced by males, in terms of most of the examined factors. Regarding duration, fricatives produced by males were significantly longer than those produced by females, which agrees with previous studies reported for the English fricatives (e.g., Crystal and House, 1988; Jongman, 1989; Jongman et al., 2000). Additionally, spectra of females were characterized by higher spectral peaks, spectral mean, amplitude, and y-intercept values and by lower values of spectral variance, skewness, kurtosis, and slope values. English fricative spectra of females have been also characterized by higher peak, mean, and y-intercept values and lower skewness values, but contrary to the Greek by higher kurtosis values (Jongman et al., 2000). Regarding F1 and F2 onset values, of interest is the fact that, although females exhibited J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

higher F1 and F2 onset values than males before /a/, /e/, and /i/, this was not the case before the back vowels /o/ and /u/. Examination of the post-fricative vowel indicated that the higher the vowel, the longer the preceding fricative. This result agrees with previous results, reported both for the English (Jongman et al., 2000) and the Greek fricatives (Fourakis, 1986). Besides, intrinsic vowel duration has been shown to correlate with the degree of jaw lowering associated with its production such that the lower the vowel the longer its duration (e.g., Lindblom, 1967), which has been shown to be the case for Greek vowels as well (Fourakis et al., 1999). In addition, the higher the vowel the lower the F1 onset value; the more front the vowel the higher the F2 onset value. This result is in agreement with Fourakis et al. (1999) data for Greek vowels’ F1 and F2 at midpoint, following universal tendencies as well. Regarding fricative spectral parameters, post-fricative vowel examination indicated that vowel context also influenced the measured spectral peak location, as well as the first, third, and fourth moments. The results of this study have indicated the complex nature of the speech signal and its relation to phonetic distinctions. As shown for this study of Greek fricatives, fricative place of articulation and voicing distinction are correlated with a variety of acoustic parameters, including temporal, spectral, and amplitude ones. Future research is needed to shed light to the acoustic cues that listeners make use of, and the extent to which the acoustic cues identified here account for the perception of fricative place of articulation and voicing distinction. ACKNOWLEDGMENTS

This research was supported by a Doctoral Research Grant from the National and Kapodistrian University of Athens. Portions of this research were presented at the 3rd ISCA Workshop on Experimental Linguistics in Athens and at the 17th International Congress of Phonetic Sciences in Hong Kong. The author thanks Marios Fourakis and Antonis Botinis for their helpful comments on the first version of this paper, as well as Ann Todd and Christos Stentoumis for their help with MATLAB coding.

Baum, S. R., and Blumstein, S. E. (1987). “Preliminary observations on the use of duration as a cue to syllable-initial fricative consonant voicing in English,” J. Acoust. Soc. Am. 82, 1073–1077. Behrens, S. J., and Blumstein, S. E. (1988a). “Acoustic characteristics of English voiceless fricatives: A descriptive analysis,” J. Phonet. 16, 295–298. Behrens, S. J., and Blumstein, S. E. (1988b). “On the role of the amplitude of the fricative noise in the perception of place of articulation in voiceless fricative consonants,” J. Acoust. Soc. Am. 84, 861–867. Benkı, J. R. (2001). “Place of articulation and first formant transition pattern both affect perception of voicing in English,” J. Phonet. 29, 1–22. Blacklock, O. S. (2004). “Characteristics of variation in production of normal and disordered fricatives, using reduced-variance spectral methods,” Ph.D. thesis, University of Southampton, Southampton, United Kingdom. Botinis, A. (2011). The Phonetics of Greek (in Greek) (ISEL Editions, Athens), pp. 90–97. Botinis, A., Fourakis, M., and Prinou, I. (1999). “Prosodic effects on segmental durations in Greek,” in Proceedings of the 6th ECSCT, EUROSPEECH 99, Budapest, Vol. 6, pp. 2475–2478. Crystal, T., and House, A. (1988). “Segmental durations in connected speech signals: Current results,” J. Acoust. Soc. Am. 83, 1553–1573. Elina Nirgianaki: Acoustic characteristics of Greek fricatives

2975

Evers, V., Reetz, H., and Lahiri, A. (1998). “Crosslinguistic acoustic categorization of sibilants independent of phonological status,” J. Phonet. 26, 345–370. Forrest, K., Weismer, G., Milenkovic, P., and Dougall, R. N. (1988). “Statistical analysis of word-initial voiceless obstruents: Preliminary data,” J. Acoust. Soc. Am. 84, 115–124. Fourakis, M. (1986). “A timing model for word-initial CV syllables in modern Greek,” J. Acoust. Soc. Am. 79, 1982–1986. Fourakis, M., Botinis, A., and Katsaiti, M. (1999). “Acoustic characteristics of Greek vowels,” Phonetica 56, 28–43. Fowler, C. A. (1994). “Invariants, specifiers, cues: An investigation of locus equations as information for place of articulation,” Percept. Psychophys. 55, 597–610. Fox, R. A., and Nissen, S. L. (2005). “Sex-related acoustic changes in voiceless English fricatives,” J. Speech Lang. Hear. Res. 48, 753–765. Hedrick, M. S., and Ohde, R. N. (1993). “Effect of relative amplitude of frication on perception of place of articulation,” J. Acoust. Soc. Am. 94, 2005–2027. Hughes, G. W., and Halle, M. (1956). “Spectral properties of fricative consonants,” J. Acoust. Soc. Am. 28, 303–310. Jongman, A. (1989). “Duration of fricative noise required for identification of English fricatives,” J. Acoust. Soc. Am. 85, 1718–1725. Jongman, A., Wayland, R., and Wong, S. (2000). “Acoustic characteristics of English fricatives,” J. Acoust. Soc. Am. 108, 1252–1263. Lee, C. Y., and Malandraki, G. A. (2004). “Greek fricatives: Inferring articulation from F2 at vowel onset,” J. Acoust. Soc. Am. 116(4), 2629. Liberman, A. M., Delattre, P. C., and Cooper, F. S. (1957). “Some cues for the distinction between voiced and voiceless stops in initial position,” J. Acoust. Soc. Am. 29, 1254. Lindblom, B. (1963). “On vowel reduction,” Report No. 29, Stockholm: Royal Institute of Technology, Speech Transmission Laboratory. Lindblom, B. (1967). “Vowel duration and a model of lip mandible coordination,” (STL-QPSR) 8(4), 1–29. Maniwa, K., Jongman, A., and Wade, T. (2009). “Acoustic characteristics of clearly spoken English fricatives,” J. Acoust. Soc. Am. 125, 3962–3973. McFarland, D. H., Baum, S. R., and Chabot, C. (1996). “Speech compensation to structural modifications of the oral cavity,” J. Acoust. Soc. Am. 100, 1093–1104. Nicolaidis, K. (1994). Aspects of Lingual Articulation in Greek: An Electropalatographic Study. Themes in Greek Linguistics, edited by I. Philippaki Warburton, K. Nicolaidis, and M. Sifianou (John Benjamins, Amsterdam), pp. 225–232. Nicolaidis, K. (2002). “Durational variability in vowel-consonant-vowel sequences in Greek: The influence of phonetic identity, context and speaker,” in Selected Papers on Theoretical and Applied Linguistics from the 14th International Symposium, edited by M. Makri-Tsilipakou, Thessaloniki, April 20–22, 2000, pp. 280–294. Nissen, S. L. (2003). “An acoustic analysis of voiceless obstruents produced by adults and typically developing children,” Ph.D. dissertation, Ohio State University.

2976

J. Acoust. Soc. Am., Vol. 135, No. 5, May 2014

Nissen, S. L., and Fox R. A. (2005). “Acoustic and spectral characteristics of young children’s fricative productions: a developmental perspective,” J. Acoust. Soc. Am. 118, 2570–2578. Nittrouer, S. (1995). “Children learn separate aspects of speech production at different rates: Evidence from spectral moments,” J. Acoust. Soc. Am. 97, 520–530. Nittrouer, S. (2002). “Learning to perceive speech: How fricative perception changes, and how it stays the same,” J. Acoust. Soc. Am. 112, 711–719. Nittrouer, S., Studdert-Kennedy, M., and McGowan, R. S. (1989). “The emergence of phonetic segments: evidence from the spectral structure of fricative-vowel syllables spoken by children and adults,” J. Speech Lang. Hear. Res. 32, 120–132. Percival, D. B., and Walden, A. T. (1993). Spectral Analysis for Physical Applications: Multitaper and Conventional Univariate Techniques (Cambridge University Press, Cambridge), pp. 1–583. Shadle, C. H. (1985). “The acoustics of fricative consonants,” Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA. Shadle, C. H. (1990). “Articulatory-acoustic relationships in fricative consonants,” in Speech Production and Speech Modeling, edited by W. Hardrastle and A. Marchal (Kluwer, Dordrecht), pp. 187–209. Shadle, C. H., and Mair, S. J. (1996). “Quantifying spectral characteristics of fricatives,” in Proceedings of the International Conference on Spoken Lang. Proc. (ICSLP), pp. 1521–1524. Soli, S. D. (1981). “Second formants in fricatives: Acoustic consequences of fricative vowel coarticulation,” J. Acoust. Soc. Am. 70, 976–984. Stevens, K. N. (1998). Acoustic Phonetics (The MIT Press, Cambridge, MA), pp. 100–116 and 379–412. Strevens, P. (1960). “Spectra of fricative noise in human speech,” Lang. Speech 3, 32–49. Sussman, H. M. (1994). “The phonological reality of locus equations across manner class distinctions: Preliminary observations,” Phonetica 51, 119–131. Sussman, H. M., McCaffrey, H. A., and Matthews, S. A. (1991). “An investigation of locus equations as a source of relational invariance for stop place categorization,” J. Acoust. Soc. Am. 90, 1309–1325. Sussman, H. M., and Shore, J. (1996). “Locus equations as phonetic descriptors of consonantal place of articulation,” Percept. Psychophys. 58, 936–946. Tjaden, K., and Turner, G. S. (1997). “Spectral properties of fricatives in Amyotrophic Lateral Sclerosis,” J. Speech Lang. Hear. Res. 40, 1358–1372. Tomiak, G. R. (1990). “An acoustic and perceptual analysis of the spectral moments invariant with voiceless fricative obstruents,” Doctoral dissertation, SUNY Buffalo. Wilde, L. (1993). “Inferring articulatory movements from acoustic properties at fricative vowel boundaries,” J. Acoust. Soc. Am. 94, 1881. Yeni-Komshian, G. H., and Soli, S. D. (1981). “Recognition of vowels from information in fricatives: Perceptual evidence of fricative-vowel coarticulation,” J. Acoust. Soc. Am. 70(4), 966–975.

Elina Nirgianaki: Acoustic characteristics of Greek fricatives

Copyright of Journal of the Acoustical Society of America is the property of American Institute of Physics and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Acoustic characteristics of Greek fricatives.

The present study examined the acoustics of Greek fricative consonants in terms of temporal, spectral, and amplitude parameters. The effects of voicin...
794KB Sizes 2 Downloads 4 Views