Relation between voice-onset time and vowel duration.

Relation Robert

between

voice-onset

time

and vowel

duration

F. Port

Departmentof Linguistics,Indiana University,Bloomington,Indiana 47405

[:losemarie

Rotunno

Departmentof Speechand Hearing Sciences,City Universityof New York, New York, New York 10036 (Received26 July 1978; revised3 April 1979) As part of an investigationof the temporal implementationrules of English, measurements were made of voice-onsettime for initial English stopsand the duration of the following voiced vowel in monosyllabic words for New York City speakers.It was found that the VOT of a word-initial consonantwas longer beforea voicelessfinal clusterthan beforea singlenasal,and longer beforetensevowelsthan lax vowels. The vowels were also longer in environmentswhere VOT was longer, but VOT did not maintain a constantratio with the vowel duration, even for a singleplace of articulation.VOT was changedby a smallerproportionthan the followingvoicedvowel in both cases.VOT changesassociated with the vowel were consistentacrossplace of articulationof the stop. In the final experiment,when vowel tensity and final consonanteffectswere combined,it was found that the proportionof vowel durationchangethat carried over to the precedingVOT is different for the two phoneticchanges.These resultsimply that temporalimplementationrules simultaneously influenceseveralacousticintervalsincludingboth VOT and the "inherent" interval correspondingto a segment,either by independentcontrol of the relevant articulatory variablesor by someunknown commonmechanism. PACS numbers: 43.70.Gr, 43.70.Bk, 43.70.Ve

INTRODUCTION

A central problem in phonetics is to discover how the phonologicalelements from which words are spelled (phonemesand features) are manifestedin the physical medium of sound.

The mismatch between the digitized

segmentsof linguistic analysis and the finely graded timing of articulatory and acoustic events constitutes

vowels in English. Context effects on VOT will provide an opportunity to investigate several theoretical issues regarding the nature of the hypothesizedtemporal implementation rules operating in speech production in English.

Voice-onset time (VOT) is the temporal interval from the release of an initial stop to the onset of glottal pul-

investigators treat the complexity of speech timing in roughly the same way coarticulation is often treated--

sing (if acousticrecords are examined)or to the closure of the glottis for a followingvowel (if articulatory records of speech are examined) (Lisker and Abramson, 1964, 1967; Abramson, 1976). It is knownto play

as a kind of noise added to the speech signal by the production device during performance. It is sometimes

a major role in distinguishing initial voiced and voiceless (or lax and tense) sto•psin English as well as in a

proposed that physiological and mechanical properties

number of other languages. In English the voiceless stops /p, t, k/have positive VOT (with voicing lagging after stop release) greater than around 30 ms. The term aspiration is used to describe the auditory effect of the glottal turbulence accompanyingthis voicing lag. The voiced stops /b, d, g/have either negative VOT

one of the most troublesome aspects of this problem

(Lisker, 1974b;Lehiste, 1970; Klatt, 1976). Some

of the articulators provide the main sources of temporal effects (Chomsky and Halle, 1968; Stevens and House, 1972), however, many of the specific hypotheses concerning such mechanical effects have not survived rig-

orous testing (Lisker and Abramson, 1971; Hirose and Gay, 1972; MacNeilage, 1972; Lisker, 1974a; Raphael, 1975). These results lend support to the notion that the level of phonetic recoding which converts phono-

logical input into instructionsto the articulators (Lib-

(with voicing beginningbefore stop release) or very short positive VOT (Lisker and Abramson, 1964, 1967; Zlatin, 1974). Perceptual studies employing a multidimensional synthetic analog of the VOT continuum have validated the phonological relevance of this art-

erman, Cooper, Shankweiler, and $tuddert-Kennedy, 1967; Fant, 1969; Ladefoged, 1972) must also include a set of temporal implementation rules that specify the durations of phonetic intervals according to various contextual features (Klatt, 1976). Although there may

thoughVOT is clearly a highly important parameter distinguishingvoiced from voiceless stops in English,

be other ways of conceptualizingthese effects (Fowler,

the absolute value of VOT depends not just on the voic-

1977), Klatt's notion of temporal implementation rules provides a useful framework for the formulation of issues regarding the manifestation of phonological units in the temporal structure of speech. In this paper we shall report some properties of the implementation rules necessary to describe the temporaJ variable of voice-onset time (VOT) for aspirated word-initial stops

ing feature of the correspondingstop, but is also highly occurs (Lisker and Abramson, 1967; Klatt, 1975). Althoughthe main contrast in VOT between voiced and voiceless stops could be said to be due to the phonology directly, the context effects require either temporal implementation rules for their description, or a

as a function

carefully developed mechanical explanation.

654

of factors

known to affect the duration

J. Acoust. Soc.Am.66(3),Sept.1979

of

iculatory parameter (Lisker and Abramson, 1970;

Zlatin, 1974;SummerfieldandHaggard,1976). Al-

sensitive

to various

features

of the context in which it

0001-4966/79/090654-09500.80 ¸ 1979Acoustical Society of America

654

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 132.236.27.111 On: Thu, 18 Dec 2014 02:50:57

Assuming that the phonology of a language yields a matrix of segmental features as output, the phonetic temporal implementation rules would convert these into descriptions of the durations of various prominent

temporal intervals (such as consonant constriction durations, vowel durations, and VOT) as observed in speech production.

Although there have been several

attempts to formalize such _asystem of rules (Lindblom and Rapp, 1973; Nooteboom, 1972), we shall build primarily on the model of Klatt (1976)--to be described in more detail below--since it is the most appropriate for description of the effects of segmental features on timing. There are a number of issues regarding the form of such rules. Presumably the input to the rules is in the form of segments or segmental features as well as certain prosodic properties such as stress and word boundaries. More problematic is the domain of application of the rules. For example, are there rules that partially specify the phonetic duration of a

"segment itself" in terms of its feature composition, or instead are "inherent durations" always given in advance with rules applying only to intervals in neighboring context (Klatt, 1976)? Another question regarding rule domains is whether they affect only a single interval at a time, or if a number of temporal intervals may be independently specified for durational modifications. In the latter case, for example, a single rule might modify not only the duration of a vowel, but also the duration of adjacent consonant constrictions or VOT. It has also been proposed that there may be a hierarchy of rule domains nested within each other (Lindblom and Rapp, 1973). Another issue is the arithmetic form of the temporal changes themselves. There have been data to support rules that involve addition or subtraction of a constant (Chen, 1970), multiplication by a constant (Klatt, 1973, 1976) and the use of power functions

(Nooteboom, 1972; Lindblom and Rapp, 1973). Finally, there are questions to be discussed later about the combinationof

several rules acting on a single inter-

val.

The following experiments tion

affect

that

most

contexts

the duration

have a similar

which

of vowels

tion effects onset

have been

observed

also been

found

time.

pected effects on vowel duration (Peterson and Lehiste, 1960; Chen, 1970) would also be observed for the VOT in a preceding voiceless

less stops /p, t, k/are

stop.

Since the three voice-

known to have different mean

VOTs, however, we also want to know if the same temporal relation applies to all three stops.

A methodological problem arises here, as elsewhere in phonetic research, since to collect data we must make measurements of physical events without knowing

what kind of physical events (for example, whether articulatory or acoustic) are most appropriate to the language. We measured acoustically defined intervals and propose models to describe the results.

Direct

measurement of articulatorily defined intervals (e.g., duration of glottal abduction or duration of open oral cavity) might conceivabl•y yield results that imply a rather

different

I. EFFECTS EXPERIMENT A.

formal

model.

OF POSTVOCALIC I

CONSONANTS:

Methods

A symmetrical set of six monosyllabic test words was prepared beginning with the three voiceless stops

/p, t, k/ and ending'with either/n/or

/pt/ (in order to

maximize the effects of postvoealic el duration). The set consisted of

consonants on vow-

pin pipped

to

tin tipped

to

kin

effect on the duration of VOT for a pre-

ceding aspirated stop. For example, both vowels (Fry,

the rules must have separate

If a simple relation, such as constant ratio, holds between the VOT of initial voiceless stops and the duration of the voiced portion of the following vowel, we should predict that any factor that affects the duration of a vowel would also affect preceding VOT by the same proportion--even segmental phonological features. In our first experiment, the voicing and clustering of the postvocalic consonants were varied to see if their ex-

derive from the observahave

or whether

terms to account for segmental context effects on voice-

kipped

in stressed syllables than in unstressed syllables, and

A pseudorandom list was prepared containing five occurrences of each word along with ten other words to distract attention from the test items. Eight speakers

longer in one syllable words than in two syllable words

were

(Klatt, 1973; Lisker and Abramson, 1967). Increasing speaking tempo shortens both vowels (Peterson and Lehiste, 1960; Port, 1976) and VOT (Summerfield,

loudness.

1955) and VOT (Lisker and Abramson, 1967) are longer

1975). There is even indication that aspiration is longer in utterance final syllables than in nonutterance final

words (Summerfield, 1975), comparable to utterance final vowel lengthening (Oller, 1973; Lehiste, 1972). Weismer (1979) measured VOT of initial voiceless stops in CVCs before both tense and lax vowels followed by both voiced and voiceless final stops. He found a significant though small effect of both vowel tensity and final consonant voicing on VOT. One question to be explored in this paper is whether these VOT effects can be derived in some way directly from the vowel dura655

J. Acoust.Soc.Am., Vol. 66, No. 3, September1979

asked

to read

the list

at a comfortable

rate

Five of the subjects were life-long

and

residents

of New York City and students at Brooklyn College; four of this group were female. ers

were

male

students

from

The other three speak-

the Midwest

or South.

The recordings were made in a quiet booth on a TEAC 1 (model A-3340S) tape deck at 77 ips after the subject was well practiced at reading the list. Wide-band spec-

trograms were made of each utterance on a Kay (model 6061B) sound spectrograph. A time scale was prepared for measurement

from a wide-band

gram of a calibration tone.

spectro-

Voice-onset time (VOT)

was measured to the nearest 10 ms from the beginning of the release burst on the initial stop to the first vis-

ible striation representingglottal pulsing.• The duraR.F. Port and R. Rotunno: VOT and vowel duration

655

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 132.236.27.111 On: Thu, 18 Dec 2014 02:50:57

tion of the following vowel was measured to the nearest 10 ms from the onset of glottal pulsing to the closure for the following consonant as evidenced by the abrupt

cessation of energy in the lower formants.

initial

Although

I00

in general it is perhaps most appropriate to define vow-

c

"

8O

-

•

60

A plot of VaT against vowel duration for the six words in Fig. 1 demonstrates the relation between the two. If VaT

and vowel

duration

maintain

a constant

ratio

across the three stops then all six points should array themselves along a single line radiating from the origin of the graph. If instead each initial stop has its own characteristic ratio, then the pairs of words beginning with the same stop should lie along such lines. Figure I displays dashed lines representing constant VaT/ vowel duration ratios of 1, •,• and • From the figure it can be seen first that the vowel duration is greatly affected by the following consonants, since vowels with

following/pt/were 60% of the duration of those with following/n/. This was expected from previous research (Peterson and Lehiste, 1960; Chen, 1970). It can also be seen that although the place of articulation of the initial stop affects VaT, place does not affect the duration of the following vowel in these data. Thus, no overall constant ratio is obtained. 2

We must hypothesize, then, a rule that modifies the inherently different VaTs of the three stops. Arithmetically such a rule might add 15 ms to the VaT before -p! (or equivalently subtract from the VaT of

the -n words), or it might increase (or decrease) VaT TABLE

I.

Mean voice-onset

time and vowel duration pooled

across subjects for the six test words in experiment 1 with standard

deviations. Voice

onset

time

-zn m

p-t-k-

656

SD

64 (11) 73 (13) 90 (11)

Vowel

-zpt m

SD

47 (10) 61 (13) 74 (13)

-zpt SD

167 170 169

(21) (27) (23)

A

/n /

&

/•/kzn/

2/

A./

_ •/ /kzpt///•n/ _ //

//

•

/tzpt/.,•/

•n

•

/ //

//

/

5o

4o

p

/ •0

// 80

90

Vowel

•00

•0

•0

Duration

•0

•

in

•50

ms

•60

for

•0

•80

•90

•00

/x/

FIG. 1. Mean voice-onset time plotted against mean vowel duration for the six test words in experiment 1. Dashed lines

represent VOT/vowel duration ratios of 1, «, and «.

by 20%. In an attempt to choose between the additive and multiplicative models, a coefficient of correlation was computed for predicted and observed values of the

individual speaker means for the three stops (24 pairs of points) separately for the two models. There was no difference between them (for additive model, r = 0.893, for multiplicative model, r= 0.888). The more interesting of the two models, however, is the multiplicative one since one might try to derive the mult-

iplicative constant for the VaT effect (k = 80%) from the multiplicative constant (Klatt, 1976; Peterson and Lehiste, 1960; Port, 1976) for the vowel duration effect (/r = 60%). This experiment replicates Weismer's (1979) result by showing that even segmental properties, such as voicing, manner and clustering that are known to affect preceding vowel duration also affect the duration of the quasivocalic interval of VOT for a preceding aspirated stop. These results further indicate that the proportion by which VOT is affected is considerably smaller than the amount by which the voiced vowel is affected.

If implementation rules are required to account for the effects of postvocalic consonants on VaT preceding a vowel, will phonological features of the vowel itself have effects on preceding VaT? The next experiment explored this question.

duration

-•n m

/k/

•-'ø• /

/

its familiar effect on VaT since /p/ < /t/ < /k/. These data replicate Weismer (1979) and indicate that some

stops.

70

0

are 17 ms, 12 ms, and 16ms, respectively, amounting to an average shortening of VaT of 20%. The table also shows that the place of articulation of the stop has

segmental phonological features that affect vowel duration will also affect the duration of aspiration on a preceding voiceless stop. The results further show that the shortening effect of the final consonants has roughly the same absolute change for the three initial aspirated

O

consonants

/

•ø•/

words ending with the voiceless cluster /pt/than in words ending with/n/ (p

Vowel duration in American English.

Vowel duration characteristics of esophageal speech.

Relation between ventricular depolarization duration and cardiac cycle length.

Dose-response relation between exercise duration and cognition.

Luminance-duration relation in reaction time to spectral stimuli.

Effect of intensity slopes on the perception of vowel duration.

Depression in university students: duration, relation to calendar time, prevalence, and demographic correlates.

The relation between cycling time to exhaustion and anaerobic threshold.

Relation between usual daily walking time and metabolic syndrome.

Children's articulatory and auditory awareness of differences between vowel sounds.

Relation between corticosteroid and albumin concentrations in umbilical vein plasma and the duration of labour.

Variability in Vowel Production within and between Days.

Relation between QRS duration and atrial synchronicity in patients with systolic heart failure.

Vowel identification by cochlear implant users: Contributions of duration cues and dynamic spectral cues.

Vowel duration in Afrikaans: the influence of postvocalic consonant voicing and syllable structure.

Effects of stimulus duration and vowel quality in cross-linguistic categorical perception of pitch directions.

The relation between the level of interleukin-23 with duration and severity of ulcerative colitis.

Effect of final-syllable position on vowel duration in infant babbling.

Associations between physical activity, sedentary time, sleep duration and daytime sleepiness in US adults.

Concurrent Associations between Physical Activity, Screen Time, and Sleep Duration with Childhood Obesity.

Bidirectional relationships between sleep duration and screen time in early childhood.

Association Between Daily Time Spent in Sedentary Behavior and Duration of Hyperglycemia in Type 2 Diabetes.

WHO European Childhood Obesity Surveillance Initiative: associations between sleep duration, screen time and food consumption frequencies.

Children's sleep quality: relation with sleep duration and adiposity.