Intelligibility of speech under nonexponential decay conditions B. Yegnanarayana and B. S. Ramakrishna Acoustics Laboratory,Departmentof ElectricalCommunication Engineering,Indian Instituteof Science, Bangalore-I2, India (Received 17 December 1974)
In this paper,subjectivestudiesmadeon nonexponentially decayingsoundsare described.The various decayconditionsare realizedby changingthe positionof the pickup microphonein a reverberationroom with a highly absorbingsampleon the floor. It is shown,by meansof articulationtests,that intelligibility of speechis more closeto a highly absorbingsamplethan away from it. It is alsoshownthat the perceptionof decayis mainlydue to the initial portionof a nonexponential decay.The significance of these studiesis determiningthe acousticsof halls is explained. Subiect Classification:55.20; 70.35; 55.30.
INTRODUCTION
It is generally agreed that Sabine's reverberation time is not sufficient as a criterion for acoustical quality of a room. This is because the decay of sound in rooms is
usually nonexponential and there is no unique definition of reverberation time for such decays. Several crite-
ria t-4 were proposedas alternatives to reverberation time. Almost all of them emphasize the importance of an initial portion of decay in some form or other. However, no systematic investigation appears to have been undertaken of the relation between the proposed criterion and the subjective assessment. The difficulty of such investigations lies in the realization of the various types of nonexponential decays under controlled conditions.
In the subjective assessment of acoustical merits of a room, two aspects are normally considered. They are the intelligibility and the quality of sounds perceived. A quantitative measure of intelligibility of speech may be obtained by counting the number of discrete speech units correctly recognized by a listener. The procedure by which this quantitative measure is obtained is known as an articulation test. In this, typically, an announcer reads
lists of syllables, words, or sentences to a group of listeners, and the percentage of items correctly recorded by these listeners is taken as a measure of the intelligibility of speech. Obviously then, the intelligibility
dependsnot only onthe listeners andthe talker but also on several other factors such as the level at which the signal is presented to the listener, the frequency response of the acoustical system, and the level of noise both due to reverberation
and extraneous
sources.
Theoretical
and experimental studiess-• have been made to estimate the intelligibility ditions
of speech under exponential decay con-
in a room.
articulation
Theoretical
curves
are
available
for
scores as a function of noise level (rever-
herant and external) and speech level.
The signal or
speech level is calculated as the sum of the energy in the direct sound and the energy due to early reflections within the integration period of the ear. The reverberant noise level is calculated on the basis of the energy in the decaying sound field after the integration period.
For a given signal-to-noise ratio, maximum intelligibility is found to occur when the speech level is in the
range65-70 dB.6 853
J. Acoust.Soc.Am.,Vol. 58, No.4, October1975
In an articulation test, the evaluation of intelligibility is based on the number of test items correctly recorded by a listener and the listener is not required to appraise the quality of the speech. By contrast, methods of subjective appraisal require the listeners to evaluate the quality of speech itself. Subjective evaluation of nonexponential decays has been made using computer-simu-
lated reverberators. •' In that experiment, exponentially and nonexponentially reverberated signals were presented in pairs to listeners over headphones at an average level of 70 dB. The subjects were asked which stimulus in each pair sounded more reverberant. Varying the reverberation time of the exponential decay, the subjective reverberation of nonexponential decay was found by determining the point where the judgments were equally divided between exponentially and nonex-
ponentially reverberated signals. It was found (for artificially reverberated signals) that the reverberation time of the nonexponential decay calculated on the basis of the first 160 msec of the decay corresponds to subjective reverberation. In this paper, articulation tests performed under nonexponential decay conditions are described. The in-
fluenceof initial decayrate on the intelligib{lity of speech is determined by performing the tests under different decaying states. These states are realized by simply changing the position of the pickup microphone in a reverberation room with a large area of highly absorbing material on the floor. Experiments to verify perception of nonexponential decays are also described.
Finally, the relation between these subjective studies and the decay curves is discussed. Such studies derive their significance from the fact that in listening to sound in auditoria the ear is close to the top surface of a highly sound-absorbing layer which the audience themselves furnish. All the experiments described here, however, correspond to monaural listening only. I.
ARTICULATION
DECAY A.
Articulation Since
TESTS
UNDER
NONEXPoNENTIAL
CONDITIONS test
articulation
tests
are
used
to evaluate
the ef-
ficiency of a room as an information channel for spoken
Copyright ¸ 1975by theAcoustical Societyof America
853
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32
854
Yegnanarayana andRamakrishna' Intelligibility undernonexponential decay reference ANECHOIC
CHAMBER
TALI•ERc•
TE ST
RECORDER
1
ROOM
TAPE
TAPE
REC OR DER
RECORDER ANECHOIC REV.
system.
In the present case, an anechoicchamber(19 ftx14 ft TAPE
STEP
854
OR
CHAMBER
x 13ft) is used as a reference system. A reverberation chamber (25ftx20ftx14ft) with a large area of highly absorbing material on the floor is used as the test system. Different decay conditions are realized in the room at different heights from the sample and by picking up the sound at different levels above the sample we can study the intelligibility of the sound under different
decay conditions. Close to the sample a fast initial decay (for the first 10-20 dB) is followed by a slow lateral decay giving rise to a nonexponentialdecay. As the microphone is moved away from the sample, the decay tends to be more or less exponential corresponding to
only the later part of the nonexponential decay. STEP
ANECHOIC
2
The five different test conditions for which intelligibility tests were carried out are as follows'
A anechoic chamber (Reference system) B1 reverberation chamber with 240 sq. ft. of 8-in.-
CHAMBER ß
ß
TAPE
'ß LISTENERS
RECORDER
ß ß
STEP
FIG. 1.
3
Block diagram for the test system.
words, the tests should be so performed that the resuits obtained are influenced, as far as possible, by only
the acousticfeatures of the room under study.z0 Therefore, when studying the properties of this channel it is essential to keep the conditions on the sides of both the speaker and the listener as constant as possible. To achieve this, the test speech material is carefully recorded in an anechoic chamber. It is then played back in the test room and is again recorded. These rerecorded signals which are modified by the conditions of the test room are presented to a group of listeners in an anechoic chamber where they estimate the articulation ß
of the test
room.
shown in Fig.
The
block
schematic
for
the test
is
1.
Whatever precautions one may take, since human factors are involved, the results of articulation tests
are subjectto considerable variability. Articulation scores
must
therefore
be considered
indicative
of the
relative merit of the system by comparison with a
thick polyurethane foam on the floor and microphone kept flush with the sample; B2 reverberation chamber with 240 sq. ft of 8-in.-thick polyurethane foam on the floor and microphone kept 7 ft. above the sample; C1 reverberation chamber with 300 sq. ft. of 8-in.thick polyurethane foam on the floor and microphone kept flush with the sample, and C2 reverberation chamber with 300 sq. ft. of 8-in.thick polyurethane foam on the floor and microphone kept 7 ft. above the sample.
The level recorder decay curves for the conditions B1 to C2 are given in Figs. 2-5. 1
.
In all these tests, a •-zn.-condenser microphone and a 12-in. loudspeaker were used. The microphone was always kept at a distance of 12 ft. from the loudspeaker. B. Speech material The speech material selected for this testing was the
Harvard phonetically balanced (PB) test words. These words are monosyllabic in structure and are in common usage. They cover a wide range of difficulty to make them suitable for most types of articulation comparison.
All the eight lists of words given in Beranek's book, AcousticMeasurements,• were used. The spread of difficulty is approximately the same in each list, and each list has nearly the same average difficulty. Each
%
250
FTG. 2.
500
1000
2000
Hz
Decay curves under conditions B1.
J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32
855
Yegnanarayana and Ramakrishna: Intelligibility undernonexponential decay
•F[G. 3.
Decay curves under condition
250
500
1000
list contains fifty words, the smallest number that will satisfy the above-mentioned requirements.
C. Recording the test words The test words were recorded on a tape recorder in the anechoic chamber. A number of trials preceded the final recording to get practice of reading all the words at nearly the same average level.
Each word was read
at the end of a carrier sentence like "Please say and write..., .... now pronounce clearly...," etc. The car-
rier sentencehelpsthe talker in 'adjustinghis voice level before the actual word is uttered. Also, it alerts the listeners to concentrate on the word following the sentence. An interval of 15 sec is given between words. This
interval
is needed
because
the reverberation
time
of the room in which the test samples are played some-
times exceeds 10 sec, and the sounddue to each word must
be allowed
to die down before
the next
one com-
mences.
The talker in these experiments was the first author himself. The constancy of the average level of utterance of the sentences and words was checked by taking level recordings of the recorded signals. It was found that the fluctuation of average level was within + 2.5 dB throughout. The signal to noise ratio of the tape recorded signals obtained directly from these level recordings was about 30 dB. The taped signals were played back in the test room through the 12-in. loudspeaker which was facing the ceiling. The reverberated signal picked up by the condenser microphone was rerecorded. All the eight lists of words (total--400) were recorded under each of the five
different
test
conditions
described
earlier.
The listening tests were conducted in the anechoic from
disturbances
due to external
250
FIG. 4.
500
Hz
and reverbation so that the scores depend only upon the recorded words the listeners hear directly from the loudspeaker. The chamber could accommodate six people comfortably at a time and the listeners were all placed along an arc of a circle of 12-ft radius with the loudspeaker as the center. It was ensured that the sound level at the ears of all six listeners was equal. There was no appreciable difference of the spectrum of the speech at different listeners due to directional effects of the loudspeaker. The level was set around 65 dB in most of the tests, as it was found to be the most
comfortable
The listeners
level
for
all
the listeners.
in all these tests were mostly the au-
thor's students who were familiar with his voice, pronounclarion, The
listeners
and accent in the classroom. were
asked
to concentrate
on the word
following each carrier sentence and were instructed to write the word only on the basis of the sound they hea.rd. Whenever there was doubt or difficulty in spelling the sound correctly in English, they were asked to write it in a script that was most familiar to them. This was done to ensure that only the phonetic correctness of a word was judged.
The complete testing scheme is illustrated in Table I. The various symbols in the table stand for the test conditions described earlier. For each batch, the order in which the lists were presented is given.
The testing order was arranged somewhat randomly for batch No. I to make a detailed statistical analysis by analysis of variance technique to test the hypothesis that the differences
the conditions
D. Testing scheme free
2000
in scores
observed
for various
test
conditions are only due to inherent differences among
,
chamber
855
noise
and not due to chance.
With
all other
batches, our interest was only to determine the trend of variation of average scores under different test conditions. The test with each batch was completed on the
1000
2000
Hz
Decay curves under condition C1. ,
J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32
856
Yegnanarayanaand Ramakrishna:!ntelligibilityundernonexponentialdecay
2,50
FIG.
5.
856
500
1000
2000
Hz
Decay curves under condition C2.
same day with sufficient break after each list of 50
tween themselves.
words.
student-t table t2 that the differences obtained between the average scores of B2 and B1 will occur by chance only in less than one percent of the time. Hence we can concludethat the system B1 is significantly better than the system B2. In other words, in a room with a large area of highly absorbing material on the floor the intelligibility of speech is significantly better closer to the
E.
Results The
Tables
F.
results
of articulation
tests
are
summarized
in
II and III.
Conclusions
material
The data given in Table II are multiplied by 2 to get percentage scores and the resulting data are analysed by analysis of variance technique to test the hypothesis that
the differences
in scores
observed
for
various
test
conditions are only due to inherent differences among the test conditions. The detailed steps of analysis as
given in Chap. 17 of Beranek's book, Acoustics Mea-
surements, tt havebeencarried outonthe data. The results of the analysis of variance are summarized in Table
IV.
It can be shown with the aid of a
than away from it.
This analysis confirms the prediction one makes from the decay curves obtained at points corresponding to B1 and B2 (Figs. 2 and 3). The decay is more steep in the initial portion for condition B1, that means the reverberation is less, and hence the intelligibility is more. The essential point to be emphasizedhere is the need to position the microphone properly in order to evaluate the acoustical merits of a room. In fact, in certain cases the initial decay almost disappears when
the microphoneis taken even slightly away from the The
results
show that the differences
in scores
ob-
absorbing sample.•3 Consequently, oneis likely to mis-
tained by the three systems are statistically significant, whereas all other sources of variation, such as
judge the acoustical qualities of a room on the basis of
listeners, blocks, and interactions of system xblocks, blocksxlisteners, andlisteners xsystems are noteffective
the room with microphone remote from the audience.
conventional
The results
in determining the differences in scores. Though it is easy
to see that system A (listening in the anechoic chamber) is very much superior to systems B2 and B1, it is difficult to compare the merits of systems B2 and B1 be-
reverberation
in Tables
time measurements
II andHIaiso
indicate
made in
that aver-
age articulation score is always higher when the microphone is kept close to the sample than away from it. The average scores obtained are slightly higher for
larger areas of the material (conditionC1 in Table III) with microphoneflush with [he sample. This may be TABLE I.
Testing scheme.
due to steeper initial fall of the decay of the sound.
Batch No. (with the number listeners
in
parentheses) (6)
(6) (6) (4) (6)
If the talker, the electroacoustic system and the listeners were all perfect then the scores obtained for
of Test
Condition
List
A B2
2
B1
7
B2
4
B1
5
A
8
B2
1-4
B1
5-8
C1
1-4
C2
5-8
A
1-4
B2
5-8
B1
1-4
A
5-8
No.
anechoicchamber (A) must be nearly 100%. The actual scores reflect the influence of parameters other than the test system. However, in comparison with the scores
for the anechoic
chamber
the scores
for all
other test conditions are very much lower, showing the effect of reverberation on the intelligibility of speech.
TABLE
II.
the correct Block
Articulation score
scores of batch No. 1.
Each entry is
50 words.
l(List No. 1, 2, and 7)
2(List No. 8, 4, and 5)
1 2 3 4 5 6
1 2 3 4 5 6
Sys[••m s•ner A
for
38 39 37 38 35 35
41 44 41
37 41 41
Average
percentage score 77.8
B2
16
15
15
17
22
18
10
10
21
18
22
14
33.0
B1
23
23
19
19
20
23
24
20
16
21
23
20
41.85
J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32
857
Yegnanarayanaand Ramakrishna: Intelligibility under nonexponential decay
TABLE III.
Each entry is the correct score for 200 words.
Batch •ener 1 2 3 4 5 6 Average No. System•,,• score
Percentage
TABLE
857
V.
Reverberation
reverberation room. 0--2-KHz
random
times for different
conditions
in the
(From the decay curves obtained with
noise.
score
2
R.
B2 B1
73 86
76 85
49 70
87 92
62 66
62 88
68.2 77.8
34.1 38.9
3
C1 C2
85 74
64 50
93 69
92 77
84 68
89 63
84.5 66.8
42.3 33.4
4
A B2
160.3 73
80.15 36.5
74
74.8
37.4
169 153 145 158 128 154
151.2
75.6
5
175 153 154 159 86 61 64 80
B1
76
A
76
71
79
76
T. when
R.T.
With
Area of the
R. T. when Microphone is
microphone is 7 ft from
fusers and microphone flush with
material
flush with the
the materi-
the material
(sq ft)
material (sec)
al (sec)
(sec)
300
1.1
3.0
1.0
The importance of this study in the acoustics of auditoria is very obvious now. Since the audience forms a
highly absorptive area in a hall, the decay measured at their ear level reveals the true acoustical quality of the hall. If one measures the decay at a different height the vital initial part is likely to be missed. In this connection it is interesting to note the following observation
It was found that subjectively R1 and R3 sounded alike, whereas R2 was more reverberant compared to R1 and R3 for larger areas (300 and 240 sq. ft. ) of the material.
"Though the reverberation time of the hall fits in with its
For smaller areas (180 and 120 sq. ft.) all the three decay conditions R1, R2, and R3 were found to be nearly equally reverberant which indicates that the microphone position in these cases is less significant for the per-
use as a concert hall, one notices a surprisingly high
ception of reverberation.
of Meyer• in Beethovenhalleconcert hall in Bonn:
intelligibi!ity
of speech." The reverberation
II.
PERCEPTION
OF NONEXPONENTIAL
the
Subjective experiments confirm the fact that the perception of decay is due to the initial portion of a nonexponential decay and hence the sound near a highly absorbing material in a room sounds less reverberant at a point away from it.
times
as measured
from
the initial
15-dB fall are given in Table V for the four areas of
DECAYS
than
The sentence, "This is a recording for subjective tests, "was recorded in the anechoic chamber. The sentence was played back in the reverberation chamber
and rerecorded under the following decay conditions:
material
in the
room.
1R. Thiele, "Richtungsverteilung and Zeilfolge der Schallrukwurfe in Raumen," Acustica 3, 291-302 (1953).,
•'B. S. Atal, M. R. Schrocder andG. M. Sessler,"Subjective Reverberation time and its Relation to SoundDecay," Proc. 5th Int. Congr. Acoust. Leige lb, G32 (1965).
SL. L. BeranekandT. J. Schultz,"SomeRecentExperiences in the Design and Testing of Concert Hails with Suspended Arrays," Acustica 15, 307-316 (1965).
4V. L. Jordan, "AcousticalCriteria for AuditoriumsandTheir
R1 nonexponentialdecay with the microphone kept flush with the absorbing material; R2 near exponential decay with the microphone kept 7 ft. away from the material; and R3 exponential decay with diffusers installed in the room and with the microphone kept flush with the material.
Relation to Model Techniques," J. Acoust. Soc. Am. 47, 408-412
(1970).
5L. A. Jeffress,R. N. Lane, andFrankSeay,"Articulation Scores for Two Similar, Reverberant Rooms, One with Polycylindrical Diffusers on Walls and Ceiling," Am. 27, 787-788 (1955).
J. Acoust. Soc.
6j. p. LohnerandJ. F. Burger,"TheIntelligibilityof Speech Under Reverberant Conditions," Acustica 11, 195-200 (1961).
?J. Tolk andV. M. A. Peutz, 'Speech Intelligibilityin ReComparison of these recorded sentences was made in
pairs in the anechoic chamber. The experiment was repeated for four different areas of the material (300, 240, 180, and 120 sq. ft. )
verberant Rooms, ' Proceedings of Fifth International Congress on Acoustics, Liege, la, A52 (1965).
8R. H. BoltandA.D. Macdonald, "Theoryof Speech Masking by Reverberation," J. Acoust. Soc. Am. 21, 577-580 (1949).
9j. H. Janssen,"A methodfor the Calculation of theSpeech Intelligibility Under Conditionsof Reverberation and Noise," Acustica 7, 305-309 (1957).
TABLE
IV.
IøB. Nordlund,Tor Kihlman,andS. Lindblad,"Useof Articu-
Analysis of variance
lation Tests in Auditorium Studies," J. Acoust. Soc. Am. 44, 148-157
Degrees
Sum of
Sourceof
of free-
squares square
F(ratio of
Significance
variance
dom (dr)
Systems (S) Listeners (L) Blocks (B) SxL LxB Sx B Total Error
Mean
(ss)
(variance)
variances)
level (%)
2 5 1
13536.2 88.8 15.9
6768.1 17.76 15.9
174 5 >5
38.92
1
(1968).
IlL. L. Beranek,Acoustic Measurements (Wiley,NewYork, 1949), Chap. 17.
I•'R. A. Fisher, StatisticalMethods for ResearchWorkers (Oliver and Boyd, Edinburgh, 1970), p. 176.
lSB.Yegnanarayana andC. G. Balachandran, "Studies in a Reverberation Room with a Highly AbsorbingSample," J. Acoust. Soc, Am.
52, 465-470
(1972).
laE. G. Richardson andE. Meyer, Technical Aspectsof Sound (Elsevier, Amsterdam and New York, 1962), Vol. III, p. 322.
J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32