Intelligibility of speech under nonexponential decay conditions B. Yegnanarayana and B. S. Ramakrishna Acoustics Laboratory,Departmentof ElectricalCommunication Engineering,Indian Instituteof Science, Bangalore-I2, India (Received 17 December 1974)

In this paper,subjectivestudiesmadeon nonexponentially decayingsoundsare described.The various decayconditionsare realizedby changingthe positionof the pickup microphonein a reverberationroom with a highly absorbingsampleon the floor. It is shown,by meansof articulationtests,that intelligibility of speechis more closeto a highly absorbingsamplethan away from it. It is alsoshownthat the perceptionof decayis mainlydue to the initial portionof a nonexponential decay.The significance of these studiesis determiningthe acousticsof halls is explained. Subiect Classification:55.20; 70.35; 55.30.

INTRODUCTION

It is generally agreed that Sabine's reverberation time is not sufficient as a criterion for acoustical quality of a room. This is because the decay of sound in rooms is

usually nonexponential and there is no unique definition of reverberation time for such decays. Several crite-

ria t-4 were proposedas alternatives to reverberation time. Almost all of them emphasize the importance of an initial portion of decay in some form or other. However, no systematic investigation appears to have been undertaken of the relation between the proposed criterion and the subjective assessment. The difficulty of such investigations lies in the realization of the various types of nonexponential decays under controlled conditions.

In the subjective assessment of acoustical merits of a room, two aspects are normally considered. They are the intelligibility and the quality of sounds perceived. A quantitative measure of intelligibility of speech may be obtained by counting the number of discrete speech units correctly recognized by a listener. The procedure by which this quantitative measure is obtained is known as an articulation test. In this, typically, an announcer reads

lists of syllables, words, or sentences to a group of listeners, and the percentage of items correctly recorded by these listeners is taken as a measure of the intelligibility of speech. Obviously then, the intelligibility

dependsnot only onthe listeners andthe talker but also on several other factors such as the level at which the signal is presented to the listener, the frequency response of the acoustical system, and the level of noise both due to reverberation

and extraneous

sources.

Theoretical

and experimental studiess-• have been made to estimate the intelligibility ditions

of speech under exponential decay con-

in a room.

articulation

Theoretical

curves

are

available

for

scores as a function of noise level (rever-

herant and external) and speech level.

The signal or

speech level is calculated as the sum of the energy in the direct sound and the energy due to early reflections within the integration period of the ear. The reverberant noise level is calculated on the basis of the energy in the decaying sound field after the integration period.

For a given signal-to-noise ratio, maximum intelligibility is found to occur when the speech level is in the

range65-70 dB.6 853

J. Acoust.Soc.Am.,Vol. 58, No.4, October1975

In an articulation test, the evaluation of intelligibility is based on the number of test items correctly recorded by a listener and the listener is not required to appraise the quality of the speech. By contrast, methods of subjective appraisal require the listeners to evaluate the quality of speech itself. Subjective evaluation of nonexponential decays has been made using computer-simu-

lated reverberators. •' In that experiment, exponentially and nonexponentially reverberated signals were presented in pairs to listeners over headphones at an average level of 70 dB. The subjects were asked which stimulus in each pair sounded more reverberant. Varying the reverberation time of the exponential decay, the subjective reverberation of nonexponential decay was found by determining the point where the judgments were equally divided between exponentially and nonex-

ponentially reverberated signals. It was found (for artificially reverberated signals) that the reverberation time of the nonexponential decay calculated on the basis of the first 160 msec of the decay corresponds to subjective reverberation. In this paper, articulation tests performed under nonexponential decay conditions are described. The in-

fluenceof initial decayrate on the intelligib{lity of speech is determined by performing the tests under different decaying states. These states are realized by simply changing the position of the pickup microphone in a reverberation room with a large area of highly absorbing material on the floor. Experiments to verify perception of nonexponential decays are also described.

Finally, the relation between these subjective studies and the decay curves is discussed. Such studies derive their significance from the fact that in listening to sound in auditoria the ear is close to the top surface of a highly sound-absorbing layer which the audience themselves furnish. All the experiments described here, however, correspond to monaural listening only. I.

ARTICULATION

DECAY A.

Articulation Since

TESTS

UNDER

NONEXPoNENTIAL

CONDITIONS test

articulation

tests

are

used

to evaluate

the ef-

ficiency of a room as an information channel for spoken

Copyright ¸ 1975by theAcoustical Societyof America

853

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32

854

Yegnanarayana andRamakrishna' Intelligibility undernonexponential decay reference ANECHOIC

CHAMBER

TALI•ERc•

TE ST

RECORDER

1

ROOM

TAPE

TAPE

REC OR DER

RECORDER ANECHOIC REV.

system.

In the present case, an anechoicchamber(19 ftx14 ft TAPE

STEP

854

OR

CHAMBER

x 13ft) is used as a reference system. A reverberation chamber (25ftx20ftx14ft) with a large area of highly absorbing material on the floor is used as the test system. Different decay conditions are realized in the room at different heights from the sample and by picking up the sound at different levels above the sample we can study the intelligibility of the sound under different

decay conditions. Close to the sample a fast initial decay (for the first 10-20 dB) is followed by a slow lateral decay giving rise to a nonexponentialdecay. As the microphone is moved away from the sample, the decay tends to be more or less exponential corresponding to

only the later part of the nonexponential decay. STEP

ANECHOIC

2

The five different test conditions for which intelligibility tests were carried out are as follows'

A anechoic chamber (Reference system) B1 reverberation chamber with 240 sq. ft. of 8-in.-

CHAMBER ß

ß

TAPE

'ß LISTENERS

RECORDER

ß ß

STEP

FIG. 1.

3

Block diagram for the test system.

words, the tests should be so performed that the resuits obtained are influenced, as far as possible, by only

the acousticfeatures of the room under study.z0 Therefore, when studying the properties of this channel it is essential to keep the conditions on the sides of both the speaker and the listener as constant as possible. To achieve this, the test speech material is carefully recorded in an anechoic chamber. It is then played back in the test room and is again recorded. These rerecorded signals which are modified by the conditions of the test room are presented to a group of listeners in an anechoic chamber where they estimate the articulation ß

of the test

room.

shown in Fig.

The

block

schematic

for

the test

is

1.

Whatever precautions one may take, since human factors are involved, the results of articulation tests

are subjectto considerable variability. Articulation scores

must

therefore

be considered

indicative

of the

relative merit of the system by comparison with a

thick polyurethane foam on the floor and microphone kept flush with the sample; B2 reverberation chamber with 240 sq. ft of 8-in.-thick polyurethane foam on the floor and microphone kept 7 ft. above the sample; C1 reverberation chamber with 300 sq. ft. of 8-in.thick polyurethane foam on the floor and microphone kept flush with the sample, and C2 reverberation chamber with 300 sq. ft. of 8-in.thick polyurethane foam on the floor and microphone kept 7 ft. above the sample.

The level recorder decay curves for the conditions B1 to C2 are given in Figs. 2-5. 1

.

In all these tests, a •-zn.-condenser microphone and a 12-in. loudspeaker were used. The microphone was always kept at a distance of 12 ft. from the loudspeaker. B. Speech material The speech material selected for this testing was the

Harvard phonetically balanced (PB) test words. These words are monosyllabic in structure and are in common usage. They cover a wide range of difficulty to make them suitable for most types of articulation comparison.

All the eight lists of words given in Beranek's book, AcousticMeasurements,• were used. The spread of difficulty is approximately the same in each list, and each list has nearly the same average difficulty. Each

%

250

FTG. 2.

500

1000

2000

Hz

Decay curves under conditions B1.

J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32

855

Yegnanarayana and Ramakrishna: Intelligibility undernonexponential decay

•F[G. 3.

Decay curves under condition

250

500

1000

list contains fifty words, the smallest number that will satisfy the above-mentioned requirements.

C. Recording the test words The test words were recorded on a tape recorder in the anechoic chamber. A number of trials preceded the final recording to get practice of reading all the words at nearly the same average level.

Each word was read

at the end of a carrier sentence like "Please say and write..., .... now pronounce clearly...," etc. The car-

rier sentencehelpsthe talker in 'adjustinghis voice level before the actual word is uttered. Also, it alerts the listeners to concentrate on the word following the sentence. An interval of 15 sec is given between words. This

interval

is needed

because

the reverberation

time

of the room in which the test samples are played some-

times exceeds 10 sec, and the sounddue to each word must

be allowed

to die down before

the next

one com-

mences.

The talker in these experiments was the first author himself. The constancy of the average level of utterance of the sentences and words was checked by taking level recordings of the recorded signals. It was found that the fluctuation of average level was within + 2.5 dB throughout. The signal to noise ratio of the tape recorded signals obtained directly from these level recordings was about 30 dB. The taped signals were played back in the test room through the 12-in. loudspeaker which was facing the ceiling. The reverberated signal picked up by the condenser microphone was rerecorded. All the eight lists of words (total--400) were recorded under each of the five

different

test

conditions

described

earlier.

The listening tests were conducted in the anechoic from

disturbances

due to external

250

FIG. 4.

500

Hz

and reverbation so that the scores depend only upon the recorded words the listeners hear directly from the loudspeaker. The chamber could accommodate six people comfortably at a time and the listeners were all placed along an arc of a circle of 12-ft radius with the loudspeaker as the center. It was ensured that the sound level at the ears of all six listeners was equal. There was no appreciable difference of the spectrum of the speech at different listeners due to directional effects of the loudspeaker. The level was set around 65 dB in most of the tests, as it was found to be the most

comfortable

The listeners

level

for

all

the listeners.

in all these tests were mostly the au-

thor's students who were familiar with his voice, pronounclarion, The

listeners

and accent in the classroom. were

asked

to concentrate

on the word

following each carrier sentence and were instructed to write the word only on the basis of the sound they hea.rd. Whenever there was doubt or difficulty in spelling the sound correctly in English, they were asked to write it in a script that was most familiar to them. This was done to ensure that only the phonetic correctness of a word was judged.

The complete testing scheme is illustrated in Table I. The various symbols in the table stand for the test conditions described earlier. For each batch, the order in which the lists were presented is given.

The testing order was arranged somewhat randomly for batch No. I to make a detailed statistical analysis by analysis of variance technique to test the hypothesis that the differences

the conditions

D. Testing scheme free

2000

in scores

observed

for various

test

conditions are only due to inherent differences among

,

chamber

855

noise

and not due to chance.

With

all other

batches, our interest was only to determine the trend of variation of average scores under different test conditions. The test with each batch was completed on the

1000

2000

Hz

Decay curves under condition C1. ,

J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32

856

Yegnanarayanaand Ramakrishna:!ntelligibilityundernonexponentialdecay

2,50

FIG.

5.

856

500

1000

2000

Hz

Decay curves under condition C2.

same day with sufficient break after each list of 50

tween themselves.

words.

student-t table t2 that the differences obtained between the average scores of B2 and B1 will occur by chance only in less than one percent of the time. Hence we can concludethat the system B1 is significantly better than the system B2. In other words, in a room with a large area of highly absorbing material on the floor the intelligibility of speech is significantly better closer to the

E.

Results The

Tables

F.

results

of articulation

tests

are

summarized

in

II and III.

Conclusions

material

The data given in Table II are multiplied by 2 to get percentage scores and the resulting data are analysed by analysis of variance technique to test the hypothesis that

the differences

in scores

observed

for

various

test

conditions are only due to inherent differences among the test conditions. The detailed steps of analysis as

given in Chap. 17 of Beranek's book, Acoustics Mea-

surements, tt havebeencarried outonthe data. The results of the analysis of variance are summarized in Table

IV.

It can be shown with the aid of a

than away from it.

This analysis confirms the prediction one makes from the decay curves obtained at points corresponding to B1 and B2 (Figs. 2 and 3). The decay is more steep in the initial portion for condition B1, that means the reverberation is less, and hence the intelligibility is more. The essential point to be emphasizedhere is the need to position the microphone properly in order to evaluate the acoustical merits of a room. In fact, in certain cases the initial decay almost disappears when

the microphoneis taken even slightly away from the The

results

show that the differences

in scores

ob-

absorbing sample.•3 Consequently, oneis likely to mis-

tained by the three systems are statistically significant, whereas all other sources of variation, such as

judge the acoustical qualities of a room on the basis of

listeners, blocks, and interactions of system xblocks, blocksxlisteners, andlisteners xsystems are noteffective

the room with microphone remote from the audience.

conventional

The results

in determining the differences in scores. Though it is easy

to see that system A (listening in the anechoic chamber) is very much superior to systems B2 and B1, it is difficult to compare the merits of systems B2 and B1 be-

reverberation

in Tables

time measurements

II andHIaiso

indicate

made in

that aver-

age articulation score is always higher when the microphone is kept close to the sample than away from it. The average scores obtained are slightly higher for

larger areas of the material (conditionC1 in Table III) with microphoneflush with [he sample. This may be TABLE I.

Testing scheme.

due to steeper initial fall of the decay of the sound.

Batch No. (with the number listeners

in

parentheses) (6)

(6) (6) (4) (6)

If the talker, the electroacoustic system and the listeners were all perfect then the scores obtained for

of Test

Condition

List

A B2

2

B1

7

B2

4

B1

5

A

8

B2

1-4

B1

5-8

C1

1-4

C2

5-8

A

1-4

B2

5-8

B1

1-4

A

5-8

No.

anechoicchamber (A) must be nearly 100%. The actual scores reflect the influence of parameters other than the test system. However, in comparison with the scores

for the anechoic

chamber

the scores

for all

other test conditions are very much lower, showing the effect of reverberation on the intelligibility of speech.

TABLE

II.

the correct Block

Articulation score

scores of batch No. 1.

Each entry is

50 words.

l(List No. 1, 2, and 7)

2(List No. 8, 4, and 5)

1 2 3 4 5 6

1 2 3 4 5 6

Sys[••m s•ner A

for

38 39 37 38 35 35

41 44 41

37 41 41

Average

percentage score 77.8

B2

16

15

15

17

22

18

10

10

21

18

22

14

33.0

B1

23

23

19

19

20

23

24

20

16

21

23

20

41.85

J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32

857

Yegnanarayanaand Ramakrishna: Intelligibility under nonexponential decay

TABLE III.

Each entry is the correct score for 200 words.

Batch •ener 1 2 3 4 5 6 Average No. System•,,• score

Percentage

TABLE

857

V.

Reverberation

reverberation room. 0--2-KHz

random

times for different

conditions

in the

(From the decay curves obtained with

noise.

score

2

R.

B2 B1

73 86

76 85

49 70

87 92

62 66

62 88

68.2 77.8

34.1 38.9

3

C1 C2

85 74

64 50

93 69

92 77

84 68

89 63

84.5 66.8

42.3 33.4

4

A B2

160.3 73

80.15 36.5

74

74.8

37.4

169 153 145 158 128 154

151.2

75.6

5

175 153 154 159 86 61 64 80

B1

76

A

76

71

79

76

T. when

R.T.

With

Area of the

R. T. when Microphone is

microphone is 7 ft from

fusers and microphone flush with

material

flush with the

the materi-

the material

(sq ft)

material (sec)

al (sec)

(sec)

300

1.1

3.0

1.0

The importance of this study in the acoustics of auditoria is very obvious now. Since the audience forms a

highly absorptive area in a hall, the decay measured at their ear level reveals the true acoustical quality of the hall. If one measures the decay at a different height the vital initial part is likely to be missed. In this connection it is interesting to note the following observation

It was found that subjectively R1 and R3 sounded alike, whereas R2 was more reverberant compared to R1 and R3 for larger areas (300 and 240 sq. ft. ) of the material.

"Though the reverberation time of the hall fits in with its

For smaller areas (180 and 120 sq. ft.) all the three decay conditions R1, R2, and R3 were found to be nearly equally reverberant which indicates that the microphone position in these cases is less significant for the per-

use as a concert hall, one notices a surprisingly high

ception of reverberation.

of Meyer• in Beethovenhalleconcert hall in Bonn:

intelligibi!ity

of speech." The reverberation

II.

PERCEPTION

OF NONEXPONENTIAL

the

Subjective experiments confirm the fact that the perception of decay is due to the initial portion of a nonexponential decay and hence the sound near a highly absorbing material in a room sounds less reverberant at a point away from it.

times

as measured

from

the initial

15-dB fall are given in Table V for the four areas of

DECAYS

than

The sentence, "This is a recording for subjective tests, "was recorded in the anechoic chamber. The sentence was played back in the reverberation chamber

and rerecorded under the following decay conditions:

material

in the

room.

1R. Thiele, "Richtungsverteilung and Zeilfolge der Schallrukwurfe in Raumen," Acustica 3, 291-302 (1953).,

•'B. S. Atal, M. R. Schrocder andG. M. Sessler,"Subjective Reverberation time and its Relation to SoundDecay," Proc. 5th Int. Congr. Acoust. Leige lb, G32 (1965).

SL. L. BeranekandT. J. Schultz,"SomeRecentExperiences in the Design and Testing of Concert Hails with Suspended Arrays," Acustica 15, 307-316 (1965).

4V. L. Jordan, "AcousticalCriteria for AuditoriumsandTheir

R1 nonexponentialdecay with the microphone kept flush with the absorbing material; R2 near exponential decay with the microphone kept 7 ft. away from the material; and R3 exponential decay with diffusers installed in the room and with the microphone kept flush with the material.

Relation to Model Techniques," J. Acoust. Soc. Am. 47, 408-412

(1970).

5L. A. Jeffress,R. N. Lane, andFrankSeay,"Articulation Scores for Two Similar, Reverberant Rooms, One with Polycylindrical Diffusers on Walls and Ceiling," Am. 27, 787-788 (1955).

J. Acoust. Soc.

6j. p. LohnerandJ. F. Burger,"TheIntelligibilityof Speech Under Reverberant Conditions," Acustica 11, 195-200 (1961).

?J. Tolk andV. M. A. Peutz, 'Speech Intelligibilityin ReComparison of these recorded sentences was made in

pairs in the anechoic chamber. The experiment was repeated for four different areas of the material (300, 240, 180, and 120 sq. ft. )

verberant Rooms, ' Proceedings of Fifth International Congress on Acoustics, Liege, la, A52 (1965).

8R. H. BoltandA.D. Macdonald, "Theoryof Speech Masking by Reverberation," J. Acoust. Soc. Am. 21, 577-580 (1949).

9j. H. Janssen,"A methodfor the Calculation of theSpeech Intelligibility Under Conditionsof Reverberation and Noise," Acustica 7, 305-309 (1957).

TABLE

IV.

IøB. Nordlund,Tor Kihlman,andS. Lindblad,"Useof Articu-

Analysis of variance

lation Tests in Auditorium Studies," J. Acoust. Soc. Am. 44, 148-157

Degrees

Sum of

Sourceof

of free-

squares square

F(ratio of

Significance

variance

dom (dr)

Systems (S) Listeners (L) Blocks (B) SxL LxB Sx B Total Error

Mean

(ss)

(variance)

variances)

level (%)

2 5 1

13536.2 88.8 15.9

6768.1 17.76 15.9

174 5 >5

38.92

1

(1968).

IlL. L. Beranek,Acoustic Measurements (Wiley,NewYork, 1949), Chap. 17.

I•'R. A. Fisher, StatisticalMethods for ResearchWorkers (Oliver and Boyd, Edinburgh, 1970), p. 176.

lSB.Yegnanarayana andC. G. Balachandran, "Studies in a Reverberation Room with a Highly AbsorbingSample," J. Acoust. Soc, Am.

52, 465-470

(1972).

laE. G. Richardson andE. Meyer, Technical Aspectsof Sound (Elsevier, Amsterdam and New York, 1962), Vol. III, p. 322.

J. Acoust. Soc. Am., Vol. 58, No. 4, October 1975

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 146.189.194.69 On: Sat, 20 Dec 2014 09:45:32

Intelligibility of speech under nonexponential decay conditions.

Intelligibility of speech under nonexponential decay conditions B. Yegnanarayana and B. S. Ramakrishna Acoustics Laboratory,Departmentof ElectricalCom...
659KB Sizes 0 Downloads 0 Views