Hearing Research, 61 (1992) 161-166 © 1992 Elsevier Science Pubi:gshers B.V. All rights restarted 0378-5955/q2/$05.00

161

HEARES 01774

Detection of inharmonicity in dichotic pure-tone dyads L a u r e n t D e m a n y a,b and C a t h e r i n e Semal b a Laboratoire de Psychoacoustiqu~; Unirersit~de Bordeaux II, Bordeat~, France and h Laboratoire d'Audio!ogie Erpt;rimentale, Unit~INSERM 229, Bordeawc, France (Received 13 November 1991; Revision received and accepted 31 March 1992) Thresholds for the detection of quasi-sinusoidai frequency ratio fluctuations were measured with stimuli consisting of dichotic dyads of simultaneous pure tones. The two component tones of each dyad were slowly modulated in frequency, in such a way that the ratio of their instantaneous frequencies oscillated (or not) around some standard frequency ratio (SFR). As in a previous study [Demany and Semal 0988) J. Acoust. Soc. Am. 83, 687-695], it was found that smaller oscillations could be detected when the SFR was precisely an octave (2/1 ) than when it was slightly smaller or larger (2/1 +50 or 100 cents). Similar 'harmonicity effects' were obtained here for SFRs in the vicinity of a fifth (3/2), a twelfth (3/1 ), or a double octave (4/1). However, these harmonicity effects were generally less pronounced than those observed in the vicinity of an octave. Each of our four subjects provided evidence for a central sensitivity to the octave harmonicity, but the same consistency could not be found with respect to other kinds of harmonicity. Harmonicity perception; Frequency ratios; Dichotic listening; Scene analysis: PACS numbers: 43.66.Jh, 43.66.Rq, 43.75.Cd

Introduction

The vowels of speech, as well as many musical tones, are ha~wnonic complex tones: the frequencies oi their spectral components are integer multiples of one and the same audio frequency. In our natural acoustic environment, a markedly inharmonic sound stimulus is generally not produced by a single physical source, but by several sources acting simultaneously, each producing a given harmonic 'sub-stimulus'. Since the segregation of simultaneous sound sources is a crucial auditory task, it is important for the auditory system to discriminate harmonic from inharmonic frequency ratios. And indeed, most human listeners have some ability to do so (Broadbent and Ladefoged, 1957; McAdams, 1984; Moore et al., 1985b, 1986; Demany and Semal, 1988; Beerends and Houtsma, 1989; Carlyon and Stubbs, 1989; Hartmann et ak, 1990; Buell and Hafter, 1991; Carlyon, 1991; Demany et al., 1991; Carlyon et al., 1992). For complex tones with a dense spectrum, the detection of inharmonicity can be mediated by the perception of beats~ due to interactions between spectral components within common cochlear filters (Plomp, 1976; Moore et al., 1985b). However, inharmonicity is also detectable in the absence of such interactions, even by quite amusical listeners. A clear demonstration

Correspondence to: L. Demany, Laboratoire de Psychoacoustique, Universit6 de Bordeaux !I, 146 rue L6o-Saignat, F-33076 Bordeaux Cedex, France. Fax: (33) 5699-0380.

of the latter point was provided by Demany and Semal (1988). They used stimuli consisting of two simultaneous pure tones about one octave apart. The two component tones of each dyad were not liable to cochlear interactions since they were presented at a low sound pressure level (generally 45 dB) and dichotically (i.e., one tone to the left ear, the other tone to the right ear). They were frequency modulated with a 2-Hz sinusoidal function, in phase or not in phase, and just-noticeable phase differences between the two modulation functions (jnd-~'s) were measured. When there was no phase difference, the frequency ratio of the two tones did not vary with time, in spite of each tone's frequency modulation. Introducing a phase difference resulted in a quasi-sinusoidal oscillation of the frequency ratio. Since the amplitude of this oscillation was roughly proportional to the phase difference, measuring a jnd-qb was equivalent to measuring a just-noticeable frequency ratio oscillation around a given central ratio. For each of the four subjects tested (two of whom had no musi,zal education), it was found that the jnd-qb was often markedly smaller when the central ratio was equal to 1200 cents (i.e., 2/1, or one octave) than when it was equal to 1100 or 1300 cents. This showed that the central auditory system of ordinary listeners is sensitive to the harmonicity of an octave relationship, even in the complete absence of cochlear interactions between the tones, and that this sensitivity can be exploited to detect a frequency ratio fluctuation. It should be noted, however, that octave effects were never found when the carrier frequencies of the two simultaneous tones were both above i000 Hz.

162

The octave was the only harmonic frequency ratio investigated by Demany and Semal (1988). However, this ratio may have some uniqueness for the human auditory system. Various psychophysical data (see Demany, 1989, for a review) indicate that two pure tones one octave apart are perceived as similar, even by nonr~_~usicians, and it has been claimed that such tones share a common sensory quality, called 'tone chroma' by Bach~m (1937). The aim of the present study was to determine if results similar to those of Demany and Sern_al (1988) can be obtained when the sensitivity to harmonicity is investigated with pure-tone dyads forming frequency ratios in the vicinity of a fifth (3/2), a twelfth (3/1), or a double octave (4/1). From a related study, Carlyon (1991) reports results suggesting that the octave should not be an exception. However, most of his stimuli contained more than two component tones, and he used only three dyads, with the same lower-frequency tone: 400 Hz + 600 Hz (a fifth), 400 Hz + 800 Hz (an octave), and 400 Hz + 700 Hz (which can be considered as an inharmonic dyad). He found that frequency ratio fluctuations around the fifth and the octave were about equally detectable, and much more detectable than fluctuations around the inharmonic ratio (the latter fluctuations were actually not detected at all). The data reported here are for a much wider range of dyads.

Method

Subjects Four subjects, aged 22 to 35, were used. All had normal audiograms. Subjects LD and CS were the authors; they had previously participated as subjects in the study by Demany and Semal (1988)~ CC and JK were students with no previous experience in psychoacoustic experiments. LD was the only subject with some formal musical education and instrumental training.

Stimuli and Task Each stimulus consisted of two simultaneous and dichotically presented pure tones, gated on and off synchronously and frequency modulated with a sinusoidal function of 2 Hz around some carrier frequency (F, for the lower tone, F H for the higher one). The lower tone was always presented to the right ear. The modulation of each tone had a carrier-to-peak swing corresponding to 5% of the carrier frequency, and its initial phase within a stimulus varied randomly from a given stimulus to the next one. The stimuli had a steady state portion of 1.5 s and 200-ms rise/fall times. shaped with a raised-cosine function. Their component tones had a level of 45 dB SPL when the carrier

frequency was above 200 Hz, and 50 dB SPL for 180-Hz or 200-Hz carrier~,. On each trial, two successive stimuli were presented, with an inter-stimulus interval of 200 ms. Their two component tones had the same carrier frequencies. In one stimulus, the modulations of the two tones were perfectly in phase; thus, within this stimulus, the frequency ratio of the two tones had a steady value, corresponding to F#/FL. In the other stimulus, ~h~re was a phase difference, ~ ( < ~r radians), between the two modulations; thus, the frequency ratio of the two tones oscillated in a quasi-sinusoidal manner, at a 2-Hz rate, around FH/FL; since ~ could not exceed ~r radians, the amplitude of this oscillation was a monotonic function of ~; more precisely, the minimum and maximum values of the frequency ratio respectively corresponded to (Fn/FL). [ 1 - 0 . 1 sin(~/2)] and (FH/FL). [1 + 0.1 sin(~/2)] (see Demany and Semal, 1988, footnote 1). On each trial, these two stimuli were presented in a random order, and the subject had to decide which stimulus contained out-of-phase modulations, by pressing one of two labelled keys of a computer keyboard. Immediate feedback concerning response accuracy was provided via the computer screen. Subjects reported making their decisions by selecting the stimulus with the less 'fused' component tones. The stimuli were generated in real time, by means of a two-channel digital synthesizer (DMX 1000), with a sampling rate of 27.9 kHz and a precision of 15-16 bits for each tone. The outputs of the two digital-toanalog converters were low-pass filtered with a cutoff frequency of 9.6 kHz (Frequency Devices, 48 dB/oct), amplified (Denon PMA 530), attenuated (Charybdis), and then sent to a pair of TDH 49 earphones. Subjects were tested individually in a soundproof booth.

Procedure Jnd-~'s were measured as a function of FH/FL, with an adaptive procedure estimating the • values yielding 79.4% correct responses (Levitt, 1971)• Trials were organized in blocks within which F L and FH were fixed. In every block, @ was initially set at a high value, divided by L5 after each succession of three correct responses, and multiplied by 1.5 after each incorrect response (except when the result of the multiplication would have exceeded ~r radians, in which case cb was set at ~r radians); the jnd-~ was taken as the geometric mean of the first 16 reversal points in the variation of In each experimental session, F L was fixed and • (l~ seven jnd-_ s were measured, one for each of seven F H values. Among the seven Fu/F: ratios, one was harmonic; it was equal to 2/I, 3/2, 3 / 1 , or 4/1. The differences between the hammnic ratio and the other six amounted to + 25, 50, and ~00 cents. The seven jnd-~'s were measured in a random order, unknown to

163

the subject throughout each session. For a given value of F L and a given set of seven F H / F L ratios, at least six consecutive sessions were run: a variable number of training sessions plus six formal sessions. All sessions were run on different days.

6O 40

20 ¸

R=3/2, FL=900Hz ~

'.

',

:

:

;

i

20: R=4/1, £L=180HZc=

C~. i

60

Results Each subject's results are displayed in a separate Figure (Figs. 1-4), where each panel presents the data for a given value of F L and a set of seven F H / F L ratios centered on a given harmonic value (labelled as 'R'). All the data points represent means of six jnd-O's and the vertical bars are two standard deviations in total length. The data shown in each panel were submitted to a ~eparate analysis of variance (ANOVA), which outcome is summarized in Table I. The following conclusions can be drawn: (1) For R -- 2 / 1 (one octave), the jnd-~'s always varied as a nonmonotonic _function of F H / F L, with a minimum at or near an exact octave. Such nonmonotonic trends were found in each subject, and were. very marked in CS, CC, and JK (see Figs. 1, 3, and 4). For CS and LD (Fig~. 1 and 2), the dashed curves are data replotted from Demany and Semai (1988). These data were collected about three years before the present study, in almost identical conditions (the only noteworthy difference is that the stimuli lasted 1.9 s in the present study, versus 3.4 s in the previous one). For CS ( F L = 180 Hz), the dashed curve (from 1985) and the

O1 -O

R=3/1,

20

C

R=3/2,

FL= 610 ,HIc:

-O O ct-

FL=240 Hz

20 40 10

~

213

R=3/2, FL=400 Hz

O

C5

~D|1

:

:

: : : ; I I I

R=2/1, FL=180HZ 4O

2oi

J R=3/2, FL=240Hz

t . . . . . . . . . C$ - l o o - 5 o o 50 loo

Deviation

of

10

-~oo-5o

F,/F L from

o

so loo

R (in

cents)

Fig. 1. Jnd-O's measured in subject CS, as a function of the standard frequency ratio ( F H / F L, corresponding to R + 0 , 25, 50 or 100 cents) and the lower carrier frequency (FL). The ordinate scale is logarithmic. Each data point is the mean of six jnd-q~'s and the vertical bars are two standard deviations in total length. The left-hand panels are foi R = 3 / 2 and the right-hand panels are for other values of R. For R = 2 / 1 , the dashed curve displays previous data obtained from the same subject.

TABLE i Results of ANOVA's performed on the jnd-q~'s: " S " indicates that the overall F ratio was significant with P < 0.01 and that a quadratic trend was obtained; " s " has the same ~aeaning except that 0.01 < P < 0.05; " ? " means that the overall F ratio was significant ( P < 0.05) but that no clear quadratic trend was obtained; " 0 " indicates that the overall F ratio was not significant ( P > 0.10).

Octave(FH=2.FL)

Doable octave (Ftt

FL (Hz)

180

180

CS LD CC JK

S

270

400

S S S

S

= 4-FL)

200

270

?

0

Fifth

Twelfth

(F.= 3"FL/2)

(FH= 3"FL)

FL (Hz)

~40

CS LD CC JK

s S

3OO S s 0

4OO

600

900

240

s S

S 0 S 0

0

S 9

solid curve (from 1988) diverge when F H / F L > 2/1. In the case of LD, however, strong correlations between the two curves can be seen, for each of the two frequency registers used; similar asymmetries were found, and for F L = 270 Hz, the tip of both curves is located 25 cents above a physically exact octave. (2) For R = 3 / 2 (a fifth), sig~dficantly nonmonotonic threshold variations were also found (see Table I). However, this was true in only three subjects out of four: JK was not sensitive to the harmonicity of the fifth (Fig. 4). His negative results are probably not due to an unfortunate choice of the F L values because JK provided a strongly U-shaped curve in his 'octave' condition ( F L = 270 Hz): For R = 3 / 2 and F L = 300 Hz, the two tones were located within the frequency register involved in the octave condition; in addition, for R = 3 / 2 and F L = 600 Hz, the fundamental (or quasi-fundamental) frequencies of the stimuli were

164

]

60

40

40

20

20

R=4/1, f-L=270H

R=3/2. FL=600HZ

: : : : : : : ,

G" (D

100.

(D "O

~0

R=2/1,

E

R=5/2,

FL:=600 Hz

6O

~" 4o "~ 2o

i:': ...... IC:,o °I

v "o o 60

"O O t-

40

20 R=2/1, rL=400 HZL0,

4 1

E

H

LO

o'+...,

JK 100.

50

JK -100 -50 R=3/2,

FL=300

0

50

100

Hz JK

21) -100 -50

R=3/2, ~=300 HZ ~ iL°

~

50

"(9-

.......

]

2O

5

I R=2/1, Ft=270 Hz

100

FL=270 Hz

10t

0

50

100

Deviation of FH/FL from

L[I

I .......

R (in cents)

Fig. 4. Same as Fig. 1, but for subject JK.

40

20

R=3/2. ~=240 Hz -100 -50 O 50 100 Deviotion of FH/FL from

--100--50 O 50 100 R (in cents)

R = 4 / 1 do not seem to be due to a bad choice of F L values in the light of this subject's positive results for R = 2/1.

Fig. 2. Same as Fig. 1, but for subject LD. Discussion similar to those of the stimuli involved in the octave condition. JK was the only subject with an apparently total 'deafness to the fifth'. However, the other three subjects also seemed somewhat less sensitive to the fifth than to the octave, in so far as they displayed generally larger harmonicity effects for octaves than for fifths. (3) For R = 3 / 1 (a twelfth) or 4 / 1 (a double octave), the two subjects tested, LD and CS, behaved quite differc-tly: Very clear nonmonotonic trends were found for CS, but LD provided no evidence for harmonicity detection. The negative results obtained for LD with

O

60

~

40

73 r" o--

20

R=2/1, R=3/2,

cc

O 8O ..C CO 60 c

4O 20

4(1

CC -100 -50

,9"

FL=270 Hz

FL=600 Hz

R=3/2,

0

50

100

FL=300 Hz cc

-ioo'-5o' 6 "5"o',b~ Deviation of F./F[

from

R (in cents)

Fig. 3. Same as Fig. 1, but for subject CC.

The present data confirm that normal human listeners can discriminate harmonic from inharmonic dyads of pure tones even when the comDonent tones of the dyads a~e processed by completely separate neural channels at the level of the auditory nerve. What is more, they show that for such dyads, frequency ratio fluctuations are better detected when they occur around a harmonic ratio than when they occur around a markedly inharmonic ratio. In this respect, our results generalize previous findings by Demany and Semal (1988) and Carlyon (1991). In the present study, however, stronger evidence for a central sensitivity to harmonicity was found with octave intervals than with other harmonic intervals. This singularity of the octave might be related to the fact that two pure tones one octave apart have some common perceptual quality (Kallman and Massaro, 1979; Demany and Armand, 1984), that might originate from the temporal coding of frequency by the auditory nerve (Rose et al., 1967; Ohgushi, !983). The stimuli used here were pairs of simultaneous pure tones. It is interesting to compare our results with those obtained in experiments on the discrimination between frequency ratios formed by pairs of successive pure tones, i.e., melodic intervals. Experiments of this kind were performed by Burns and Ward (1978), who used standard frequency ratios between 250 and 550 cents. In subjects who were musicians, they found nonmonotonic variations of discrimination performance as a function of the standard frequency ratio.

165

However, discrimination was poorer when the standard ratio was an interval existing in the musical tempered scale of frequency (e.g., 300 or 400 cents) than when it was half-way between two such intervals (e.g., 350 cents). In the present study, on the contrary, better performances were found, overall, for standard ratios corresponding to an octave or a fifth than for standard ratios 50 cents apart from these intervals. Burns and Ward did not use the octave or the fifth as standard rack,s. However, Houtsma (1968) performed a similar study using a set of frequency ratios including the octave and frequency ratios in its vicinity. He did not obtain significant threshold differences within such a set. This negative result, contrasting with the significant differences found by Demany and Semal (1988) and in the present study, suggests that quite separate neural mechanisms are at work in the perception of melodic (i.e., successive) and harmonic (i.e., simultaneous) octave~. A similar suggestion was made ~~-.-~j from other grounds. by Demany and Semai ~":~""~ However, it would be premature to conclude that there is no relation at all between the perception of harmonicity and the perception of melodic intervals (Demany, 1991). Finally, some introspective observations made during the present study are noteworthy. For R = 3/2, subject LD clearly heard the "missing fundamentals' (or quasi-fundamentals) of the stimuli, as expected from the well-known finda~gs of Houtsma and Goldstein (1972). Such was the case even for Ft. = 600 Hz, although in this condition the obtained threshold curve was almost flat (see Fig. 2). However, again for R = 3/2, subject CS could not clearly hear the missing fundamentals, even when her corresponding threshold curve was definitely nonmonotonic (see Fig. 1, FL = 600 Hz). Thus, the discrimination between harmonicity and inharmonicity seemed relatively independent of the extraction of virtual pitch. This is rather su~rising, but in tact consistent with previous results (Moore et al., 1985a, 1986; Demany et al., 1991; Darwin et al., 1992). Moore et ai. (i~f,5a, 19~6) used speetrally rich harmonic complex tones with one component mist,reed from its 'normal" freqaency; they found that the mistuned component could make a full contribution to the virtual pitch of the whole complex even when it was sufficiently mistuned to be heard as a separate tone. Darwin et al. (1992) confirmed these findings; they also showed that a sine tone which was an in-tune component of a harmonic complex but a mistuned component of another, simultaneous, harmonic complex (with a markedly different fundamental frequency) could contribute to the virtual pitch of both complexes. Using dyads of pure tones in the vicinity of an octave interval, Demany et al. (1991) measured thresholds for (i) the detection of mistuning, and (ii) the discrimination of differences in pitch (presumably virtual pitch); these

two kinds of threshold (expressed as relative frequency changes in the higher tone) did not vary in the same way as a function of the frequency register of the dyads.

Acknowledgments The experiments reported here were performed at the Laboratoire de Psycholog/e Expdrimentale, Universitd Rend Descartes, Paris. Some of the results were presented at the 13th I.C.A., Belgrade, Yugoslavia (Demany and Semal, 1989). Author LD is affiliated with the Centre National de la Recherche Scientifique. Financial support for the preparation of this article was received from the Conseii Rdgional d'Aquitaine and the Minist~re de l'Education Nationale (Action Spdcifique 'Sciences de la Cognition'). We thank Christine Chauvin for her kind cooperation, and Brian C.J. Moore for comments on a previous version of the manuscript.

References Bachem, A. (1937) Various types of absolute pitch. J. Acoust. Soc. Am. 9, 146-151. Beerends, J.G. and Houtsma, AJ.M. (1989) Pitch identification of simultaneous diolic and dichotic two-tone complexes. L Acoust. Soc. Am. 85, 813-819. Broadbent, D.E. and Ladefoged, P. (1957) On the fusion of sounds reaching different sense organs. J. Acoust. Soc. Am. 29, 708-710. Buell, T.N. and Hailer, E.R. (1991) Combination of binaural information across frequency bands. J. Acoust. Soc. Am. 90, 18941900. Bums, E.M. and Ward, W.D. (1978) Categorical perception-phenomenon or epiphenomenon: Evidence from experiments in the perception of melodic musical intervals. J. Acoust. Soc. Am. 63, 456-468. Carlyon, R.P. (1991) Discriminating between coherent and incoherent frequency modulation of complex tones. J. Acoust. Soe. Am. 89, 329-340. Carlyon, R.P., Demany, L. and Semal, C. (1992) Detection of across-frequency differences in fundamental frequency. J. Acoust. Soc. Am. 91,279-292. Carlyon, R.P. and Stubbs, R.J. (1989) Detecting single-cycle frequency modulation imposed on sinusoidal, harmonic, and inharmonic carriers. J. Acoust. Soc. Am. 85, 2563-2574. Darwin, C.J., Buffa, A., Williams, D. and Ciocca, V. (i992) Pitch of dichotic complex tones with a mistuned frequency component. In: Y. Cazals, L. Demany and K. Homer (Eds.), Auditory Physiology and Perception, Pergamon Press, Oxford, UK. Demany, L. (1989) Perception de la hauteur tonale. In: M.C. Botte, G. Candvet, L. Demany and C. Sorin (Eds.), Psychoacoustique et Perception Auditive, Editions INSERM, Paris, pp. 43-81. Demany, L. (1991) Les myst~res de l'octave. In: G. Candvet and J.C. Risset (Eds.), Gen~se et Perception des Sons. Publications du Laboratoire de Mdcanique et d'Acoustique, Marseille, France, No. 128, pp. !77-!88. Demany, L and Annand, F. (1984) The perceptual reality of tone chroma in early infancy. J. Acoust. Soc. Am. 76, 57-66.

166 Demany, L. and Semal, C. (1988) Dichotic fusion of two tones one octave apart: Evidence for internal octave templates. J. Acoust. Soc. Am. 83, 687-695. Demany, L. and Semal, C. (1989) Internal templates of harmonic intervals. Proceedings of the 13th International Congress on Acoustics (Belgrade, Yugoslavia), Vol. 3, pp. 15-18. Demany, L. and Semal, C. (1990) Harmonic and melodic octave templates. J. Acoust. Soc. Am. 88, 2126-2135. Demany, L., Semal, C. and Carlyon, R.P. (1991) On the perceptual limits of octave harmony and their origin. J. Acoust. Soc'. Am. 90, 3019-3027. Hartmann, W.M., McAdams, S. and Smith, B.K. (1990) Hearing a mistuned harmonic in an otherwise periodic complex tone. J. Acoust. Soc. Am. 88, 1712-1726. Houtsma, A.J.M. (1968) Discrimination of frequency ratios. J. Acoust. Soc. Am. 44, 383 (abstract W2). Houtsma, A.J.M. and Goldstein, J.L. (1972) The central origin of the pitch of complex tones: Evidence from musical interval recognition. J. Acoust. Soc. Am. 51, 520-529. Kallman, H.J. and Massaro, D.W. (1979) Tone chroma is functional in melody recognition. Percept. Psychophys. 26, 32-36.

Levitt, H. (1971)Transformed up-down methods in psychoacoustics. J. Acoust. Soc. Am. 49, 467-477. McAdams, S.

Detection of inharmonicity in dichotic pure-tone dyads.

Thresholds for the detection of quasi-sinusoidal frequency ratio fluctuations were measured with stimuli consisting of dichotic dyads of simultaneous ...
542KB Sizes 0 Downloads 0 Views