[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

A bank of beamformers implementing a constant-amplitude panning lawa) Yoomi Hurb) and Jonathan S. Abel Center for Computer Research in Music and Acoustics, Music Department, Stanford University, Stanford, California 94305 [email protected], [email protected]

Young-cheol Park Computer and Telecommunications Engineering Division, Yonsei University, Wonju, South Korea [email protected]

Dae Hee Youn DSP Laboratory, Department of Electrical and Electronic Engineering, Yonsei University, Seoul, South Korea [email protected]

Abstract: This paper describes a technique for designing a collection of beamformers, a “beamformer bank,” that approximately produces a constant-amplitude panning law. Useful in multichannel audio recording scenarios, a point source will appear with energy above a specified sidelobe level in at most two adjacent beams, and the sum of all beam signals will approximate the source signal. A design method is described in which a specified sidelobe level determines beamwidth as a function of arrival direction and frequency, leading directly to the number and placement of beams at each frequency. Simulation results are presented verifying the proposed technique’s performance. C 2014 Acoustical Society of America V

PACS numbers: 43.60.Fg, 43.60.Dh [CG] Date Received: May 14, 2014 Date Accepted: July 30, 2014

1. Introduction Recently, multichannel audio techniques using microphone arrays and panning laws have been studied for soundfield reproduction,1,2 for instance in ITU standard surround formats such as 5.1 surround. In order to accurately reproduce a soundfield, a large number of loudspeakers covering the entire horizontal plane is needed. Higher Order Ambisonics (HOA)3 and Wave Field Synthesis (WFS)4 are examples of this approach of physical soundfield synthesis. The HOA technique records and reproduces the soundfield hierarchically based on a spherical harmonic expansion of the soundfield as a function of arrival direction. Ambisonics and HOA recording techniques include B-format microphones, which combine three orthogonal figure-of-eight microphones and an omnidirectional microphone, and spherically baffled microphone arrays which can be configured to produce the needed spherical harmonic beam patterns.5 By contrast, WFS synthesizes a wavefront in a listening area by controlling the driving signal of each loudspeaker in a manner that is somewhat analogous to the Huygens principle. As a practical alternative, panning laws have been used to localize sources between adjacent loudspeakers by manipulating the source level in each loudspeaker. The first two-channel stereophony panning system was introduced by Blumlein,6 and it a)

An earlier version of this work was presented at the 129th AES Convention, Preprint 8280. Author to whom correspondence should be addressed. Also at DSP Laboratory, Department of Electrical and Electronic Engineering, Yonsei University, Seoul, South Korea.

b)

EL212 J. Acoust. Soc. Am. 136 (3), September 2014

C 2014 Acoustical Society of America V

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13

Hur et al.: JASA Express Letters

[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

has been extended to multichannel sound systems such as 5.1, 7.1, 10.2, or 22.2 surround later to cover the entire 3-D space. A sine-cosine panning law providing constant power is commonly used in rooms where much of the energy arriving at the listener from different loudspeakers arrives incoherently. A constant-amplitude panning law provided by a linear crossfade between adjacent loudspeakers is often used in binaural settings in which sources are panned among coherent virtual loudspeakers.7 One of the main issues in panning law-based multichannel reproduction is how to capture the directional soundfield using a number of microphones. Some literature such as Hamasaki-Square8 suggests using a set of directional microphones. However, Backman9 describes the limits of using a simple polar pattern control. He claims that the control is not sufficiently precise since the parameters are not adjustable independent of frequency, the channel separation is often inadequate, and the typical ITU loudspeaker layout with unequal angular separation requires control over the polar pattern, which does not exist. To circumvent these difficulties, Backman9 proposed a beamforming approach in which the mainlobe widths and sidelobe levels of adjacent beam are adjusted to work with the standard five-channel surround panning format. While this idea is similar in spirit to our approach, no general design method was presented. In this paper, we present a novel beamforming design technique in which microphone array outputs are processed to form a collection of beams, termed here a “beamformer bank.” The goal is recording a directional soundfield by forming a set of fixed beamformers and by producing a constant-amplitude panning law with a number of loudspeakers for perceptually realistic soundfield reproduction. For this purpose, the beamformer bank is designed according to two criteria: First, signals from point sources appear panned between adjacent beams as the source moves about the array; and the second, the beam sum is approximately omnidirectional. The proposed beamformer bank is designed by controlling the trade-off between mainlobe width and sidelobe level of each beam as a function of direction and frequency. Adjacent beam responses are designed to be in phase and to cross at 6 dB, yielding a design that approximately satisfies the two criteria. In sequence, Sec. 2 describes the beamformer bank design method, Sec. 3 presents the example that was designed, and Sec. 4 is a summary. 2. Beamformer bank design Consider a single source arriving at the N elements of an array in the presence of additive noise. We have xðxÞ ¼ dðhs ; xÞ s ðxÞ þ vðxÞ;

(1)

where x(x) and v(x) are columns of the microphone signal and noise signal transforms, respectively, and s(x) is the source signal, all evaluated at frequency x. The quantity d(hs, x) is an N 1 steering vector corresponding to source direction hs and evaluated at frequency x. The beamformed output signal y(t) is obtained by combining filtered microphone signals, yðtÞ ¼ =1 fwðhs ; xÞ H xðxÞg:

(2)

1

Here, = fg represents the inverse Fourier transform, and w(hs, x) is the column of beamforming coefficients or “weights,” and where H represents the Hermitian transpose. The associated beampattern is given by pðhÞ ¼ jwH dðhÞj: In our beamformer bank, we use a low-sidelobe design in which the sidelobes have a prescribed constant maximum level while the mainlobe gain is constrained to be one and have zero phase.10 Doing so ensures that adjacent beam mainlobes are phase aligned, while signals will appear across only adjacent beams. This low-sidelobe design algorithm hypothesizes interfering sources placed in the sidelobe region, and weights are calculated so as to maximize the signal-to-interference-plus-noise power ratio (SINR). If the sidelobes do not achieve the desired level, interferers are added, and

J. Acoust. Soc. Am. 136 (3), September 2014

Hur et al.: Beamformer bank EL213

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13

Hur et al.: JASA Express Letters

[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

the process is repeated until the desired sidelobe level is achieved. Note that the maximum sidelobe level is increased, the mainlobe width of each beamformer will decrease. To design a beamformer bank, we first evaluate the mainlobe width b(h, x, k) as a function of arrival direction h for the prescribed sidelobe level k. This leads directly to the number of beams and their placement. The beamformer beam density function q(h, x, k), which is given by the inverse of beamformer beamwidth b(h, x, k) is formed, qðh; x; kÞ ¼

1 : bðh; x; kÞ

The beam density function is then integrated over arrival angle, ðh gðh; x; kÞ ¼ qðu; x; kÞ du

(3)

(4)

0

to produce the number of beams within the integration limits. When the integral is evaluated over all angles, the total number of beams required to have a source appear in at least one beam results. Denoting by N(x, k) the number of beams needed in the beamformer bank, we have Nðx; kÞ ¼ bgðh ¼ 2p; x; kÞc;

(5)

where b c represents the floor function. Recall that by decreasing k, the beam count g(h, x, k) will decrease due to the increased mainlobe widths. An optimal sidelobe level k* can be found using a line search so as to make the integrated beam density function over all angles nearly an integer, k ¼ argmink kNðx; kÞ gð2p; x; kÞÞk:

(6)

Finally, the beam centers hn ¼ 1,2,…,N for a given beam count N are determined from the integrated beam density function at a chosen sidelobe level k* by finding the angles at which the integral Eq. (4) achieves half-integer beam counts, ð n 1=2 2p gðhn ; x; k Þ ¼ (7) qðu; x; k Þ du: N 0 Note that, our design method is independent of the array configuration.

3. Example design We now present beamformer bank example designs. We begin with a 14-element linear array with elements uniformly spaced at 5 cm intervals. Each beam is designed using a low-sidelobe beamformer.10 A set of beams with different sidelobe levels is designed at 3 kHz (making the 5 cm sensor spacing roughly one-half wavelength), and the beamformer mainlobe width is calculated as a function of beam angle from sets of beams with 20, 30, and 40 dB sidelobe levels as in Fig. 1(a). Here, the mainlobe width is measured as the distance between angles at which the level is 6 dB below its maximum. Note that, the mainlobe widths tend to be wider at endfire as compared to broadside for the linear array. Figure 1(b) shows the beam density function q(h, x, k) and Fig. 1(c) shows the corresponding integrated beam densities g(h, x, k) for different sidelobe levels and different frequencies. Those associated with lower sidelobe levels have fewer beams. Then, the optimal sidelobe level which makes the integrated beam density function over all angles nearly an integer is found and the beam centers are calculated using Eq. (7) as seen in Fig. 1(d). Figure 1(e) shows the final beamformer bank designed using the optimal sidelobe level 29.2 dB with the beam centers at 15 , 43 , 61 , 76 , 90 , 104 , 119 , 137 , and 165 . The adjacent beam responses cross at roughly 6 dB and the sidelobe level is constant. Also the summed beamformer response,

EL214 J. Acoust. Soc. Am. 136 (3), September 2014

Hur et al.: Beamformer bank

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13

Hur et al.: JASA Express Letters

[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

FIG. 1. (Color online) Example beamformer bank design procedure for 14-element uniform linear microphone array (a) calculated 6-dB mainlobe beamwidth with several sidelobe levels at f ¼ 3 kHz; (b) the associated beam density functions at f ¼ 3 kHz; (c) the integrated beam density function for different sidelobe levels at (i) f ¼ 3 kHz, (ii) f ¼ 1.5 kHz, (iii) f ¼ 750 Hz; (d) integrated beam density function with optimal sidelobe level 29.2 dB and computed beam centers denoted by ; and (e) final beamformer bank beampatterns, plotted along with the summed beam response (black line, top) at f ¼ 3 kHz.

shown in thick solid line, is nearly independent of direction. Accordingly, the design goals have been met. In Figs. 2(a) and 2(c), the integrated beam density function with optimal sidelobe level is presented for f ¼ 1.5 kHz and f ¼ 750 Hz, respectively, and the final beamformer bank beampatterns are plotted in Figs. 2(b) and 2(d) for f ¼ 1.5 kHz and f ¼ 750 Hz, respectively. Note that the characteristics depend on frequency, with fewer beams at lower frequencies. It can be implemented practically by splitting the frequency into several bands, and using a frequency-independent beamformer bank for each band. Another beamformer bank example is shown using a 14-element non-uniformly spaced linear array with microphones located at 0, 10, 17, 22, 27, 30, 32, 34, 36, 39, 44, 49, 56, and 66 cm. Figure 3(a) shows 6-dB beamwidths for a calculated optimal sidelobe level of 25 dB, and (b) shows the beam density function, and (c) shows the integrated beam density function and calculated beam centers at 23 , 60 , 82 , 98 , 120 , and 157 . Figure 3(d) shows the final beamformer bank beam magnitude responses and the summed beamformer response. Again, the design method produced beams crossing at roughly 6 dB, and sidelobes below a specified level. The standard deviation of the beam sum was about 0.8 dB.

FIG. 2. (Color online) The integrated beam density function and computed beam centers (a) at f ¼ 1.5 kHz with optimal sidelobe level 30 dB (c) at f ¼ 750 Hz with optimal sidelobe level 28 dB, and the final beamformer bank beampatterns, plotted along with the summed beam response (black line, top) (b) at f ¼ 1.5 kHz (d) at f ¼ 750 Hz.

J. Acoust. Soc. Am. 136 (3), September 2014

Hur et al.: Beamformer bank EL215

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13

Hur et al.: JASA Express Letters

[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

FIG. 3. (Color online) Example beamformer bank design procedure for 14-element non-uniformly spaced linear microphone array (a) calculated 6-dB mainlobe beamwidth with sidelobe level 25 dB at f ¼ 3 kHz, (b) the associated beam density function, (c) the integrated beam density function with computed beam centers, and (d) final beamformer bank beampatterns, plotted along with the summed beam response (black line, top).

4. Summary A technique for designing a collection of beamformers that approximately produces a constant-amplitude panning law and maintains a given maximum sidelobe level was presented. The procedure is to compute the beam density function by inverting the mainlobe beamwidth as a function of arrival direction and frequency for a given sidelobe level, and then integrating the density to determine the beamformer bank beam count and beam center positions. Note that, while a 6 dB crossfade was used in this paper to measure beamwidth, 3 dB cross-over points to implement a constant-power panning law would be possible using the same procedure. With this proposed technique, one can efficiently design beamformer banks for arbitrary multichannel configurations, and the source signals impinging the array will be automatically panned between adjacent beams according to their arrival direction. Also note that for surround and similar recording applications, since the beam count and beam center positions are directly connected to loudspeaker number and position, respectively, using the methods presented here, an optimal microphone array configuration can be iteratively designed for use with a given loudspeaker reproduction system. Finally, the algorithm presented in this paper was illustrated using 1-D linear arrays. However, the algorithm is not limited to linear or 1-D arrays and can be used with multidimensional arrays of arbitrary shape. For example, future work may include designing beamformer banks for microphone arrays to be used with multichannel reproduction systems such as standard 5.1 and 7.1 surround systems.

Acknowledgments This research was supported in part by the Stanford Presidential Fund for Innovation in the Humanities, granted for “Icons of Sound: Architectural Psychoacoustics in Byzantium,” and by the Stanford Art Institute for the Chavın de Huaantar Archaeological Acoustics project. References and links 1

V. Pulkki, “Spatial sound generation and perception by amplitude panning techniques,” Ph.D. thesis, Helsinki University of Technology, Helsinki, Finland, 2001. 2 F. Rumsey, “Novel surround sound microphone and panning techniques—A digest of selected recent AES convention and conference papers,” J. Audio Eng. Soc. 62(1/2), 74–80 (2004). 3 M. Poletti, “Three-dimensional surround sound systems based on spherical harmonics,” J. Audio Eng. Soc. 53(11), 1004–1025 (2005).

EL216 J. Acoust. Soc. Am. 136 (3), September 2014

Hur et al.: Beamformer bank

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13

Hur et al.: JASA Express Letters

[http://dx.doi.org/10.1121/1.4892669]

Published Online 6 August 2014

4

S. Spors and J. Ahrens, “Analysis and improvement of pre-equalization in 2.5-dimensional wave field synthesis,” AES 128th Convention, London (May 2010). 5 Z. Li and R. Duraiswami, “Flexible and optimal design of spherical microphone arrays for beamforming,” IEEE Trans. Audio Speech and Language Proc. 15(2), 702–714 (2007). 6 A. D. Blumlein, “Improvements in and relating to sound-transmission, sound-recording and soundreproducing systems,” U.K. Patent 394325 (1932–1933). 7 D. Griesinger, “Stereo and surround panning in practice,” presented at the AES 112th convention, Munich (May 2002). 8 K. Hamasaki and K. Hiyama, “Reproducing spatial impression with multichannel audio,” presented at the AES 24th International Conference on Multichannel Audio (June 2003). 9 J. Backman, “Microphone array beam forming for multichannel recording,” presented at the AES 114th Convention, Amsterdam (March 2003). 10 C. A. Olen and R. T. Compton, “A numerical pattern synthesis algorithm for arrays,” IEEE Trans. Antennas Propagation 38(10), 1666–1676 (1990).

J. Acoust. Soc. Am. 136 (3), September 2014

Hur et al.: Beamformer bank EL217

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.216.129.208 On: Thu, 18 Dec 2014 00:53:13