November 15, 2014 / Vol. 39, No. 22 / OPTICS LETTERS

6549

Optical voice recorder by off-axis digital holography Osamu Matoba,1,* Hiroki Inokuchi,1 Kouichi Nitta,1 and Yasuhiro Awatsuji2 1

Department of Systems Science, Graduate School of System Informatics, Kobe University, Rokkodai 1-1, Nada, Kobe 657–8501, Japan 2 Division of Electronics, Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan *Corresponding author: matoba@kobe‑u.ac.jp Received September 4, 2014; revised October 8, 2014; accepted October 17, 2014; posted October 22, 2014 (Doc. ID 221487); published November 14, 2014 An optical voice recorder capable of recording and reproducing propagating sound waves by using off-axis digital holography, as well as quantitative visualization, is presented. Propagating sound waves temporally modulate the phase distribution of an impinging light wave via refractive index changes. This temporally modulated phase distribution is recorded in the form of digital holograms by a high-speed image sensor. After inverse propagation using Fresnel diffraction of a series of the recorded holograms, the temporal phase profile of the reconstructed object wave at each three-dimensional position can be used to reproduce the original sound wave. Experimental results using a tuning fork vibrating at 440 Hz and a human voice are presented to show the feasibility of the proposed method. © 2014 Optical Society of America OCIS codes: (090.1995) Digital holography; (120.2880) Holographic interferometry; (070.1060) Acousto-optical signal processing; (110.6915) Time imaging. http://dx.doi.org/10.1364/OL.39.006549

Sound waves and image data are important communication media in our daily life. Unlike image data, sound waves are invisible, and it is difficult to see their propagation distribution because of their high velocity in air (about 340 m∕s). Visualization of propagating sound waves is helpful for many engineering applications, such as the design of musical instruments, rooms, and artificial mouths, including vocal chords, and is also important for understanding certain scientific phenomena. Visualization of ultrasound waves is also helpful in the design of surface acoustic wave devices and the measurement of shock waves. There are several methods of measuring the behavior of sound waves, including the use of microphone arrays or optical methods [1–4]. With a microphone array, the direction of the sound wave can be detected, but it is difficult to avoid reflections at the microphone itself and to reduce the size of the measurement system. Among optical methods, Schlieren photography is a simple way to visualize the phase distribution of sound waves. However, it is difficult to quantitatively measure the phase modulation. Another approach uses a heterodyne technique to reduce the frequency of the object waves modulated by a sound wave so that they can be measured by a normal image sensor operating at 30 frames per second [1]. This approach allows the phase distribution to be visualized even by using a lowframe-rate digital-image sensor. However, the bandwidth of the measurable sound waves is limited by the bandwidth of the image sensor. In Ref. [1], the threedimensional field of the sound wave was not recorded and reproduced. Three-dimensional reconstruction of the sound wave did not mention. In this Letter, we propose an optical voice recorder based on digital holography for recording and reproducing propagating sound waves. To the best of our knowledge, there is no reported optical method for reproducing sound waves themselves. Digital holography [5–9] is widely used in many applications, such as the measurement of particle velocimetry distributions [10], phase 0146-9592/14/226549-04$15.00/0

distributions [11–13], and fluorescence in biological fields [14]. Advantages of digital holography include the large depth-of-field made possible by numerical focusing and the ability to quantitatively analyze the amplitude and the phase. These advantages make digital holography effective for measuring dynamic events in a three-dimensional field. For instance, measurement of a phase distribution at 180,000 frames per second has been achieved by parallel phase-shifting digital holography [15]. First, we briefly explain how sound waves modulate a refractive-index distribution [1]. From the basic theory, the refractive-index change Δn is described by   P Δn  n − 1 −1 ; P0

(1)

where n is the refractive index of the medium at atmospheric pressure P 0 , and P is the increased pressure due to the sound waves. This refractive-index change causes phase retardation of a plane light wave incident on the medium. This temporal phase retardation can be measured by optical interferometry or laser Doppler vibrometry. Figure 1(a) shows the optical setup used to record the propagating sound waves by off-axis digital holography. The system was based on a Mach–Zehnder interferometer. A laser beam from a Nd:YVO4 laser operating at a wavelength of 532 nm was used as a coherent light source. The laser beam was divided into two beams: one served as an object beam and the other as a reference beam. Each beam was expanded by a beam expander. The object beam was modulated by the temporally changing phase distribution caused by the propagating sound waves. The object and reference beams were made to interfere via a beam splitter, and the resulting interference patterns were recorded as digital holograms by an image sensor with sufficiently high recording speed. We used an image sensor with 512 × 512 pixels and a maximum frame rate of 2000 frames © 2014 Optical Society of America

6550

OPTICS LETTERS / Vol. 39, No. 22 / November 15, 2014

Fig. 2. Observation of the sound wave from the tuning fork. (a) picture of the tuning fork and (b) one of the holograms.

Fig. 1. (a) Schematic diagram of optical recording system using off-axis digital holography. (b) Process of recovering sound waves by extracting the temporal phase distribution.

per second. The pixel size of the image sensor was 16 μm × 16 μm. One advantage of the proposed system is that it can measure sound waves that are located far from the image sensor. Figure 1(b) shows the process of recovering the sound waves from the reconstructed object wave. The phase distribution of each reconstructed wave was extracted from the object wave reconstructed by making the object light waves inversely propagate from the hologram over an appropriate propagation distance. From the series of holograms, a temporal phase profile was obtained at each pixel of the reconstructed phase distribution. This temporal phase signal was considered to be the same as the sound wave. According to the sampling theory, the sampling frequency should be larger than twice the maximum frequency of the sound wave. When the maximum frequency of the sound wave is f max , the recording temporal interval of the image sensor should be shorter than the 1∕2f max . In our system shown in Fig. 1, the maximum frequency of the detected sound wave was 1 kHz. To show the feasibility of the proposed system, we used a tuning fork, as shown in Fig. 2(a). The tuning fork emitted sound waves at a single frequency of 440 Hz. Figure 2(b) shows one of the holograms recorded by the image sensor. In off-axis digital holography, a DC term and a conjugate term were eliminated by a bandpass filter. After an appropriate inverse propagation distance using numerical Fresnel propagation, the object wave was reconstructed. The reconstructed amplitude and the phase distributions when the propagation distance is 240 mm are presented in Figs. 3(a) and 3(b), respectively. An example of the phase profile at position (100, 257) of the reconstructed phase image is presented in Fig. 4(a). Any positions outside the tuning fork can be used for the reproduction of the sound wave. If we can measure a large field such as 1 m2 , we could see the

Fig. 3. Reconstructed object wave: (a) amplitude distribution and (b) phase distribution. The dark region of the amplitude distribution is in the tuning fork.

spatial propagation of the sound wave with a maximum frequency of 1 kHz. In Fig. 4(a), we can see a sinusoidal profile; however, the average phase decreases as a function of time. By subtracting the local average phase, the phase profile can be improved, as shown in Fig. 4(b). The sound can be heard from the phase profile (hear the sound file of Media 1). The spectrum obtained by taking the Fourier transform of Fig. 4(b) is shown in Fig. 4(c). It is clearly seen that a peak was obtained at 440.4 Hz. Figure 5 shows the spectrogram of Fig. 4(b). From Fig. 5, it is clearly seen that there is a single frequency distribution and that the power of the sound wave gradually decreases with time. In the next experiment, we recorded a human voice speaking the five Japanese vowels, namely, /a/, /i/, /u/, /e/, and /o/. The propagation distance was set to be 250 mm. Figures 6(a)–6(c) show the spectrograms of the five vowels reconstructed by the proposed method. Comparing the spectrograms of five vowels, the intense frequencies are different. These spectral characteristics are called formants [16]. We also compared them with those recorded by a microphone, as shown in Figs. 7(a)–7(c). Here, the sound wave signal detected by the microphone was modified by setting the maximum frequency to 1 kHz. This means that the sound wave was acquired at a sampling rate of 2 kHz. As seen in Figs. 6 and 7, although the details were different, similar features, namely, intense frequencies, were obtained.

November 15, 2014 / Vol. 39, No. 22 / OPTICS LETTERS

6551

Fig. 6. Spectrograms of Japanese five vowels measured by digital holography; (a) /a/ and /i/, (b) /u/ and /e/, and (c) /o/.

Fig. 4. (a) Reconstructed temporal phase profile of the reconstructed phase image, (b) temporal phase profile after the data processing (hear sound file of Media 1), and (c) its spectrum distribution.

In a third experiment, we recorded a human voice speaking the word “love”. We could hear the word from the reconstructed waveform (hear the sound files of Media 2, 3, and 4. Media 3 is the downconverted sound file of Media 4 from maximum frequencies of 8 to 1 kHz). The temporal signal reconstructed by digital holography is presented in Fig. 8. In conclusion, we have presented an optical voice recorder based on an optical method using off-axis digital holography. This is the first demonstration of optical reproduction of a sound wave, especially a human voice. An advantage of the proposed method is that it can

Fig. 5. Spectrogram of the reconstructed wave by digital holography.

Fig. 7. Spectrograms of Japanese five vowels measured by microphone; (a) /a/ and /i/, (b) /u/ and /e/, and (c) /o/.

6552

OPTICS LETTERS / Vol. 39, No. 22 / November 15, 2014

the reference wave as described in Ref. [1]. The proposed system only uses the high-speed image sensor. In future works, we will evaluate the detectable power of sound wave, noise levels, and so on to assess the capability of the proposed system. A part of this work is supported by Grant-in-Aid for Exploratory Research in Grant-in-Aid for Scientific Research (KAKENHI). Fig. 8. Spectrogram of recorded sound wave by pronouncing the phrase, “LOVE”. (hear the sound file of Media 2. Media 3 and Media 4 are the sound files recorded by a microphone where the maximum frequencies are 1 and 8 kHz, respectively).

measure a sound wave even though it is far from the image sensor. The modulated field can be recovered as a phase distribution. This property could be used in remote sensing of shock waves in femtosecond laser processing, stellar explosions in the universe, and so on. However, when the sound wave existed far from the image sensor, the spatial resolution of the reproduced sound wave becomes worse. By using an image sensor with a frame rate of 2 kHz, the sound waves from a tuning fork vibrating at a frequency of 440 Hz and a human voice at frequencies up to 1 kHz were reconstructed successfully. The maximum frequency of the sound waves reconstructed by the proposed method was half of the frame rate of the image sensor. The fastest image sensor available with present technology can be operated at 107 frames per second, giving a maximum sound-wave-recording frequency of 5 MHz. This is in the range of ultrasound waves. By extending the measurement frequency to the ultrasonic region, the proposed method can be used in many applications. We can also use the heterodyne technique to measure a high-frequency sound wave by modulating

References 1. O. Lokberg, Appl. Opt. 33, 2574 (1994). 2. P. Zheng, E. Li, J. Zhao, J. Di, W. Zhou, H. Wang, and R. Zhang, Opt. Commun. 282, 4339 (2009). 3. H. Takei, T. Hasegawa, K. Nakamura, and S. Ueha, Jpn. J. Appl. Phys. 46, 4555 (2007). 4. K. Mizutani, M. Nemoto, T. Ezure, H. Masuyama, and K. Nagai, Jpn. J. Appl. Phys. 42, 3072 (2003). 5. J. W. Goodman and R. W. Lawrence, Appl. Phys. Lett. 11, 77 (1967). 6. U. Schnars and W. Juptner, Appl. Opt. 33, 179 (1994). 7. T. Kreis, Handbook of Holographic Interferometry (Wiley, 2005). 8. Y. Frauel, T. J. Naughton, O. Matoba, E. Tajahuerce, and B. Javidi, Proc. IEEE 94, 636 (2006). 9. T. C. Poon, Nat. Photonics 2, 131 (2008). 10. S. Murata, D. Harada, and Y. Tanaka, Jpn. J. Appl. Phys. 48, 09LB01 (2009). 11. B. Kemper and G. von Bally, Appl. Opt. 47, A52 (2008). 12. M.-K. Kim, SPIE Rev. 1, 018005 (2010). 13. T. Tahara, R. Yonesaka, S. Yamamoto, T. Kakue, P. Xia, Y. Awatsuji, K. Nishio, S. Ura, T. Kubota, and O. Matoba, IEEE J. Sel. Topics Quantum Electron. 18, 1387 (2012). 14. J. Rosen and G. Brooker, Nat. Photonics 2, 190 (2008). 15. T. Kakue, R. Yonesaka, T. Tahara, Y. Awatsuji, K. Nishio, S. Ura, T. Kubota, and O. Matoba, Opt. Lett. 36, 4131 (2011). 16. A. H. Benade, Fundamentals of Musical Acoustics (Oxford University, 1976).

Optical voice recorder by off-axis digital holography.

An optical voice recorder capable of recording and reproducing propagating sound waves by using off-axis digital holography, as well as quantitative v...
981KB Sizes 1 Downloads 4 Views