The Effects of Emotional Expression on Vibrato *Christopher Dromey, *Sharee O. Holmes, †J. Arden Hopkin, and *Kristine Tanner, *yProvo, Utah Summary: Objectives/Hypothesis. The purpose of this study was to investigate the effect of emotional expression on several acoustic measures of vibrato, including its rate, extent, and steadiness. We hypothesized that singing a passage with emotional content would influence these variables. Study Design. This study used a within-subjects, repeated-measures design. Singer performance under different conditions was analyzed. Methods. Ten graduate student singers (eight women, two men) completed a series of tasks including sustained sung vowels at several pitch and loudness levels, an assigned song that was judged to have relatively neutral emotion, and a personal selection that included passages of intense emotion. Vowel tokens were extracted from the recordings and averaged for each task. Dependent measures included the mean fundamental frequency (F0), mean intensity, frequency modulation (FM) rate, FM extent, and measures of FM rate and extent variability. Results. The FM rate and extent were higher and the modulation variability was lower for the more emotional song than for the sustained vowels. Mean F0 and intensity were higher for the emotional song than for the neutral song. Conclusions. Singing an emotional passage influences acoustic features of vibrato when compared with isolated, sustained vowels. The wider dynamic and pitch ranges for emotional passages only partly explain vibrato differences between emotional and neutral singing. Key Words: Vibrato–Singing–Emotion–Vibrato extent–Vibrato rate–Modulation. INTRODUCTION Vocal vibrato has been the subject of research for many years. C.E. Seashore pioneered the use of acoustic measures to examine vibrato in as early as the 1930s. Seashore1 defined vibrato as ‘‘a periodic pulsation, generally involving pitch, intensity, and timbre, which produces a pleasing flexibility, mellowness, and richness of tone’’ (p. 623). Vocal vibrato is understood to be a natural feature of a well-balanced singing voice,2 contributing to a listener’s perception of the performer’s technical and artistic skill. Vibrato is also considered one of the means a singer may use to express emotion.1,3 Vibrato acoustics and vocal beauty On the surface it may seem simple to define a specific set of acoustic and physical measures that characterize a pleasing voice, but many factors are involved in beautiful singing. One factor identified in the literature is vibrato. Robison, Bounous, and Bailey4 compared the vocal ratings from a panel of expert judges to several acoustic measures of vocal performance. Singers with the highest ratings of vocal beauty were those whose vibrato occupied proportionally more of their total singing time. Other predictors of vocal beauty included cleanness of voice and adequate breath management. Several acoustic features of vibrato have been studied in detail since Seashore made his first observations, including the rate, extent, and periodicity of the vocal modulations.3,5 Vibrato rate is defined as the number of fundamental frequency (F0) and amplitude pulses per second.1 Frequency modulation (FM) and amplitude modulation (AM) both Accepted for publication June 10, 2014. From the *Department of Communication Disorders, Brigham Young University, Provo, Utah; and the ySchool of Music, Brigham Young University, Provo, Utah. Address correspondence and reprint requests to Christopher Dromey, Department of Communication Disorders, Brigham Young University, 133 John Taylor Building, Provo, UT 84602. E-mail: [email protected] Journal of Voice, Vol. 29, No. 2, pp. 170-181 0892-1997/$36.00 Ó 2015 The Voice Foundation http://dx.doi.org/10.1016/j.jvoice.2014.06.007

contribute to vibrato rate. In vibrato, AM is largely an epiphenomenon that arises from the resonance-harmonic interaction or RHI.6,7 The RHI is the interaction between rising and falling harmonic frequencies (as a result of fluctuating F0) and the formant peaks in the vocal tract transfer function that determine the overall intensity of a sound. This interaction creates an involuntary modulation in amplitude with the modulation in F0. There is also a laryngeal component of AM that can be measured through electroglottography,8 although the target behavior for a singer is believed to be the modulation of fundamental frequency. Therefore, for the remainder of this article, FM of vibrato will be the focus of measurement and discussion, and AM will not be considered further. The rate of vocal vibrato for an individual is not fixed; it can be modified by the singer with conscious effort. However, singers have a natural speed of vibrato, and rate can only be changed modestly with volitional control.9 The average rate of vibrato has been reported to be within the 5–7 Hz range,1,3,5 with higher or lower rates depending on when the note occurs within a musical passage,10 or the amount of vocal training.11 Prame’s10 research showed that vibrato rate does not always remain the same throughout a sustained note. Several studies have also shown that with vocal training, the vibrato characteristics of an individual singer may change.12,13 With training, inexperienced singers with an unusually fast rate tend to gradually slow down closer to an average pace, and singers who begin with a slow rate tend to speed up and move closer to the average rate over a period of time.13 As defined by Seashore,1 vibrato extent is the distance between the crest and trough of the F0 trace, and is measured in semitones (ST). The average extent of vibrato is reported to lie between 0.41 and 1.58 ST.3 Vibrato extent has generally increased over the course of the past century,12 and it also increases in individual singers after a significant period of vocal training.13 In a study of the link between acoustics and listener ratings, excessive amplitude modulation, delayed onset of vibrato, and complete absence of vibrato all had negative

Christopher Dromey, et al

Effects of Emotional Expression on Vibrato

effects on perceptual measures of the voice.2,3 A moderate vibrato rate and extent are also important for professional singers, and a balance in rate and extent has been identified as important to vocal beauty.2,4 Diaz and Rothman14 concluded that extent was an important aspect of vibrato, one that was reflective of overall vocal quality. They suggested that periodicity of vibrato, or the regularity of the modulation, was also among the most significant indicators of vocal beauty. Although much has been learned about the contributions of vibrato rate, extent, and periodicity to the overall performance and beauty of singing, much remains to be discovered, including the effects of emotional expression on these measures of vibrato. Natural versus simulated emotion The role of emotions in the human experience has been investigated extensively. It has been suggested that the primary purpose of emotional reaction in any species is to either protect an organism from impending threats (negative valence emotions linked to fight or flight), or to increase the chances of both short and long-term survival of the species (positive valence responses linked to food or mating).15 Researchers differ in their views as to whether specific emotions such as anger, fear, joy, or surprise have their own autonomic hallmarks, or whether emotional arousal is less finely differentiated at a physiologic level. Kreibig’s thorough review of the data from 134 studies suggests that the body’s autonomic response tends to be specific for a given emotion.16 In theater, the ability to assume the role of a character while convincingly portraying emotion is a vital skill. An actor is both himself and a fictional character at once. This duality of an actor on stage is an essential element of theater. An actor makes what is artificial seem genuine, and evokes an emotion in the audience that is not necessarily felt by the actor, but by the character. The skilled simulation or feigning of emotion can be sufficient to invoke autonomic responses in a theatrical or operatic audience. Baltes, Avram, Miclea, and Miu17 found that experiencing music through listening, watching, and learning the plot of an opera led to physiologic changes in a viewer. It has been suggested that emotional responses to music may be linked to the mirror neuron system, which activates a subset of motor neurons when an action is observed, in much the same way those neurons would be active in actually performing a task. In the case of an artistic expression, an audience member experiences an emotion that is evoked not by the listener’s direct experience of an event, but by a potentially innate capacity, mediated by mirror neurons, to respond emotionally to a musical performance.18 Given the capacity of musical performance to arouse an affective response in the listener, it could be speculated that certain acoustic features might characterize singing that is more rather than less emotional. This reasoning has led to a number of studies of the connection between emotion and the physiology of human phonation. Emotion and the voice During his early research on vibrato, Seashore briefly addressed emotion as a contributor to vibrato characteristics. At that time

171

it was suggested that vibrato had been found throughout the ages in many cultures, and that it occurred during emotional singing, or singing with feeling. Although Seashore1 suggested an emotional contribution to the emergence of vibrato, there was no clear evidence at the time that emotion had a direct effect on the characteristics of vibrato. The respiratory system is the energy source for phonation, and because it is under autonomic and volitional control, it is reasonable to anticipate that emotional arousal may influence its activity, which in turn may have an impact on the voice. Foulds-Elliott and collaborators19 asked professional opera singers to sing in two ways. One involved technical singing, as the artist might use during warm-up or rehearsal, and the other was emotionally connected singing, or the type of singing that meaningfully communicates with an audience during a performance. The key respiratory difference was that the emotionally connected singing involved initiating phonation at a higher lung volume level and using more air. An examination of sound pressure levels showed that the dynamic range was greater in the emotional singing and more uniformly loud in the technical condition. The authors speculated that performers may rely on autonomic nervous system activation to allow a convincing performance, in much that same way that a photographer elicits a more natural smile from a subject by telling a joke than by asking for a smile.19 The work of Klaus Scherer in the study of emotion in communication has been extensive. He succinctly summed up the rationale for using acoustics to understand the mechanisms of emotional expression: ‘‘If it is demonstrated that emotion can be correctly diagnosed from the voice, then clearly the emotions must differentially affect the vocalization mechanism and, in consequence, yield demonstrable differences in acoustic patterning of the resulting sound waves’’ (p.236–237).20 He acknowledged the ethical challenges associated with invoking true emotions in a laboratory, and noted that in most research into the acoustic features of individual emotions, actors have supplied the samples, raising the concern that the results may not reflect the features of a truly emotional experience. In discussing singing, Scherer noted that strong emotional involvement appears necessary for a successful performance, but that we cannot be sure whether these emotions were actually felt as opposed to skillfully feigned.20 The autonomic nervous system has been suggested as potentially responsible for functional voice disorders, where no organic pathology can explain the dysphonia. This reasoning led to a study of laryngeal muscle activation during a task known to invoke an autonomic response. Helou et al21 had their volunteers immerse a hand in ice water, and compared the activity levels of several intrinsic laryngeal muscles to cardiovascular indexes of autonomic activity. Along with the anticipated increases in heart rate and blood pressure, activation of vocal fold adductors, abductors, and tensors was observed, which lasted beyond the return to baseline of the cardiovascular measures after the ice water task was over. The authors concluded that the larynx is sensitive to autonomic nervous system activation, which in the present study may imply that vibrato characteristics could be affected by emotion.

172 A number of studies have been conducted that show the effects of emotion on specific aspects of the voice. Howes et al3 found that judges were able to correctly identify the emotion of a singer during a short cadenza, which confirmed that the singers were able to effectively portray the target emotion; however, this still does not explain the effect of the singers’ emotion on the acoustic features of their vibrato. In another study, rate, pitch height, and loudness were features of both speech and music that helped listeners to decode emotion.22 The work of Sundberg et al23 has shown that the extent of frequency modulation in vibrato may increase for an emotional as opposed to a neutral performance. The researchers in this study avoided a paradigm where the singer was asked to sing the same passage with several contrasting emotions, considering such a request to be at odds with the intended emotion embedded into a musical passage by the composer. In contrast, other researchers have attempted to distinguish between specific emotions (tenderness, happiness, sadness, and anger) by having performers sing the same short phrase using each of these emotions.24 In the latter study, none of the emotions affected the acoustic measures of vibrato (rate, extent, or steadiness); however, the modulation extent and steadiness increased when the voice was louder. In another report, Sundberg noted that during the performance of a song the vibrato rate was higher than in isolated, sustained tones.25 It could be speculated that the increased emotional engagement required for a performance, whether those emotions were real or simulated, could account for this increase in vibrato rate. The purpose of the present study was to investigate changes in several acoustic measures of vibrato as singers performed songs that were judged to be higher or lower in their emotional content. We hypothesized that singing a more emotional passage would lead to changes in vibrato rate, extent, and variability. It has been reported that greater emotional arousal is associated with a higher vibrato rate,2 but it is unclear how the other variables might be affected. This research could potentially yield insights into singing physiology and the mechanics of modifying vibrato. This may have value for professionally trained singers in their quest for vocal beauty by shedding light on the connection between the expression of emotion and a balanced vibrato. METHOD Participants Ten graduate student singers with high vocal competency ratings from the classical voice program at the School of Music at Brigham Young University participated in this study. All singers were rated between 3.0 and 4.05 on a scale of 1.0 to 5.0 for vocal technique. A score of 3.0 is required to pass a junior level recital and 4.0 for a graduate recital. A score of 5.0 represents professional caliber performance. Students who were already assigned a score of 3.0 or higher in the music program were invited to participate, and no further screening was conducted as part of the study. The mean age of the singers was 23.9, years (SD ¼ 2.08). Eight were women and two were men. All participants reported good health and denied a

Journal of Voice, Vol. 29, No. 2, 2015

history of hearing or voice disorders. The experimenters also listened to each participant’s speaking voice and found it to be perceptually within normal limits. Each participant signed a consent form that was approved by the Brigham Young University Institutional Review Board. Equipment Recordings were made in a sound-treated studio. A sound level meter (Extech Instruments [Nashua, NH] 407736) was positioned 50 cm from the microphone to calibrate the audio signal for vocal intensity. A Neumann (Berlin, Germany) TLM 49 condenser microphone was placed inside a sound isolation shell, with the pickup pattern of the microphone facing away from the piano to reduce signal bleed. An Audient (Herriard, UK) 8024 Analog Recording Console, Grace (Lyons, CO) Model 201 two channel preamplifier, and ProTools 10 HD2 Recording System (Avid Technology Inc, Burlington, MA) were used for the recordings at 44.1 kHz. Procedure The singers warmed up their voices before the recording. A pianist accompanied each singer. They performed the following tasks in randomized order to minimize the likelihood of a sequence effect. Personal selection. Each participant sang a song in classical style that they had been practicing with their vocal instructor. The singers were invited at the time of recruitment to the study to be expressive with the emotion of their selection, which should have passages of both high and low levels of emotion. Emotion was not operationally defined for the singers; it was left to them to identify a passage that they judged to be emotionally expressive. At the time of recording, no further instruction regarding emotion was given. Thus, the emotions expressed in the songs were not standardized across participants. Previous work has reported no changes to vibrato in association with the targeted expression of specific emotions,24 and for the purpose of the present study, it was decided to have the singers express the emotion intended by the composer, regardless of its specific character or even valence. The participants brought a copy of their musical selection with annotations that they had made to indicate the emotional intensity of each part of the song. These markings guided segmentation during signal analysis to identify and extract for analysis only those vowels marked as having the highest emotion. During statistical analysis the dependent measures (reflecting vibrato rate, extent, and steadiness) for the other conditions (assigned song, isolated vowels) were compared with this emotional passage to evaluate changes that might be attributable to emotional arousal. The personal selection was considered to be a performance task, meaning it was most representative of singing in a concert, because the participant sang an entire song. It is important to acknowledge that the researchers did not attempt to directly influence the singers’ emotional state by any experimental means. Thus, it can be assumed that the performers were feigning the intended emotion required for each

Christopher Dromey, et al

Effects of Emotional Expression on Vibrato

part of their chosen song. The singers were not questioned following the recording regarding their level of perceived emotional arousal. Assigned song. Each singer sang the first 12 measures of the song ‘‘Caro mio ben’’ by Giordanni, which was chosen for three reasons. First, it was well known to the students in the program as a beginning level song. Secondly, this song is almost always sung as part of a skill development exercise without encouragement of emotional expression. Finally, it is relatively slow, with limited dynamic range and pitch range, and includes several prolonged vowels that would be suitable for modulation analysis. This short song was sung once at a self-selected comfortable pitch and loudness. This selection was considered a performance task (albeit without deliberate emotional engagement) along with the personal selection, because the participant sang multiple phrases within the context of a familiar song. Sustained vowels. Each participant sang the isolated vowels /ɑ/, /u/, and /i/ across pitch and loudness continua. They sang each vowel at a comfortable pitch, at three different loudness levels: low, medium, and high. They also sang each vowel at a comfortable loudness, at three different pitches: low, medium, and high. The purpose of these tasks was to allow the measurement of vibrato changes across pitch and loudness conditions, which might overlap those used in the songs, but without any emotional component. It was anticipated that passages with the highest emotional content might also be sung with a wide dynamic range; therefore, it was important to measure vibrato changes that might be attributable to elevated pitch or loudness of the voice aside from any contribution of emotional arousal. These tasks were assumed to have the most neutral emotion, and were also considered isolated phonation tasks, meaning they had no real context and they were the least representative of a concert performance. Tasks within the sustained vowel section were also randomized to minimize the likelihood of an order effect. Data analysis Digital sound files from the recording studio were transferred to a laboratory computer for analysis. The files were first segmented to extract isolated vowel tokens from the singing passages as individual 44.1 kHz wav files. These vowel tokens were opened with Praat acoustic analysis software (version 5.3.03; Paul Boersma and David Weenink, University of Amsterdam, The Netherlands) to generate an F0 contour, which was exported as a text file with values reported at 1 ms intervals. The wav audio files were also analyzed with custom Matlab software (MathWorks, Natick, MA, 2009) to create an root mean square contour, also at 1 ms sample intervals. Acoustic measurements were derived from the recordings with a custom Matlab application, to compute variables reflecting vibrato rate, extent, and steadiness, as well as means of F0 and intensity. During vowel segmentation, the individual tokens were trimmed minimally, leaving the longest duration possible for each vowel; therefore, vowel tokens varied in length across all tasks. Instances of delayed onset of vibrato were also

173

included in the analysis. Vowel tokens were excluded from analysis when the piano intensity overcame the acoustic shielding and affected the voice recording to the extent that the analysis yielded a visibly contaminated F0 trace. Personal selection vowel tokens were chosen from the sections of highest emotion, as indicated by the singers’ markings on their music score. The acoustic measures of vibrato (reflecting rate, extent, steadiness, fundamental frequency, and vocal intensity) from the first 20 high-emotion vowel tokens from each singer were averaged to create the personal selection data set. The assigned song data set was created from the same five vowels for each participant. These vowels were chosen for their length, providing vowel tokens with comparable duration to those from the personal selection. Pitch and loudness summary values of the dependent measures were generated by averaging data from all /ɑ/, /u/, and /i/ vowels for each condition, because initial analysis revealed no differences between the three places of articulation. Figure 1 illustrates how the dependent variables were defined. FM rate. FM rate was measured in Hz, and was calculated through the use of a peak- and trough-picking algorithm that identified the temporal location of each FM cycle. The rate was computed as the inverse of the mean period of each cycle. FM extent. FM extent was measured in ST. This was calculated by taking the maximum value (peak) minus the minimum value (trough), averaged over all cycles. FM rate coefficient of variation (COV). This variable was a measure of the regularity of the FM rate. It was computed by dividing the standard deviation by the mean of the FM period of a vowel token (coefficient of variation), which was then multiplied by 100 to make the numbers more convenient to interpret. FM extent coefficient of variation. This variable was a measure of the regularity of the FM extent. It was computed by dividing the standard deviation by the mean of the FM extent for the modulation cycles within a vowel token, which was then multiplied by 100 to make the numbers more convenient to interpret. Mean F0. The mean F0 was measured in Hz. This was the average fundamental frequency during each vowel token. Mean intensity (dB). The intensity mean was measured in dB SPL at 50 cm. This variable was calculated as the average intensity during each vowel token.

Statistical analysis Univariate repeated-measures analyses of variance (ANOVAs) were used to evaluate the statistical significance of changes in the dependent measures across the vocal task conditions. An initial analysis comparing the vowels /ɑ/, /u/, and /i/ showed no significant differences in the dependent measures. Because the data were comparable for all vowels, they were therefore averaged for each pitch and loudness level before further analysis. Contrast tests within the ANOVA model compared the task with the highest level of emotion (the personal selection) with each of the other tasks.

174

Journal of Voice, Vol. 29, No. 2, 2015

FIGURE 1. Frequency modulation (FM) and amplitude modulation (AM) traces for an individual vowel token, with software-derived peak markers which were used to calculate FM rate, FM extent, FM rate coefficient of variation (COV) and FM extent COV. Vertical axis units are Hz for FM and dB SPL at 50 cm for AM. Although women’s speaking voices are generally about an octave higher than men’s, it was reasoned that it would be appropriate to analyze the F0 data for both men and women together because the repeated measures ANOVA essentially tests each singer against their own performance across the tasks in the study. Large intersubject variance in a standard ANOVA would lead to high levels of error variance. But the repeated measures computation accounts for this variance because the samples across the conditions are assumed to be correlated and not independent. The F0 range among a mixed group of singers will necessarily be larger than when men and women are considered separately; however, the statistical test still allows significant changes to be identified within singers across the levels of the independent variable. The statistical results are reported in their unaltered form, without explicit adjustments to minimize the potentially inflated risk for type I errors when multiple tests are conducted. Certain kinds of error reduction adjustments (such as Bonferroni) carry

with them the risk of increasing type 2 errors because they are overly conservative.26 All P values below 0.05 are reported in Tables 1 and 2, but the reader is encouraged to critically evaluate the relative strength of the results for each variable. RESULTS Figures 2 and 3 show means and standard deviations for the dependent variables for the pitch (Figure 2) and loudness (Figure 3) continua. The assigned song and personal selection are also presented for comparison with the sustained vowel tasks. Tables 1 and 2 show the F-ratios and P values for each of the statistically significant findings for pitch and loudness continua respectively. FM rate There was a significant main effect of vocal task on FM rate of vibrato. Figures 2 and 3 show an upward trend, with FM rate increasing from the neutral-emotion tasks to the more

TABLE 1. Inferential Statistics for the Dependent Measures in the Pitch Continuum, Including Main Effect and Contrast Analyses Against the Personal Selection Task Main Effect Variable FM rate FM extent FM rate COV FM ext COV Mean f0 Mean dB

Low Pitch

Comf Pitch

High Pitch

F

P

F

P

F

P

F

P

5.475 28.238 8.043 4.194 54.459 42.657

0.002 0.000 0.000 0.007 0.000 0.000

26.336 53.030 6.453 6.457 97.604 81.041

0.001 0.000 0.032 0.032 0.000 0.000

29.263 45.933

0.000 .000

14.150 7.225

0.004 0.025

13.183 29.436 9.606 5.133 14.006

0.005 0.000 0.013 0.050 0.005

Abbreviations: CMB, Caro mio ben; comf, comfortable; ext, extent.

CMB F

P

25.928 26.849

0.001 0.001

Christopher Dromey, et al

175

Effects of Emotional Expression on Vibrato

TABLE 2. Inferential Statistics for the Dependent Measures in the Loudness Continuum, Including Main Effect and Contrast Analyses Against the Personal Selection Task Main Effect Variable FM rate FM extent FM rate COV FM ext COV Mean f0 Mean dB

Low Loud

Comf Loud

High Loud

F

P

F

P

F

P

F

P

8.708 33.408 7.200 3.977 15.460 19.716

0.001 0.000 0.000 0.027 0.000 0.000

17.053 35.545

0.003 0.000

31.840 21.124

0.000 0.001

54.636 30.134

0.000 0.000

6.542 35.470 63.883

0.031 0.000 0.000

22.683 11.522

0.001 0.008

22.927

0.001

CMB F

P

25.928 26.849

0.001 0.001

Abbreviations: CMB, Caro mio ben; comf, comfortable; ext, extent.

emotional task. According to the contrast analysis, there was not a significant difference in FM rate between the neutralemotion song and the high-emotion song. There were, however, statistically significant increases in FM rate from the sustained vowel tasks to the personal selection in both the pitch and the loudness continua.

FM extent In both pitch and loudness continua, there was a significant main effect of vocal task on FM extent. Figure 2 shows a general increase from the sustained vowel tasks to both performance tasks: the assigned song and the personal selection. Figure 3 shows the same pattern for the loudness continuum.

FIGURE 2. Mean and standard deviation of FM rate, FM extent, FM rate COV, FM extent COV, mean F0 and mean dB across all tasks, within the pitch continuum. CMB, Caro mio ben (neutral) song; PS, personal selection (high-emotion) song.

176

Journal of Voice, Vol. 29, No. 2, 2015

FIGURE 3. Mean and standard deviation of FM rate, FM extent, FM rate COV, FM extent COV, mean F0 and mean dB across all tasks, within the loudness continuum. CMB, Caro mio ben (neutral) song; PS, personal selection (high-emotion) song. A contrast analysis revealed that the FM extent for the personal selection was significantly higher than the sustained vowel low, comfortable, and high tasks for both pitch and loudness. The difference between the assigned song and the personal selection was insignificant. FM rate COV A significant main effect of vocal task on FM rate COV was found for both pitch and loudness continua. A relative decrease in FM rate variability is notable in Figures 2 and 3. Through the contrast analysis, it was found that in the pitch continuum, lowand high-pitch tasks were significantly more inconsistent in FM rate than the personal selection; however, the comfortable pitch and assigned song tasks yielded no significant difference from the personal selection. In the loudness continuum, only a main effect was found, with no significant differences found between individual tasks. FM extent COV The main effect of vocal task on FM extent COV was statistically significant in both the pitch and the loudness continua.

Figure 2 shows a decrease in FM extent COV for the performance tasks. Figure 3 shows a similar pattern, with more stability for the comfortable loudness task than other sustained vowel tasks. The contrast analysis showed that in the pitch continuum, low- and high-pitch vowels were significantly more inconsistent than personal selection vowels. For the loudness continuum, only low loudness vowels were significantly more inconsistent than personal selection vowels. Mean F0 There was a statistically significant main effect of vocal task on mean F0. For the pitch continuum, Figure 2 shows a clear increase in F0 from low- to comfortable- and to high-pitch tasks, as would be anticipated. The assigned song mean F0 was between the low and comfortable pitch sustained vowel mean F0 values. The personal selection mean F0, however, was between the comfortable and high-pitch means. On the loudness continuum, Figure 3 shows consistency between all tasks with the exception of the personal selection, which has an increased mean F0. The contrast analysis confirmed a statistically significant difference between the personal selection and

Christopher Dromey, et al

Effects of Emotional Expression on Vibrato

all other tasks (sustained low, comfortable, high, and the assigned song) individually for both the pitch continuum and the loudness continuum. Mean dB For both the pitch and loudness continua, there were statistically significant main effects of vocal task on mean dB. A contrast analysis of the pitch continuum revealed that the low pitch and comfortable pitch sustained vowel tasks, and the assigned song, had a significantly lower mean dB than the personal selection. The high-pitch task, however, showed no significant difference in mean dB from the personal selection. For the loudness continuum, the low loudness and comfortable loudness conditions, and the assigned song, were significantly lower in mean dB than the personal selection. The high loudness task showed no significant difference in mean dB compared with the personal selection. DISCUSSION This study investigated the potential effects of emotional arousal on the acoustic features of vocal vibrato. On the basis of previous evidence that emotion can affect speech, song, and overall voice quality, it was anticipated that singing passages considered to be higher in emotional content might cause vocal vibrato to change in its rate, extent, and/or steadiness. It is important to recognize that emotional arousal is not the only factor that may have led to changes in vibrato in the present study. Physical or cognitive arousal may play a role in preparing a singer for performance, and thus in the present study, differences between vibrato in the isolated vowels and the songs may be attributable to factors other than real or feigned emotion. The data revealed that there were significant changes in vibrato as a function of vocal task; however, the extent to which the changes were due to the level of emotional arousal remains unclear. The results include two main trends, which were seen in Figures 2 and 3. First, there was a general increase in vibrato FM rate across tasks with presumably increasing emotional engagement. Second, there was an increase in FM extent from the isolated sustained vowel tasks to the tasks that involved songs. Further examination of the individual acoustic measures led to more detailed speculation about what may have contributed to these changes. FM rate FM rate is a key component of vocal vibrato. Several studies have shown that the average FM rate of vibrato is approximately 5–7 Hz.1,3,5 The average vibrato rate in this study was in the 5 Hz range, with the slowest vowel tokens around 4 Hz, and the highest reaching approximately 6 Hz. In Figure 2, FM rate steadily increased with each task across the pitch continuum, from low pitch to the personal selection. Because the pattern in FM rate differed from that of mean F0 across the pitch continuum and mean dB across the loudness continuum, the data suggest that the FM rate of vibrato was potentially influenced by emotional arousal in the task, rather than just changing as a consequence of a higher pitched or

177

louder voice. This inference is supported by the observation that although the difference in FM rate between the assigned song and personal song did not reach statistical significance, there was a visible increase in rate for the personal song. This would be consistent with the report of Ekholm et al2 of an increase in vibrato rate for more emotional singing. The finding that both of the songs in the present study had a higher vibrato rate than the isolated vowels is similar to the report from Sundberg.25 It could be inferred from this finding that singing a meaningful song activates the mechanism underlying vibrato to a different degree than the production of vowels in isolation. Although the assigned song was neutral in its intended emotional content, it could nevertheless be speculated that singing either song could engage the autonomic nervous system in such a way as to lead to a slightly faster vibrato. Titze et al27 have suggested a reflex resonance model of vibrato, which relies on muscle spindle afference and elevated feedback gains to generate the muscle tension modulations that result in vibrato. If the singers experienced increased autonomic activity for either song when compared with isolated vowels, it might be that this resulted in higher levels of muscle contraction, as reported in a previous study of autonomic activation.21 This could have influenced the timing and magnitude of the oscillations in the neural circuits responsible for vibrato. FM extent The FM extent of vibrato has previously been reported in the range of 0.41 to 1.58 ST.3 In this study, the average extent was about 1.5 ST, with a range from approximately 1.1 ST to 2.0 ST. The patterns for FM extent across the pitch and loudness continua are very similar, allowing the results to be considered together. Figures 2 and 3 show intriguing patterns of change for FM extent. First, there is a modest but steady increase in extent from the low- to the high-pitch and loudness conditions. For the sustained vowel tasks, the FM extent increased with mean F0, which was also associated with a dB increase. Second, and perhaps more significantly, there was a greater difference between the isolated vowels and the song tasks in the extent of vibrato. The personal selection showed almost no difference from the assigned selection, which suggested that the level of intended emotional arousal may not have had much of an effect on vibrato extent. Instead, the nature of the task had a significant impact on FM extent. It could be speculated, based on this finding, that FM extent is not tied to emotional arousal, but rather increases during performance, in contrast to the sustained, isolated vowel tasks that are not representative of concert performance. Sundberg25 reported that FM extent increased slightly with vocal loudness, but the current data reveal greater increases for the performance of a song than would be anticipated from intensity change alone, especially because the vibrato extent for the neutral song was higher, whereas the dB level was lower, than for the loudest isolated vowel. Thus, singing a song appears to involve factors that can influence vibrato extent that are missing in the production of isolated vowels, possibly the realistic nature of the task, because singers train to perform songs rather than sustained vowels. Anticipated differences in

178 the level of emotional arousal between the two songs do not appear to be influential in this context. FM rate COV Vibrato rate steadiness was measured in this study as FM rate COV. This measure was used as the method of examining the consistency of the FM rate. In a previous study, vibrato rate periodicity was described as an important component of vocal beauty.11 The term periodicity usually includes both rate and extent measures to assess the overall steadiness of a sound in comparison to a sine wave. In this study, however, rate steadiness and extent steadiness have been examined individually in an attempt to more specifically examine the timing and amplitude components of vocal vibrato. In a previous study, a steadier vibrato was favored by expert listeners over a less steady vibrato.4 Therefore, the examination of FM rate COV could give insights into the way that emotion affects the overall beauty of vibrato. In this study, the FM rate COV patterns across the pitch and loudness continua were found to be alike, as seen in Figures 2 and 3. In both graphs, the FM rate COV showed a slight decrease in unsteadiness with an increase in the pitch and loudness for the sustained vowel tasks. This decrease was subtle compared with the substantial decrease in FM rate COV between the sustained vowels and the assigned song. The size of this change suggests that FM rate COV is affected by the performance nature of the task. The FM rate COV for the personal selection was likewise significantly lower than for the sustained vowel tasks. It is important to note, however, that the personal selection was slightly higher in FM rate COV than the assigned song. Although this difference was not statistically significant, there is a visible difference between the songs in Figures 2 and 3. The increase in unsteadiness from the assigned song to the personal selection may permit speculation that a higher level of emotional arousal increases FM rate inconsistency. This may be linked to a previously identified relationship between fear, or anxiety, and a quivering voice.28 What is less clear is why sustaining isolated vowels would result in greater vibrato rate unsteadiness than singing a song. Although we do not have data on the singers’ perceptions during the different tasks, it is possible that because their training targets vocal beauty in performance, no such expectation is present for isolated vowels. Previous work has linked steadier vibrato to more positive ratings of vocal beauty,4 and singers may naturally produce a steadier vibrato in association with performance of a song, regardless of its emotional content. FM extent COV FM extent COV is the measure of inconsistency in the width of the vibrato extent during each vowel token. Inferences about this vibrato characteristic mirror those of the FM rate COV. A slight decrease in extent variability was noted between the low pitch and loudness tasks and the high-pitch and loudness tasks within the sustained vowels, showing an inverse relationship between pitch/loudness and FM extent COV. The most significant difference was between the isolated vowel tasks and the

Journal of Voice, Vol. 29, No. 2, 2015

performance tasks, with FM extent COV decreasing significantly for the performance tasks. These results, when applied to real performance, could suggest that the FM extent is more stable during performance and less stable during vocal tasks that are not part of a singer’s concert performance, such as warm-up activities. With regard to emotion, FM extent COV did not appear to be affected by the presumed increased emotional arousal during the personal selection in the way that FM rate COV was affected. There was no noticeable difference between the extent COV for the neutral-emotion assigned song and the high-emotion personal selection. It is not possible on the basis of the present data to infer mechanistic differences in the way singers may regulate the steadiness of FM in its rate as opposed to its extent. Previous work has suggested a possible trade-off between rate and extent in vibrato,29 but this does not allow confident conclusions about the separate contributions of regularity in rate and extent to the overall steadiness of the modulation.

Mean F0 Mean F0 for the personal selection was higher than for the assigned song in the pitch continuum (Figure 2) and higher than all other conditions in the loudness continuum (Figure 3). Because the singers identified sections of the musical score representing high and low emotion, and the experimenter selected vowels from the sections marked as higher in emotion, the results necessarily reflected a high mean F0 for these highemotion vowels. This would be consistent with the use of higher pitch by the composer as one element of emotional expression, along with other factors, such as loudness, tempo, and the choice of words in the song. The assigned song mean F0 was between the low and comfortable pitch sustained vowel tasks, and the personal selection mean F0 was between the comfortable and high-pitch tasks, further suggesting the importance of elevated pitch in emotional expression. The fundamental frequency of each vowel token was measured to examine the possibility that mean F0 might be a causal factor for changes in the dependent measures of vibrato rate, extent, and steadiness. In the graphs and statistical analyses, the patterns in mean F0 for each task were compared with patterns in the variables reflecting vibrato rate, extent, and steadiness for each task. Figure 2 shows the mean F0 for tasks of the pitch continuum, in which the mean F0 for sustained vowel tasks was intentionally modified. This graph also shows the mean F0 of the assigned song and their personal selection. Because the personal selection and the assigned song were both within the F0 range of the sustained vowel tasks, it appears unlikely that the increases in the rate, extent, and steadiness of the modulation with high-emotion were simply a function of mean F0. If mean F0 for the personal selection or the assigned song had been out of the range of sustained vowel mean F0 for low- to high-pitch tasks, the impact on these vibrato indexes might have simply been a product of increasing fundamental frequency above this range. However, this was not the case, as mean F0 was within the range that the singers produced during sustained vowel tasks for the pitch continuum.

Christopher Dromey, et al

Effects of Emotional Expression on Vibrato

Mean dB The mean intensity of the vowel tokens was examined to determine whether changes in the rate, extent, and steadiness of vibrato might be attributable to a difference in intensity as opposed to the level of emotional arousal. In Figure 2, the mean dB for the personal selection was higher than for all conditions except the high-pitch task, including the assigned song. Thus at first glance, it would appear that the high dB levels associated with emotional expression in the personal selection may have contributed to changes in the rate and extent of vibrato, and thus it would be difficult to disentangle the effects of emotional arousal and vocal loudness. However, a closer examination of the data in Figure 3 reveals that vibrato rate and extent did not climb stepwise with loudness across the dB continuum, implying that emotional expression in vibrato may rely on more than a simple increase in vocal intensity, although highly emotional passages of singing tend to be performed with a louder voice. The general trend was for mean dB to follow mean F0: when there was an increase in mean F0, there was also a comparable increase in mean dB. This finding is consistent with the physiologic explanation that a higher subglottic pressure is needed to overcome the increased resistance of stiffer vocal folds during higher notes.30 Differences between sustained vowels and songs The purpose of including sustained vowels at several levels of pitch and loudness was to learn whether these fundamental adjustments to laryngeal function would have consistent effects on the rate, extent, and steadiness of vibrato. It was reasoned that this knowledge would be important to give context to the interpretation of any vibrato changes when singers sang more emotionally involved passages of a song. In other words, because emotional expression in singing can involve increases in pitch and loudness,24 knowing the effects of these changes in the absence of emotional engagement may help to isolate or at least more clearly interpret the effects of emotional expression on vibrato. As previously mentioned during the discussion of the individual acoustic measures, there were few differences in vibrato rate, extent, or steadiness between the two songs. This could be interpreted to mean that although true emotions can and do affect speech,20 the expression of emotion by a singer may not have substantial effects on vibrato, at least in the context of a recording session in a studio. This finding, however, does not clarify whether singing while experiencing a genuine emotion, along with its associated autonomic responses,16 might affect vibrato by means of increased muscle activation,21 especially in the presence of a responsive audience. Most of the significant differences in the results were between the songs and the isolated vowels, and a number of factors may have contributed to this finding. The linguistic content of the songs may have contributed a cognitive load to the task that was not present for isolated vowels. The completion of linguistic tasks concurrently with sentence repetition has previously been reported to influence measures of articulatory stability.31 It is possible that singing the words of a song like-

179

wise influences laryngeal behavior when some of the brain’s neural resources are dedicated to language. Another possible explanation is that the act of articulating words may alter the activity of the larynx. Studies of laryngeal-articulatory coupling32,33 have provided evidence that the vocal tract subsystems are far from isolated in their function, and that biomechanical and/or neural linkages may be responsible for adjustments to one component leading to changes in another.34 Vocal modulation on the surface seems like a relatively simple phenomenon that is brought about by rhythmic adjustments to the level of cricothyroid muscle activation.35 However, the complexity of the control circuitry of the lungs and larynx during phonation means that several sources of neural input can influence the behavior of the lungs and vocal folds. These include volitional adjustments to the expiratory muscles and those that control the position, length, and tension of the folds, and reflexive responses based on sensory signals from the upper airway, and also the influence of the autonomic nervous system in response to emotional arousal. In the act of singing with emotion—either genuine or feigned—it would be anticipated that a blend of signals from different components of the central nervous system would influence the muscles that control phonation. Thus the factors influencing vibrato are complex, making it difficult to interpret evidence of change in a straightforward manner. Limitations of the study and directions for future research A number of assumptions were made in the design of the present study that limit the strength of the inferences we may draw from the results. Foremost was the belief that when singing a passage recognized as emotionally expressive, a singer would experience autonomic nervous system activation that would influence the physiology of singing. Previous work that has examined respiratory behaviors during emotionally engaged singing19 and laryngeal responses to autonomic nervous system activation21 would support the hypothesis that for a listener to perceive emotion in a song, there must be features of the sound production that differentiate it from a more emotionally neutral performance. However, because singers, like actors, may be highly skilled at feigning emotions, it is entirely possible that no autonomic changes occurred as singers performed the personal selection in the present study, even if the acoustics reflected a convincing portrayal of an emotion.20 This may be one reason why the acoustic indexes of vibrato did not differ between the two songs, although mean F0 and mean dB were significantly higher for the personal selection. Furthermore, the personal selection song was the only condition in the study under which the singers would be anticipated to perform with intense emotional arousal. However, singing in the recording studio as part of an experiment would only poorly simulate the experience of performing before a large audience. Thus, the personal selection may have been more representative of a performance practice session, because the engagement with a live audience was missing. A further concern about the personal selection was that the length was greater than for the

180 assigned song, and the length also differed across singers because each chose a different song. The length of the song may have influenced the singer’s capacity to maintain a given intensity of emotional expression, and may thus have affected the results. One way to understand links between emotion and singing more fully would be through an examination of the physiological changes in the singer while performing with emotion during a live stage event. Relevant measures could include cardiac, electrodermal, or vascular measures such as cardiac interbeat intervals, skin conductance level, diastolic blood pressure, and mean arterial pressure. These measures have been used in previous studies to assess physiological changes in individuals while they listen to emotional operatic music,17 and could potentially be adapted to assess the emotional arousal of the singer during performance. This type of study could give a clearer understanding of whether singers genuinely experience emotions during a performance, or whether they are instead highly skilled at simulation, having practiced the emotional song so many times that they need not experience the actual emotion during performance to convincingly evoke it in the audience. A recent study of vocal performance students revealed that the psychological stress associated with a jury examination led to increases in heart rate, but there were divergent effects on singing accuracy depending on the training level of the singer.36 Because the vowels extracted for analysis in the more emotionally expressive personal selection came from a different song for each performer, vowel segment durations were not controlled for during analysis. Previous work has shown that FM extent can be lower for longer vowels,37 and also that FM rate can increase toward the end of a longer vowel.10 Because the FM rate and extent in the present study were measured as the average across vowel segments of varying duration, potentially important differences in vibrato rate and extent for vowels of different length were missed. The degree to which we can generalize from the present study was limited because there were only 10 participants— eight females and two males. In the future, a larger number of participants with approximately equal representation of men and women might yield results that are easier to interpret, particularly with regard to any differences between males and females in the influence of emotional arousal on vibrato. In the present study it was difficult to directly compare the personal selections with other tasks because the personal selection was different for each participant, although all other tasks were completed in the same way for each singer. A possible solution in future research would be to have all participants sing the same high-emotion song. In this study, high-emotion and neutral-emotion were the only two categories used to describe the emotion in the singing tasks. In future studies, the type of emotion could be further examined in several ways, including comparisons of positive and negative emotions, or specific emotions such as anger, fear, pride, sadness, and so on. A more specific method of classifying emotion may lead to a clearer understanding of the mechanisms by which emotions influence the singing voice,

Journal of Voice, Vol. 29, No. 2, 2015

particularly given Kreibig’s report of different physiologic responses for specific emotional states.16 CONCLUSION The purpose of this study was to learn whether the intensity of emotion expressed by a singer during performance would influence the acoustic characteristics of their vibrato. In spite of the limitations identified previously, the results not only link certain aspects of vibrato to emotional expression, but also to singing in performance as opposed to the production of isolated vowels. Given the improvements in vibrato steadiness for the more emotional passages, the importance of emotional engagement in a performance appears worthy of further consideration in the pursuit of vocal beauty. Acknowledgment We express our appreciation to the singers and accompanists who participated in this study. We are also grateful for the financial support provided by the David O. McKay School of Education at Brigham Young University. This manuscript is based on the master’s thesis research of the second author. REFERENCES 1. Seashore C. The natural history of the vibrato. Proc Natl Acad Sci U S A. 1931;17:623–626. 2. Ekholm E, Papagiannis GC, Chagnon FP. Relating objective measurements to expert evaluation of voice quality in Western classical singing: critical perceptual parameters. J Voice. 1998;12:182–196. 3. Howes P, Callaghan J, Davis P, Kenny D, Thorpe W. The relationship between measured vibrato characteristics and perception in Western operatic singing. J Voice. 2004;18:216–230. 4. Robison CW, Bounous B, Bailey R. Vocal beauty: a study proposing its acoustical definition and relevant causes in classical baritones and female belt singers. J Sing. 1994;51:19–30. 5. Corso JF, Lewis D. Preferred rate and extent of the frequency vibrato. J Appl Psychol. 1950;34:206–212. 6. Horii Y, Hata K. A note on phase relationships between frequency and amplitude modulations in vocal vibrato. Folia Phoniatr. 1988;40:303–311. 7. Horii Y. Acoustic analysis of vocal vibrato: a theoretical interpretation of data. J Voice. 1989;3:36–43. 8. Dromey C, Reese L, Hopkin JA. Laryngeal-level amplitude modulation in vibrato. J Voice. 2009;23:156–163. 9. Dromey C, Carter N, Hopkin A. Vibrato rate adjustment. J Voice. 2003;17: 168–178. 10. Prame E. Measurements of the vibrato rate of ten singers. J Acoust Soc Am. 1994;96:1979–1984. 11. Mendes AP, Rothman HB, Sapienza C, Brown WS Jr. Effects of vocal training on the acoustic parameters of the singing voice. J Voice. 2003; 17:529–543. 12. Ferrante I. Vibrato rate and extent in soprano voice: a survey on one century of singing. J Acoust Soc Am. 2011;130:1683–1688. 13. Mitchell HF, Kenny DT. Change in vibrato rate and extent during tertiary training in classical singing students. J Voice. 2010;24:427–434. 14. Diaz JA, Rothman HB. Acoustical comparison between samples of good and poor vibrato in singers. J Voice. 2003;17:179–184. 15. Lang PJ, Bradley MM. Emotion and the motivational brain. Biol Psychol. 2010;84:437–450. 16. Kreibig SD. Autonomic nervous system activity in emotion: a review. Biol Psychol. 2010;84:394–421. 17. Baltes FR, Avram J, Miclea M, Miu AC. Emotions induced by operatic music: psychophysiological effects of music, plot, and acting: a scientist’s tribute to Maria Callas. Brain Cogn. 2011;76:146–157.

Christopher Dromey, et al

Effects of Emotional Expression on Vibrato

18. Molnar-Szakacs I, Overy K. Music and mirror neurons: from motion to ‘e’motion. Soc Cogn Affect Neurosci. 2006;1:235–241. 19. Foulds-Elliott SD, Thorpe CW, Cala SJ, Davis PJ. Respiratory function in operatic singing: effects of emotional connection. Logoped Phoniatr Vocol. 2000;25:151–168. 20. Scherer KR. Expression of emotion in voice and music. J Voice. 1995;9: 235–248. 21. Helou LB, Wang W, Ashmore RC, Rosen CA, Abbott KV. Intrinsic laryngeal muscle activity in response to autonomic nervous system activation. Laryngoscope. 2013;123:2756–2765. 22. Ilie G, Thompson WF. A comparison of acoustic cues in music and speech for three dimensions of affect. Music Percept. 2006;23:319–329. 23. Sundberg J, Iwarsson J, Hagegard H. A singer’s expression of emotions in sung performance. STL-QPSR. 1994;35:81–92. 24. Guzman MA, Dowdall J, Rubin AD, Maki A, Levin S, Mayerhoff R, Jackson-Menaldi MC. Influence of emotional expression, loudness, and gender on the acoustic parameters of vibrato in classical singers. J Voice. 2012;26: 675.e5–675.e11. 25. Sundberg J. Acoustic and psychoacoustic aspects of vocal vibrato. STLQPSR. 1994;35:45–68. 26. O’Keefe DJ. Searching for a defensible application of alpha-adjustment tools. Hum Commun Res. 2003;29:464–468. 27. Titze IR, Story B, Smith M, Long R. A reflex resonance model of vocal vibrato. J Acoust Soc Am. 2002;111(5 pt 1):2272–2282.

181

28. Merritt L, Richards A, Davis P. Performance anxiety: loss of the spoken edge. J Voice. 2001;15:257–269. 29. Sundberg J, Th€ornvik MN, S€oderstr€om AM. Age and voice quality in professional singers. Logoped Phoniatr Vocol. 1998;23:169–176. 30. Solomon NP, Ramanathan P, Makashay MJ. Phonation threshold pressure across the pitch range: preliminary test of a model. J Voice. 2007;21:541–550. 31. Dromey C, Benson A. Effects of concurrent motor, linguistic or cognitive tasks on speech motor performance. J Speech Lang Hear Res. 2003;46: 1234–1246. 32. Dromey C, Nissen S, Roy N, Merrill RM. Articulatory changes following treatment of muscle tension dysphonia: preliminary acoustic evidence. J Speech Lang Hear Res. 2008;51:196–208. 33. Dromey C. Laryngeal articulatory coupling in three speech disorders. In: Van Lieshout P, Maasen B, Terband H, eds. Speech Motor Control: New Developments in Basic and Applied Research. Oxford, UK: Oxford University Press; 2010:283–296. 34. Cookman S, Verdolini K. Interrelation of mandibular laryngeal functions. J Voice. 1999;13:11–24. 35. Horii Y. Frequency modulation characteristics of sustained /a/ sung in vocal vibrato. J Speech Hear Res. 1989;32:829–836. 36. Larrouy-Maestri P, Morsomme D. The effects of stress on singing voice accuracy. J Voice. 2014;28:52–58. 37. Prame E. Vibrato extent and intonation in professional Western lyric singing. J Acoust Soc Am. 1997;102:616–621.

The effects of emotional expression on vibrato.

The purpose of this study was to investigate the effect of emotional expression on several acoustic measures of vibrato, including its rate, extent, a...
596KB Sizes 1 Downloads 6 Views