Why and how should we study infant cry?

International Journal of Pediatric Otorhinolaryngology, 24 (1992) 145-159 0 1992 Elsevier Science Publishers B.V. All rights reserved 0165.5876/92/$05.00

PEDOT

145

00808

Why and how should we study infant cry?

’

H.S. Gopal and Sanford E. Gerber Department of Speech and Hearing Sciences, University of California, Santa Barbara, CA (USA) (Revised

Key words: Infant;

Cry; Acoustical

(Received 5 March 1991) version received 25 November (Accepted 29 November 199 I)

1991)

analyses

Abstract

The study of the acoustics of infants’ vocalizations has important implications for the study of infant development and for clinical prediction. We suggest procedures that enhance the probability of discriminating among sub-glottal, periglottal, and supraglottal sites of pathology. With these procedures, then, it should become possible to contribute to diagnosis and therefore to treatment.

Introduction

The general answer to why we should study infant cry, at least one which accords with our view, was stated by Lieberman Ill]: ,/ the biological substrate of human speech involves an interplay between biological mechanisms that have other vegetative functions and neural and anatomical mechanisms have evolved primarily for their role in facilitating human vocal communication”

that appear (p. 29).

to

In the same volume, Boukydis [I] considered crying “to be a gradation of a pre-verbal distress signal” (p. 210). His notion, evidently, is that crying and its perception constitute an interpersonal event. Although this is undoubtedly true, for us it is not the primary reason why we should study infant cry or other vocalizations. For us, the vocalizations of infants are to be studied as phonatory-respiratory events including the necessary neural substrates. As Hollien [7] observed, cries relate to some physiological event or condition as well as to behavioral events or states. What follows is a review of our experience to date and our perspective of what may be accomplished by the analysis of babies’ cries. Correspondence to: H.S. Gopal, Department Santa Barbara, CA 93106-7050, USA. ’ Presented at the Pacific Voice Conference,

of Speech

and Hearing

San Francisco,

1990.

Sciences,

University

of California,

146

Fig. 1. Sagittal sections of different species from domestic dog to adult human. (Source: Jeffrey T. Laitman, The evolution and development of the human upper respiratory tract. In: R.J. Ruben, T.R. van de Water, and E.W. Rubel (Eds.), The Biology of Change in Otolaryngology, Amsterdam: Excerpta Medica. 1986.)

Parry and Quattromani [19] described the cries of infants who have airway obstructions, and then said that radiography and bronchoscopy can confirm the diagnosis. But, crying and other vocalizations are signals which also can be used to evaluate neurorespiratory (and phonatory) function. It is for this reason that there has been so much interest in the cries of high risk newborns (see, e.g., Ref. 26). In fact, we had earlier claimed that evaluation of phonatory behavior “may assist in the identification of.. . neonates who have airway obstruction and/or inadequate neuromotor control of respiration” (Ref. 21, p. 1). One of the most important laws of biology, as we all know, is that ontogeny recapitulates phylogeny. Consider, therefore, Fig. 1 [lo]. Beginning at the upper left and proceeding left to right, we see first the domestic dog (A), the domestic cat (B), spider monkey (0, chimpanzee CD), a newborn human (E), and a 58-year-old human (F). Laitman [lo] pointed out the high position of the larynx and the fact that the epiglottis is in contact with the velum in all specimens except the human

Fig. 2. Premature Pediatric

human infant and adult skulls. (Source: Becky L. McGraw and Randolph R. Cole, maxillofacial trauma, Arch. Otolaryngol. Head Neck Surg., 116 (1990) 41-4.5.)

adult. The phylogeny is evident, but also consider the ontogeny as displayed in Fig. 2 [HI. Observe that the infant’s face is small with respect to head size. Craniofacial ratio decreases from 8: 1 in infancy to 2.5 : 1 in adulthood. These basic anatomical facts must lead us to some acoustical expectations, viz., we need first to examine the acoustics of vocalizations of normal infants. Only with this knowledge can we know if the phonatory output of any given infant is what we would like it to be. Consider the consequences of these anatomical differences. A major characteristic of the adult vocal tract is the right-angle configuration it forms at the craniovertebral junction (refer to Fig. 1). The oral cavity lies along a horizontal axis aligned with the base of the cranium, whereas the pharyngeal and laryngeal cavities are along a vertical axis aligned with the spinal cord. This makes the adult vocal tract a cavity that is sharply bent around the oropharynx. This sharply bent configuration is absent in every other animal species as well as in the human infant. In the infant under six months of age, there is only a gradual bending of the oropharyngeal cavity. The infant configuration is due largely to the fact that the epiglottis is positioned very high in the oropharyngeal tract such that it is in contact with the velum. Additionally, the larynx is also positioned quite high, somewhere around C2 (the second cervical vertebra) position. In contrast, the adult larynx and the epiglottis are positioned much lower, i.e., close to C5. This lowered position of the epiglottis and the larynx provides for much larger oropha-

148

ryngeal and pharyngeal cavities. By adulthood, the epiglottis and the velum are well separated and no longer in contact with each other. The descent of the larynx, which takes place around 4 to 6 months of age [9,21], produces the sharp bending of the vocal tract in the human adult. Knowing why there is this dissimilarity in formation between the adult and the infant, what is the significance of this configurational difference? One major consequence is that this allows for simultaneous feeding and respiration in the infant. Infants have no control over the soft palate; therefore, they are obligatory nose breathers. Because the epiglottis and the velum are juxtaposed, infants can continue to breathe while feeding. The larynx needs to be closed off by the epiglottis only during the brief moments of a swallow. Another major consequence, and a more important one from the standpoint of speech production, is that the lack of control of the velum impacts the repertoire of possible speech sounds. First and foremost, the infant is unable to produce consonant sounds. Production of stops, fricatives and affricates requires sufficient amounts of air pressure in the oral cavity. In the human adult, this is made possible by sealing off the nasal passage and narrowing the constriction through which air flows in the oral cavity. In the infant, a failure to seal the nasal cavity results in a loss of air pressure; therefore, production of most consonants is compromised. Second, the distinction between nasal and non-nasal sounds is lost. Generation of nasal vs. non-nasal sounds requires the ability to control the soft palate at will. Absence of this control renders most vocalic sounds, even vowels, as having a strong nasal component 1891. Thus, the developmental anatomy contributes significantly to the type of speech sounds possible in the human infant. This information is of great benefit to us. First, it tells us that, in normal infants, we would expect primarily vocalic (and nasalized) sounds. In our acoustical analyses of cries, we would expect a large amount of periodicity or regularity when dealing with normal cries, Hence, a great degree of irregularity (or aperiodic&) should warn us that something is amiss, e.g., a lack of normal control, a narrowing of the airway, etc. Secondly, these analyses may provide a way to distinguish between airway obstructions below the nasopharyngeal region from those within the nasopharyngeal space. If normal infants’ cries have a high degree of nasality due to lack of control of the soft palate, absence of such nasality could be an indication of nasal passage obstruction. With this objective in mind, a number of investigators has employed acoustical analyses of various sorts of both normal and non-normal infants [4,12,13,16-18,22251. For example, for over 30 years, several Finnish and Swedish researchers [12,13,16-18,22-251 have used the sound spectrograph and conducted systematic acoustical analyses of cries. They have provided large amounts of acoustical data on cries of normal .infants as well as on cries of several different types of non-normal infants. This vast body of research has provided much of the justification and inspiration for using acoustical analyses of cries as an additional diagnostic tool in clinical pediatrics. More recently, sophisticated and automated computer analyses of cries have been undertaken with the same objective in mind. For example, Golub and Corwin

149

[4], employing sophisticated acoustical techniques, analyzed the cries of 87 infants. They extracted 88 acoustic features and, by a grouping of some combination of these features, they designed a set of eight cry tests which was based on their model of cry production. They reported that a majority of the normal cries (about 82%) had none of the abnormal cry features, whereas the majority of the cries from the non-normal infants evidenced one or more abnormal features. These newer, automated, sophisticated analyses provide increasing promise for cry analysis as a powerful non-invasive tool in the future. However, more studies are needed to establish standardized procedures and templates for acoustical analyses of cries before the true potential of cry analysis may be evaluated and harnessed. To reiterate our emphasis, we first need to examine the acoustics of vocalizations of normal infants in a systematic way. This will help us substantially in understanding the development of speech production as well as our attempts to use infant cry analysis for diagnostic purposes.

Hypotheses

In embarking upon our work on infants’ vocalizations, we established some hypotheses. These hypotheses, in turn, could lead us to some fulfilled wishes. The first hypothesis is that some infants don’t sound like some other infants. This, we trust, is self-evident. Those of us who spend time in newborn nurseries have had the experience of a nurse or other care giver commenting that “that kid doesn’t sound right” or “this one has a funny cry.” They are usually right, but “a funny Cl-f’ is not diagnostically useful. It does tell us, though, that these care givers recognize the cry of a well baby, and know when they heard something else. For us, this means that acoustical analyses of infant cries may help to separate the normal babies from the non-normal ones. The second hypothesis is that this recognizably unusual cry is caused by some disturbance, even abnormality, of respiratory function. This disruption could be due to faulty neural control of respiration (including phonation) as might be produced by Down syndrome, cri du chat syndrome, or damage to the recurrent nerve, inter alia. On the other hand, it could be produced by some disarrangement in the airway itself, for example, tracheostenosis, laryngomalacia, or upper airway abnormality. The third hypothesis is that we should be able to tell the difference between subglottal and supraglottal sites of disruption and tell the difference between airway and nervous system origins by performing acoustical analyses of these cries. The fourth hypothesis - maybe this is the one which is a wish - is that we should eventually be able to come to a state of comprehension such that we could know what is wrong. That is, we hope to arrive at a level of understanding such that we would be able to say confidently that, e.g., “Yes, this is clearly the cry of an infant with choanal atresia,” or “Obviously, this baby has had a lesion of the recurrent nerve”. However, this may be a utopian expectation. The relationship between specific disorders and diseases and their acoustic consequences is not

150

one-to-one. As a minimum, though, we hope to achieve a degree of diagnostic skill that could aid us in inferring whether the disorder is structure-based (extra growth, narrowing of passage etc.) or function-based (paralysis of vocal folds, additional vibrating structures etc.). Moreover, it may aid us in locating the site of lesions periglottal, subglottal or supraglottal.

Work to date

The underlying assumption, obviously, of these hypotheses and of the work so far undertaken is that study of the acoustics of infants’ cries leads to an understanding (or a contribution to it) of the anatomy and/or physiology of what we are calling the neurorespiratory system. So, we began our work with the normal infant. Here, too, we made some rather obvious assumptions. One was that infants cry for reasons other than pain, and it is therefore not necessary to inflict pain on a baby in order to record a cry. Indeed, there are properties of the pain cry that may be diagnostically useful, especially cry latency; we haven’t studied them. A second observation was that an infant has only one vocal tract to use for vocalization and it does not change its structure with different causes of crying. That is, an infant uses the same vocal tract for crying, cooing, burping etc. and for pain cries, hunger cries, discomfort cries etc., although it may be that there are differences in the

-8

1

100

I

I11111

I

1000 Frequency

I

I1llll

10,000 in Hz

Fig. 3. One-third-octave band analysis of normal cry. (Source: Sanford E. Gerber, Acoustical analyses of neonates’ vocalizations, Int. J. Pediatr. Otorhinolaryngol., 10 (1985) l-8).)

151

Fig. 4. Spectrum

and waveform

section

of normal

infant cry

acoustic properties of these various cries. Fig. 3 [21 shows a one-third-octave band analysis of cry and non-cry sounds averaged over 16 children who ranged in age from 9 to 62 h, who had no Apgar scores lower than 7, and who had no Prechtl [20] score above 2. These data make the point that there are no important differences between non-cry and cry vocalization in the full-term, well newborn. Admittedly, there are more sophisticated techniques than one-third-octave band analysis; nevertheless, we did not evidence any important differences between crying and other phonatory events. We took one baby of the 16 whom we considered to be typical, and did some more analyses. Fig. 4 is the spectrum and a section of the waveform of that baby’s cry, and Fig. 5 is an analysis that corresponds to them but based on linear predictive coding (LPC). The spectrum contains the detailed acoustical information from both the laryngeal source and the resonances of the vocal tract, whereas the LPC analysis primarily captures the vocal tract resonances. So, we believed that we had established a template, if you will, for both the laryngeal and vocal tract acoustic characteristics of a normal cry. Others have employed different analysis techniques including cinefluorography, one-third and one-half octave band analyses, spectrographic analyses, waveform analyses, spectra1 analysis, and even listener transcriptions. So far, we cannot claim any inherent superiority of our technique. But, we are optimistic that a battery of these analysis techniques may help to provide templates for several broad categories of vocalizations in distinguishing between normal and several non-normal cries. Given the kind of template just described, we supposed it possible to compare a cry of any infant to it. Furthermore, if we knew what was wrong with the infant, we could be on the way to developing a template for that particular problem. For example, an acoustical analysis of infants with unilateral vocal fold paralysis ought to reveal the three commonly expected events in such a case: first, two fundamen-

152

FRE@UE!W‘[ HZ1 Fig. 5. LPC (linear predictive coding) analysis of normal cry.

tal frequencies, one for each fold; second, high frequency noise due to turbulent air passage through the glottis because the vocal folds never close in the mid-line; and, third, weak intensity and inadequate variation, of intensity due to lack of complete glottal closure. All of these ought to be revealed by acoustical analyses. Fig. 6 is a display of the spectrum and a section of the waveform of the cry of a l-week-old infant with a unilateral vocal fold paralysis evidently as a consequence

Why should we study the infant 'near miss for Sudden Infant Death'?

How and why should we implement genomics into conservation?

Pulmonary arterial compliance: How and why should we measure it?

Why we should care how hospitals bill for radiology.

Bioanalytical laboratory automation development: why should we and how could we collaborate?

Cancer and metabolism: Why should we care?

Why should we study change in cognitive function?

Why and how should we measure disease activity and damage in lupus?

How long should we treat?

"An impediment to living life": why and how should we measure stiffness in polymyalgia rheumatica?

Why should we continue to learn?

Declaring independence: why we should be cautious.

How we handle appeals and why.

How and why should we standardize phytopharmaceutical drugs for clinical validation?

Should we treat pyrexia? And how do we do it?

Infant cry sound; developmental features.

How should we treat air leaks?

How much advice should we give?

How aggressively should we treat short stature?

Educational administration: how should we organize?

How should we select? - A sociologist's VIEW.

How should we palliate bladder cancer?

How should we address the pipeline problem?

Barrett's oesophagus: how should we manage it?