Cerebral Cortex Advance Access published May 16, 2014 Cerebral Cortex doi:10.1093/cercor/bhu099

Hierarchical Encoding of Social Cues in Primate Inferior Temporal Cortex Elyse L. Morin1, Fadila Hadj-Bouziane1, Mark Stokes2,3, Leslie G. Ungerleider1 and Andrew H. Bell1 1

Laboratory of Brain and Cognition, NIMH/NIH, Bethesda, MD, USA, 2Oxford Centre for Human Brain Activity and 3Department of Experimental Psychology, University of Oxford, Oxford, UK

Address correspondence to Andrew H. Bell, MRC Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge, CB2 7EF, UK. Email: [email protected]

Keywords: face processing, facial expression, identity, gaze direction, monkey

Introduction Faces are dynamic and complex stimuli that convey information about the identity and emotional state of an individual. A highly influential model of face processing, first proposed by Bruce and Young (1986) and later refined by Haxby and colleagues (2000), suggests that the different aspects of a face, such as its identity and expression, are processed by functionally and anatomically separated pathways. According to this model, regions in the inferior occipital gyrus first process lowlevel aspects of face stimuli (e.g., gender). From there, information concerning the specific aspects of a face is fed into 2 divergent pathways. The first pathway, which projects along the superior temporal sulcus (STS), is thought to be responsible for encoding the changeable aspects of a face (i.e., facial expression, gaze direction, and lip movement) (see also Allison et al. 2000). The second pathway, which projects ventrally along the fusiform gyrus, is thought to be responsible for encoding the invariant aspects of a face (i.e., identity). Collectively, these 2 pathways have been labeled the “core system” for face processing (Haxby et al. 2000). More recently, however, the degree to which these 2 pathways are indeed anatomically segregated, and at what level(s) this segregation is present, has come under debate (Calder and Young 2005; Engell and Haxby 2007; Hoffman et al. 2007; see Graham and LaBar 2012 for review).

Neurons responsive to face attributes have been found throughout macaque inferior temporal (IT) cortex, including the superior and inferior banks of the STS and the IT gyrus (Gross et al. 1969; Bruce et al. 1981; Perrett et al. 1982, 1984, 1985; Desimone et al. 1984; Rolls 2000). More recently, several groups have begun to parcellate these neuronal populations into functionally (and, in some cases, anatomically) distinct groups based on their selectivity for various facial features (Yamane et al. 1988; Hasselmo et al. 1989; Eifuku et al. 2004, 2011; Freiwald et al. 2009; Ghazanfar et al. 2010; Ohayon et al. 2012), which may provide clues as to if, where, and when the pathways responsible for processing faces might segregate. For example, a small proportion of neurons in the inferior bank, fundus, and superior bank of the STS were found to be sensitive to head orientation and gaze direction (Perrett et al. 1985) as well as facial expressions and different identities (Baylis et al. 1985; Hasselmo et al. 1989; Young and Yamane 1992; Sugase et al. 1999). The degree to which these cues interact appears to increase as one moves anteriorly along the STS (De Souza et al. 2005; Freiwald and Tsao 2010). Neurons sensitive to facial identity are found in more anterior and ventral areas of IT cortex (Eifuku et al. 2004; Leopold et al. 2006; Freiwald and Tsao 2010). To provide further neurophysiological evidence addressing how social cues are encoded along the face-processing network, we examined the sensitivity of IT neurons to changeable and invariant properties of face stimuli. We chose to focus our recordings on 2 subdivisions of the inferior bank of the STS in monkeys, approximately corresponding to the faceselective patches recently identified using fMRI (Tsao et al. 2006, 2008; Pinsk et al. 2008; Bell et al. 2011; Ku et al. 2011). Three different facial expressions were used (neutral, fear grin, and threat), each presented with the gaze either directed toward the observer, or averted, from 8 different identities. We targeted 3 fundamental questions concerning face processing in the IT cortex: (1) How are social cues encoded in IT cortex? (2) How does the encoding of social cues progress along the STS?, and (3) Are social cues (i.e., facial expression, gaze direction, and identity) processed by separate neural populations?

Methods Animal Subjects and Recording Techniques All procedures were approved by the National Institute of Mental Health Animal Care and Use Committee and observed all NIH guidelines. Two adult male rhesus monkeys (Macaca mulatta), weighing between 10 and 14 kg, were used in this study. MR-compatible head posts (Applied Prototype, Inc., Franklin, MA, USA) and recording chambers (Crist Instruments, Hagerstown, MD, USA) were surgically implanted under aseptic conditions. The recording chambers were

Published by Oxford University Press 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.

Downloaded from http://cercor.oxfordjournals.org/ at Georgetown University on September 7, 2014

Faces convey information about identity and emotional state, both of which are important for our social interactions. Models of face processing propose that changeable versus invariant aspects of a face, specifically facial expression/gaze direction versus facial identity, are coded by distinct neural pathways and yet neurophysiological data supporting this separation are incomplete. We recorded activity from neurons along the inferior bank of the superior temporal sulcus (STS), while monkeys viewed images of conspecific faces and nonface control stimuli. Eight monkey identities were used, each presented with 3 different facial expressions (neutral, fear grin, and threat). All facial expressions were displayed with both a direct and averted gaze. In the posterior STS, we found that about one-quarter of face-responsive neurons are sensitive to social cues, the majority of which being sensitive to only one of these cues. In contrast, in anterior STS, not only did the proportion of neurons sensitive to social cues increase, but so too did the proportion of neurons sensitive to conjunctions of identity with either gaze direction or expression. These data support a convergence of signals related to faces as one moves anteriorly along the inferior bank of the STS, which forms a fundamental part of the face-processing network.

placed over 19 mm craniotomies in the right hemisphere of both subjects, centered 12–14 mm anterior to the interaural axis. Neuronal data were recorded from the right IT cortex (ventral bank of the STS and the immediately adjacent IT gyrus) using procedures described elsewhere (Bell et al. 2011) (Supplementary Fig. 1). During recording sessions, between 1 and 3 electrodes were lowered into IT cortex, guided by transdural guide tubes held in place by a delrin grid (Crist Instruments). Waveform data were sampled at 40 kHz and later sorted into individual units using Offline-Sorter (Plexon Systems, Dallas TX, USA).

Data Analysis Spike trains were converted into spike-density functions using a normal Gaussian kernel (σ = 10 ms) and summed to yield a single density function for each trial. Neurons were defined as visually responsive if the average firing rate (50–400 ms following stimulus onset) to any/all of the 3 stimulus categories (neutral direct gaze faces, body parts, and objects) was significantly different from baseline (200 ms prior to 50 ms after stimulus onset) (Wilcoxon rank sum tests, P < 0.05). We examined all neurons that showed a significant response to faces (average of all 8 neutral, direct gaze face stimuli), regardless of whether the neuron responded more strongly to faces or stimuli from either non-face category. Data from neurons that responded most strongly to faces versus those that responded to faces but more strongly to stimuli from another category were quantitatively and qualitatively similar and so are considered together and are henceforth collectively referred to in this study as “face-responsive neurons.” Neuronal responses to different facial features (facial expression, gaze direction, and identity) and interactions between the various factors were analyzed using repeated-measures 3-way ANOVAs (factors: expression, gaze direction, and identity). Neurons that showed a main effect of identity, facial expression, or gaze direction were considered sensitive to that feature.

2 Social Cues in Primate Temporal Cortex



Morin et al.

We recorded activity from 637 neurons from the IT cortex (between ∼5–19 mm anterior to the interaural axis) of 2 monkeys (463 from monkey A and 174 from monkey B; see Supplementary Fig. 1 for recording locations), of which 64% (439/637) responded to at least one of the visual stimuli presented (summarized in Table 1). Approximately 44% (278/637) of the neurons encountered were responsive to face stimuli, 2 examples of which are shown in Figure 1. Both of these neurons were highly selective for faces. The neuron on the left also showed a significant effect of facial expression on the magnitude of the face responses (ANOVA, F2,160 = 16.08, P < 10−6), such that the response to threat (direct gaze) was greater than that to either neutral or fear grin (Fig. 1B, left). In contrast, the neuron on the right showed approximately equal responses to all 3 facial expressions (Fig. 1B, right), but showed a significantly greater response to direct gazes (neutral faces) compared with averted gazes (ANOVA, F1,186 = 39.66, P < 10−8) (Fig. 1C, right). Encoding of Social Cues by Individual Neurons in IT Cortex Of the 278 face-responsive neurons sampled, 23% (65/278) were sensitive to facial expression (ANOVAs with significant main effect of facial expression, P < 0.05). The average spikedensity function for these 65 neurons is shown in Figure 2A. At the population level, threatening faces elicited the greatest response, followed by neutral faces. Fear grin/submissive faces did not elicit statistically different responses compared with neutral faces. From these data, it might be tempting to assume that neurons sensitive to facial expressions all exhibited this pattern of activity (threat > neutral > fear grin). However, closer examination of these neurons revealed that, in fact, many showed the opposite effect. Figure 2B shows the relative (normalized) responses to threat versus fear grin for all 278 face-responsive neurons. Filled symbols indicate neurons that exhibited a significant main effect of facial expression (n = 65, 23%). Those shown in red are neurons whose response to threat was greater than fear grin (n = 44/65), and those shown in blue are those whose response to threat was less than fear grin (n = 21/65). Therefore, while at the population level (Fig. 2A, inset), the greatest response was found for threatening faces, a third (32%) of these neurons responded more strongly to fear grin faces. A similar trend was seen for gaze direction (Fig. 2C,D). At the population level, averted gaze stimuli (neutral only) elicited stronger responses among the 28% (77/278) of neurons that showed a main effect of gaze direction (n = 45/77, Fig. 2C,

Table 1 Breakdown of the sampled population of face-responsive neurons Face-responsive neurons (n = 278) Total (monkey A/monkey B) Effect of facial expression Significant Not significant Effect of gaze direction Significant Not significant Effect of identity Significant Not significant

278 (183/95)

44% (40%/55%)

65 (44/21) 213 (139/74)

23% (24%/22%) 77% (76%/78%)

77 (49/28) 201 (134/67)

28% (27%/29%) 72% (73%/71%)

128 (85/43) 150 (98/52)

46% (46%/45%) 54% (54%/55%)

Downloaded from http://cercor.oxfordjournals.org/ at Georgetown University on September 7, 2014

Fixation Task and Stimuli Monkeys were rewarded for passively viewing images of conspecific faces and non-face control stimuli (objects and monkey body-part images). Each trial began with a fixation interval of 100–300 ms. The stimulus was presented for 300 ms followed by an additional 100-ms fixation interval. Eye position was sampled at 240 Hz using an infrared eye-tracking system (ISCAN, Inc., Woburn, MA, USA). Monkeys were given a liquid reward for maintaining stable fixation within 2–3° from a central fixation point for the entire duration of a single trial. Trials where the animal failed to maintain fixation were removed. Forty-eight different conspecific face stimuli were used in this study (Gothard and Erickson 2004; Gothard et al. 2006), selected from the same stimulus set used previously in neuroimaging studies of social cue encoding in the temporal cortex of monkeys (Hoffman et al. 2007; Hadj-Bouziane et al. 2008). They consisted of 8 different facial identities (taken from monkeys at a different institution, therefore not familiar to the 2 subjects used in this study; male and female, juvenile to adult), and each presented with 3 different facial expressions (neutral, fear grin, and threat) with the gaze (head and eyes) directed toward or averted relative to the observer. In addition, 20 images of monkey body parts (e.g., arms, torsos, etc.) and 20 familiar objects (i.e., objects that the monkeys saw/interacted with daily, such as water bottles, toys, sipper tubes, etc.) were also presented (Bell et al. 2008). Stimuli were presented using an LCD monitor (refresh rate: 60 Hz). All stimuli were full color, approximately 5 × 5° in size, and presented on a neutral gray background. They were controlled for overall luminance (variation in image intensity was

Hierarchical Encoding of Social Cues in Primate Inferior Temporal Cortex.

Faces convey information about identity and emotional state, both of which are important for our social interactions. Models of face processing propos...
1020KB Sizes 0 Downloads 3 Views