Cognition 133 (2014) 443–456

Contents lists available at ScienceDirect

Cognition journal homepage: www.elsevier.com/locate/COGNIT

Eye spy: The predictive value of fixation patterns in detecting subtle and extreme emotions from faces Avinash R. Vaidya ⇑, Chenshuo Jin, Lesley K. Fellows Montreal Neurological Institute, Dept. of Neurology & Neurosurgery, McGill University, 3801 University St., Montreal, QC H3A 2B4, Canada

a r t i c l e

i n f o

Article history: Received 26 November 2013 Revised 9 July 2014 Accepted 10 July 2014

Keywords: Emotion Face perception Expression Modeling Eye-tracking Fixation

a b s t r a c t Successful social interaction requires recognizing subtle changes in the mental states of others. Deficits in emotion recognition are found in several neurological and psychiatric illnesses, and are often marked by disturbances in gaze patterns to faces, typically interpreted as a failure to fixate on emotionally informative facial features. However, there has been very little research on how fixations inform emotion recognition in healthy people. Here, we asked whether fixations predicted detection of subtle and extreme emotions in faces. We used a simple model to predict emotion detection scores from participants’ fixation patterns. The best fit of this model heavily weighted fixations to the eyes in detecting subtle fear, disgust and surprise, with less weight, or zero weight, given to mouth and nose fixations. However, this model could not successfully predict detection of subtle happiness, or extreme emotional expressions, with the exception of fear. These findings argue that detection of most subtle emotions is best served by fixations to the eyes, with some contribution from nose and mouth fixations. In contrast, detection of extreme emotions and subtle happiness appeared to be less dependent on fixation patterns. The results offer a new perspective on some puzzling dissociations in the neuropsychological literature, and a novel analytic approach for the study of eye gaze in social or emotional settings. Ó 2014 Elsevier B.V. All rights reserved.

1. Introduction Day to day social situations require us to continuously interpret the emotional states of individuals with whom we interact. We use information from many sources in forming these interpretations, including body language, tone of voice and contextual factors (Barrett, Lindquist, & Gendron, 2007; Meeren, van Heijnsbergen, & de Gelder, 2005). The communication of emotional state through facial expressions has long been of particular interest, as stereotyped emotional expressions are well conserved

⇑ Corresponding author. Present address: Rm 276, Montreal Neurological Institute, McGill University, 3801 University St., Montreal, QC H3A 2B4, Canada. Tel.: +1 514 398 2083. E-mail address: [email protected] (A.R. Vaidya). http://dx.doi.org/10.1016/j.cognition.2014.07.004 0010-0277/Ó 2014 Elsevier B.V. All rights reserved.

across species and are thought to be universal among humans (Darwin, 1896; Ekman & Friesen, 1971). Recognizing these basic emotions requires searching for and detecting the emotional content in a face. Expressive information is largely conveyed through dynamic changes in facial features such as the width of the eyes, position of the jaw, or the curving of the lips (Calder, Burton, Miller, Young, & Akamatsu, 2001). The distinct pattern of features involved in each expression suggests that sampling of information-rich features might be an effective strategy for distinguishing between facial emotions. Smith, Cottrell, Gosselin, and Schyns (2005) confirmed that individual features observed in isolation are more or less useful in distinguishing between basic emotional expressions (e.g. eyes were more useful for fear, mouth for happiness), by requiring participants to judge an emotional expression where only parts of the face were visible (Bubbles method;

444

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

(Gosselin & Schyns, 2001)). Smith et al. (2005) demonstrated the sufficiency of specific facial features in conveying different emotions. However, it is not clear how sampling facial features informs emotion recognition in the more usual case, when we search for emotional content in a fully visible face. Tracking eye movements while participants assess facial emotions provides information about which features are foveated, and thus processed with the greatest visual acuity and contrast sensitivity (Robson & Graham, 1981). Fixations to face stimuli are generally distributed within the central features: mostly the eyes, nose and mouth (Bindemann, Scheepers, & Burton, 2009; Haith, Bergman, & Moore, 1977; Janik, Wellens, Goldberg, & Dell’Osso, 1978; Yarbus, 1967). The pattern of gaze is functionally important for detecting emotional expressions: Asking healthy subjects to fixate away from the eye region worsens their ability to recognize facial emotions (Peterson & Eckstein, 2012; but also see Arizpe, Kravitz, Yovel, & Baker, 2012). Studies of clinical populations impaired in detecting emotional expressions also commonly show disturbed gaze behavior. Patients with schizophrenia, autism and prosopagnosia demonstrate unusual fixation patterns while examining facial stimuli (Klin, Jones, Schultz, Volkmar, & Cohen, 2002; Pelphrey et al., 2002; Schwarzer et al., 2007; Streit, Wolwer, & Gaebel, 1997). Adolphs et al. (2005) found that a patient with bilateral amygdala damage avoided fixating the eye region and was impaired at detecting fear in emotional faces. Asking this patient to directly fixate the eyes restored her recognition of fearful expressions to control levels. These findings suggest that orienting foveal vision to regions of the face with diagnostic emotional content has an important role in emotion recognition. The presence of altered fixation patterns in clinical populations, and evidence that normal variation in fixation patterns can be related to social functioning (e.g. autistic traits are associated with less frequent fixations to the eyes (Chen & Yoon, 2011; Freeth, Foulsham, & Kingstone, 2013)) imply an important mechanistic link between fixation patterns and emotion recognition. Critically, abnormal fixation patterns in clinical populations do not disrupt detection of all emotions (Adolphs et al., 2005). Diagnostic emotional content of faces may not always reside in the eyes: The mouth, nose or brow may be important, depending on the emotion being examined (Blais, Roy, Fiset, Arguin, & Gosselin, 2012; Smith et al., 2005). The functional importance of the ‘normal’ pattern of fixations to face stimuli in emotion recognition is unknown. Eisenbarth and Alpers (2011) found a slightly greater preference for the mouth while viewing happy faces and an increased tendency to look toward the eyes of angry and sad faces in early fixations. This study suggested that fixations are somewhat biased to certain features, but did not assess if directing gaze to these regions benefitted emotion detection. Fixations may not necessarily reveal what visual information is being actively attended (Posner, 1980; Remington, 1980, although see Deubel & Schneider, 1996). Instead, we may fixate a point to maximize access to facial information using parafoveal vision rather than directly fixating the most informative

features (Hsiao & Cottrell, 2008; Peterson & Eckstein, 2012). These findings raise doubts about the importance of fixation patterns for emotion recognition, despite the circumstantial evidence from clinical populations. We suspect that task differences explain the inconsistent emphasis on fixation patterns in the literature: fixation to informative features might be more critical for recognizing subtle compared to extreme emotional expressions (Adolphs, 2002). While emotional changes in expression might require acute foveal vision when subtle, the emotional content of extreme expressions is visible at a parafoveal resolution. In the current study we examined the role of fixations to facial features in emotion detection during free exploration of face stimuli by healthy participants. We developed a simple model to predict emotion detection scores using the weighted sum of participants’ fixations to facial features. This model allowed us to examine how fixations to individual facial features contribute to emotion detection under different conditions. We hypothesized that directing fixations to features with greater diagnostic emotional content in high spatial frequencies would predict successful emotion recognition. We used an ideal observer analysis to determine the diagnostic emotional content of subtle and extreme emotional stimuli along a range of spatial frequencies. We predicted that the relevance of fixations to emotion detection would be greater in the recognition of subtle than extreme emotions, where the emotional content is present at lower frequencies, and might not require discrete foveal processing. We also tested secondary questions, examining the effects of varying task demands (instructing participants to search for a specific emotion, or simply to categorize faces by emotion), emotional content (neutral versus afraid, disgusted, happy or surprised) and the signal strength of the expression (subtle versus extreme emotions) on fixation patterns. 2. Methods 2.1. Participants Thirty-two participants volunteered for this study. Four were excluded either because they met exclusion criteria (history of psychiatric or neurological disease, head trauma, regular use of psychoactive drugs) or because eye-tracking data of sufficient quality could not be acquired. Of the remaining 28 participants, 18 were female, with a mean age of 24.57 years, SD = 4.8 years. The McGill University Research Ethics Board approved the study protocol and all participants gave written informed consent. 2.2. Apparatus The experiment was programmed using E-Prime 1.2 (Psychology Software Tools, Inc., Pittsburgh, PA, USA). Participants’ heads were stabilized using a headrest and stimuli were presented on a 19-inch monitor (Dell Inc., Round

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

Rock, TX, USA) approximately 57 cm from participants’ heads. Monocular recordings of participants’ dominant eye were acquired using an Eyelink 1000 system with a desk-mounted camera (SR Research Ltd., Mississauga, Ontario, Canada). Participants’ responses were recorded using a standard computer mouse (Dell Inc., Round Rock, TX, USA). 2.3. Emotion rating task Participants completed an emotion rating task similar in design to a task used by Adolphs and Tranel (2004), as well as Heberlein, Padon, Gillihan, Farah, and Fellows (2008) and Tsuchida and Fellows (2012). The task was split into four randomized blocks where participants were directed to judge a series of face stimuli for a different emotion (happiness, fear, disgust and surprise) in each block. These four emotions were selected from the basic six (Ekman & Friesen, 1971), as they represent relatively distinct domains along dimensions of arousal and affect. Happiness and disgust share medium arousal, but are opposites in affect. Fear and surprise both involve high arousal, but surprise is relatively neutral in affect, whereas fear is more negative (Adolphs, 2002; Russell, 1980). Stimuli included neutral, subtle and extreme expressions for eight different actors (Fig. 1a). All subtle and extreme expressions for each actor and each emotion were shown once per block, while neutral emotional stimuli were shown three times per block. The higher frequency of neutral stimuli was intended to provide a strong baseline for calculating participants’ sensitivity to the expressions of each actor (see below). Thus, participants rated each stimulus for each emotion (e.g. rating fearful faces for happiness, disgust, etc.), with only a fraction of the stimuli congruent with the target emotion in each block (e.g. rating fearful faces for fear). At the beginning of each block, the eye-tracker was calibrated using a 9-point calibration sequence covering the entire screen. At the beginning of each trial, participants were required to fixate an eccentric 0.6  0.6° fixation cross, 9.4° to the left or right of the center of the screen (randomly alternating to avoid effects of the starting point of fixation (Arizpe et al., 2012)), for a continuous 1000 ms. This process also served to ensure the quality of calibration throughout the block: Failure to maintain fixation in a 1.7  1.7° box around the fixation cross in a period of 4000 ms would cause the fixation slide to repeat. After

Fig. 1. Example stimuli. (a) Examples of neutral, subtle and extreme afraid stimuli. (b) Example of division of neutral stimulus into different facial features.

445

three consecutive failures, the eye-tracker would be recalibrated. The fixation cross was followed by central presentation of the face stimulus. The face would disappear 2000 ms after participants’ eye position passed the border of the face stimulus, ensuring that the time to explore the face was not affected by the latency of the first saccade to the stimulus. After the face disappeared, participants were asked to rate the stimulus for the target emotion in that block on a 10-point Likert scale (1—not at all to 10—extremely happy, disgusted, etc.) using the computer mouse. As in a prior study (Tsuchida & Fellows, 2012), we converted Likert ratings to a proportional difference (pD) score that compensated for between-subject differences in baseline ratings and the range of the Likert scale utilized (see Section 3.1 for more details). 2.4. Emotion labeling task We also aimed to compare fixation patterns when participants were instructed to search out a specific emotion, and when they were asked to simply categorize emotional expressions. Following four blocks of emotion rating, participants completed a labeling task that was identical in structure to the rating task, but did not have any directions regarding the emotion to be rated. Instead, participants were asked to select the emotional label that they felt best described the face stimulus (‘happy’, ‘surprised’, ‘disgusted’ or ‘afraid’) after the stimulus had disappeared from the screen. The position of these labels was randomized in each trial. 2.5. Stimuli Extremely emotional afraid, angry, disgusted, happy, sad, surprised and neutral face stimuli were selected from the Karolinska Directed Emotional Faces stimulus set (Lundqvist, Flykt, & Olman, 1998) for six female and six male actors. Subtle facial expressions were generated by morphing extreme and neutral facial expressions using the software Popims animator (http://meesoft.logicnet.dk/). This software carries out a continuous spatial transformation between two images by realigning image elements through warping pixel locations, and computing a weighted average of pixel luminance values (Steyvers, 1999). All stimuli were converted to grayscale, cropped around central facial features to fit a 17.2  22.4° ellipse, and corrected for mean luminance (Fig. 1a). Subtle emotional stimuli were selected from morphs of 20%, 30% and 40% between neutral (0%) and extreme (100%) stimuli created for each emotion and each actor. To select morphs for the main eye-tracking experiments, we asked a separate group of five participants (two females, mean age = 26.40 years, SD = 5.12 years, all meeting the same exclusion criteria applied to the larger group) to complete a longer version of the rating task described above, without eye-tracking. To ensure that subtle emotional stimuli were balanced with respect to the variation in actors’ expressions, and of approximately equal difficulty across actors and emotions, we computed a target pD score that equally weighted these factors using the average pD scores of these five participants. This target

446

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

pD score was an average of three values: half the pD score for an actor’s extreme emotional expression, half the pD score for that extreme emotion collapsed across actors, and half the pD score for that extreme emotion collapsed across actors and emotions. We selected morphed stimuli with pD scores that had the smallest absolute difference from this target pD score for the main eye-tracking experiments. Two actors (1 male and 1 female) were later removed entirely from the stimulus set due to hair entering the frame of the ellipse. 2.6. Ideal observer analysis To characterize the information content of the stimuli used in this task, we submitted these images to an ideal observer analysis (Geisler, 2003). This analysis identified the pixels in each image that best discriminated subtle or extreme emotions from neutral faces. Our procedure was very similar to the ideal observer analysis used by Smith et al. (2005), testing the diagnostic content of faces along a spectrum of spatial frequencies. White Gaussian noise was added to the original image, and it was band-pass filtered into five, discrete frequency bands of one octave (120–60, 60–30, 30–15, 15–7.5 and 7.5–3.75 cycles/image (here equivalent to cycles/face)). Images were iteratively sampled with a Gaussian window centered at random locations. The scale of this window was adjusted for the frequency band of the image to reveal 6 cycles per window. The ideal observer would then categorize the image using this sparse input by comparing the Pearson correlation of the sample with the same pixels in the original emotional and neutral faces and choosing the category with the higher correlation. Thus, the ideal observer was programmed to identify visual information that distinguished subtle and extreme emotions from neutral faces. This process was repeated until every pixel had been sampled a minimum 100 times. After this sampling was completed, the accuracy of each pixel in categorizing faces as emotional or neutral was calculated by dividing the total number of accurate classifications by the number of samples. To control the informational value of the ideal observer’s accuracy across faces and spatial frequencies, we found the pixels that were most informative compared to the average accuracy by calculating the z-score accuracy of all pixels in each frequency band in each face (i.e. relative to the mean and standard deviation of ideal observer accuracy for all pixels in a the same face and spatial frequency). This process resulted in a normalized map of diagnostic visual information for each face, which was then averaged together across actors. 2.7. Eye-tracking analysis Fixations were defined using the online parser of the Eyelink 1000: saccades were identified using a velocity threshold of 30°/s, an acceleration threshold of 9500°/s2 and a distance threshold of more than 0.1°. In-house written Matlab (Mathworks, Natick, MA, USA) tools were used to determine the number of fixations made to different regions of the face. Regions were defined for each stimulus

using horizontal bands that parcellated faces into forehead (top of stimulus to start of eye-brows), eyebrows (top of eye-brows to upper orbit), eyes (upper orbit to bottom orbit), nose and cheeks (from bottom of orbit to tip of nose), mouth (from tip of nose to bottom of lower lip), and chin (from bottom of lower lip to bottom of stimulus) (Fig. 1b). The number of fixations to different features was collapsed across actors, to give an average for subtle and extreme expressions in each block. The effects of emotional display on fixation patterns were assessed within the labeling task, as fixations here were not affected by instruction to search out a specific emotion. We assessed both the number of fixations and the frequency of first and second fixations to facial features. By taking the difference of these measures for emotional and neutral stimuli, we could test whether emotional stimuli drove increases in fixations to any facial features. We separately tested the effects of task instructions on gaze by taking the difference between the number and frequency of first and second fixations to features in the rating and labeling tasks. To avoid stimulus-derived effects, we only compared fixation data for neutral stimuli in these two tasks. 3. Results 3.1. Behavioral analysis As in previous studies using this paradigm (Adolphs & Tranel, 2004; Heberlein et al., 2008), we first calculated the difference between ratings of emotional stimuli and the average rating of the neutral stimulus for each actor. This score reflects participants’ ability to distinguish subtle and extreme emotions from a neutral baseline. To control for idiosyncratic differences in the range of the scale used for rating expressions, and for differences in the expressivity of different actors, we divided the difference score by the range of ratings (difference of maximum and minimum score) each participant used for each actor, as in Tsuchida and Fellows (2012). This proportional difference (pD) score was then averaged across actors to generate a score that reflected participants’ detection of emotional expressions. 3.2. Behavioral data – rating task We compared participants’ pD scores in the emotion rating task to assess their ability to detect subtle and extreme facial emotions. pD scores were tested using a three-way repeated-measures ANOVA with stimulus expression (afraid, surprised, disgusted or happy), degree (extreme or subtle) and target emotion (the emotion rated) as separate factors. There was a significant three-way interaction between stimulus expression, emotion degree and target emotion, F (9, 243) = 346.35, p < 0.0001 (Fig. 2). To evaluate the effects of emotional degree on detection, we compared the pD scores for subtle and extreme emotions when target emotion and stimulus expression were congruent through post hoc Bonferroni t-tests. pD scores for subtle emotions were significantly lower than extreme emotions in all four cases

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

(p’s < 0.001), indicating that participants rated subtle morphs as less emotional than extreme expressions. To determine whether participants could distinguish between expressions, we contrasted the pD scores of different stimulus expressions for the same target emotion using post hoc Bonferroni t-tests. pD scores for subtle fear, disgust and happy expressions were higher than other subtle emotional stimuli when these stimuli were congruent with the target emotion for the block (p’s < 0.05). pD scores for subtle surprised stimuli were significantly higher than happy and disgusted stimuli (p’s < 0.001), though not subtle afraid stimuli (p = 0.4). In the case of extreme emotional stimuli, pD scores for all four emotions were significantly higher when stimuli were congruent with the target emotion (p’s < 0.001). Participants thus distinguished between different emotional expressions, even for most subtle expressions. We also assessed if participants could distinguish between emotional categories by comparing pD scores for the same expressions between target emotions. Posthoc Bonferroni t-tests showed that pD scores for subtle disgusted and happy stimuli were significantly higher when participants rated these expressions for congruent target emotions (p’s < 0.001). pD scores for subtle afraid stimuli were significantly higher when rated for fear than when rated for happiness or disgust (p < 0.001), though not different from when participants rated these stimuli for surprise (p = 0.7). pD scores for subtle surprised stimuli were significantly higher when rated for surprise than disgust or happiness (p’s < 0.001), but not significantly different than when rated for fear (p = 0.5). In the case of extreme

447

stimuli, pD scores were consistently higher when participants were asked to rate these expressions for congruent emotions, compared to blocks where they were asked to rate faces for any other emotion (p’s < 0.001). Hence, participants could distinguish emotional categories for all extreme emotions and nearly all subtle emotions. However, participants had some difficulty distinguishing subtle fear and surprise, likely owing to the greater similarity of these emotions in terms of their physical attributes (Calder et al., 2001). 3.3. Labeling task To determine whether participants selected appropriate labels for emotional faces, we also examined accuracy in the emotion-labeling task, summarized in Table 1. We found that performance in this task was affected by a 2way interaction between expression degree and the emotion of the stimulus, v2(3) = 20.10, p < 0.001, two-way Friedman ANOVA. Post-hoc Wilcoxon signed-rank tests revealed that accuracy for extreme emotions was significantly higher than subtle emotions (p’s < 0.002), with the exception of fear (p = 0.1). Accuracy for happy expressions was higher than other emotions at both extreme and subtle levels (p’s < 0.01). Participants were less accurate in labeling extreme fear compared to extreme disgust or surprise (p’s < 0.01), and tended to be worse at labeling subtle fear compared to subtle disgust (p = 0.08), though there was no difference in accuracy for subtle surprise (p = 0.4). Accuracy for extreme disgust and surprise stimuli were not different (p = 0.6), though accuracy for subtle disgust

Fig. 2. Proportional difference (pD) scores in the emotion rating task. Each panel shows scores for blocks with a different target emotion, and stimulus expression on the horizontal axis. pD scores for judgments of (a) fear, (b) disgust, (c) happiness and (d) surprise. Error bars represent 1 SD.

448

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

Table 1 Percentage of accurate responses in emotion labeling task. Values represent median accuracy with range in parentheses. Stimulus expression

Subtle Extreme

Afraid

Disgusted

Happy

Surprised

50 (12.5–87.5) 75 (12.5–100)

75 (25–87.5) 87.5 (62.5–100)

87.5 (37.5–100) 100 (75–100)

37.5 (0–75) 87.5 (12.5–100)

was significantly higher than subtle surprise (p = 0.0004). Altogether, participants were more accurate in labeling extreme emotions compared to subtle emotions, with the exception of fear. Accuracy was also generally higher for happiness and disgust compared to surprise and fear.

the images shown in Fig. 4. These images show the features that best distinguished extreme and subtle emotional faces from their neutral counterparts.

3.4. Ideal observer analysis

Fixations to face stimuli were almost exclusive to central features, namely the eyes, nose and cheeks, and mouth. There were very few fixations outside these areas (collapsed across all conditions: eyebrows: M = 0.018 fixations, SD = 0.013 fixations; chin: M = 0.011 fixations, SD = 0.014 fixations; forehead: M = 0.002 fixations, SD = 0.003 fixations). We therefore focused on only the eyes, mouth, and nose and cheeks in our analysis. We were primarily interested in determining whether fixations to specific facial features predicted detection of emotional content. Fixations provide a means for sampling the environment with high acuity vision, which is particularly important when visual information is present in high spatial frequencies (Robson & Graham, 1981; Tatler, Baddeley, & Gilchrist, 2005). Hence, greater sampling of certain regions of emotional faces may be key to extracting the emotional content of the expression. We expected that fixations to more informative features, as determined by the ideal observer analysis, would predict detection of subtle emotions. Due to interdependencies between the number of fixations to facial features in a limited amount of time, simple correlations between the number of fixations to facial features and emotion detection would not produce meaningful results (Aitchison, 2003; Schilling, Oberdick, & Schilling, 2012). Instead, we created a model to predict emotion detection in the emotion rating task where detection was linearly dependent on the weighted sum of fixations to the eyes, mouth and nose (which were the predominant targets of participants’ gaze, see results) according to D = weyes feyes + wmouth fmouth + wnose fnose, where D is the emotion detection score, w is the weight of fixations and f is the average number of fixations made to each feature on trials where the stimulus and target emotion (the emotion for which participants were being asked to rate stimuli) were congruent. The relative values of the weights assigned to each feature in this model provide an estimate of the contributions of fixations to facial features in determining the emotion detection score. For example, fixations to the eyes might benefit detection (positive weight), harm detection (negative weight), or contribute nothing (weight of zero)). By the same logic, if fixations to the nose make no contribution to detecting fear, we would expect nose fixations to be assigned a value of zero. Thus, the model would predict very low emotion detection scores for participants who devoted most of

We were interested in determining which facial features best distinguished subtle and extreme emotions from their neutral counterparts (similar to the task given to participants, and reflected in their pD scores). Thus, we used an ideal observer that categorized these faces as neutral or emotional using sparse inputs over a range of spatial frequencies, similar to Smith et al. (2005). This analysis created a filter function representing the most diagnostic information for each image in each frequency band. Fig. 3 shows the average z-score categorization accuracy of pixels in the eyes, nose and mouth regions of the face (defined in the same way as in the eye-tracking analysis), collapsed across actors. All four subtle emotions showed a similar pattern, with the eyes containing the strongest diagnostic information across high and low frequencies, the nose and cheeks containing diagnostic information at low, and very high spatial frequencies, and the mouth at higher frequencies. However, there was relatively more diagnostic information in the nose and mouth for subtle happiness than for the other subtle emotions. In extreme emotional faces, there were larger differences in the distribution of diagnostic information between emotions. Extreme afraid faces showed a fairly similar pattern to subtle emotions, while there appeared to be more diagnostic information in the nose and cheeks at low frequencies (7.5–15 cycles/image) for extreme disgusted and happy faces. Interestingly, in extreme happiness, the mouth and eyes contained almost identical diagnostic information at high frequencies. Extreme surprise differed from all other emotions, in that information from the eyes only had diagnostic value at high spatial frequencies. Additionally, this was the only case where the nose and cheeks had relatively little diagnostic information, even at lower frequencies. To visualize the diagnostic information in emotional faces, as assessed by the ideal observer, we produced a set of images showing the most informative pixels for each emotion. These images were created by setting a threshold on diagnostic filter function for each emotion (averaged across actors), in each spatial frequency to only include pixels where the z-score accuracy was above 1. This map was the multiplied by the spatial filtered images from one actor in the stimulus set and the resultant images were averaged together across spatial frequencies to produce

3.5. Emotion recognition model

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

449

Fig. 3. Ideal observer performance expressed as the average z-scored accuracy of pixels within facial features across spatial frequency bands. (a) Subtle emotional expressions, and (b) extreme emotional expressions. Error bars indicate SEM.

Fig. 4. Faces filtered for optimal information distinguishing emotional from neutral expressions, as determined by an ideal observer. The average diagnostic filter was thresholded to only include pixels where the z-score diagnostic accuracy was greater than 1. The filter was then smoothed before being multiplied by stimuli from one actor in the stimulus set (’F01’), for the purposes of visualization. The resulting images from each spatial frequency band were then averaged together to produce the faces shown here.

their fixations to the nose. This model assumes a linear relationship between the number of fixations and emotion detection, as we were interested in testing if

fixations had simple cumulative benefit for social performance, as some literature suggests (Freeth et al., 2013; Klin et al., 2002).

450

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

To test this model, the dataset was divided in half, separating trials based on the actor shown on screen, so each half included two male and two female actors. The overall difficulty of emotion detection in each half of the dataset was matched by selecting the combination of actors with the minimum absolute difference in average pD score across emotions. The model was fit to one half of the dataset by testing all combinations of w values for each feature, ranging from 1 to 1, in steps of 0.1. The fit of each combination of weights was assessed using a Spearman’s ranked correlation coefficient, which tested if the predicted emotion detection scores derived from participants’ eye-tracking data had a monotonic relationship with participants’ actual pD scores. The weights that gave the best fit were then applied to the ‘test’ half of the dataset to determine if the predictive power of these weights was generalizable. We can infer that the better the fit of the model generated data, the more accurate was the estimate of the feature weights. For the purpose of visually comparing model data to actual pD scores, the values of D were normalized to the range of participants’ pD scores without affecting the relative values of the model data. We found that fitting the model using weights along the range of 1 to 1 resulted in sets of weights that did not

0.3

Fig. 5. Scatterplots showing relationship between pD scores for subtle emotional stimuli for congruent target emotions and model generated data using participants’ fixation patterns from the same trials. By expression: (a) afraid, (b) disgusted, (c) happy and (d) surprised. Spearman correlation coefficient and associated p values are shown in the upper left-hand corner for each scatterplot.

Fig. 6. Scatterplots showing relationship between pD scores for extreme emotional stimuli for congruent target emotions and model generated data. By expression: (a) afraid, (b) disgusted, (c) happy and (d) surprised. Spearman correlation coefficient and associated p values are shown in the upper left-hand corner for each scatterplot.

strongly predict detection of any subtle or extreme emotions, with the exception of subtle and extreme afraid, where all features were given a positive weight (subtle and extreme disgusted, happy and surprised faces: q’s 6 0.306, p’s P 0.1; subtle afraid: q = 0.425, p = 0.02, extreme afraid: q = 0.379, p = 0.05). We then fit the model using only positive weights (i.e. on the range of 0–1) to see if we could better predict emotion detection when the model was fit with a more discreet range of potential weights. Here, the model predictions of emotion detection scores were significantly correlated with participants’ pD scores for subtle fear and disgust (q’s P 0.420, p’s 6 0.03), and there was a trend toward significance with subtle surprise (q = 0.349, p = 0.07; Fig. 5). However, the model was not able to predict detection scores for subtle happiness (q = 0.249, p = 0.2), nor could the model predict participant pD scores for extreme disgust, happiness or surprise (q’s 6 .259, p’s P 0.2; Fig. 6). Overall, the model could predict detection of most subtle emotions relatively well, but predictions for most extreme expressions, and subtle happiness, were much weaker. Table 2 summarizes the weights of features that gave the best fit to the data for each emotion, when the data were fit with weights ranging from 0 to 1. The weights have been normalized to a range of 0–1 for easier compar-

Table 2 Weights applied to facial feature fixations that provided the best fit for participants’ proportional difference scores in the emotion rating task. Values have been normalized to a range of 0–1. Subtle

Eyes Mouth Nose

Extreme

Afraid*

Disgusted*

Happy

Surprised^

Afraid*

Disgusted

Happy

Surprised

0.50 0.22 0.28

0.80 0.20 0.00

0.38 0.62 0.00

0.78 0.00 0.22

0.23 0.38 0.38

0.25 0.42 0.33

0.0 0.0 1.0

1.0 0.0 0.0

Spearman correlation coefficient for relationship of model detection scores and participant proportional difference (pD) scores. * p 6 .05. ^ p = .07.

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

ison of values between emotions. Fixations to the eyes were given the greatest positive weight for subtle afraid expressions, though mouth and nose fixations were also given relatively strong weights. For subtle surprise and disgust, eye fixations were again given the greatest weight, while fixations to the mouth and nose carried less influence. In contrast, the mouth and nose were given higher weights in detecting extreme fear, with somewhat less weight given to the eyes. The model weights for detection of subtle happiness and other extreme emotions did not fit these data well, and hence the relationship between feature fixations and emotion detection in these conditions is not well characterized by these weights. To appreciate the relationship between pD scores and fixation data, we plotted pD scores as a function of individual differences in the distribution of fixations to the eyes, mouth and nose. Data from the ‘test’ half of the dataset are shown in ternary surface diagrams in Fig. 7, along with

451

the model’s predicted emotion detection scores based on the same fixation data. These plots show how the detection scores of each subject varied according to the composition of fixation patterns by representing detection scores as an interpolated surface plotted over the percentage of fixations made to the eyes, nose or mouth. These plots show the same data used in testing the model (i.e. subject-level average pD scores of trials where the stimulus and target emotion were congruent). In the case of subtle fear, disgust and surprise (Fig. 7a), higher pD scores were concentrated in participants who fixated the eyes more frequently. However, pD scores tended to be lower in participants where fixations were more frequently directed to the nose. In contrast, pD scores for subtle happiness appeared to be higher with more mouth fixations, while pD scores for extreme emotions (Fig. 7b) tended to higher when the eyes, nose and mouth were sampled approximately equally.

Fig. 7. Ternary plots showing relationship between participants’ fixation patterns and emotion detection (expressed as proportional difference (pD) scores), and model-generated emotion detection scores (arbitrary units) for (a) subtle and (b) extreme emotions. Each graph shows an interpolated topographic map of these emotion detection scores plotted over the relative proportion of fixations made to the eyes, nose and mouth. These maps represent the distribution of performance with respect to face sampling behavior, both as participants acted in reality, and as predicted using the optimal fit model weights.

452

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

Previous work has demonstrated that very few fixations are sufficient for facial recognition (Hsiao & Cottrell, 2008). In a similar vein, is possible that our model failed to predict detection of extreme emotions because only the first few fixations were relevant to detection. To examine this possibility, we tested if our model could predict detection of extreme emotions using only participants’ first three fixations. We found that the model’s predictions were generally worse than when provided with all fixations (afraid: q = 0.042, p = 0.8, disgusted: q = 0.222, p = 0.2, happy: q = .0134, p = 0.5, surprised: q = 0.02, p = 0.9). Thus, the first few fixations did not have greater predictive value for detection of extreme emotions. We chose to focus on the number of fixations to individual features because this provides a strong measure of the extent to which participants explored the stimulus. An alternative index, the percentage of total fixations, can produce misleading results due to individual differences in participants’ willingness to explore stimuli. We also tested the model using dwell time on facial features as an alternative measure. The pattern of results was essentially identical to what was found using the number of fixations, though the fit of the model detection scores to participants’ pD scores was worse overall (data not shown). 3.6. Effects of facial emotion on fixation patterns As a secondary question, we also assessed how facial emotions affect gaze patterns by comparing fixations to emotional faces and neutral faces in the labeling task. This task was free of any instructions regarding which emotions

subjects were supposed to judge, making it well suited for evaluating stimulus-driven effects. We used fixations to neutral faces as a common baseline, asking if facial features of subtle or extreme emotional expressions were fixated more, using one-tailed paired t-tests, controlled for multiple comparisons using a Bonferroni correction. The number of fixations to facial features did not significantly differ between any extreme and neutral expressions (Fig. 8a), t’s(27) 6 1.737, p’s P 0.1. However, participants made more fixations to the mouth region when inspecting subtle happy and disgusted faces, t’s(27) P 3.428, p’s 6 0.007. A similar trend was observed for subtle afraid faces (t(27) = 2.243, p = 0.1), though this effect was not significant. There were no other significant increases in fixations toward features of subtle compared to neutral faces, t’s(27) 6 1.850, p’s P 0.3. To gain a more detailed perspective on fixation patterns, we also examined the frequency with which first and second fixations were directed toward particular facial features in the labeling task. First fixations (Fig. 8b) were directed more frequently to the mouth of extreme surprised stimuli, t(27) = 3.474, p = 0.006, and to the mouth of extreme happy stimuli, though this trend did not reach significance, t(27) = 2.490, p = 0.07. There were no significant increases for any other subtle or extreme stimuli, t(27) 6 2.079, p’s > 0.1. Second fixations were directed more frequently to the nose of extreme disgusted stimuli (Fig. 8c), t(27) = 3.143, p = 0.02, and to the mouth of subtle disgusted and happy stimuli, t’s(27) P 2.644, p’s < 0.05. A similar increase was seen for second fixations to the mouth of subtle afraid stimuli, though this did not reach signifi-

Fig. 8. Effects of stimulus emotion on fixation patterns. (a) Average difference in number of total fixations to facial features of emotional and neutral stimuli. (b) Average difference in frequency of first fixations to facial features of emotional and neutral stimuli. (c) Average difference in frequency of second fixations to facial features of emotional and neutral stimuli. Error bars represent 1 SD. *p < .05, one-tailed t-test against zero, corrected for multiple comparisons.

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

cance, t(27) = 2.366, p = 0.1. Otherwise, second fixations were directed to facial features with equivalent frequency compared to neutral stimuli, t(27) 6 1.698, p’s P 0.4. Thus, the mouth and nose of certain emotional stimuli tended to draw initial fixations. 3.7. Effects of instruction on fixation patterns We also assessed how searching for a specific target emotion affected fixation patterns compared to examining the face to determine an emotional category. For this comparison, we tested whether participants fixated any specific facial features more in each block of the rating task (i.e. rating for fear, disgust, etc.) compared to the labeling task, using one-tail t-tests with a Bonferroni correction for multiple comparisons. In this comparison, we only tested fixations to neutral faces, to avoid effects of emotional expression. The total number of fixations to the mouth, eyes and nose were not significantly different in the rating and labeling task for any target emotion (Fig. 9a), t’s(27) 6 1.97, p’s P 0.1. We tested if first or second fixations were aimed toward specific facial features when searching for a specific target emotion. First fixations to the mouth were more frequent when searching for disgust, happiness (t(27) = 2.179, p = 0.07) or surprise (t(27) = 2.343, p = 0.06), though these differences did not reach significance (Fig. 9b). Other features were not targeted by first fixations significantly more frequently in the rating task, t’s(27) 6 1.96, p’s > 0.1. Second fixations were directed to the eyes more frequently when searching for fear (Fig. 9c), t(27) = 3.523, p = 0.003. However, second fixations were not directed toward any other features significantly more frequently when participants were asked to search for any other emotions t’s(27) 6 1.788, p’s > 0.1. Altogether, there was only a modest tendency to direct gaze to specific features when searching out specific emotions.

453

4. Discussion Here, we asked how fixations inform emotion recognition during the free inspection of faces, for different levels of emotional intensity. We expected that fixations to features that were most informative about a given emotional state at high spatial frequencies would lead to improved performance in detecting subtle expressions. To test this hypothesis, we first determined the information content of face stimuli using an ideal observer analysis. We then created a model that predicted emotion detection scores based on fixation patterns directly measured in healthy participants, in order to test how fixations to different regions of the face may have contributed to emotion detection performance. 4.1. Role of fixations in emotion recognition The model used here predicted emotion detection scores using a weighted linear combination of fixations to facial features. These weights provide an estimate for the relative value of fixations to particular features in detecting different emotions. When we tried to use this model to predict the observed emotion detection scores using weights on a range of 1 to 1, we had very limited success. However, using the same model with a more discreet range of weights from 0 to 1, we were able to independently predict emotion detection from patterns of gaze alone. The strength of this simpler model implies that negative weights were not strongly predictive, and likely overfit the data. Thus, it appears that fixations to core facial features either benefit, or have no impact on emotion detection; they do not appear to have negative value for performance, at least in the context of this task. The emotion detection model was able to successfully predict detection of subtle fear and disgust from fixation data alone, with a somewhat worse prediction of subtle

Fig. 9. Effects of task instruction on fixation patterns. (a) Average difference in number of total fixations to facial features in emotion rating and labeling tasks. (b) Average difference in frequency of first fixations to facial features in emotion labeling and rating tasks. (c) Average difference in frequency of second fixations to facial features in emotion rating and labeling tasks. Error bars represent 1 SD. *p < .05, one-tailed t-test against zero, corrected for multiple comparisons.

454

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

surprise. In the case of the first two emotions, the weights assigned by the model appeared to be in line with the features assigned greater diagnostic value by the ideal observer, particularly at high spatial frequencies, as predicted. In particular, the eyes were given the greatest weight, with smaller weights assigned to the mouth and nose. However, the detection model gave zero weight to the mouth in detecting surprise, though the ideal observer assigned diagnostic value to this feature. This difference might explain the poorer fit of the detection model to participants’ surprise detection scores. Unlike other subtle expressions, we did not find a set of feature weights that predicted participants’ detection of subtle happy emotions. The closest model fit gave greater weight to the mouth, in line with the ideal observer analysis, and in contrast to the other subtle emotions where fixations to the eyes had greater weight. The ideal observer analysis revealed that the mouth had more high frequency diagnostic information for both subtle and extreme happiness than other emotions, possibly explaining why fixations to this location were estimated as more informative. There was also conspicuously more low-frequency diagnostic information in the nose and cheeks, which may have sufficed for the detection of happiness, and could explain why fixation patterns were not predictive of happiness detection overall. Happiness is also categorically quite distinct from the other five cardinal emotions (Ekman & Friesen, 1971), and possibly easier to detect (pD scores and accuracy for happy faces were consistently higher than other emotions). Thus, happiness may have both a conceptual and perceptual advantage. Recognition of happy expressions is also relatively spared in several neurological and psychiatric conditions that impair detection of other emotions (Adolphs & Tranel, 2004; Adolphs et al., 2005; Heberlein et al., 2008; Pelphrey et al., 2002; Tsuchida & Fellows, 2012; Uljarevic & Hamilton, 2013); the present results suggest a possible explanation for this dissociation. The model could not accurately predict detection of most extreme emotions from participants’ fixation patterns. The signal strength of the emotional information in extreme expressions was much greater, perhaps making fine-grained analysis of these faces through multiple fixations unnecessary. Indeed, examination of the ternary plots in Fig. 7 indicates that participants performed best when sampling facial features more or less equally. The ideal observer analysis indicated that there was relatively more diagnostic information in the nose and cheeks at lowspatial frequencies for extreme emotions, particularly for disgust and happiness. This low frequency information might have reduced the impact of foveal exploration of these stimuli. Importantly, this might explain why detection of some extreme emotions is sometimes preserved, even when fixation patterns are disrupted (Adolphs et al., 2005). We did not find that the first few fixations were any more informative for predicting extreme emotion recognition. It is possible that participants could have recognized these extreme emotions using the first one or two fixations, as in face recognition (Hsiao & Cottrell, 2008). However, to optimally test this, we would need to more tightly

control participants’ exposure to the stimulus. This idea is supported by recent work by Peterson and Eckstein (2012), who showed that forcing fixation to certain regions of the face can affect recognition of extreme emotions. Thus, the location of the first few fixations might only matter when participants are given very limited exposure to the stimulus and hence can only explore a fraction of the face, although more work will be needed to test this possibility. It was possible to predict detection of extreme fear from fixation patterns. The ideal observer analysis showed that low frequency information had less relative diagnostic value in extreme fear compared to other extreme emotions. The greater importance of high frequency information, requiring foveation, may explain why detection of this emotion could be predicted from fixation patterns. Surprisingly, the model gave greater weight to the nose and mouth regions than the eyes, which are thought to be key in detecting fear (Adolphs et al., 2005; Smith et al., 2005). These weights also differed from the results of the ideal observer, which assigned greater diagnostic value to the eyes and mouth. It is possible that the more even distribution of high frequency diagnostic information between features in extreme fear made fixations to the nose and mouth more useful in detecting this emotion. Human subjects might have also been biased to use highspatial frequency information gleaned from the mouth and nose more than the ideal observer, assigning more value to information from these regions. What can these results tell us about the optimal face scanning strategy for detecting emotional expressions? The eyes appear to contain diagnostic emotional information at both high and low frequencies, making them important in both foveal and parafoveal processing of the face. Foveating the eyes may be particularly important for detection of subtle emotions, as these features carry rich highfrequency information about emotional state, particularly when these states are nuanced (Baron-Cohen, Wheelwright, & Jolliffe, 1997; Calder et al., 2001). The mouth and nose fixations similarly appeared to provide valuable information when fixated, though not for all subtle emotions. In particular, subtle emotional information in the nose and cheeks mostly occurred in lower spatial frequencies, and thus might be detected through parafoveal vision. Fixations to the eyes, or eye region, might allow acute processing of high-spatial frequency information in the eyes, as well as low-spatial frequency information below. Thus, eye fixations might serve as an optimal search strategy for detecting most subtle emotional facial expressions (Peterson & Eckstein, 2012). This may be particularly important in scenarios with greater time pressure than used here. Our results also shed light on how variability in emotion recognition performance arises in the healthy, normal population. Participants who committed more fixations to the nose and mouth did so at the expense of fixating the eyes, leading to worse performance in detecting subtle emotions. This pattern of behavior could reflect avoidance of the eyes related to individual differences in social functioning (Chen & Yoon, 2011; Freeth et al., 2013), a strategic error where participants attempted to take in more information during their brief exposure to the face, or simply a lack of motivation to explore the whole face.

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

4.2. Modeling approach The modeling approach used in the current study allowed us to investigate the relationship between fixation patterns and detection while circumventing the problem of interdependency between the number of fixations to different facial features (Schilling et al., 2012). This problem often goes ignored in the literature (Klin et al., 2002), though the difficulty of interpreting correlations using compositional data has long been recognized (Pearson, 1897). Our model could be adapted to examine the independent contributions of a theoretically unlimited set of interdependent variables to an outcome. However, the model we chose made simplistic assumptions about the relationships between variables (i.e. a linear relationship between feature fixations and detection). Furthermore, this approach was limited by the parsing of fixations into discrete regions of interest on the face, which may not be the best representation of the visual information sampled by the retina. Importantly, future research can learn from both the successes and failures of this model’s assumptions to develop a more sophisticated understanding of the relationship between fixations and emotion detection. 4.3. Effects of display and instruction on fixation This dataset can also address the extent to which the emotional display of a face, and instructions to search out a specific emotion influence fixations. We found that some visually salient features of extreme emotions (like the ‘O’ shaped mouth in surprised faces) capture gaze initially, but total fixations were not different from neutral stimuli. It is possible that because the emotional content of these extreme faces was obvious within the first few fixations, participants did not explore any particular feature more. Fixations to subtly emotional faces tended to be drawn more to the mouth than in neutral expressions, particularly for happiness, similar to the results of Eisenbarth and Alpers (2011). Participants were thus not strongly influenced by the salient features in extreme emotional faces. This result is in line with work suggesting that visual saliency does not command a strong influence when participants inspect scenes with social elements (Birmingham, Bischof, & Kingstone, 2009). Instead, fixations may be biased slightly to features that are informative about subtle emotional displays. We found that instruction to search for a specific emotion only had a modest effect on fixation patterns. First fixations were directed slightly more often toward the mouth when participants were asked to rate faces for happiness, disgust or surprise, while second fixations were directed more toward the eyes when participants were asked to rate faces for fear. Thus, participants preferentially directed their early fixations only slightly more often to features that carried greater diagnostic information about the emotion of interest (Smith et al., 2005). Several studies have shown that distinct facial information is preferentially used depending on the task at hand (e.g. judging gender versus emotion) (Gosselin & Schyns, 2001; Schyns, Bonnar, & Gosselin, 2002; Schyns & Oliva, 1999). However, instructing subjects to search for a spe-

455

cific emotion did not strongly influence their fixation patterns compared to asking subjects to categorize emotions. While both tasks rely on the same visual information, instructions did not lead to strategic fixations to features that hold diagnostic information for the specified emotion. These results suggest that fixation patterns are driven more by the emotional content of the face than by instruction. These distinctions may be clinically relevant: bottomup and top-down abnormalities in gaze may be dissociable in patient groups that share abnormal fixation patterns to faces (Birmingham, Cerf, & Adolphs, 2011). Kennedy and Adolphs (2010) have suggested that bilateral amygdala damage impairs bottom-up effects of face stimuli on gaze, whereas top-down modulation of gaze might be impaired in autism spectrum disorders (Neumann, Spezio, Piven, & Adolphs, 2006). 4.4. Conclusions Measurement of fixation patterns during observation of static face stimuli has been used frequently in emotion recognition research. However, there has been little investigation into when and how fixations inform recognition in healthy people. Here, we show that fixation patterns predict detection of most subtle emotional states. However, detection of subtle happiness and of most extreme emotions was not predicted by fixation patterns, indicating that detection in these conditions is not as dependent on fixations. We provide evidence that this distinction results from differences in signal strength and the spatial frequency of diagnostic information. This study sheds light on the mechanisms underlying recognition of emotions, with relevance for future research into the processes that guide fixations toward emotionally informative features, and how these processes are disrupted by brain damage or psychiatric disorders. Acknowledgements This work was supported by an operating Grant from CIHR (MOP 97821). References Adolphs, R. (2002). Recognizing emotion from facial expressions: Psychological and neurological mechanisms. Behavioral and Cognitive Neuroscience Reviews, 1(1), 21–62. Adolphs, R., Gosselin, F., Buchanan, T. W., Tranel, D., Schyns, P., & Damasio, A. R. (2005). A mechanism for impaired fear recognition after amygdala damage. Nature, 433(7021), 68–72. doi:nature03086 [pii] 10.1038/nature03086. Adolphs, R., & Tranel, D. (2004). Impaired judgments of sadness but not happiness following bilateral amygdala damage. Journal of Cognitive Neuroscience, 16(3), 453–462. http://dx.doi.org/10.1162/ 089892904322926782. Aitchison, J. (2003). A concise guide to compositional data analysis. Retrieved September, 2012, from . Arizpe, J., Kravitz, D. J., Yovel, G., & Baker, C. I. (2012). Start position strongly influences fixation patterns during face processing: Difficulties with eye movements as a measure of information use. PLoS ONE, 7(2), e31106. http://dx.doi.org/10.1371/journal. pone.0031106.

456

A.R. Vaidya et al. / Cognition 133 (2014) 443–456

Baron-Cohen, S., Wheelwright, S., & Jolliffe, T. (1997). Is there a ’’language of the eyes’’? Evidence from normal adults, and adults with autism or Asperger Syndrome. Visual Cognition, 4(3), 311–331. http://dx.doi.org/ 10.1080/713756761. Barrett, L. F., Lindquist, K. A., & Gendron, M. (2007). Language as context for the perception of emotion. Trends in Cognitive Sciences, 11(8), 327–332. http://dx.doi.org/10.1016/j.tics.2007.06.003. Bindemann, M., Scheepers, C., & Burton, A. M. (2009). Viewpoint and center of gravity affect eye movements to human faces. Journal of Vision, 9(2), 1–16. 7 http://dx.doi.org/10.1167/9.2.7. Birmingham, E., Bischof, W. F., & Kingstone, A. (2009). Saliency does not account for fixations to eyes within social scenes. Vision Research, 49(24), 2992–3000. http://dx.doi.org/10.1016/j.visres.2009.09.014. Birmingham, E., Cerf, M., & Adolphs, R. (2011). Comparing social attention in autism and amygdala lesions: Effects of stimulus and task condition. Social Neuroscience, 6(5–6), 420–435. http://dx.doi.org/ 10.1080/17470919.2011.561547. Blais, C., Roy, C., Fiset, D., Arguin, M., & Gosselin, F. (2012). The eyes are not the window to basic emotions. Neuropsychologia, 50(12), 2830–2838. http://dx.doi.org/10.1016/J.Neuropsychologia.08.010. Calder, A. J., Burton, A. M., Miller, P., Young, A. W., & Akamatsu, S. (2001). A principal component analysis of facial expressions. Vision Research, 41(9), 1179–1208. Chen, F. S., & Yoon, J. M. (2011). Brief report: Broader autism phenotype predicts spontaneous reciprocity of direct gaze. Journal of Autism and Developmental Disorders, 41(8), 1131–1134. http://dx.doi.org/ 10.1007/s10803-010-1136-2. Darwin, C. (1896). The expression of the emotions in man and animals. New York: D. Appleton and Company. Deubel, H., & Schneider, W. X. (1996). Saccade target selection and object recognition: Evidence for a common attentional mechanism. Vision Research, 36(12), 1827–1837. Eisenbarth, H., & Alpers, G. W. (2011). Happy mouth and sad eyes: Scanning emotional facial expressions. Emotion, 11(4), 860–865. http://dx.doi.org/10.1037/a0022758. Ekman, P., & Friesen, W. V. (1971). Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17(2), 124–129. Freeth, M., Foulsham, T., & Kingstone, A. (2013). What affects social attention? Social presence, eye contact and autistic traits. PLoS ONE, 8(1), e53286. http://dx.doi.org/10.1371/journal.pone.0053286. Geisler, W. S. (2003). Ideal observer analysis. In L. M. Chalupa & J. S. Werner (Eds.), The visual neurosciences (pp. 825–837). Cambridge, MA: MIT Press. Gosselin, F., & Schyns, P. G. (2001). Bubbles: A technique to reveal the use of information in recognition tasks. Vision Research, 41(17), 2261–2271. Haith, M. M., Bergman, T., & Moore, M. J. (1977). Eye contact and face scanning in early infancy. Science, 198(4319), 853–855. Heberlein, A. S., Padon, A. A., Gillihan, S. J., Farah, M. J., & Fellows, L. K. (2008). Ventromedial frontal lobe plays a critical role in facial emotion recognition. Journal of Cognitive Neuroscience, 20(4), 721–733. http://dx.doi.org/10.1162/jocn.2008.20049. Hsiao, J. H., & Cottrell, G. (2008). Two fixations suffice in face recognition. Psychological Science, 19(10), 998–1006. http://dx.doi.org/10.1111/ j.1467-9280.2008.02191.x. Janik, S. W., Wellens, A. R., Goldberg, M. L., & Dell’Osso, L. F. (1978). Eyes as the center of focus in the visual examination of human faces. Perceptual and Motor Skills, 47(3 Pt 1), 857–858. Kennedy, D. P., & Adolphs, R. (2010). Impaired fixation to eyes following amygdala damage arises from abnormal bottom-up attention. Neuropsychologia, 48(12), 3392–3398. doi:S0028-3932(10)00264-2 [pii] 10.1016/j.neuropsychologia.2010.06.025. Klin, A., Jones, W., Schultz, R., Volkmar, F., & Cohen, D. (2002). Visual fixation patterns during viewing of naturalistic social situations as predictors of social competence in individuals with autism. Archives of General Psychiatry, 59(9), 809–816.

Lundqvist, D., Flykt, A., & Olman, A. (1998). The karolinska directed emotional faces (KDEF). Stockholm: Department of Neurosciences Karolinska Hospital. Meeren, H. K., van Heijnsbergen, C. C., & de Gelder, B. (2005). Rapid perceptual integration of facial expression and emotional body language. Proceedings of the National Academy of Sciences of the United States of America, 102(45), 16518–16523. http://dx.doi.org/ 10.1073/pnas.0507650102. Neumann, D., Spezio, M. L., Piven, J., & Adolphs, R. (2006). Looking you in the mouth: Abnormal gaze in autism resulting from impaired topdown modulation of visual attention. Social Cognitive and Affective Neuroscience, 1(3), 194–202. http://dx.doi.org/10.1093/scan/nsl030. Pearson, K. (1897). Mathematical contributions to the theory of evolution: On a form of spurious correlation which may arise when indices are used in the measurements of organs. Proceedings of the Royal Society, 60, 489–498. Pelphrey, K. A., Sasson, N. J., Reznick, J. S., Paul, G., Goldman, B. D., & Piven, J. (2002). Visual scanning of faces in autism. Journal of Autism and Developmental Disorders, 32(4), 249–261. Peterson, M. F., & Eckstein, M. P. (2012). Looking just below the eyes is optimal across face recognition tasks. Proceedings of the National Academy of Sciences of the United States of America, 109(48), E3314–3323. http://dx.doi.org/10.1073/pnas.1214269109. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32(1), 3–25. Remington, R. W. (1980). Attention and saccadic eye movements. Journal of Experimental Psychology: Human Perception and Performance, 6(4), 726–744. Robson, J. G., & Graham, N. (1981). Probability summation and regional variation in contrast sensitivity across the visual field. Vision Research, 21(3), 409–418. Russell, J. A. (1980). A circumplex model of affect. Journal of Personality and Social Psychology, 39(6), 1161–1178. http://dx.doi.org/10.1037/ H0077714. Schilling, K., Oberdick, J., & Schilling, R. L. (2012). Toward an efficient and integrative analysis of limited-choice behavioral experiments. Journal of Neuroscience, 32(37), 12651–12656. http://dx.doi.org/10.1523/ JNEUROSCI.1452-12.2012. Schwarzer, G., Huber, S., Gruter, M., Gruter, T., Gross, C., Hipfel, M., et al. (2007). Gaze behaviour in hereditary prosopagnosia. Psychological Research, 71(5), 583–590. http://dx.doi.org/10.1007/s00426-0060068-0. Schyns, P. G., Bonnar, L., & Gosselin, F. (2002). Show me the features! Understanding recognition from the use of visual information. Psychological Science, 13(5), 402–409. Schyns, P. G., & Oliva, A. (1999). Dr. Angry and Mr. Smile: When categorization flexibly modifies the perception of faces in rapid visual presentations. Cognition, 69(3), 243–265. Smith, M. L., Cottrell, G. W., Gosselin, F., & Schyns, P. G. (2005). Transmitting and decoding facial expressions. Psychological Science, 16(3), 184–189. doi:PSCI801 [pii] 10.1111/j.0956-7976.2005.00801.x. Steyvers, M. (1999). Morphing techniques for manipulating face images. Behavior Research Methods, Instruments, & Computers, 31(2), 359–369. Streit, M., Wolwer, W., & Gaebel, W. (1997). Facial-affect recognition and visual scanning behaviour in the course of schizophrenia. Schizophrenia Research, 24(3), 311–317. Tatler, B. W., Baddeley, R. J., & Gilchrist, I. D. (2005). Visual correlates of fixation selection: Effects of scale and time. Vision Research, 45(5), 643–659. http://dx.doi.org/10.1016/j.visres.2004.09.017. Tsuchida, A., & Fellows, L. K. (2012). Are you upset? Distinct roles for orbitofrontal and lateral prefrontal cortex in detecting and distinguishing facial expressions of emotion. Cerebral Cortex, 22(12), 2904–2912. http://dx.doi.org/10.1093/cercor/bhr370. Uljarevic, M., & Hamilton, A. (2013). Recognition of emotions in autism: A formal meta-analysis. Journal of Autism and Developmental Disorders, 43(7), 1517–1526. http://dx.doi.org/10.1007/s10803-012-1695-5. Yarbus, A. (1967). Eye movements and vision. New York: Plenum.

Eye spy: the predictive value of fixation patterns in detecting subtle and extreme emotions from faces.

Successful social interaction requires recognizing subtle changes in the mental states of others. Deficits in emotion recognition are found in several...
2MB Sizes 0 Downloads 5 Views