Validation of the NIMH-ChEFS adolescent face stimulus set in an adolescent, parent, and health professional sample.

HHS Public Access Author manuscript Author Manuscript

Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10. Published in final edited form as: Int J Methods Psychiatr Res. 2015 December ; 24(4): 275–286. doi:10.1002/mpr.1490.

Validation of the NIMH-ChEFS adolescent face stimulus set in an adolescent, parent, and health professional sample MARIKA C. COFFMAN1,†, ANDREA TRUBANOVA1,†, J. ANTHONY RICHEY1, SUSAN W. WHITE1, JUNGMEEN KIM-SPOON1, THOMAS H. OLLENDICK1, and DANIEL S. PINE2 1Department

of Psychology, Virginia Polytechnic Institute and State University, Blacksburg, VA,

USA

Author Manuscript

2Section

on Development and Affective Neuroscience, Mood, and Anxiety Programs, National Institutes of Mental Health Intramural Research Program, Bethesda, MD, USA

Abstract

Author Manuscript

Attention to faces is a fundamental psychological process in humans, with atypical attention to faces noted across several clinical disorders. Although many clinical disorders onset in adolescence, there is a lack of well-validated stimulus sets containing adolescent faces available for experimental use. Further, the images comprising most available sets are not controlled for high- and low-level visual properties. Here, we present a cross-site validation of the National Institute of Mental Health Child Emotional Faces Picture Set (NIMH-ChEFS), comprised of 257 photographs of adolescent faces displaying angry, fearful, happy, sad, and neutral expressions. All of the direct facial images from the NIMH-ChEFS set were adjusted in terms of location of facial features and standardized for luminance, size, and smoothness. Although overall agreement between raters in this study and the original development-site raters was high (89.52%), this differed by group such that agreement was lower for adolescents relative to mental health professionals in the current study. These results suggest that future research using this face set or others of adolescent/child faces should base comparisons on similarly-aged validation data.

Keywords adolescent development; face processing; methodology; emotion perception; face stimulus set

Author Manuscript

Introduction The faces of others hold special significance. Early in development, infants prefer human faces to patterned stimuli that do not resemble faces (e.g. Goren et al., 1975; Kleiner and Banks, 1987; Maurer and Young, 1983). Atypical attention to faces can be measured experimentally using eye-tracking methodology (Horley et al., 2004) and has been observed

Correspondence: Marika C. Coffman, Virginia Polytechnic Institute and State University, Department of Psychology, Blacksburg, VA, USA. [email protected]. †Marika C. Coffman and Andrea Trubanova contributed equally as lead authors on this work. Supporting information Additional supporting information may be found in the online version of this article at the publisher’s web site.

COFFMAN et al.

Page 2

Author Manuscript

across various clinical disorders (Archer et al., 1992). For example, atypical eye contact is often noted clinically in children and adults with various psychological disorders, such as social anxiety disorder, autism spectrum disorder, specific phobia, generalized anxiety disorder, and depression (Greist, 1994; Cowart and Ollendick, 2010, 2011; Jones et al., 2008; Mathews and MacLeod, 2002; Pelphrey et al., 2002, Mogg et al., 2000; Gotlib et al., 2004, Hallion and Ruscio, 2011). Most of these clinical disorders typically onset in late childhood or early adolescence; however, nearly all studies of attention and face processing with child and adolescent samples utilize adult faces.

Author Manuscript

Several stimulus sets of adult faces are widely available and routinely used in behavioural and neuroscience research. These include the Ekman set (Ekman and Friesen, 1976), the Japanese and Caucasian Facial Expressions of Emotion (JACFEE; Biehl et al., 1997), the NimStim set of Facial Expression (Tottenham et al., 2009), the Karolinska Directed Emotional Faces (Lundqvist et al., 1998), and the Naples Computer Generated Face Stimulus set (Naples et al., 2014). Mazurski and Bond (1993) created a stimulus set that contains primarily adult faces, with an additional six child faces included. These stimulus sets range widely in terms of the number of actors, emotions, images provided per condition, posing characteristics of the actors, as well as the method used to validate the emotions depicted in the faces, with most studies using samples of convenience with little regard for the populations that may be viewing the images in clinical studies. To date, children and adolescents have not been involved in the validation of the images used in these studies and, to our knowledge, only one data set is comprised exclusively of images of children and adolescents, the recently developed National Institute of Mental Health Child Emotional Faces Picture Set (NIMH-ChEFs; Egger et al., 2011). However, even with this child stimulus set, the faces were validated only with adults.

Author Manuscript Author Manuscript

It is critical to consider adolescent responses to age-matched stimuli because of documented developmental differences in patterns of facial emotion perception. From a neuroscientific perspective, children and adolescents demonstrate altered activation when viewing faces of children compared to adults (Hoehl et al., 2010; Marusak et al., 2013), and children and adolescents show differential activation on functional magnetic resonance imaging (fMRI) tasks when viewing adult emotional faces compared to adults (Blakemore, 2008). Developmental improvements in the N170, a face-sensitive component observed in electroencephalography (EEG), occur steadily from childhood into adulthood, as indexed by reduced latency to faces with age (Leppänen et al., 2007; Taylor et al., 2004; Taylor et al., 1999). Children and adults display differences when viewing adult faces compared to child and infant faces (Macchi Cassia, 2011; Macchi Cassia et al., 2012). Together, this indicates that face perception may change over time throughout development. In addition, it has been shown that children demonstrate increased responsivity, or faster reaction times, to child faces compared to adult faces (Benoit et al., 2007). These findings clearly illustrate the importance of utilizing peer emotional faces when conducting research with children and adolescents. Furthermore, much of what we currently know of child and adolescent responses to facial stimuli may be incomplete and may not fully capture the developmental effects of facial emotion recognition, given the frequent use of adult stimuli (Marusak et al., 2013). Collectively, these findings highlight the importance of validating adolescent stimuli in an adolescent sample. Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 3

Author Manuscript

The NIMH-ChEFs stimulus set provides a valuable resource for investigators conducting clinical research with children and adolescents. These stimuli have already been used in recent studies, including work on attention retraining in adolescents with anxiety (De Voogd et al., 2014; Ferri et al., 2013) and neuroimaging studies of face processing in typical development (Marusak et al., 2013). However, it is important to consider features within the image sets themselves that may potentially bias results, and to acknowledge that some improvements can be applied to make this stimulus set even more useful. For example, within the stimulus set, some limitations are noted, such as low-level visual properties of these stimuli, or variability in brightness (which resulted in visual “hot spots,” or areas of potentially increased salience due to luminance or spatial frequency), yet have not been controlled, and such limitations could complicate research with neuroimaging tasks and pupillometric assessment (Porter et al., 2007; Rousselet and Pernet, 2011). Further, angles of reference lines on the body relative to the viewer (e.g. tilted or rotated) result in a visual mismatch between the angles at which the actors’ faces were tilted with respect to the gaze of a viewer. Previous studies suggest that face identification accuracy is sensitive to variation in such features (Bachmann, 1991; Costen et al., 1994; Nasanen, 1999). As previously mentioned, this stimulus set has only been validated by adults, and therefore validation by adolescents themselves might add to its usefulness. Just as it is important to standardize evidence-based assessment tools with the clinical population for which they are intended (see McLeod et al., 2013), it is essential to validate facial stimulus sets with children and adolescents for clinical research.


Accordingly, our aims here are two-fold: (1) to standardize this image set with the images along various parameters (i.e. colour, luminance, frequency, and other low-level visual properties), and (2) to further validate the set of adolescent facial expressions with regard to inter-rater agreement of perceived emotion and representativeness, or the degree to which the stimuli appeared to be good representations of the emotions they were intended to depict in adolescents, parents of adolescents, and mental health professional samples. With the subsample of mental health professionals, we wished to replicate the Egger et al. (2011) findings, which were based on ratings from mental health professionals. Parents of teens were selected to rate the stimuli due to exposure to teen facial expressions and also to provide a validated sample for researchers who wish to use the stimuli for caregiver research, and to address our exploratory aim investigating whether there are group differences between adolescents, parents, and mental health professionals.

Methods Stimuli development and image processing

Author Manuscript

All stimuli were generated using procedures described by Egger et al. (2011; see this article for original image development steps1). The final image set from Egger et al. (2011) contained 482 images of 59 children and adolescents aged 10 to 17 years (20 males and 39 females) displaying five different expressions (happiness, anger, sadness, fear, and neutral) with two gaze conditions (direct and averted). As noted by Egger et al. (2011), most actors 1The stimuli were developed by Ellen Leibenluft and Danny Pine at NIMH and first validated by Egger et al. (2011) with mental health professionals.

Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 4

Author Manuscript

were Caucasian, with only one boy and four girls of non-Caucasian ethnicities. The processed image set included direct gaze only (N = 257 out of the 482 images). Adjustment of low-level visual properties

Author Manuscript

To facilitate comparability among image metrics at later steps, all of the images were first sampled into a common pixel space (Figure 1a; 1960 × 3008) while preserving the original resolution of the Egger et al. (2011) images. To create visual landmarks, each face was manually bisected vertically and horizontally, such that the centre of the bridge of the nose to the philtrum was in the centre of a vertical axis and that the area just below the pupils was in the centre of the horizontal axis, resulting in a four-quadrant space whose origin lay at the midline of the face just above the bridge of the nose (Figure 1b). In cases where the actor had tilted his/her head, the image was rotated until the orientation of facial landmarks derived from the drawn vectors matched the horizontal and vertical grid of the background. An oval cutout of a standardized size (182 × 267 pixels) was then placed over the face, thus masking additional identifying information (clothes, hair; Figure 1c.). Due to variability in the physical size of different faces, some images were isotropically scaled for each axis direction, by either increasing the degree of visual angle occupied by the face, or decreasing the size of the face to ensure that critical features (e.g. the eyebrows) were visible. In either case, the resultant image occupied the entire oval cutout so that non-face pixels were minimal. Finally, pixels not falling into the oval were filled in with black (see Figure 1c), resulting in a high-resolution “face-only” image.


Some applications may require extreme precision in the consistency of low-level visual properties across the image set. Accordingly, we employed additional steps to further standardize visual features (brightness, spatial frequency). First, to correct variability in the luminance (brightness), we adjusted all the images to within ±0.0001% brightness using a combination of custom and publicly available code from the Python Image Library (see Figure 1d; PIL; version 1.1.7). In our approach, the voxel-wise brightness along each red– green–blue (RGB) vector was computed and aggregated within image to a scaled global value of 80% of the original brightness (to allow some images to be adjusted upward, and some downward if necessary), constrained to a tuning factor of ±0.0001 variability.2 We then minimized the influence of visual “hot spots” and high frequency data in the images by calibrating images to each other based on overall spatial frequency (Figures 1e and 1f). This procedure had the dual purpose of eliminating bright pixels as well as edges in the images (which tend to attract the eyes during unrestricted viewing; Einhäuser and König, 2003), and thus eliminated unevenly distributed high frequency data as a potential source of noise in patterns of visual attention. We removed high frequency data using a two dimensional (2D) convolution that consisted of weighting each local pixel by a moving 3 × 3 Gaussian kernel for spatial smoothing along the average of its 3 × 3 neighbourhood.3 The ultimate goal of this process was to ensure that studies employing our “adjusted” images could rule out high frequency noise and unevenly distributed brightness as sources of noise in visual attention, leaving only the lower frequency components of the face (e.g. gross anatomy) to drive visual

2For more detail on the validation of our processing steps, please see Supporting Information. 3Processing within each window uses the original pixel value, not the previously calculated values.


COFFMAN et al.

Page 5

Author Manuscript

attention (see Figure 1g for final image). We have made the total image set available at the stages corresponding to images c, d, e, and g of Figure 1.4 Validation of stimuli

Author Manuscript

Online surveys—The total image set (k = 257 direct gaze faces; Neutral = 56, Angry = 52; Happy = 50; Sad = 48, Fear = 51) was randomly split within gender into two separate surveys (Survey 1: Image N = 125 faces; Survey 2: Image N = 132 faces) and uploaded to secure web-based survey platform. We split the total image set into two parts due to the estimated time required for a single user to validate the entire set (approximately 50 minutes), which we considered to be prohibitively long, especially for our adolescent sample. Each participant completed only one of the two surveys. The surveys were identical in format (i.e. each survey contained 8–10 male and 15–18 female faces from each expression group) and differed only on the specific faces that were portrayed. The participants for each of the two surveys saw the images in the same order. However, the images were presented randomly such that no emotion and no actor was presented more than twice in a row. The survey also included questions about the rater’s sex, age category, race/ ethnicity, and, for the mental health professionals, the level of training in the practice of clinical or counselling psychology. The survey informed the raters that they would see pictures of faces portraying various expressions of emotions. Similar to Egger et al. (2011), raters were asked to answer two questions: (1) which emotion does this picture represent, and (2) how accurately does the picture represent the emotion you selected in response to the first question (see Figure 2 for a sample page). For the first question, raters could select one of five expressions (afraid, angry, happy, neutral, or sad), always provided in the same alphabetical order. For the second question, regarding representativeness of the expression of emotions, raters selected a level on a scale from 1 (poorly) to 10 (very well).

Author Manuscript

Recruitment of participants for image validation

Author Manuscript

Recruitment criteria for adolescents included being between the ages of 12 and 17, inclusive. There were no age restrictions for the other two groups. To be included in the parents of adolescents group, the participants had to endorse having at least one child living in the household between the ages of 12 and 17 years.5 Mental health professionals included participants who reported having formal training in the practice of clinical and counselling psychology. Their level of training ranged from pre-masters’ (in graduate school) to completed terminal degree (e.g. PhD). To recruit adolescents and parents of adolescents, we used a variety of databases within our department and issued advertisements on our department’s website, as well as flyers posted in the community. Recruitment of mental health professionals was done via emails to the faculty, graduate students, and alumni of the Department of Psychology and mental health professionals from the community not affiliated with our university. For each survey, one rater was chosen at random from each group to receive a $20 cash prize.

4The image set is available at each of the processing steps (i.e., unadjusted colour, unadjusted grey scale, and all smoothing steps) at www.scanlab.org/downloads.html 5Notably, the parents of adolescents did not necessarily include parents of the adolescents who participated in the survey; parents whose teenagers did not participate in the survey were included, as were teenagers whose parents did not participate in the survey. The anonymity of the survey did not allow for knowledge of parent–child dyads.


COFFMAN et al.

Page 6

Author Manuscript

A total of 129 raters completed the surveys: 41 adolescents [58.5% female, 90.2% Caucasian, mean age = 14.54, standard deviation (SD) = 1.70], 54 parents (83.33% female, 87.03% Caucasian, modal age range = 45–47), and 34 mental health professionals (82.4% female, 88.2%, modal age range = 50 and above). Modal age is reported for parents and mental health professionals, due to the survey set up which asked for age within a range: 33.3% of parents were between the ages of 24 and 41, 50.0% were between the ages of 42 and 50, and 16.6% were above the age of 50 years; for the professionals, 67.6% were between the ages of 24 and 41, 5.8% were between the ages of 42 and 50, and 26.5% were above the age of 50 years. The level of clinical training for the mental health professionals varied from pre-master’s degree (n = 2; 5.9%) to post master’s degree (n = 12; 35.3%) to completed doctoral degree (n = 20; 58.8%). Analytic approach

Author Manuscript

We computed Fleiss kappa for multiple raters from these data to estimate overall agreement, as well as agreement for each emotion type, and agreement within groups. In addition, each image was scored in terms of percentage of agreement between the rater and the previously reported classification of the expression it was intended to convey (Egger et al., 2011). These analyses were performed for each expression, as well as for each rater group. IBM SPSS Statistics 21 and R (R Development Core Team, 2010) were used for data analyses. An α level of 0.05 was used for all statistical tests.

Author Manuscript

To examine differences between the two surveys, a one-way analysis of variance (ANOVA) was conducted with survey (Survey 1 versus Survey 2) as a factor, and overall agreement with previously reported classification of emotions (based on the paper by Egger et al., 2011) as the dependent variable. To examine potential differences in participant characteristics between the two surveys, we conducted a series of t-tests comparing means between different participant characteristics.6 To examine the differences between groups in agreement with classification from Egger et al. (2011) across and within emotions, as well as the representativeness of the emotions, a one-way ANOVA was conducted with group (adolescent, parent, mental health professional) as a factor, and accuracy (or representativeness) ratings for emotion as a dependent variable. If a significant group difference was found, pairwise multiple comparison using Tukey post hoc test were used to identify specific groups differing on accuracy and representativeness.

Author Manuscript

As a secondary analysis, to examine gender differences in accuracy for rating emotional faces in adolescents, we conducted a series of t-tests comparing means between the two gender groups for total accuracy as well as the five emotions, for a total of six t-tests. Additionally, in order to examine whether accuracy for identification of emotions differed by age, we conducted a Spearman correlation between the adolescent ages and percentage of accuracy for all emotions.

6For parents and mental health professionals, the difference in age was calculated using the median age in each age group (e.g. for individuals between 24 and 26 years, age 25 was used for comparison analyses). The age ranges were equally spaced allowing for a parametric comparison.


COFFMAN et al.

Page 7

Author Manuscript

Results Survey differences As indicated earlier, the survey was split into two different sub-sections to reduce subject burden. Overall kappa for the Survey 1 was κ = 0.81, and for Survey 2 it was κ = 0.84. Kappa for adolescents was κ = 0.81 for Survey 1 and κ = 0.79 for Survey 2. Kappa for parents of adolescents was κ = 0.83 and κ = 0.84 for the Surveys 1 and 2, respectively. Kappa for mental health professionals was κ = 0.81 for Survey 1 and κ = 0.89 for Survey 2. There was substantial agreement for all kappa values and all p values < 0.01. Further analyses of kappa by emotion type are displayed in Table 1.

Author Manuscript

No differences in terms of types of images shown or rater characteristics, aside from the age of mental health professionals, across groups were noted, and, as such, the results for both surveys were combined for analytic purposes. There was no statistically significant difference in terms of overall agreement with previously reported classifications between the two surveys [F(1, 127) = 2.21, p = .14]. See Supporting Information Table S1 for a breakdown of participant characteristics between the two surveys. Agreement with previously reported classification

Author Manuscript

Table 2 indicates the percentage agreement between the respondents and the previously reported classifications established by Egger et al. (2011). The mean agreement rate across emotions for all three groups was 89.52%: mean (M) = 87.97% for adolescents, M = 89.74% for parents of adolescents, and M = 91.02% for mental health professionals. However, the overall agreement with previously reported classifications differed significantly between groups [F(2, 128) = 4.31, p = 0.02]. For total agreement, adolescents and the mental health professionals differed (Tukey test, p = 0.01) such that mental health professionals had higher overall agreement with previously reported classifications than adolescents. There was no significant difference between the parents of adolescents and the mental health professionals (Tukey test, p = 0.40) or adolescents and parents of adolescents (Tukey test, p = 0.15) in accuracy ratings.

Author Manuscript

For 72 of the 257 stimuli (28.02%), there was complete agreement among all raters and the previously reported emotion classification. Additionally, for 194 images (75.48%), there was at least 90% agreement with previously reported classification, and for 220 stimuli (85.60%) at least 75% agreement. Therefore, only 37 of the 257 images (14.40% of stimuli) did not reach a 75% agreement – a criterion suggested by Egger et al. (2011) to establish accuracy. See Supporting Information Table S1 for the agreement percentages for each of the specific images. Agreement by emotion condition Table 3 indicates the percentage agreement between the respondents and previously reported classification for each emotion type. Stimuli expressing happiness showed highest agreement across all raters, followed by fear, anger, and neutral stimuli. Images displaying sadness showed the lowest agreement across rater groups. See Table 3 and Figure 3 for breakdown of agreement by emotion type for each of the rater groups.


COFFMAN et al.

Page 8

Author Manuscript

The groups showed differences in rating agreement for anger [F(2, 128) = 6.46; p < 0.01]. Parents of adolescents agreed with previously reported classification of images more than the adolescents for stimuli expressing anger (Tukey test, p < 0.01). No other differences among groups were noted. See Figure 3 for accuracy by rater group. In addition, mental health professionals did not differ from adolescents on their rating agreement for any of the expressions (Tukey test, p = 0.07–0.59).

Author Manuscript

Happiness was the emotion most often identified correctly, across all three rater groups. There was at least a 97% agreement among raters with previously reported classification for all 50 images expressing happiness. Sadness, however, was the emotion that was least likely to be correctly identified. There was perfect agreement among raters for only three of the 48 images of sadness. For images of anger, all raters agreed on the previously reported classification for 12 of the 52 images. For fear, complete agreement among raters occurred for nine of the 51 images. For neutral expressions, complete agreement was reached for five out of 56 images. See Table 4 for summary percentages of correctly identified stimuli for each emotion. Emotion misattribution

Author Manuscript

In order to evaluate the nature of misattributions (incorrectly identified expressions) in finer detail, we evaluated the types of errors in emotional attributions for misidentified faces. Table 5 indicates the total number of times each group misidentified an expression, which expression was incorrectly chosen, and the proportion of time that expressions were incorrectly chosen. Misattributed fearful faces were most commonly identified as angry, followed by neutral, sad, and happy. Misinterpreted angry faces were inaccurately labelled as neutral, sad, fearful, and happy. Misinterpreted happy faces were misidentified as angry, sad, fearful, and neutral. Misinterpreted neutral faces were identified as angry most often, followed by sad, happy, and fearful. Misidentified sad faces were classified as angry, followed by neutral, fearful, and happy. Representativeness ratings

Author Manuscript

Overall, the mean representativeness across all five emotions was 75.61% (SD = 11.29) across all raters. The ratings did not differ significantly across the three groups [F(2, 128) = 0.24, p = 0.78]. In addition, the ratings did not differ for individual expressions [F(2, 128) = 0.48–1.41, p values = 0.25–0.62]. Our findings were consistent with the results reported in Egger et al. (2011), such that images of happy faces had the highest representativeness rating, followed by neutral faces, angry faces, then fearful faces. Sad faces had the lowest representativeness rating. See Table 6 for breakdown of representativeness by emotion type for each of the three rater groups. Gender and age differences Male and female adolescents did not differ significantly on accuracy across the different emotion types [t(39) = 0.61, p = 0.55] or for specific emotions [t(39) = 0.38–1.31, p values = 0.20–0.71].


COFFMAN et al.

Page 9

Author Manuscript

There was no significant correlation of accuracies with age in the adolescent group for any of the expressions [Anger: r(39) = 0.21, p = 0.19; Happiness: r(39) = 0.06, p = 0.70; Sadness: r(39) = 0.23, p = 0.16; Fear: r(39) = −0.15, p = 0.35; Neutral: r(39) = −0.02, p = 0.91].

Discussion

Author Manuscript

In this study, we refined the child and adolescent facial stimulus set presented in Egger et al. (2011) and validated the revised set with a sample of adolescents, parents of adolescents, and mental health professionals. We found high accucuracy ratings overall across expressions, consistent with previously reported ratings, although with somewhat diminished accuracy in emotions expressing sadness and slightly lower accuracy ratings in adolescents. We standardized the faces from Egger et al. (2011) for low- and high-level visual properties by standardizing the stimuli for luminance, size, and smoothness. In addition, the stimuli were adjusted for location of facial features. Visual properties are one of the many factors that influence the choice of a given stimulus set for clinical and research purposes. For instance, when measuring covert attention, low-level visual properties within an image may contribute to spurious findings in neuroimaging or attention retraining research (Naples et al., 2014). This extends to proper control of luminance, smoothness, and size of the image. Although experiments that allow for free viewing, such as eyetracking studies, are less susceptible to this type of noise and thus do not require the same amount of smoothing or luminance matching, these properties are nonetheless desirable ones (Einhäuser and König, 2003).

Author Manuscript

To our knowledge, this is the only facial stimulus set comprised solely of adolescent faces that also accounts for visual properties of the stimuli. From a neuroscience perspective, adolescents show differential activation on fMRI tasks when viewing adult faces compared to adolescent faces (Hoehl et al., 2010; Marusak et al., 2013), when viewing adult emotional faces than do adults (Blakemore, 2008) and developmental improvements in the temporal ordering of face processing occur predictably from childhood into adulthood (Taylor et al., 2004; Taylor et al., 1999). Thus, the development of a carefully developed and standardized set of adolescent stimuli is needed to advance research in this domain. We offer such a set in this study.

Author Manuscript

Our efforts to validate the stimulus set with adolescents, parents of adolescents, and mental health professionals yielded encouraging results. Agreement between our three groups of respondents and the previously reported emotion classifications established by Egger et al. (2011) was approximately 90%. Across respondents, agreements were highest for the happy emotion faces (almost 100%) and lowest for the sad emotion faces (about 73%), with agreement for fear, anger, and neutral emotion faces falling in between. It is important to highlight that neutral expression is not always perceived as emotionally neutral (Thomas et al., 2001; Lobaugh et al., 2006), which is why some researchers opt to include calm, in addition to neutral faces (e.g. Tottenham et al., 2009). This should be taken into account when interpreting the ratings of neutral expressions. Our three groups of respondents differed in accuracy minimally, and the representativeness ratings paralleled their accuracy ratings. Although parents of adolescents and mental health professonals did not differ in


COFFMAN et al.

Page 10

Author Manuscript

accuracy for any of the expressions, parents of adolescents were more accurate than adolescents in accurately recognizing faces expressing anger. It should be noted, however, that although these differences were obtained, the adolescents still accurately identified approximately 91% of the angry faces (versus 95% for parents). Our finding that parents were more accurate at identifying angry faces of adolescents suggests that parents may find angry faces of adolescents more salient. This finding is somewhat supported by Hoehl et al. (2010), who reported that adults have greater amygdala activation when viewing angry adolescent faces than children viewing the same images, and Marusak et al. (2013) who reported reduced amygdala activation in children when viewing adolescent angry versus happy faces. Still, the differences we did find do not appear to be clinically meaningful in the sense that as much as over 90% of the adolescents correctly identified the angry and happy faces.


Sad emotion faces were the least accurately identified by all three respondent groups: approximatley 30% of adolescents and 26% of parents of adolescents and mental health professionals missclassified the sad emotion faces. Relatively poor rates of agreement for sad emotion faces were also reported by Egger et al. (2011) and by other research groups using different face sets (e.g. Mazurski and Bond, 1993; Tottenham et al., 2009). Thus, the relatively poor recognition of sad faces in the current study likely represents a general feature of sad face-emotion displays, rather than a unique feature of this data set. Here, misidentified sad faces were classified as angry faces about 60% of the time followed by neutral faces (23%), fearful faces (16%), and happy faces (1%). It should be recalled that accuracy estimates were determined based on the accuracy of the 20 mental health professionals in identifying the emotions that the adolescents were making in the original Egger et al. (2011) study and no other criterion of accuracy was used. In that study, 76% of the sad direct gaze faces were correctly identified by 75% of the raters, and 23 sad faces had to be eliminated from their final pool of 38 sad faces. In our study we used only those found to be accurate by the mental health professionals in the study by Egger et al. (2011). Nevertheless, as noted, we found agreement rates for our respondents to be the lowest for the sad faces. For ease of use in research projects, we have included specific accuracy and representativeness ratings of each image by group in the Supporting Information (Table S2) so that all images, regardless of expression, might be selected based on an acceptable percentage agreement criterion.

Author Manuscript

Further research is needed to detemine the sources of inaccuracy for the sad stimuli and to determine whether a more objective criterion by which to gauge accuracy, other than judgements by mental health professionals as provided by Egger et al. (2011), can be established. Most importantly, validation of the happy, angry, fear, and neutral emotion faces seem acceptable from our findings across adolescents, parents of adolescents, and mental health professionals. Further, the represetativeness of the emotion faces were viewed as satisfactory in the judgement of our various respondent groups. Overall, the stimuli were successfully validated in the three populations examined (adolescents, parents of adolescents, and mental health professionals). This study set out to test differences in face perception in terms of accuracy ratings across our three groups of raters. The differences across groups were only partially observed, which might be due to the generally high accuracy across the groups. The developmental differences we did observe in anger ratings Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 11

Author Manuscript

between adolescents and parents may be accounted for by age effects observed in EEG, which highlight improvements in face perception with age (Taylor et al., 2004), although this was not ascertained in the current study.

Author Manuscript

Several limitations exist in our study, some of which were also evident in the prior study by Egger et al. (2011). First, neither study included disgust and surprise emotion faces. Both of these emotions have been identified as basic emotions (Ekman and Friesen, 1975; 1976) and need to be examined in future studies. Second, nearly all of the facial stimuli were derived predominantly from Caucasian actors (over 95%) and nearly all of our respondents were Caucasian (over 95%). Thus, our findings are limited to this racial/ethnic group and may not generalize to other ethnicities and races. This is a noteable limitation and one that is in need of careful and systematic inquiry. Third, the sample size in this study for each group is comparatively small. This limitation in number of raters should be kept in mind when interpreting the results of this study.

Author Manuscript

These limitations notwithstanding, the current study provides useful data on a carefully refined set of facial emotion stimuli for basic and clinical research and it has validated this set of stimuli across adolescent, parent, and mental health professional groups. Given noted differences in developmental markers of face perception and expertise (e.g. Benoit et al., 2007), we suggest it is important for researchers to use age-matched stimuli for the population under study. Our research group is currently using this stimulus set in an NIMHsponsored study with socially anxious adolescents who are receiving attention retraining via a dot-probe paradigm to address their implicit attentional biases. Future studies will also need to examine the utility of these stimuli to explore the potentially enhanced effects of viewing similar-age emotional faces in brain based tasks, as suggested by Blakemore (2008), Hoehl et al. (2010), Marusak et al. (2013), and Taylor and colleagues (Taylor et al., 2004; Taylor et al., 1999).

Supplementary Material Refer to Web version on PubMed Central for supplementary material.

Acknowledgments This work was supported in part by the National Institute of Mental Health (NIMH) Grant # R34 MH096915 awarded to Thomas H. Ollendick. The authors gratefully acknowledge the NIMH for its support and the many colleagues who assisted them with various aspects of the present research, including Kathleen Driscoll who helped create the surveys. Finally, the authors thank the parents, teenagers, and colleagues who participated in the validation ratings.

Author Manuscript

References Archer J, Hay DC, Young AW. Face processing in psychiatric conditions. British Journal of Clinical Psychology. 1992; 31(1):45–61. [PubMed: 1559117] Bachmann T. Identification of spatially quantized tachistoscopic images of faces: How many pixels does it take to carry identity? European Journal of Cognitive Psychology. 1991; 3(1):87–103. Benoit KE, McNally RJ, Rapee RM, Gamble AL, Wiseman AL. Processing of emotional faces in children and adolescents with anxiety disorders. Behaviour Change. 2007; 24(04):183–194.


COFFMAN et al.

Page 12

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

Biehl M, Matsumoto D, Ekman P, Hearn V, Heider K, Kudoh T, Ton V. Matsumoto and Ekman’s Japanese and Caucasian Facial Expressions of Emotion (JACFEE): Reliability data and crossnational differences. Journal of Nonverbal Behavior. 1997; 21(1):3–21. Blakemore SJ. The social brain in adolescence. Nature Reviews Neuroscience. 2008; 9(4):267–277. [PubMed: 18354399] Costen NP, Parker DM, Craw I. Spatial content and spatial quantization effects in face recognition. Perception. 1994; 23:129–146. [PubMed: 7971093] Cowart, MJW.; Ollendick, TH. Attentional biases in children: Implication for treatment. In: Hadwin, JA.; Field, AP., editors. Information Processing Biases and Anxiety: A Developmental Perspective. Oxford University Press; Oxford: 2010. p. 297-319. Cowart MJW, Ollendick TH. Attention training in socially anxious children: A multiple baseline design analysis. Journal of Anxiety Disorders. 2011; 25(7):972–977. [PubMed: 21763102] De Voogd EL, Wiers RW, Prins PJM, Salemink E. Visual search attentional bias modification reduced social phobia in adolescents. Journal of Behavior Therapy and Experimental Psychiatry. 2014; 45(2):252–259. [PubMed: 24361543] Egger HL, Pine DS, Nelson E, Leibenluft E, Ernst M, Towbin KE, Angold A. The NIMH Child Emotional Faces Picture Set (NIMH-ChEFS): A new set of children’s facial emotion stimuli. International Journal of Methods in Psychiatric Research. 2011; 20(3):145–156. [PubMed: 22547297] Ekman, P.; Friesen, WV. Unmasking the face: A guide to recognizing emotions from facial cues. Palo Alto, CA: Consulting Psychologists Press; 1975. Ekman, P.; Friesen, WV. Pictures of Facial Affect. Palo Alto, CA: Consulting Psychologists Press; 1976. Einhäuser W, König P. Does luminance-contrast contribute to a saliency map for overt visual attention? European Journal of Neuroscience. 2003; 17(5):1089–1097. [PubMed: 12653985] Ferri J, Bress JN, Eaton NR, Proudfit GH. The impact of puberty and social anxiety on amygdala activation to faces in adolescence. Developmental Neuroscience. 2013; 36(3–4):239–249. Goren CC, Sarty M, Wu PY. Visual following and pattern discrimination of face-like stimuli by newborn infants. Pediatrics. 1975; 56(4):544–549. [PubMed: 1165958] Gotlib IH, Krasnoperova E, Yue DN, Joormann J. Attentional biases for negative interpersonal stimuli in clinical depression. Journal of Abnormal Psychology. 2004; 113(1):127. Greist JH. The diagnosis of social phobia. The Journal of Clinical Psychiatry. 1994; 56(5):5–12. Hallion LS, Ruscio AM. A meta-analysis of the effect of cognitive bias modification on anxiety and depression. Psychological Bulletin. 2011; 137(6):940. [PubMed: 21728399] Hoehl S, Brauer J, Brasse G, Striano T, Friederici AD. Children’s processing of emotions expressed by peers and adults: An fMRI study. Social Neuroscience. 2010; 5(5–6):543–559. [PubMed: 20486013] Horley K, Williams LM, Gonsalvez C, Gordon E. Face to face: Visual scanpath evidence for abnormal processing of facial expressions in social phobia. Psychiatry Research. 2004; 127(1):43–53. [PubMed: 15261704] Jones W, Carr K, Klin A. Absence of preferential looking to the eyes of approaching adults predicts level of social disability in 2-year-old toddlers with autism spectrum disorder. Archives of General Psychiatry. 2008; 65(8):946–954. [PubMed: 18678799] Kleiner KA, Banks MS. Stimulus energy does not account for 2-month-olds’ face preferences. Journal of Experimental Psychology: Human Perception and Performance. 1987; 13(4):594. [PubMed: 2965751] Leppänen JM, Moulson MC, Vogel-Farley VK, Nelson CA. An ERP study of emotional face processing in the adult and infant brain. Child Development. 2007; 78(1):232–245. [PubMed: 17328702] Lobaugh NJ, Gibson E, Taylor MJ. Children recruit distinct neural systems for implicit emotional face processing. Neuroreport. 2006; 17(2):215–219. [PubMed: 16407774] Lundqvist, D.; Flykt, A.; Öhman, A. The Karolinska directed emotional faces (KDEF). CD ROM from Department of Clinical Neuroscience, Psychology Section, Karolinska Institutet; 1998. p. 91-630.


COFFMAN et al.

Page 13

Author Manuscript Author Manuscript Author Manuscript Author Manuscript

Macchi Cassia V. Age biases in face processing: The effects of experience across development. British Journal of Psychology. 2011; 102(4):816–829. [PubMed: 21988386] Macchi Cassia V, Pisacane A, Gava L. No own-age bias in 3-year-old children: More evidence for the role of early experience in building face-processing biases. Journal of Experimental Child Psychology. 2012; 113(3):372–382. [PubMed: 22857798] McLeod, BD.; Jensen-Doss, A.; Ollendick, TH., editors. Diagnostic and Behavioral Assessment in Children and Adolescents: A Clinical Guide. New York: Guilford Press; 2013. Marusak HA, Carré JM, Thomason ME. The stimuli drive the response: An fMRI study of youth processing adult or child emotional face stimuli. NeuroImage. 2013; 83:679–689. [PubMed: 23851324] Mathews A, MacLeod C. Induced processing biases have causal effects on anxiety. Cognitive Emotion. 2002; 16(3):331–354. Maurer D, Young RE. Newborn’s following of natural and distorted arrangements of facial features. Infant Behavior and Development. 1983; 6(1):127–131. Mazurski EJ, Bond NW. A new series of slides depicting facial expressions of affect: A comparison with the pictures of facial affect series. Australian Journal of Psychology. 1993; 45(1):41–47. Mogg K, Millar N, Bradley BP. Biases in eye movements to threatening facial expressions in generalized anxiety disorder and depressive disorder. Journal of Abnormal Psychology. 2000; 109(4):695. [PubMed: 11195993] Naples AN, Nguyen-Phuc A, Coffman MC, Kresse A, Bernier R, McPartland C. A computergenerated animated face stimulus set for psychophysiological research. Behavioral Research Methods. 2014; 47(2):562–570. DOI: 10.3758/s13428-014-0491-x Nasanen R. Spatial frequency bandwidth used in the recognition of facial images. Vision Research. 1999; 39(23):3824–3833. [PubMed: 10748918] Pelphrey KA, Sasson NJ, Reznick JS, Paul G, Goldman BD, Piven J. Visual scanning of faces in autism. Journal of Autism and Developmental Disorders. 2002; 32(4):249–261. [PubMed: 12199131] Porter G, Troscianko T, Gilchrist ID. Effort during visual search and counting: Insights from pupillometry. The Quarterly Journal of Experimental Psychology. 2007; 60(2):211–229. [PubMed: 17455055] R Development Core Team. R: A Language Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing; 2010. Retrieved from http://www.R-proect.org [28 January 2014] Rousselet GA, Pernet CR. Quantifying the time course of visual object processing using ERPs: It’s time to up the game. Frontiers in Psychology. 2011; 2(107):2–6. DOI: 10.3389/fpsyg.2011.00107 [PubMed: 21738514] Taylor MJ, Batty M, Itier RJ. The faces of development: A review of early face processing over childhood. Journal of Cognitive Neuroscience. 2004; 16(8):1426–1442. [PubMed: 15509388] Taylor MJ, McCarthy G, Saliba E, Degiovanni E. ERP evidence of developmental changes in processing of faces. Clinical Neurophysiology. 1999; 110(5):910–915. [PubMed: 10400205] Thomas KM, Drevets WC, Whalen PJ, Eccard CH, Dahl RE, Ryan ND, Casey BJ. mygdala response to facial expressions in children and adults. Biological Psychiatry. 2001; 49(2):309–316. [PubMed: 11239901] Tottenham N, Tanaka JW, Leon AC, McCarry T, Nurse M, Hare TA, Marcus DJ, Westerlund A, Casey B, Nelson C. The NimStim set of facial expressions: Judgments from untrained research participants. Psychiatry Research. 2009; 168(3):242–249. [PubMed: 19564050]


COFFMAN et al.

Page 14

Author Manuscript Author Manuscript Figure 1.

Author Manuscript

Depicts each of the processing steps undertaken to standardize the stimulus set: (a) the original image, (b) the rotated image to place the eyes and nose in the exact centre of the screen, (c) an oval cutout was then placed over the face to obscure hair and clothing, (d) the figure was then matched to the average brightness of the other images in the respective condition (e.g. all “happy” faces were matched to mean brightness for happy), (e) a 3 mm kernel then located visual “hotspots” to remove these spots from the image, (f) the high frequency was then isolated and removed from the image, (g) the final resultant image.

Author Manuscript Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 15

Author Manuscript Author Manuscript Figure 2.

This is a sample of the webpage that raters viewed to perform rating.

Author Manuscript Author Manuscript Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 16


Figure 3.

A bar graph displaying mean accuracy rates by group. The black bars represent the profession group, the hatched bars represent parents, and the grey bars represent adolescents.


Author Manuscript

Author Manuscript

Sad

Neutral

Happy

Anger

Fear

Expression

Parents Kappa score (p value) 0.75 (p < 0.01) 0.51 (p < 0.01)

Adolescents Kappa score (p value) 0.79 (p < 0.01) 0.40 (p < 0.01)

Survey

1

0.38 (p < 0.01) 0.20 (p < 0.01) 0.45 (p < 0.01)

0.15 (p < 0.01) 0.37 (p < 0.01)

1

2

0.62 (p < 0.01)

0.28 (p < 0.01)

1

0.60 (p < 0.01)

0.90 (p < 0.01)

2

0.85 (p < 0.01)

0.90 (p < 0.01)

1

0.87 (p < 0.01)

0.48 (p < 0.01)

0.43 (p < 0.01)

2

0.52 (p < 0.01)

0.51 (p < 0.01)

2

1

2

Author Manuscript

Depicts kappa scores for Surveys 1 and 2 by group across each expression

0.55 (p < 0.01)

0.15 (p < 0.01)

0.45 (p < 0.01)

0.66 (p < 0.01)

0.97 (p < 0.01)

0.87 (p < 0.01)

0.50 (p < 0.01)

0.57 (p < 0.01)

0.50 (p < 0.01)

0.83 (p < 0.01)

Professionals Kappa score (p value)

0.42 (p < 0.01)

0.15 (p < 0.01)

0.35 (p < 0.01)

0.40 (p < 0.01)

0.91 (p < 0.01)

0.89 (p < 0.01)

0.46 (p < 0.01)

0.53 (p < 0.01)

0.50 (p < 0.01)

0.64 (p < 0.01)

Total Kappa score (p value)

Author Manuscript

Table 1 COFFMAN et al. Page 17


COFFMAN et al.

Page 18

Table 2

Author Manuscript

Mean accuracy, standard deviation (SD), and range of accuracy by group across all expressions Group

Mean % (SD)

Range %

Adolescents

87.97 (5.87)

67.79–96.07

Parents

89.74 (4.00)

77.16–97.63

Professionals

91.02 (3.39)

80.75–95.22

Total

89.52 (4.66)

67.79–97.63

Author Manuscript Author Manuscript Author Manuscript Int J Methods Psychiatr Res. Author manuscript; available in PMC 2016 November 10.

COFFMAN et al.

Page 19

Table 3

Author Manuscript

Accuracy for each emotion by group

Author Manuscript

Expression (number of images)

Adolescents Mean % (SD) Range

Parents Mean % (SD) Range

Professionals Mean % (SD) Range

Total Mean % (SD) Range

Fear

94.18 (10.10)

95.09 (7.60)

97.53 (4.16)

95.44 (7.87)

(N = 51)

50.00–100

58.33–100

83.33–100

50.00–100

Anger

90.68 (6.49)

94.97 (5.18)

93.66 (5.98)

93.26 (6.04)

(N = 52)

73.08–100

80.77–100

76.92–100

73.08–100

Happy

99.44 (1.62)

99.72 (1.64)

99.89 (0.67)

99.67 (1.44)

(N = 50)

92.31–100

88.46–100

96.15–100

88.46–100

Neutral

86.37 (12.53)

85.16 (12.44)

88.91 (7.14)

86.54 (11.34)

(N = 56)

43.33– 100

37.04–100

70.00–100

37.04–100

Sad

69.21 (13.50)

73.77 (10.29)

75.12 (11.63)

72.67 (11.90)

(N = 48)

37.50–95.83

41.66–95.83

37.50–91.67

37.50–95.83


Author Manuscript

Author Manuscript 94.12

85% 96.08

17.65

100%

75%

Fear

Percentage of images agreed upon by raters

90.38

86.54

23.08

Anger

100.00

100.00

86.00

Happy

80.36

69.64

8.93

Neutral

60.42

47.92

6.25

Sad

Author Manuscript

Percentage agreement across raters for individual faces

Author Manuscript



Author Manuscript

Sad

Neutral

Happy

Anger

Fear

Expression depicted

(145) 47.85 (8) 2.64

Anger Happy (99) 32.67

(51) 16.83

Neutral

(59) 35.96

(18) 10.98

Happy

Fear

(77) 46.95

Anger

Sad

(10) 6.10

(2) 16.67

Neutral (0) 0

(4) 66.67

Anger

Fear

(1) 16.67

Sad

(38) 38.76

(48) 48.98

Neutral

Fear

(1) 0.10

Happy

Sad

(11) 11.22

(20) 29.85

Neutral (12) 17.91

(12) 17.91

Happy

Fear

(23) 34.33

Anger

Sad

(N) %

(N) %

Expression chosen

(56) 16.47

(1) 0.29

(242) 71.18

(41) 12.10

(80) 34.04

(0) 0

(131) 55.74

(5) 2.13

(28) 31.81

(42) 47.72

(2) 0.14

(0) 0

(28) 31.81

(42) 47.72

(2) 0.14

(16) 18.18

(17) 24.29

(20) 28.5

(4) 5.71

(29) 41.43

Parents

Author Manuscript Adolescents

(36) 17.73

(1) 0.50

(125) 61.58

(41) 20.20

(47) 43.12

(7) 6.42

(52) 47.71

(3) 2.75

(0) 0

(0) 0

(0) 0

(1) 100

(28) 51.85

(22) 40.74

(0) 0

(4) 7.41

(6) 28.57

(3) 14.28

(1) 4.76

(11) 52.38

(N) %

Clinicians

Author Manuscript

Overall emotion misattribution

(191) 22.58

(10) 1.18

(512) 60.52

(133) 15.72

(186) 36.61

(44) 8.66

(260) 51.18

(18) 3.54

(3) 36.61

(1) 9.10

(5) 45.45

(2) 18.18

(94) 39.17

(112) 46.67

(3) 1.25

(31) 12.92

(35) 22.15

(43) 27.22

(17) 10.76

(63) 39.87

(N) %

Total

Author Manuscript



COFFMAN et al.

Page 22

Table 6

Author Manuscript

Representativeness rating for each emotion by group

Author Manuscript

Expression (number of images)

Adolescents Mean % (SD) Range

Parents Mean % (SD) Range

Professionals Mean % (SD) Range

Total Mean % (SD) Range

Fear

69.92 (16.33)

69.69 (16.14)

73.62 (21.11)

70.80 (15.23)

(N = 51)

25.71–100

26.43–99.29

49.64–96.07

25.71–100

Anger

73.26 (14.95)

76.03 (14.78)

74.06 (12.04)

74.63 (14.11)

(N = 52)

28.85–100

24.62–100

52.69–97.69

24.61–100

Happy

89.15 (11.81)

91.11 (9.48)

92.66 (7.92)

90.89 (9.94)

(N = 50)

58.85–100

68.08–100

70.77–100

58.85–100

Neutral

75.17 (14.09)

75.52 (13.84)

72.35 (9.52)

74.70 (12.92)

(N = 56)

40.67–100

43.33–100

53.67–95.56

40.67–100

Sad

66.43 (15.04)

69.33 (15.16)

64.13 (12.27)

67.04 (14.47)

(N = 48)

25.00–100

36.25–100

37.92–88.33

25.00–100


Parent-adolescent congruence for adolescent substance use.

Parent - Adolescent Relationship Qualities and Adolescent Adjustment in Two-Parent African American Families.

Quality of Parent-Adolescent Conversations About Sex and Adolescent Sexual Behavior: An Observational Study.

Interparental Boundary Problems, Parent-Adolescent Hostility, and Adolescent-Parent Hostility: A Family Process Model for Adolescent Aggression Problems.

Parent education for adolescent mothers.

Predictors of parent-adolescent communication in post-apartheid South Africa: a protective factor in adolescent sexual and reproductive health.

The impact of parent involvement in an effective adolescent risk reduction intervention on sexual risk communication and adolescent outcomes.

Parent educational attainment and adolescent cigarette smoking.

Adolescent-to-Parent Violence in Adoptive Families.

Discrepancies Between Perceptions of the Parent-Adolescent Relationship and Early Adolescent Depressive Symptoms: An Illustration of Polynomial Regression Analysis.

Personality construct of Sasang Personality Questionnaire in an adolescent sample.

Do parent-adolescent discrepancies in family functioning increase the risk of Hispanic adolescent HIV risk behaviors?

Parent-adolescent interaction and risk of adolescent internet addiction: a population-based study in Shanghai.

Parent-Adolescent Discrepancies in Perceived Parenting Characteristics and Adolescent Developmental Outcomes in Poor Chinese Families.

Facial expressions depicting compassionate and critical emotions: the development and validation of a new emotional face stimulus set.

Preliminary validation of the Multi-Attitude Suicide Tendency (MAST) scale using a South African adolescent sample.

Depressive mood, the single-parent home, and adolescent cigarette smoking.

The EU-Emotion Stimulus Set: A validation study.

Adolescent and Parental Contributions to Parent-Adolescent Hostility Across Early Adolescence.

a Adolescent Sexual Behavior.

Parent and peer attachment in early adolescent depression.

Experience with an adolescent health care program.

Parent and adolescent knowledge of HPV and subsequent vaccination.

Examining the Quality of Adolescent-Parent Relationships Among Chilean Families.