teaching effectiveness How do medical students form impressions of the effectiveness of classroom teachers? Luke Rannelli, Sylvain Coderre, Michael Paget, Wayne Woloschuk, Bruce Wright & Kevin McLaughlin

CONTEXT Teaching effectiveness ratings (TERs) are used to provide feedback to teachers on their performance and to guide decisions on academic promotion. However, exactly how raters make decisions on teaching effectiveness is unclear. OBJECTIVES The objectives of this study were to identify variables that medical students appraise when rating the effectiveness of a classroom teacher, and to explore whether the relationships among these variables and TERs are modified by the physical attractiveness of the teacher. METHODS We asked 48 Year 1 medical students to listen to 2-minute audio clips of 10 teachers and to describe their impressions of these teachers and rate their teaching effectiveness. During each clip, we displayed either an attractive or an unattractive photograph of an unrelated third party. We used qualitative analysis followed by factor analysis to identify the principal components of teaching effectiveness, and multiple linear regression to study the associations among these components, type of photograph displayed, and TER.

RESULTS We identified two principal components of teaching effectiveness: charisma and intellect. There was no association between rating of intellect and TER. Rating of charisma and the display of an attractive photograph were both positively associated with TER and a significant interaction between these two variables was apparent (p < 0.001). The regression coefficient for the association between charisma and TER was 0.26 (95% confidence interval [CI] 0.10–0.41) when an attractive picture was displayed and 0.83 (95% CI 0.66–1.00) when an unattractive picture was displayed (p < 0.001). CONCLUSIONS When medical students rate classroom teachers, they consider the degree to which the teacher is charismatic, although the relationship between this attribute and TER appears to be modified by the perceived physical attractiveness of the teacher. Further studies are needed to identify other variables that may influence subjective ratings of teaching effectiveness and to evaluate alternative strategies for rating teaching effectiveness.

Medical Education 2014; 48: 831–837 doi: 10.1111/medu.12420 Discuss ideas arising from the article at www.mededuc.com ‘discuss’

Office of Undergraduate Medical Education, University of Calgary, Calgary, Alberta, Canada

Correspondence: Dr Kevin McLaughlin, Office of Undergraduate Medical Education, University of Calgary, 3330 Hospital Drive, Calgary, Alberta T2N 4N1, Canada. Tel: 00 1 403 220 4252; E-mail: [email protected]

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

831

L Rannelli et al ness.10–13 These variables may effect ratings subconsciously, in which case they are unlikely to be identified by qualitative analysis.14,15

INTRODUCTION

Most medical schools have a process for gathering and disseminating data on the effectiveness of teachers; in North American schools, this is an explicit accreditation standard (ED-47: ‘In evaluating program quality, a medical education program must consider medical student evaluations of their courses, clerkships [or, in Canada, clerkship rotations], and teachers, as well as a variety of other measures’1). In addition to meeting this accreditation standard, teaching effectiveness ratings (TERs) are used to provide feedback to teachers, and to facilitate decisions on academic promotion and the allocation of teaching awards. How do we decide on the effectiveness of a teacher? Which attributes are considered and how are these attributes integrated? Prior to studies on teaching effectiveness in medical education, reports in the psychology literature suggested that certain qualities of a teacher, such as ‘skill’, ‘feedback’, ‘interaction’ and ‘rapport’ (or ‘warmth’), were associated with better achievement in learners.2,3 Unfortunately, the findings across different studies were inconsistent and the effects of these variables depended upon learner characteristics (e.g. ability and gender) and the learning outcome targeted (e.g. changes in knowledge, skills or attitudes).3 With the notable exception of a study by Anderson et al.,4 in which students at the teaching hospital rated most highly for the quality of teaching outperformed students at other hospitals, most studies on teaching effectiveness in medical education have considered learners’ subjective ratings of teaching effectiveness as the outcome variable, and have used qualitative analysis to identify attributes of effective teachers.5,6 The results from this body of literature suggest that, among other things, teachers should be knowledgeable,7,8 value teaching,7 communicate effectively with learners,7,9 demonstrate concern for the well-being of learners,7,9 provide effective feedback8,9 and inspire learners.9 Indeed, when reviewing the medical education literature on what makes a good clinical teacher, Sutkin and colleagues5 identified almost 500 descriptors of effective teachers, the majority of which referred to noncognitive attributes. From this we can infer that although there are many potential qualities of an effective teacher, the construct of teaching effectiveness lacks a precise definition.5,6 To further complicate this issue, data from the non-medical education literature suggest that ratings may also be influenced by attributes that are not directly related to teaching performance, such as physical attractive-

832

In this study, we asked Year 1 medical students to listen to audio clips of teachers they had not previously encountered, after which they described the attributes of each teacher and provided a rating of his or her teaching effectiveness. While our participants were listening to the audio clips, we projected photographs of unrelated individuals whom we considered to be either attractive or unattractive. Our first objective was to identify the principal components of teaching effectiveness. Our second objective was to explore whether the relationship between these components and TER is modified by physical attractiveness. We predicted that if physical attractiveness impacts TER, there should be an independent association between the type of picture displayed and TER, or an interaction between ratings of the principal components of teaching effectiveness and TER.

METHODS

Participants Forty-eight (of 180) Year 1 students from the class of 2014 at the University of Calgary participated in this study. This institution runs a 3-year undergraduate curriculum, the first 2 years of which comprise seven systems-based courses. The third year represents a clinical clerkship. Learning experiences in the first 2 years include a combination of didactic and small-group teaching, in addition to clinical skills teaching using simulators, standardised patients and real patients. The students in this study had just entered medical school and had not previously encountered the teachers they were asked to rate. We obtained ethics approval from the University of Calgary Conjoint Research Ethics Board, in addition to written informed consent from participating teachers and students prior to the start of the study. Materials During the 2008/2009 academic year, teachers delivering didactic sessions to the class of 2011 were asked if their sessions could be recorded and made available as podcasts. We contacted 10 of these teachers to discuss the objectives of our study and asked for permission to create an anonymous audio clip of their teaching from the podcasts. We created 2-minute clips of each teacher’s first didactic session to the class of 2011, beginning from the time the teacher

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

Forming impressions of teaching effectiveness first began speaking. We removed any identifying content, such as when the teacher introduced him- or herself, and randomly ordered the clips on a CD. As part of our educational quality assurance programme, at the end of each systems-based course we ask all students to provide TERs for all teachers who taught on this course. This online assessment includes a 5-point global rating scale (GRS) for the overall rating of teaching effectiveness, on which scores of 1 to 5 represent ‘unacceptable’, ‘below expectations’, ‘good’, ‘very good’ and ‘outstanding’, respectively. From the class of 2011, five of our teachers achieved a mean TER of > 4 on the GRS and the remaining five teachers obtained a mean rating of < 3. To allow us to manipulate the variable ‘physical attractiveness’ in relation to our teachers, we asked six colleagues (three women and three men) to view a series of photographs (of unknown individuals not affiliated with our medical school) and to rate these as either attractive or unattractive. Photographs for which we did not achieve consensus were discarded. From those on which we had achieved consensus, we selected the five considered to be the most attractive and most unattractive, respectively. We revised our final selection to ensure that the gender mix of our photograph collection matched that of our teachers. Procedure This was a mixed-methods, cross-sectional study in which we randomly allocated attractive or unattractive photographs to the audio clips of our male and female teachers. Our participants listened to 10 audio clips, each lasting 2 minutes, during which we displayed a photograph on a large projection screen. We did not tell the participants that the photograph corresponded to the teacher on the audio clip, and gave no instruction on whether they should incorporate their impression of the photograph into the TER. After each audio clip, our participants were given 30 seconds to describe their impression of the teacher on the audio clip and to provide a TER using the GRS used by the class of 2011 students to rate teaching effectiveness at the end of each systems-based course. Students rated each teacher independently and were instructed not to communicate with fellow students during the study. Statistical analysis We used Pearson’s correlation coefficient to evaluate the correlation between the audio clip TER

(class of 2014) and end-of-course TER (class of 2011) for each preceptor. In our qualitative analysis, we considered the adjectives used to describe the teachers as our inductive codes. We then categorised the codes and used Spradley’s universal semantic relationship to describe the relationships among codes and categories.16 Two investigators (LR and KM) came to a consensus on dividing the codes into categories, and the relationships among codes, categories and the outcome of effective teaching. We then generated a score for each teacher on each category identified in the qualitative analysis. For example, whenever a teacher was described as approachable, caring, friendly, interested, understanding or warm, he or she was given one point for the category of ‘caring’. Conversely, one point was subtracted from this category each time the teacher was described as either arrogant or sarcastic. Prior to performing factor analysis, we used the Kaiser–Meyer–Olkin (KMO) test to assess the appropriateness of performing this analysis on these data (i.e. to ensure a KMO statistic of > 0.5).17 We performed exploratory factor analysis on the categorical variables by first creating a Pearson product– moment correlation matrix for these categories and then using principal component analysis to extract factors. We used a cut-off threshold for factor extraction of eigenvalues of ≥ 1 (Kaiser rule). We then performed factor loading on extracted factors, followed by factor rotation using the Varimax method with Kaiser normalisation.18 Finally, to ensure that the items with the highest loadings had the largest effects on the factor score, we created a weighted sum score for the principal components of teaching effectiveness by combining the rating scale items that loaded on this component after multiplying each by its factor score.19 To identify the variables associated with TER, we used multiple linear regression where our outcome variable was audio clip TER, and our explanatory variables were the weighted scores for the principal components of teaching effectiveness and the type of photograph displayed (attractive versus unattractive). We considered interaction between explanatory variables in our regression model and used backward elimination to remove variables, beginning with interaction terms. If we found a significant interaction, we reported the stratified analyses. We performed our statistical analyses using STATA Version 11.0 (StataCorp LP, College Station, TX, USA).

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

833

L Rannelli et al Categories of teaching attributes RESULTS

Association between audio clip TER and endof-course TER The mean  standard deviation (SD) GRS rating of teaching performance based on audio clips was 3.56  0.65. The mean  SD end-of-course rating for the same teachers was 3.74  0.92. There was a significant correlation between audio clip and endof-course ratings of the same teacher (r = 0.78, p = 0.01).

Code

In our qualitative analysis, we identified 30 separate codes, which we grouped into six categories: caring; engaging; entertaining; organised; confident, and knowledgeable. We were unable to categorise three codes (‘young’, ‘business-like’ and ‘well-spoken’), each of which was used by a single rater. We grouped these codes into a category named other, and did not include this category in our subsequent analyses in view of the infrequent use of these codes and their unclear relationship to teaching effectiveness. The semantic relationships among codes, categories and

Category

Approachable Caring Friendly Interested Understanding Warm Arrogant Sarcastic Dynamic Engaging Enthusiastic Interactive Boring Monotonous Fun Funny Humourous Clear Organised Rushed Unorganised Confident Happy Relaxed Nervous Quiet Unsure

Outcome

Caring

Engaging

Entertaining

Effective teaching

Organised

Confident

Knowledgeable

Knowledgeable

Young Business-like Well-spoken

Other

Key: is an attribute of contributes to Note: italicised codes are considered to be negatively associated with the corresponding category.

Figure 1 Semantic relationships among codes, categories and outcome. Italicised codes are negatively associated with the corresponding category

834

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

Forming impressions of teaching effectiveness the outcome of effective teaching are shown in Fig. 1.

Variables associated with audio clip TER There was no interaction between intellect and the other explanatory variables, and no independent association between intellect and TER (p = 0.5). The display of an attractive photograph (regression coefficient [b] = 0.70, 95% confidence interval [CI] 0.60– 0.79; p < 0.001) and rating of charisma (b = 0.90, 95% CI 0.79–1.02; p < 0.001) were both positively associated with TER, and there was a significant interaction between these two variables (p < 0.001). The regression coefficient for the association between charisma and TER was 0.26 (95% CI 0.10–0.41; p < 0.001) when an attractive photograph was displayed and 0.83 (95% CI 0.66–1.00; p < 0.001) when an unattractive photograph was displayed. The relationships among photograph type, rating of charisma and TER are shown in Fig. 2.

Principal components of teaching effectiveness The KMO statistic for our factor analysis was 0.60. We identified two principal components with eigenvalues of 1.46 and 1.04, respectively. Five categories of teaching attributes loaded on the first component; based upon this loading pattern, we felt that this component was synonymous with charisma. Based upon the loading of a single category, ‘knowledgeable’, we interpreted the second component as ‘intellect’. The factor loading pattern is shown in Table 1.

Table 1 Factor loading on the principal components of effective teaching

Category

Factor 1:

Factor 2:

charisma

intellect

Caring

0.47

Engaging

0.63

Entertaining

0.55

Organised

0.52

Confident

0.47

Knowledgeable

DISCUSSION

At the University of Calgary, as at most medical schools, our evaluation of the effectiveness of our teachers is based solely upon the subjective ratings of students. Although this rating process meets the standard for accreditation,1 and is consistent with our learner-centred philosophy, we have a limited understanding of which variables medical students attend to when rating their teachers and, consequently, which attributes we are rewarding when we allocate

0.78

Attractive picture displayed

4 3.5 3 2.5

Teaching effectiveness rating (TER)

Unattractive picture displayed

–1

–0.5

0

0.5

1

–1

–0.5

0

0.5

1

Charisma rating Figure 2 Relationships among the type of picture displayed, charisma and the audio clip-based rating of teaching effectiveness

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

835

L Rannelli et al teaching awards and make promotion decisions based upon these ratings. In this study, our goal was to identify variables that impact medical students’ ratings of teaching effectiveness. Recognising that some variables may impact ratings subconsciously14,15 – and are, therefore, unlikely to be volunteered by raters – we combined qualitative analysis with a manipulation of one variable that might affect ratings subconsciously: physical attractiveness.10–13 When we compared TERs based upon brief audio clips of our 10 teachers with end-of-course ratings of the same teachers (given by a previous class), we found that approximately 60% of the variance (i.e. the R2 value) in end-of-course ratings could be explained by the students’ ratings of the first 2 minutes of teaching, which is consistent with the findings of prior studies suggesting that the initial impression of a teacher is durable.20,21 Based upon our participants’ ratings of brief audio clips, we identified two principal components of teaching effectiveness: charisma and intellect. Of these, only the rating of charisma was associated with TER. The effect of charisma rating on TER was, however, modified by the type of photograph displayed while the participant rated the teacher on the audio clip. For teachers lacking charisma, an attractive photograph attenuated the negative effect of a low charisma rating on TER, whereas the most charismatic teachers were able to achieve high TER despite the presence of an unattractive photograph. As our participants were given no instructions on incorporating their impression of the photograph into their ratings of teaching effectiveness, and no descriptions of physical appearance were identified on qualitative analysis, our results suggest that physical attractiveness exerts its effect on TER subconsciously. Although it seems logical that the quality of the interaction between a teacher and learner (which we believe is captured in the principal component of charisma) should impact TER, it is unclear why learners rate teachers both consciously and subconsciously, or why physical attractiveness should impact TER. Our results are, however, consistent with findings from the psychology literature on impression formation, which suggest that we use dual processing to rate individuals whom we encounter in social settings.22,23 In a proposed model of dual processing, Brewer suggests that ratings begin with a priori expectations based upon stereotypes (e.g. age, gender and other data that are conveyed by physical appearance).23 As raters then appraise the individual who performs the task to be rated, they may modify their ratings based upon these behaviours. The duality of

836

processing may be explained by the existence of two types of memory: implicit and explicit.24,25 In implicit memory we store a large sample of our prior experiences, which allows us to form average expectations of any given environment or context, and which can be accessed rapidly and subconsciously.14,15 By contrast, explicit memory contains symbolic representations of knowledge, typically in the form of rules, which we can apply consciously when rating appraised behaviours, assuming that we have sufficient time and motivation to do so.14,15 As for the a priori expectation that physically attractive teachers are better teachers, it is unclear whether this consistent finding reflects a systematic rater bias, such as the halo effect, or whether encountering a physically attractive teacher does actually improve learning by, for example, making learners more attentive or motivating them to perform well.10–13,26,27 Our study has some important limitations that we should acknowledge. Our experimental set-up was somewhat artificial as we asked our participants to rate teaching effectiveness after a brief exposure to visual and verbal data, rather than after encountering a teacher trying to explain a difficult concept (in which case the impact of intellect on TER might have been stronger). However, the strength of the association between ratings of teachers in this setting and ratings delivered after 3 months of teaching exposure suggests that TERs based upon audio clips do have predictive validity. Our study was conducted in a single centre and included a relatively small number of participants, all of whom were novice learners. In addition, we considered only subjective ratings of teaching effectiveness rather than looking at the learning outcomes of students, and limited our TERs to those in the classroom setting. These factors limit our ability to generalise our findings to other groups of learners and other learning environments. Further, when studying the impact of variables that might subconsciously impact TER, we considered only physical attractiveness. There are several other potential variables, such as gender, race and accent, that may also impact TER and that should be explored in future studies. Implications for medical education The findings of this study suggest that medical students use dual processing to rate the effectiveness of classroom teachers, and that there is an interaction between the conscious appraisal of teaching attributes – specifically, the perceived charisma of the teacher – and the subconscious rating of variables that portray stereotypes, such as physical

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

Forming impressions of teaching effectiveness appearance. Important questions arise from these findings, including whether we are rewarding teachers for physical attractiveness when we allocate teaching awards and make promotion decisions based upon subjective TERs. Further studies are needed to explore which variables influence subjective ratings of teaching effectiveness and to evaluate alternative strategies for rating teaching effectiveness. Contributors: LR contributed to the study design and data acquisition. SC, MP, WW and BW contributed to the study design. KM contributed to the conception and design of the study, and to data collection and analysis, and drafted the manuscript. All authors contributed to the revision of the paper and approved the final manuscript for publication. Acknowledgements: the authors wish to thank Shaqil Peermohamed for his involvement in the pilot study that preceded this study, and Shirley Marsh, Sybil Tai, and Melanie Stopper for their help in selecting photographs used in the study. Funding: none. Conflicts of interest: none. Ethical approval: this study was approved by the University of Calgary Conjoint Research Ethics Board and written informed consent was obtained from participating teachers and students prior to the start of the study.

REFERENCES 1 Liaison Committee on Medical Education. Functions and structure of a medical school. Washington, DC: LCME. https://www.lcme.org/publications/functions. pdf. [Accessed 9 August 2013.] 2 Isaacson RL, McKeachie WJ, Milholland JE, Lin YG, Hofeller M, Baerwaldt JW, Zinn KL. Dimensions of student evaluations of teaching. J Educ Psychol 1964; 55:344–51. 3 McKeachie WJ, Lin YG, Mann W. Student ratings of teacher effectiveness: validity studies. Am Educ Res J 1971;8:435–45. 4 Anderson DB, Harris IB, Allen S, Satran L, Bland CJ, Davis-Feickert JA, Poland GA, Miller WJ. Comparing students’ feedback about clinical instruction with their performances. Acad Med 1991;66:29–34. 5 Sutkin G, Wagner E, Harris I, Schiffer R. What makes a good clinical teacher in medicine? A review of literature Acad Med 2008;83:452–66. 6 Berk RA. Top five flashpoints in the assessment of teaching effectiveness. Med Teach 2013;35:15–26. 7 Silber C, Novielli K, Paskin D, Brigham T, Kairys J, Kane G, Veloski J. Use of critical incidents to develop a rating form for resident evaluation of faculty teaching. Med Educ 2006;40:1201–8. 8 Torre DM, Simpson D, Sebastian JL, Elnicki DM. Learning/feedback activities and high-quality teaching: perceptions of third-year medical students during an inpatient rotation. Acad Med 2005;80:950–4.

9 Boendermaker PM, Schuling J, Meyboom-de-Jong BM, Zwierstra RP, Metz JC. What are the characteristics of the competent general practitioner trainer? Fam Pract 2000;17:547–53. 10 Eagly AH, Ashmore RD, Makhijiani MD, Longo LC. What is beautiful is good, but: a meta-analytic review of research on the physical attractiveness stereotype. Psychol Bull 1991;110:109–28. 11 Freng S, Webber D. Turning up the heat on online teaching evaluation: does ‘hotness’ matter? Teach Psychol 2009;36:189–93. 12 Hosoda M, Stone-Romero EF, Coats G. The effects of physical attractiveness on job-related outcomes: a meta-analysis of experimental studies. Pers Psychol 2003;56:431–62. 13 Shevlin M, Banyard P, Davies M, Griffiths M. The validity of student evaluation of teaching in higher education: love me, love my lectures? Assess Eval Higher Educ 2000;25:397–405. 14 Sloman SA. The empirical case for two systems of reasoning. Psychol Bull 1996;119:3–22. 15 Kahneman D. Thinking Fast and Slow. Toronto, ON: Doubleday 2011. 16 Parfitt BA. Using Spradley: an ethnosemantic approach to research. J Adv Nurs 1996;24:341–9. 17 Kaiser HF, Rice J. Little jiffy, mark IV. Educ Psychol Meas 1974;34:111–7. 18 Kerlinger FN, Lee HB. Foundations of Behavioral Research, 4th edn. Toronto, ON: Nelson Thomson Learning 2000. 19 Di Stefano C, Zhu M, M^ındrila D. Understanding and using factor scores: considerations for the applied researcher. Pract Assess Res Eval 2009;14:1–11. 20 Ambady N, Rosenthal R. Thin slices of expressive behaviour as predictors of interpersonal consequences: a meta-analysis. Psychol Bull 1992;111: 256–74. 21 Ambady N, Rosenthal R. Half a minute: predicting teacher evaluations from thin slices of non-verbal behaviour and physical attractiveness. J Pers Soc Psychol 1993;64:431–41. 22 Asch SE, Zukier H. Thinking about persons. J Pers Soc Psychol 1984;46:1230–40. 23 Brewer MB. A dual process model of impression formation. In: Srull TK, Wyer RS, eds. Advances in Social Cognition. Hillsdale, NJ: Lawrence Erlbaum Associates 1988; 1–36. 24 Smith ER, DeCoster J. Dual-process models in social and cognitive psychology: conceptual integration and links to underlying memory systems. Pers Soc Psychol Rev 2000;4:108–31. 25 Sherry DF, Schacter DL. The evolution of multiple memory systems. Psychol Rev 1987;94:439–54. 26 Thorndike EL. A constant error on psychological rating. J Appl Psychol 1920;4:25–9. 27 Hamermesh DS, Parker A. Beauty in the classroom: instructors’ pulchritude and putative pedagogical productivity. Econ Educ Rev 2005;24:369–76. Received 7 May 2013; editorial comments to author 28 June 2013; accepted for publication 19 December 2013

ª 2014 John Wiley & Sons Ltd. MEDICAL EDUCATION 2014; 48: 831–837

837

How do medical students form impressions of the effectiveness of classroom teachers?

Teaching effectiveness ratings (TERs) are used to provide feedback to teachers on their performance and to guide decisions on academic promotion. Howe...
216KB Sizes 0 Downloads 3 Views