commentaries regulated learning: a model and seven principles of good feedback practice. Stud High Educ 2006;31 (2):199–218. 5 Tamblyn R, Abrahamowicz M, Dauphinee WD, Hanley JA, Norcini J, Girard N, Grand’Maison P, Brailovsky C. Association between licensure examination scores and practice in primary care. JAMA 2002;288 (23):3019–26. 6 Tamblyn R, Abrahamowicz M, Dauphinee D et al. Physician scores on a national clinical skills examination as predictors of complaints to medical regulatory authorities. JAMA 2007;298 (9):993–1001. 7 Downing SM. Validity: on the meaningful interpretation of

8

9

10

11

assessment data. Med Educ 2003;37: 830–7. Cook DA, Zendejas B, Hamstra SJ, Hatala R, Brydges R. What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment. Adv Health Sci Educ Theory Pract 2014;19 (2):233–50. Cizek GJ, Bowen D, Church K. Sources of validity evidence for educational and psychological tests: a follow-up study. Educ Psychol Meas 2010;70 (5):732–43. Messick S. Validity. In: Linn RL, ed. Educational Measurement, 3rd edn. New York, NY: Macmillan 1989;13–103. American Educational Research Association, American

Psychological Association, National Council on Measurement in Education Joint Committee on Standards for Educational and Psychological Testing. Standards for Educational and Psychological Testing. Washington, DC: AERA 1999. 12 Kane MT. Validation. In: Brennan RL ed. Educational Measurement, 4th edn. Westport, CT: Praeger 2006;17–64. 13 Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary approach to validity arguments: a practical guide to Kane’s framework. Med Educ 2015;49: 560–75.

The trouble with validity: what is part of it and what is not? Mirjana Knorr & Dietrich Klusmann From psychometrics, we have learned that a good assessment should be objective, reliable and valid. Validation is the most troublesome aspect. In theory it is difficult to define the point at which we can claim that a test is valid, and in practice it is a struggle to find good measures for validation. A good assessment should be objective, reliable and valid

Hamburg, Germany

Correspondence: Mirjana Knorr, Institut f€ ur Biochemie und Molekulare Zellbiologie, Universit€atsklinikum Hamburg-Eppendorf, Martinistrasse 52, N30, Hamburg 20246, Germany. Tel: 00 49 40 7410 58 279; E-mail: [email protected] doi: 10.1111/medu.12738

550

Over decades, researchers in psychometrics have defined and refined the theoretical conception of validity. The latest development in validity theory includes Kane’s argument-based approach to validation.1,2 From a practitioner’s point of view, Kane’s extensive treatise on validity is at times quite lengthy and is repetitive in style, which makes it difficult to identify and structure his key assumptions. In this issue, Cook et al.3 summarise the important points of Kane’s framework with a focus on practical implications. As the authors3 point out, Kane1,2 frames validation as an evaluation process involving several steps. The starting point, is a statement of the intended interpretations and uses of test scores (the claim) from which key assumptions are derived. These assumptions are then empirically tested.

We can find old and new aspects to Kane’s framework. Firstly, we are reminded to put the statement of clear hypotheses at the beginning of the validation process. However, Kane’s approach now considers the evaluation of consequences as a part of validity. We wish to critically discuss both points in the light of education research. The starting point is a statement of the intended interpretations and uses of test scores

Technically, the argument-based approach to validation takes us back to the very core of scientific research: the researcher will propose a priori hypotheses based on a theoretical background and will then collect evidence that will either support or refute these

ª 2015 John Wiley & Sons Ltd. MEDICAL EDUCATION 2015; 49: 548–555

commentaries hypotheses. Although this principle is fundamental in science, both Kane1,2 and Cook et al.3 repeatedly place emphasis on it in their writing. There seems to be a need to remind us that any validity analysis should begin with the statement of clear claims. In our view, current research on assessment does not suffer from a lack of claims per se. However, for pragmatic reasons we are often satisfied with less ambitious claims or even superficial conclusions. Any validity analysis should begin with the statement of clear claims

In the context of admission, one of the main concerns is to show that a test has predictive value. Simply stated, the goal is to demonstrate to various stakeholders (primarily faculty staff and candidates) that the test works and thereby to justify its use. The most pragmatic approach would be to take the first measure at hand. Thus, for example, performance in a multiple-choice test on prior academic knowledge might be correlated with performance measures in the first year of medical school. The underlying hypothesis will usually be rather vague (A is relevant for B and therefore A will correlate with B), but perfectly adequate for the purpose (to demonstrate that the test is relevant). In Kane’s1,2 terminology, this could be considered as evidence for a less ambitious claim. For pragmatic reasons we are often satisfied with less ambitious claims or superficial conclusions

tious claims about the underlying theoretical constructs of a test. One such example concerns the validation of the perceptual ability test (PAT) based on claims derived from Ackerman’s theory of ability determinants of skilled performance.4 However, the demand for more ambitious claims is far from trivial. In the case of observational assessments such as multiple miniinterviews (MMI), the measure is rather complex (multiple attributes, multiple contexts or stations, and multiple observers) and therefore requires many assumptions and in consequence much more detailed research. Education research would benefit from more ambitious claims about the underlying theoretical constructs of a test

The new aspect in Kane’s approach to validation involves considering validation as a process and explicitly including implications into the conception of validity. In their conclusion, Cook et al.3 even emphasise implications as the most important inference in the validity argument. We agree with the authors that implications are important to assess the usefulness of a test. However, should implications be regarded as part of a validity argument? Implications are not a property of a test, but a property of its use. Implications may depend on the validity of a test, but the test’s validity does not depend on its implications. Should implications be considered within an argument for validity?

under supervision’.3 Compared with a no-test situation, the administration of such a test may delay operating privileges and consequently burden those who have been rejected and delay their entrance into patient care. This may outweigh the benefits of a higher level of competence assured by the test. However, the validity of such a test does not depend on its use. It is enough to define what it means to ‘operate on real patients under supervision’, to conceive a measure for this variable and to determine the test’s sensitivity and specificity with respect to it. Whether the beneficial consequences of such a test outweigh its drawbacks depends on the situation at hand: is there a need to respond to demands quickly? Are there better ways to assure competence? And so on. Making such deliberations part of the description of a test would stretch the scope of investigation indefinitely. Of course, experiences with the uses of a test should be part of its description, but they are distinct from its psychometric properties. Implications may depend on the validity of a test, but the test’s validity does not depend on its implications

The demand to include practical implication into the scope of scientific deliberation reminds us of the debate about the responsibility of physicists for the application of their science to, for example, nuclear weaponry. As in physics, there is no way to build a guarantee against misuse into psychometrics.

REFERENCES

In order to gain a better understanding of why a test in a specific form works, education research would benefit from more ambi-

One of the examples given in the article is a fictional test designed to establish ‘whether a resident is ready to operate on real patients

ª 2015 John Wiley & Sons Ltd. MEDICAL EDUCATION 2015; 49: 548–555

1 Kane MT. Validation. In: Brennan RL, ed. Educational Measurement, 4th edn. Westport, CT: Praeger 2006;17–64.

551

commentaries 2 Kane MT. Validating the interpretations and uses of test scores. J Educ Meas 2013;50 (1):1–73. 3 Cook DA, Brydges R, Ginsburg S, Hatala R. A contemporary

approach to validity arguments: a practical guide to Kane’s framework. Med Educ 2015;49: 560–75. 4 Gray SA, Deem LP. Predicting student performance in preclinical

technique courses using the theory of ability determinants of skilled performance. J Dent Educ 2002;66 (6):721–6.

Manipulating practice variables to maximise learning Joe Causer The last few decades have seen considerable advances in knowledge and technologies in the medical domain. These advances include the development of high-fidelity simulators, which provide enhanced possibilities for medical educators, as well as potentially expediting trainee development. Hence, there is a requirement for a more structured, systematic and empirically informed approach to medical education in general, especially in order to maximise the potential of these simulation technologies.1 Identifying the critical characteristics associated with expert performance enables practitioners to develop training interventions that can be designed and implemented to enhance the specific skills and necessary adaptations that are key to expertise in medicine. Subsequently, this will improve the quality of patient treatment and reduce the costs associated with health care training.2 However, how we structure these practice activities in order to ensure

Liverpool, UK

Correspondence: Joe Causer, Expert Performance and Learning Unit, Department of Research Institute of Sport and Exercise Sciences, Faculty of of Science, Liverpool John Moores University, Liverpool L3 3AF , UK. Tel: 00 44 151 904 6242; E-mail: [email protected] doi: 10.1111/medu.12735

552

long-term learning is still subject to debate. The development of high-fidelity simulators provides enhanced possibilities for medical educators and potentially expedites trainee development

In the current issue of Medical Education, Haji et al.3 discuss how elaboration theory may provide a basis for a more appropriate, expertisespecific training structure that can expedite skill development. Elaboration theory posits that training should begin with a simplified version of the task and progress to more complex versions as the simple task is mastered.4 These ideas share similarities with research from the motor skills literature, namely, the challenge point framework.5 The principal idea is that ‘tasks represent different challenges for performers of different abilities’.5 The long-term learning to be derived from a task of a given level of difficulty will vary based on the task environment, task complexity and the level of skill of the performer. Crucially, although increases in task difficulty may increase potential learning, they may also decrease performance. Therefore, an optimal challenge point occurs when there is maximisation of learning and any performance detriments

in practice are minimised.6 This raises challenges in how to differentiate the performance and learning of a task. The long-term learning to be derived from a task will vary based on the task environment, task complexity and the level of skill of the performer

With regard to elaboration theory, there are certain considerations around the complexity of the initial task and what information is removed in order to simplify it. Indeed, the removal or delay of certain information has been shown to significantly decrease diagnostic accuracy in emergency medicine scenarios.7 Furthermore, there is evidence that practice of simple tasks does not develop the same memory structures, attentional adaptations and fundamental knowledge base as practice of the fully complex task with realworld contextual information and emotions.2 It is also critical that there is a simultaneous development of procedural skills and conceptual knowledge, rather than an over-reliance on isolated education, which may occur if a task is overly simplified. Other practicerelated factors are also important to consider, such as practice schedule, feedback and instruction, all of which can significantly

ª 2015 John Wiley & Sons Ltd. MEDICAL EDUCATION 2015; 49: 548–555

Copyright of Medical Education is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

The trouble with validity: what is part of it and what is not?

The trouble with validity: what is part of it and what is not? - PDF Download Free
54KB Sizes 3 Downloads 11 Views