Surg Endosc DOI 10.1007/s00464-014-3975-y

and Other Interventional Techniques

Development and validation of a theoretical test of proficiency for video-assisted thoracoscopic surgery (VATS) lobectomy Mona Meral Savran • Henrik Jessen Hansen • Rene´ Horsleben Petersen • William Walker • Thomas Schmid Signe Rolskov Bojsen • Lars Konge



Received: 18 August 2014 / Accepted: 24 October 2014 Ó Springer Science+Business Media New York 2014

Abstract Background Testing stimulates learning, improves longterm retention, and promotes technical performance. No purpose-orientated test of competence in the theoretical aspects of VATS lobectomy has previously been presented. The purpose of this study was, therefore, to develop and gather validity evidence for a theoretical test on VATS lobectomy consisting of multiple-choice questions. Methods Four European VATS lobectomy experts were interviewed to explore their views on important theoretical VATS lobectomy knowledge (step 1). This information was used to construct the test items in compliance with existing guidelines for multiple-choice questions (step 2). The experts rated the relevance of the items to confirm content validity in a modified Delphi approach (step 3). Finally, the test was administered to physicians, who were categorised into different experience levels based on their experience in VATS procedures overall and in VATS

M. M. Savran (&)  S. R. Bojsen  L. Konge Centre for Clinical Education, University of Copenhagen and the Capital Region of Denmark, Blegdamsvej 9, 2100 Copenhagen, Denmark e-mail: [email protected] H. J. Hansen  R. H. Petersen Department of Cardiothoracic Surgery, Rigshospitalet, University of Copenhagen, Copenhagen, Denmark W. Walker Department of Cardiothoracic Surgery, Royal Infirmary of Edinburgh, Edinburgh, Scotland T. Schmid Department of Visceral, Transplant and Thoracic Surgery, Centre of Operative Medicine, Innsbruck, Austria

lobectomies specifically. Their answers were used to achieve construct validity (step 4). Results Initially, 81 items were constructed and two Delphi iterations reduced the test to 50 items. Item analysis led to the exclusion of 19 items and the mean discrimination index of the 31 final items was 0.26. Cronbach’s alpha for internal consistency was 0.75. The mean item difficulty was calculated to 0.63. According to performed VATS procedures, significantly different test performances were detected when comparing the group performances (p = 0.002) and the experts performed significantly better than the novices (p \ 0.001) and intermediates (p = 0.01). In the category of performed VATS lobectomies, significant group performances were also found. In this category, the experts were also significantly better than the novices (p \ 0.001), the trainees (p = 0.002), and the intermediates (p = 0.01). Conclusions This study led to the development of a theoretical test on VATS lobectomy consisting of multiplechoice questions. Both content and construct validity evidence were established. Keywords Video-assisted thoracoscopic surgery lobectomy  Education  Theoretical test  Validity

The worldwide leading source of cancer-related deaths is attributed to lung cancer with about 1.18 million deaths per year arguing the substantial role of appropriate lung cancer treatment [1]. In the management of non-small cell lung cancer (NSCLC), video-assisted thoracoscopic surgery (VATS) lobectomy is emerging as a substitute for the conventional thoracotomic approach [2–7]. Supporting arguments are found in the 2013 guidelines from the American College of Chest Physicians (ACCP)

123

Surg Endosc

acknowledging the preference of VATS lobectomy over thoracotomy and advocating this minimally invasive procedure for clinical stage I NSCLC [8]. In addition, the International Society for Minimally Invasive Cardiothoracic Surgery (ISMICS) also recommends VATS lobectomy for the surgical management of stage II NSCLC [9]. Compared to thoracotomy, VATS lobectomy has several potential advantages including reduced postoperative thoracic pain and analgesic needs, less chest tube time, shorter hospitalisation length also entailing a more economical procedure, lower overall complication rates including minimal risk of intraoperative bleeding which indicates the safety of this approach, decreased postoperative inflammatory response measured as the cytokine release, increased chemotherapy tolerance, improved postoperative maintenance of pulmonary function, and superior quality of life score and physical functioning with earlier return to preoperative activities [2, 6, 7, 9–14]. Most strikingly, a meta-analysis demonstrates potential improved 5-year mortality rates of VATS lobectomy compared with open lobectomy for patients with NSCLC [5]. Several articles emphasise the pivotal role of achieving the necessary technical competence to manage this technique [15–23]. Currently, no studies have explored the area of theoretical knowledge acquisition for VATS lobectomy although it has been shown that cognitive skills gained through theoretical testing support technical performance [24]. Besides functioning as an assessment tool, theoretical testing stimulates learning and improves information retention (that is, the ‘testing effect’) [25, 26]. Other benefits are encouragement to self-study, superiority in improving long-term retention compared with studying, and emphasis of the main curriculum [27, 28]. In the process of test development, decisions regarding type, cost-effectiveness, reliability, format, content, and validity need to be made [28–30]. Compared with other assessment types, written tests have more cost-effectiveness and reliability [31]. Written tests can have various formats, but the most optimal is multiple-choice questions (MCQs) because the MCQ test can assess a large swathe of the curriculum with possible repeated use, and the test has high reliability and reduced answering and scoring time [28–31]. When well constructed, the central asset of MCQs is their ability to oblige the test takers for knowledge appliance [28]. At the moment, a validated theoretical assessment tool in VATS lobectomy does not exist although current literature reports insecurity and lack of adequate proficiency in this field among thoracic surgeons [2, 5, 15, 32, 33]. In the absence of such a test, the objective of this study was to develop and gather validity evidence for a theoretical test in VATS lobectomy consisting of MCQs.

123

Materials and methods This study, which obtained inspiration from several other articles, involved two parts: MCQ development and validity evidence gathering [29, 34–38]. The study was divided into four steps (Fig. 1): Step 1: expert interviews Informal conversational interviews were held by MMS who consulted four European experts in VATS lobectomy (RHP, HJH, WW, and TS) from three different centres in three countries (Denmark, United Kingdom, and Austria). The interviews were recorded using written notes. The content of the interviews was outlined by the following question: ‘‘What is your opinion on relevant theoretical knowledge in VATS lobectomy targeting thoracic surgeons?’’ Step 2: development of multiple-choice questions Using advice from the literature by Case and Swanson, Downing, and Haladyna, the MCQs were made after the four interviews [28, 30, 39]. We formed the items as onebest-answer items, which contained a stem at the top with the needed information to answer the question followed by a lead-in question and three options (a, b, and c). Only one option was the most correct one, whilst the two others were less correct although they were content related. Our decision on constructing MCQs with only three options was research supported [39]. Step 3: content validity Subsequently, the created MCQ test was handed out to the four experts who rated the items in a Delphi-like process applying a rating scale of 1 (completely irrelevant) to 5 (extremely relevant). Each Delphi-like iteration ended up with a scrutinisation and collation of the expert judgements followed by a new iteration until reaching unanimity. Exclusion criteria for the test items were defined as follows: when rated ‘‘1’’, the item was discarded and when rated with a mean below ‘‘4’’, the item was redistributed to the experts for re-evaluation. In the first round, the experts were also requested to edit the wording for clarity. Step 4: construct validity In the final step, the MCQ test was investigated for its aptitude to discriminate different levels of experience groups from each other. The MCQ test was administered during the 22nd European Conference on General Thoracic

Surg Endosc Fig. 1 The development and validity evidence gathering of an MCQ test is depicted in the four steps. The outcomes of this study are seen to the right

Surgery held by the European Society of Thoracic Surgeons (ESTS) as an e-testing file and a supervisor (MMS) was present during the test answering to avoid the risk of participants who make use of any support (for instance handbooks, the Internet). No time limit was imposed for test answering. A binary and unweighted scoring method (that is, ‘zero–one’) was utilised for this assessment merely counting the number of correctly answered test items [40, 41]. The physicians were instructed to fill out a questionnaire on their surgical experience prior to test participation in order to distinguish the different experience groups from each other. They were categorised according to the number of performed VATS procedures in general and number of performed VATS lobectomies. Three intervals defined the participants according to their performed VATS procedures: the novices had performed 0–99 VATS procedures; the intermediates had performed 100–999 procedures; and the experts had performed at least 1,000 procedures. Four

VATS lobectomy intervals categorised the participants as follows: the novices had not performed any VATS lobectomies; the trainees had performed 1–49; the intermediates had performed 50–499; and the experts had performed at least 500.

Statistical analysis Item analysis was provided for the MCQ test calculating the item discrimination index (point biserial) and item difficulty [40]. Items with a point biserial index of discrimination below 0.2 were discarded. The residual items were categorised into four levels (level I–IV) with regard to their item difficulties. Internal consistency reliability was calculated by Cronbach’s alpha. The analyses of the group performances were carried out in two categories using both the total number of performed VATS procedures and the total number of performed VATS lobectomies and

123

Surg Endosc

involved One-way Analysis of Variances (ANOVA) and one-sided t tests. A statistical significance level of p \ 0.05 was predefined. Statistical analysis was carried out in a statistical software package (PASW, version 19.0; SPSS Inc.).

Ethics Test participation was voluntary; hence, according to the local Danish ethics committee, approval was not necessary (H-2-2014-FSP60).

Results Content validity (step 3) After the four expert interviews, the MCQ test contained 81 items (Fig. 1). The initial Delphi iteration reduced the test to 66 items and the next and final Delphi iteration excluded another 16 items. The items that were included in the final test after the second Delphi iteration contained agreement between all or more than half of the experts. The response rate in both iterations was 100 %. Subsequently, a test with 50 MCQ items was administered to the participants. Construct validity (step 4) Item analysis led to the exclusion of 19 items due to a discrimination index (point biserial) below 0.20. The remaining 31 items in the final test version had a mean item discrimination of 0.26. The Cronbach’s alpha for internal consistency was 0.75. The mean item difficulty was calculated to 0.63 and distributing the test items into four difficulty levels resulted in: level I (middle difficulty) 51.6 %, level II (easy) 29.0 %, and level III (difficult) 19.4 %. No items were distributed to level IV (extremely difficult or easy). When comparing the group performances according to performed VATS procedures, ANOVA showed significantly different test performances for the three groups

Table 1 The three group test performances according to performed VATS procedures

123

Experience level (n performed VATS procedures)

producing an F-statistic of 6.95 and p = 0.002 (Table 1). The mean performance difference between the intermediates and the novices was 1.76 and between the intermediates and the experts the mean difference was -4.09 (Fig. 2). One-sided t-tests revealed significantly better expert performances compared with both novices (p \ 0.001) and intermediates (p = 0.01). The intermediates performed insignificantly better than the novices (p = 0.06). Comparisons of the group performances according to performed VATS lobectomies presented significant ANOVA outcomes with an F-statistic of 5.61 and p = 0.002 (Table 2). The mean difference between the novices and the trainees was -1.4; between the trainees and the intermediates -1.3; and between the intermediates and experts -5.4 (Fig. 3). One-sided t tests showed significantly better performances for the experts compared to the novices (p \ 0.001), the trainees (p = 0.002), and the intermediates (p = 0.01). The intermediates performed significantly better than the novices (p = 0.03) and insignificantly better than the trainees (p = 0.184). The trainees’ performances were insignificantly better than the novices (p = 0.13).

Discussion Guided by the four steps depicted in Fig. 1, the present study in VATS lobectomy resulted in the construction of 31 MCQ test items with gathered validity evidence. In step 1, test content was composed of experts’ knowledge, yet in the literature expertise is somewhat arbitrarily defined: First, Gruber highlights ‘proceduralised knowledge’ (that is, know-how) as an expert disposition [42]. Next, Patel et al. focus on ‘socially-sanctioned criteria’ for medical expertise defining an expert as a boardcertified medical specialist in a subdomain with specialised and narrowed knowledge [43]. Finally, Streiner et al. state that no clear-cut expert definitions exist and at the same time they stress the importance of consulting individuals with specific knowledge of interest when the test developer’s aim is to construct as many items as possible [44]. In

Total, n

Mean score

SD 95 % CI for the mean Lower limit

Upper limit

Minimum score

Maximum score

Novices (0–99)

23

17.7

3.9

15.6

19.4

7

23

Intermediates (100–999) Experts (C1,000)

34 10

19.4 23.5

4.4 3.6

17.9 20.9

21.0 26.1

10 18

27 30

Surg Endosc

Fig. 2 Box-and-whiskers plot showing the number of test scores for the three experience levels. The median, the minimum scores, and maximum scores are also shown. Significantly different test performances were detected for the three experience levels (p = 0.002)

Fig. 3 Mean plot presenting the total score mean of the four experience levels. The performance of the four levels was significantly different (p \ 0.002)

this study, we have selected four thoracic surgeons specialised in VATS lobectomy to participate in the interviews in order to reduce uncertainties concerning expert definition. In step 2, a chief concern was developing items that can test higher-order cognitive skills such as knowledge application instead of merely regurgitation of what is learned by rote. Opponents of MCQs typically address the risk of recall and recognition testing regarding this testing modality, nonetheless proponents claim that MCQs can demonstrate clinical reasoning and problem solving with high-authenticity vignettes in the stem [28, 39, 45]. In addition, given the homogenous and plausible three options in each item, the test takers display clinical judiciousness when choosing the most correct option [45]. In step 3, expert opinions were used in a modified ‘Delphi’ to exclude irrelevant content. The pros of this consultative method are anonymity that prevents peer group pressure eliminating potential subject bias and broad opinion sampling without geographical limitations [46]. The cons are indefinite sample size, absence of a

representative sample and subjectivity in achieving unanimity [46, 47]. To reduce the negative influence of the cons, we have incorporated a research-supported number of experts [44]; we have selected experts from three different countries; and we have benefited from the purposive sampling nature of the ‘Delphi’ based on the assumption that purposively targeting experts in VATS lobectomy will meet the aim of developing a test on this topic [47]. Hence, we cannot guarantee representativeness or rule out the possibility of selection bias as limitations of our study. Another limitation addresses deficient reliability since the foundation of the ‘Delphi’ is subjectivity [46, 47]. Still, gaining content validity requires the use of expert opinions [37, 44]. In step 4, item analysis caused exclusion of the items with poor ability to discriminate between high- and lowachieving participants. The remaining items were on average suitable with discrimination above ?0.20 [40]. Item difficulty calculations allocated the test items to predominantly level I that holds the best item characteristics [40]. Fairly, easier items were distributed to level II, which

Table 2 The four group test performances according to performed VATS lobectomies

Total, n

Experience level (n performed VATS lobectomies)

Mean score

SD

95 % CI for the mean Lower limit

Maximum score

Upper limit

Novices (0)

20

3.9

15.8

19.5

12

25

Trainees (1–49)

28

19.1

4.5

17.3

20.8

7

26

Intermediates (50–499)

14

20.4

3.8

18.2

22.5

13

27

5

25.8

2.6

22.6

29.0

23

30

Experts (C500)

17,7

Minimum score

123

Surg Endosc

in our test contains fewer items than in level I. Because of the partially discriminating power of level II, these items were also incorporated in the final test [40]. Level III items, which are described as problematic, were only included due to their relevant content judged by the four experts in the Delphi iterations [40]. A strength attributed to this test is that none of the items belonged to level IV, which contains the extremely difficult or easy items emphasising the major efficacy of the ‘Delphi’. Another strength of the test was the high value of internal consistency reliability. Although some controversy exists concerning the most desirable value of Cronbach’s alpha, a value of at least 0.70 is accepted [48]. A greater reliability requires more items [48]. However, feasibility is also of major concern in test development [30]. The majority of the participants at the ESTS Conference, who took the test of 50 items (that is, before item analysis), considered it as too lengthy. Therefore, we believe that we have taken both matters into account by developing a test of 31 items with a Cronbach’s alpha of 0.75. In addition, a recently published study has demonstrated the redundant role of too many test items; the progress in learning does not continue with additional number of items [49]. The analysis of the group performances delineated two noteworthy outcomes. First of all, by grouping the participants into not only performed VATS procedures but also performed VATS lobectomies, we were able to detect significant differences between the different experience levels indicating that our test has context specificity of VATS lobectomy and thus implying a potential of being implemented in the training of thoracic surgeons on this specific matter. Context specificity is indeed imperative in improving learning [50, 51]. Next, in both groups, the experts are separated from the other experience levels in that they score significantly higher than the other participants demonstrating the high standard of the test (Figs. 2, 3). Again, the educational agenda is portrayed; the test requires sound knowledge on VATS lobectomy in order to master it. This agenda evidently depicts the idea of mastery learning [52]. Consequently, the test supports that test takers should receive theoretical training prior to testing. The decision on categorising the participants into the three and four experience levels respectively was based on research on learning curves: Petersen and Hansen state 50 VATS lobectomy procedures to be an acceptable number for a surgeon in training compared with an experienced surgeon and Li et al. note that 100–200 VATS lobectomies are ample to achieve efficiency whilst consistency requires more procedures [15, 22]. As an assessment tool, this theoretical test cannot act solely in the training of thoracic surgeons in VATS lobectomy [53]. Theoretical knowledge gaining through theoretical lectures and theoretical testing supports the achievement of technical skills and is therefore a required

123

step towards gaining full competency [24]. In the field of VATS lobectomy, technical skills learning including simulation-based training and an assessment tool for thoracoscopic performance has already been explored [17, 18]. A multimodal training programme towards accomplishing the needed skills to perform VATS lobectomies is therefore of major importance and the implementation of this theoretical test can be regarded as the primary step. In conclusion, the current study presents a theoretical test on VATS lobectomy consisting of multiple-choice questions. The test has demonstrated both content and construct validity and we advise the application of this in standardised evidence-based training programmes.

Disclosures Savran, Hansen, Petersen, Walker, Schmid, Bojsen, and Konge have no conflicts of interest or financial ties to disclose.

References 1. Youlden DR, Cramb SM, Baade PD (2008) The international epidemiology of lung cancer: geographical distribution and secular trends. J Thorac Oncol 3:819–831 2. McKenna RJ Jr, Houck W, Fuller CB (2006) Video-assisted thoracic surgery lobectomy: experience with 1,100 cases. Ann Thorac Surg 81:421–425 discussion 425-426 3. Walker WS, Codispoti M, Soon SY, Stamenkovic S, Carnochan F, Pugh G (2003) Long-term outcomes following VATS lobectomy for non-small cell bronchogenic carcinoma. Eur J Cardiothorac Surg 23:397–402 4. Hansen HJ, Petersen RH, Christensen M (2011) Video-assisted thoracoscopic surgery (VATS) lobectomy using a standardized anterior approach. Surg Endosc 25:1263–1269 5. Yan TD, Black D, Bannon PG, McCaughan BC (2009) Systematic review and meta-analysis of randomized and nonrandomized trials on safety and efficacy of video-assisted thoracic surgery lobectomy for early-stage non-small-cell lung cancer. J Clin Oncol 27:2553–2562 6. Whitson BA, Groth SS, Duval SJ, Swanson SJ, Maddaus MA (2008) Surgery for early-stage non-small cell lung cancer: a systematic review of the video-assisted thoracoscopic surgery versus thoracotomy approaches to lobectomy. Ann Thorac Surg 86:2008–2016 discussion 2016-2008 7. Daniels LJ, Balderson SS, Onaitis MW, D’Amico TA (2002) Thoracoscopic lobectomy: a safe and effective strategy for patients with stage I lung cancer. Ann Thorac Surg 74:860–864 8. Detterbeck FC, Lewis SZ, Diekemper R, Addrizzo-Harris D, Alberts WM (2013) Executive Summary: diagnosis and management of lung cancer, 3rd ed: American College of Chest Physicians evidence-based clinical practice guidelines. Chest 143:7S–37S 9. Downey RJ, Cheng D, Kernstine K, Stanbridge R, Shennib H, Wolf R, Ohtsuka T, Schmid R, Waller D, Fernando H, Yim A, Martin J (2007) Video-assisted thoracic surgery for lung cancer resection: a consensus statement of the International Society Of Minimally Invasive Cardiothoracic Surgery (ISMICS) 2007. Innovations (Phila) 2:293–302 10. Kaseda S, Aoki T, Hangai N, Shimizu K (2000) Better pulmonary function and prognosis with video-assisted thoracic surgery than with thoracotomy. Ann Thorac Surg 70:1644–1646

Surg Endosc 11. Balduyck B, Hendriks J, Lauwers P, Van Schil P (2007) Quality of life evolution after lung cancer surgery: a prospective study in 100 patients. Lung Cancer 56:423–431 12. Yim AP, Wan S, Lee TW, Arifi AA (2000) VATS lobectomy reduces cytokine responses compared with conventional surgery. Ann Thorac Surg 70:243–247 13. Petersen RP, Pham D, Burfeind WR, Hanish SI, Toloza EM, Harpole DH Jr, D’Amico TA (2007) Thoracoscopic lobectomy facilitates the delivery of chemotherapy after resection for lung cancer. Ann Thorac Surg 83:1245–1249 discussion 1250 14. Casali G, Walker WS (2009) Video-assisted thoracic surgery lobectomy: can we afford it? Eur J Cardiothorac Surg 35:423–428 15. Petersen RH, Hansen HJ (2010) Learning thoracoscopic lobectomy. Eur J Cardiothorac Surg 37:516–520 16. Konge L, Petersen RH, Hansen HJ, Ringsted C (2012) No extensive experience in open procedures is needed to learn lobectomy by video-assisted thoracic surgery. Interact CardioVasc Thorac Surg 15:961–965 17. Konge L, Lehnert P, Hansen HJ, Petersen RH, Ringsted C (2012) Reliable and valid assessment of performance in thoracoscopy. Surg Endosc 26:1624–1628 18. Jensen K, Ringsted C, Hansen HJ, Petersen RH, Konge L (2014) Simulation-based training for thoracoscopic lobectomy: a randomized controlled trial: virtual-reality versus black-box simulation. Surg Endosc 28:1821–1829 19. Bjurstrom JM, Konge L, Lehnert P, Krogh CL, Hansen HJ, Petersen RH, Ringsted C (2013) Simulation-based training for thoracoscopy. Simul Healthc 8:317–323 20. Ferguson J, Walker W (2006) Developing a VATS lobectomy programme—can VATS lobectomy be taught? Eur J Cardiothorac Surg 29:806–809 21. Wan IY, Thung KH, Hsin MK, Underwood MJ, Yim AP (2008) Video-assisted thoracic surgery major lung resection can be safely taught to trainees. Ann Thorac Surg 85:416–419 22. Li X, Wang J, Ferguson MK (2014) Competence versus mastery: the time course for developing proficiency in video-assisted thoracoscopic lobectomy. J Thorac Cardiovasc Surg 147: 1150–1154 23. Carrott PW Jr, Jones DR (2013) Teaching video-assisted thoracic surgery (VATS) lobectomy. J Thorac Dis 5:S207–S211 24. Kohls-Gatzoulis JA, Regehr G, Hutchison C (2004) Teaching cognitive skills improves learning in surgical skills courses: a blinded, prospective, randomized study. Can J Surg 47:277–283 25. Roediger HL, Karpicke JD (2006) Test-enhanced learning: taking memory tests improves long-term retention. Psychol Sci 17:249–255 26. Kromann CB, Jensen ML, Ringsted C (2009) The effect of testing on skills learning. Med Educ 43:21–27 27. Karpicke JD, Roediger HL 3rd (2008) The critical importance of retrieval for learning. Science 319:966–968 28. Case SM, Swanson DB (2001) constructing written test questions for the basic and clinical sciences, 3rd edn. National Board of Medical Examiners, Philadelphia 29. Schuwirth LW, van der Vleuten CP (2003) ABC of learning and teaching in medicine: written assessment. BMJ 326:643–645 30. Downing SM (2009) Written Tests. In: Downing SM, Yudkowsky R (eds) Assessment in health professions education. Routledge, New York, pp 149–181 31. van der Vleuten CPM, Schuwirth LWT (2009) Written Assessments. In: Dent JA, Harden RM (eds) A practical guide for medical teachers. Churchill Livingstone, London, pp 323–331 32. McKenna RJ Jr (2008) Complications and learning curves for video-assisted thoracic surgery lobectomy. Thorac Surg Clin 18:275–280 33. Boffa DJ, Gangadharan S, Kent M, Kerendi F, Onaitis M, Verrier E, Roselli E (2012) Self-perceived video-assisted thoracic

34.

35.

36.

37.

38.

39.

40.

41.

42.

43.

44.

45.

46.

47. 48.

49.

50. 51. 52.

53.

surgery lobectomy proficiency by recent graduates of North American thoracic residencies. Interact CardioVasc Thorac Surg 14:797–800 Savran MM, Clementsen PF, Annema JT, Minddal V, Larsen KR, Park YS, Konge L (2014) Development and validation of a theoretical test in endosonography for pulmonary diseases. Respiration 88:67–73 Schubert S, Ortwein H, Dumitsch A, Schwantes U, Wilhelm O, Kiessling C (2008) A situational judgement test of professional behaviour: development and validation. Med Teach 30:528–533 Strandbygaard J, Maagaard M, Larsen CR, Schouenborg L, Ottosen C, Ringsted C, Grantcharov T, Ottesen B, Sorensen JL (2013) Development and validation of a theoretical test in basic laparoscopy. Surg Endosc 27:1353–1359 Streiner DL, Norman GR (2008) Validity. In: Streiner DL, Norman GR (eds) Health Measurement Scales: a practical guide to their development and use. Oxford University Press, Oxford, pp 247–276 Graham B, Regehr G, Wright JG (2003) Delphi as a method to establish consensus for diagnostic criteria. J Clin Epidemiol 56:1150–1156 Haladyna TM (2004) Guidelines for developing MC items. In: Haladyna TM (ed) Developing and validating multiple-choice test items. Routledge, London and New York, pp 97–126 Downing SM (2009) Statistics of testing. In: Downing SM, Yudkowsky R (eds) Assessment in health professions education. Routledge, New York, pp 93–117 Haladyna TM (2004) Validity evidence coming from statistical study of item responses. In: Haladyna TM (ed) Developing and validating multiple-choice test items. Routledge, London, pp 202–229 Gruber H (2001) Acquisition of expertise. In: Smelser NJ, Baltes PB (eds) International encyclopedia of the social & behavioral sciences. Elsevier, Amsterdam, p 5145–5150 Patel VL, Kaufman DR (2001) Cognitive psychology of medical expertise. In: Smelser NJ, Baltes PB (eds) International encyclopedia of the social & behavioral sciences 9515–9517 Streiner DL, Norman GR (2008) Divising the items. In: Streiner DL, Norman GR (eds) Health Measurement Scales: a practical guide to their development and use. Oxford University Press, Oxford, pp 18–38 Hawkins RE, Swanson DB (2008) Using written examinations to assess medical knowledge and its application. In: Holmboe ES, Hawkins RE (eds) Practical guide to the evaluation of clinical competence. Mosby, Philadelphia, pp 42–59 Keeney S, Hasson F, McKenna HP (2001) A critical review of the Delphi technique as a research methodology for nursing. Int J Nurs Stud 38:195–200 Hasson F, Keeney S, McKenna H (2000) Research guidelines for the Delphi survey technique. J Adv Nurs 32:1008–1015 Streiner DL, Norman GR (2008) Selecting the items. In: Streiner DL, Norman GR (eds) Health Measurement Scales: a practical guide to their development and use. Oxford University Press, Oxford, pp 78–104 Cook DA, Thompson WG, Thomas KG (2014) Test-enhanced web-based learning: optimizing the number of questions (a randomized crossover trial). Acad Med 89:169–175 Ashley EA (2000) Medical education—beyond tomorrow? The new doctor—Asclepiad or Logiatros? Med Educ 34:455–459 Van der Vleuten CPM, Dolmans DHJM, Scherpbier AJJA (2000) The need for evidence in education. Med Teach 22:246–250 McGaghie WC, Issenberg SB, Barsuk JH, Wayne DB (2014) A critical review of simulation-based mastery learning with translational outcomes. Med Educ 48:375–385 Miller GE (1990) The assessment of clinical skills/competence/ performance. Acad Med 65:S63–S67

123

Development and validation of a theoretical test of proficiency for video-assisted thoracoscopic surgery (VATS) lobectomy.

Testing stimulates learning, improves long-term retention, and promotes technical performance. No purpose-orientated test of competence in the theoret...
504KB Sizes 0 Downloads 7 Views