ORIGINAL ARTICLE

Simulation of Tangential Excision: A Test for Construct Validity James J. Gallagher, MD, FACS, Ian M. Goldin, BA, Geoffrey M. O’Sullivan, PA-C, Elliott L. Silverman, PA-C, MSHS, Katrina B. Mitchell, MD, Roger W. Yurt, MD, FACS

A foundational skill in burn surgery is tangential excision (TE). The purpose of this study was to develop a simulation model for TE, hypothesizing that simulation could be used in surgical training. TE simulation was created using the TE knife, foam, mineral oil, and base. Subjects, surgeons, or surgeons in training, were given a pre- and post-task questionnaire about experience with TE. Subjects were divided into three TE experience groups: novice—none, intermediate—some, and expert—TE in current or past practice. The task was to excise pre-marked rectangles, generating four excisional products (EPs). Evaluators blindly assessed performance by EP analysis using a novel scoring tool and reviewed videos using a modified objective structured assessment of technical skill (OSATS) rubric. Inter-rater reliabilities and P values were obtained, comparing Novice and Intermediate with Expert scores. Forty subjects completed the study: 16 were identified as TE novices, 17 as intermediates, and seven as experts. All EPs and videos were reviewed blindly by two evaluators using the EP scoring tool and OSATS methodology, respectively. Intraclass correlation coefficients were calculated to measure inter-rater reliabilities, which were acceptable (ICC => 0.42) for OSATS, time, and EP analysis: border and texture. Statistical differences between Novice and Expert scores were found (P < .0100, P < .0200, P < .0025, and P < .0005, respectively). Statistical differences between Intermediate and Expert scores were also found (P < .0100, P < .0200, P < .0100, and P < .0025, respectively). Post-simulation survey results showed experts 86% of the time agreeing or strongly agreeing that the simulation was similar to the clinical skin and 100% felt it would be a useful for training before clinical performance. Simulation for TE was successfully created to blindly discern level of TE experience. Participants agreed that simulation could play an essential role in burn surgical training. (J Burn Care Res 2015;36:558–564)

Simulation increasingly has been recognized as essential to general surgery training, though simulation in burn care remains limited.1 Effective burn surgical simulation has the potential to provide any surgeon with the basic skill set necessary to operate on simple burn patients. Devastating complications of non-fatal burn injury can be avoided by early wound closure through proper, fundamental burn

From the Weill Cornell Medical College, New York. Dr. Paul Christos was partially supported by the following grant: Clinical Translational Science Center (CTSC; UL1-TR000457-06). Address correspondence to James J. Gallagher, MD, FACS, 525 East 68th St., Floor 7, New York, New York 10065. E-mail: [email protected] Copyright © 2014 by the American Burn Association 1559-047X/2015 DOI: 10.1097/BCR.0000000000000166

558

surgery techniques that can be learned and applied in a variety of clinical settings across the world. In 1999, Watt et al produced a low-cost technique using water and Microfoam Tape (3M Health Care, St. Paul, MN) for the simulation of harvesting split-thickness skin for grafting; however, minimal performance metrics were discussed.2 Similarly, in 2002, Cubison and Clare used lasagna noodles and water or oil to simulate donor site skin for harvest (split thickness skin graft).3 In 2004, Simon et al assembled an acrylic pole (simulating bone), circumferentially wrapped in composite foam (simulating deep tissue structures), wrapped in porcine skin (simulating human skin) to represent a limb for split thickness skin graft.4 The simulation was described as being a realistic approach to learning the skill ex vivo; however, none of these studies produced quantitative data. There have been no reported studies

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.

Journal of Burn Care & Research Volume 36, Number 5

that simulate tangential excision (TE) with the Goulian knife, nor studies that validate a construct of burn surgical simulations for TE. The purpose of this study was to test previously validated fundamentals of laparoscopic skills (FLS) metrics and novel task-specific metrics as measures of subject performance on an innovative, inorganic bench-model simulation for TE with the Goulian knife. We hypothesized that our simulation construct would be able to blindly discern expertise in TE. In parallel to this study, a second simulation was developed for TE and skin graft harvest with the Watson knife. The results of the Watson knife simulation study will be reported separately.

METHODS All active members of the Department of Surgery including attending surgeons and physicians-intraining were invited to participate in an institutional review board-approved, prospective, comparative, blinded protocol. Additionally, visiting burn surgery attendings were invited to participate. Written consent was obtained from the 40 subjects who volunteered. Subjects were surveyed for their handedness, training in surgery (number of years), and level of experience with TE. For analysis, subjects were divided into three study groups based on their level of experience with TE: of the 40 subjects who volunteered, 16 had never performed TE (Novice), 17 had performed TE at some point in their training or as a surgeon but not exclusively (Intermediate), and seven had performed or do perform TE exclusively in their practice (Expert). Subjects performed the simulation on an individual basis. Before performing the simulation, all subjects in all groups watched a 3-minute introductory video providing the goals of the simulation, a display of the proper performance of the simulation by an expert, and safety instructions. Additionally, they were instructed to perform the simulation only using the hand that was holding the knife, and to keep the other hand behind the back at all times to ensure safety. The same proctor ran the simulation for all subjects. The primary components of the bench-model TE simulation were the Goulian knife with 0.30 guard and new blade, and a new foam wound suture pad (Limbs&Things.com, Savannah, GA) marked with line parameters by template. The foam pad was placed in a modified suture pad holder (Limbs&Things. com, Savannah, GA), which was fixed to a wooden tray. The tray was secured to a sturdy desk of standard height using grip-liner and a C-clamp. The

Gallagher et al  559

coverage of subjects with gown and gloves and complete draping of the background was assured before start of the simulation to maintain anonymity of the subjects in the video recording. All component parts, with the exception of the surgical instruments held by the subject, were immobile during simulation. The surface of the foam was uniformly lubricated by the proctor using a foam painter’s brush. The amount of mineral oil used was quantified by measuring the mass of the brush before and after the application of oil to the surface. Subjects were then asked to excise four rectangles with the Goulian knife in the subject’s primary hand. For each rectangle, subjects were asked to begin on the first vertical line, remove both inner horizontal (black) lines, and come off on the second vertical line, while avoiding the outer horizontal (orange) lines (Figure 1). To further participant safety from using an unsafe amount of force, the second and fourth rectangles were drawn slightly smaller than the first and third rectangles. The smaller rectangles required the subject to lighten pressure with the knife on the foam to achieve the goal. After completion of the simulation, a three-question post-simulation survey was administered to all subjects. An additional three questions were posed to the Expert group to judge the similarity of the simulation to the clinical experience of TE and the potential value of the simulation for education (Table 1). The evaluators were two experts in burn surgery who perform TE exclusively in their clinical practice. The

Figure 1.  Birds-eye view of simulation for TE with Goulian knife. TE, tangential excision.

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.



Journal of Burn Care & Research September/October 2015

560  Gallagher et al

Table 1. Post-simulation survey. Subjects were asked to respond using a 1 to 5 Likert scale (1 = strongly disagree, 3 = somewhat agree, 5 = strongly agree) Post-Simulation Survey Statements

Study Group(s) Asked

1. The instructions for completing the simulation were clear 2. I found the simulation difficult to perform 3. I am satisfied with my performance on the simulator 4. This simulation is similar to the clinical skill 5. This simulation may be a teaching tool for skill acquisition 6. Proficiency at this simulation would be beneficial before clinical performance of TE

Novice, Intermediate, Expert Novice, Intermediate, Expert Novice, Intermediate, Expert Expert Expert Expert

TE, tangential excision.

evaluators separately and blindly assessed subject performance. Metrics for evaluation were divided into two major categories: video analysis and excisional product (EP) analysis. All data were coded for anonymity. The video analysis included objective structured assessment of technical skill (OSATS) and time. Evaluators rated the subjects independently using a modified OSATS global rating scale.5 The standard OSATS tool generally involves the assessment through recorded observation of global performance according to seven validated metrics. Only three of the seven validated metrics were applicable to TE simulation: Respect for tissue, time and motion, and instrument handling. Each of the three OSATS metrics used for this study was judged on an increasing 1 to 5 Likert scale. Evaluators were given the opportunity to replay videos as needed throughout the reviewing process. Analysis of time to complete the simulation was performed by the proctor in the following manner: the proctor reviewed the video noting the time-counter within, then record the number of seconds that the knife was in contact with the foam. Only the time for which the knife made contact with the foam was recorded. The EP analysis included seven metrics developed to assess characteristics of the four pieces that were cut

away from the foam suture pad during the simulation. Unlike the OSATS rubric, there is no precedence for the analysis of foam pieces excised with a Goulian knife; therefore, all EP metrics were innovated through pilot testing and their validity as assessment tools was determined in this study. EP metrics included four objective and three subjective measurement tools (Table 2). All EP analysis metrics were in the form of penalty points. Objective measurement tools included black line accuracy (measured in mm), orange line accuracy (measured in mm), number of holes, and number of breaks (times the EP was broken during excision). Subjective measurement tools included border (evenness of the cut edge), thickness, and texture (an assessment of the EP surface which made contact with the blade). With the exception of time scores, all scores generated by the two evaluators were checked for interrater reliability (IRR) by the intraclass correlation coefficient (ICC), using a two-way random effects model (ie, assumes subjects and raters are selected at random from a population of subjects and raters, respectively). IRRs are considered “excellent” for ICC greater than 0.75, “good” for ICC between 0.61 and 0.75, “fair” for ICC between 0.40 and 0.60, and “poor” for ICC less than 0.40. The raw evaluator

Table 2. EP analysis metrics used by evaluators to assess subject performance EP No. Metric Type

Metric

Objective

Black lines Orange lines Holes Breaks Border Thickness Texture

Subjective

Measurement Recorded

1

2

3

4

mm absent on EP mm on EP 0/1/2/3+ 0/1/2+ 0 = even, 1 = some uneven 2 = uneven 0 = just right, 1 = too thin, 2 = too thick 0 = smooth, 1 = smooth with one or two ridges, 2 = about half is inconsistent, 3 = mostly inconsistent, 4 = completely inconsistent

EP, excisional products.

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.

Journal of Burn Care & Research Volume 36, Number 5

Gallagher et al  561

scores for each metric were also averaged and compiled into the study groups based on TE experience. Novice and Intermediate study groups were then compared to each other and then separately against the Expert study group using the two-sample t-test. Wilcoxon rank-sum tests were also performed and confirmed the t-test findings (data not shown). All P values are two-sided with statistical significance evaluated at the 0.05 alpha level. Ninety-five percent confidence intervals (95% CIs) were calculated to assess the precision of the obtained estimates. All ICC and Wilcoxon rank-sum analyses were performed in SPSS Version 21.0 (SPSS Inc., Chicago, IL). All t-tests were calculated and all graphs were produced in Microsoft Excel 2010 Version 14.0.6 (Microsoft Corporation, Redmond, WA).

RESULTS All 40 subjects completed the entire study including the pre- and post-simulation surveys. For all assessment tools, no significant differences were found between the Novice and Intermediate study group scores. The remainder of this section will focus on comparing Novice vs Expert and Intermediate vs Expert scores.

Video Analysis: OSATS and Time IRRs were found to be good to excellent for individual OSATS metrics (ICC = 0.64, 0.78, 0.78) and good for the total OSATS score (ICC = 0.74). Figure 2 shows a box plot of each TE experience category, against the total OSATS score received. A strong stepwise correlation is observed between the OSATS performance and reported TE experience. Using the modified OSATS assessment tool, subjects in the Expert group (11.57 ± 3.06 = mean ± SD) scored significantly higher than both the Novice (5.28 ± 2.04) and Intermediate (6.94 ± 3.14) groups (P < .0025, P < .0100, respectively). Time was then compared to reported TE experience (Figure 3). Subjects in the Expert group (41.14  ±  12.88) took significantly less time to complete the simulation than both the Novice (63.56 ± 14.37) and Intermediate (62.00 ± 22.57) groups (P < .0100, P < .0200, respectively). As experience with TE increased, time to complete simulation decreased.

EP Analysis The presence of black and absence of orange horizontal lines on the EP, as outlined in the goals of the simulation, were assessed by both evaluators

Figure 2. On the x-axis of this box plot, TE experience category is presented. Total OSATS score (3–15) is given on the y-axis. The boxes represent the interquartile distances. The upper and lower limits of the boxes indicate the 75th and 25th percentile, respectively. The horizontal lines in the boxes represent the median. The box and the whiskers together indicate the area in which all observations are found, unless outliers are present. When a given observation is located more than 1.5 times the interquartile distance (ie, above the 75th or below the 25th percentile), then this observation is called an outlier. TE, tangential excision.

objectively. As expected, IRR for this objective measure was found to be near perfect (ICC = 0.99); however, their respective presence or absence of black and oranges horizontal lines was not found to be significantly different between TE experience categories. Penalty points for border were generated subjectively by each evaluator. As shown in Table 2, up to two penalty points were given to each EP. A maximum of eight penalty points was possible, given that each subject produced four EPs. The IRR was found to be fair (ICC = 0.58). Evaluator border scores were then averaged and compared to level of TE experience. Subjects in the Expert group (3.86 ± 1.44) received significantly fewer border penalty points than both the Novice (6.84 ± 1.09) and Intermediate (6.03 ± 1.39) groups (P < .0025, P < .0100, respectively). Penalty points for thickness were generated subjectively by each evaluator. As shown in Table 2, up to two points were given to each EP. A maximum

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.



Journal of Burn Care & Research September/October 2015

562  Gallagher et al

teaching tool (86%), and useful for proficiency-based training before clinical performance of TE (100%; Table 1). All study participants were asked to evaluate the simulation and their performance. According to the post-simulation survey, 92% of the subject population “agreed” or “strongly agreed” that the instructions for completing the simulation were clear. According to the survey, 100% of Experts but 64% of non-Experts found that the simulation was not difficult. Additionally, 71% of Experts but only 42% of non-Experts “agreed” or “strongly agreed” that they performed well.

DISCUSSION

Figure 3.  On the x-axis of this box plot, TE experience category is presented. Time in seconds is given on the yaxis. See Figure 2 for box plot interpretation. TE, tangential excision.

of eight penalty points was possible, given that four EPs were produced for each subject. The IRR was found to be poor (ICC = 0.26). Evaluator thickness scores were then averaged and compared to level of TE experience. Subjects in the Expert group (1.71 ± 2.04) received significantly fewer thickness penalty points than both the Novice (4.38 ± 1.71) and Intermediate (3.76 ± 2.13) groups (P < .0200, P < .0500, respectively). Penalty points for texture were generated subjectively by each evaluator. As shown in Table 2, up to four points were given to each EP. A maximum of 16 penalty points was possible, given that four EPs were produced for each subject. The IRR was found to be excellent (ICC = 0.76). Evaluator texture scores were then averaged and compared to level of TE experience. Subjects in the Expert group (4.86  ±  2.17) received significantly fewer texture penalty points than both the Novice (11.78 ± 2.06) and Intermediate (10.03 ± 3.47) groups (P < .0005, P < .0025, respectively; Figure 4). The objective EP metrics of holes and breaks were found to occur too infrequently for evaluation. In the post-simulation survey, Experts “Agreed” or “Strongly Agreed” that the simulation is similar to the clinical skill (86%), effective as a potential

TE is a unique and defining surgical skill necessary to treat burns. This study sought to validate the construct of an innovative burn surgical simulation. No previous attempts have been made at creating an ex vivo surgical model for the acquisition or testing of TE using the Goulian knife. Previous attempts at TE simulation involved anecdotal reports of low-tech simulation for split-thickness skin graft harvest using the Watson knife. This study represents the first attempt at validating a

Figure 4.  On the x-axis of this box plot, TE experience category is presented. Texture penalty points (0–16) are given on the y-axis. See Figure 2 for box plot interpretation. TE, tangential excision.

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.

Journal of Burn Care & Research Volume 36, Number 5

surgical simulation for TE using the Goulian knife. Based on the study, this construct for TE is a valid way of testing TE skill. Knowledge gained from its innovation will inform the creation of a training device for TE. This device was not tested for its ability to train the skill of TE. Our evaluation system is largely based on the subjective assessment of two evaluators. Therefore, foundational to the interpretation of data from this construct is strong IRR between our two evaluators. This was present for all statistically significant metrics except for thickness. Thickness between our Expert and Novice or Intermediate groups was found to be statistically different; however, the IRR was poor and therefore not relied on for evaluation of subjects. Although there was a trend toward improved performance between the Intermediate and Novice study groups, the differences were not significant. However, there were significant differences between the Expert and Novice and the Expert and Intermediate study groups, supporting our hypothesis that our TE simulation construct is an effective tool for testing expertise in this surgical skill. The IRR for the three OSATS metrics were strong. It is expected that the highest OSATS scores are achieved by those subjects with the greatest clinical burn surgical skill. A strong positive correlation exists between OSATS scores experience with TE. It has been recognized that more objective, nonvideo based measures of competency than OSATS in simulation is desirable. We therefore developed a novel system of evaluation of the unique EPs yielded by this experiment. During development of this evaluation system, efforts were made to further characterize the EPs via objective criteria. Some qualities of the EPs remain dependent on a subjective assessment by the evaluators. During pilot testing, on rare occasion a hole or a break was produced in an EP by a Novice. These obvious and objective measures were included in the evaluation tool. However, during the study, holes and breaks occurred too infrequently, thus rendering those metrics inadequate for evaluation of expertise in TE. The design of the simulation required the subject to excise black lines while avoiding orange lines, varying the knife pressure to achieve two EP sizes indicated on the surface of the simulation. Given that the goals of the simulation necessitated control the knife in order to achieve success, intuitively one would expect that the Experts would achieve higher overall scores in line accuracy. Acquisition of line data was objective, as supported by the near-perfect IRR (ICC = 0.99). Surprisingly, the best scores for line accuracy were distributed equally among all three TE experience

Gallagher et al  563

groups. Therefore, the EP metric of presence and absence of black and orange lines, respectively, could not be used to determine the level of experience. The remaining two parameters in the EP evaluation tool were subjective. Border and texture were scales of 0 to 2 and 0 to 4, respectively. Strong IRR was present for both subjective metrics. Texture was found to be the strongest EP analysis metric with the smallest P values. Strong significance was found for both metrics between our Expert and Novice or Intermediate groups. A goal in simulation evaluation is to move away from the video-based OSATS scoring method to an entirely objective assessment of the byproduct of the experiment, as is possible with the fundamentals of laparoscopic skills construct. Ongoing efforts are aimed at identification of a technology that can quantify the subjective metric of texture for our EPs. We believe in the future, this potentially more specific measurement would even more accurately delineate the textural differences between TE experience groups observed by our evaluators. According to the post-simulation survey, most subjects indicated that they understood the goals of the tasks, suggesting that the instructions for completing the simulation are not a source of confounding variables in this experiment. Also in the post-simulation survey results, we were encouraged that Experts strongly supported the simulations similarity to the operative experience of TE, known as face validity. Additionally, many of our subjects commented on the usefulness of this construct as a trainer. Strengths of our study include that we were able to acquire seven Experts in the small subspecialty of burn surgery as participants in this prospective, blinded study. Moreover, though some of the novel metrics were unable to distinguish between Experts and non-Experts, OSATS, border, time, and texture metrics were successful and supported by good IRRs. The feasibility of the simulation is evident by the portability of the small simulation unit and the ease at which one can acquire the accessory items (found in the clinical setting or in most hardware stores). Feasibility is also improved as the evaluators must be blinded from subject identity and can therefore assess the videos and EPs on their own time. We were unable to accurately subdivide the TE experience categories further than Novice, Intermediate, and Expert based on our sample size. Moreover, the Intermediate category included nonExpert subjects who had performed TE in the past, most of whom had a clinical experience limited to a single rotation during residency, which only marginally impacted technical performance. A larger sample size may allow for more subdivisions in the

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.



Journal of Burn Care & Research September/October 2015

564  Gallagher et al

intermediate group, which may result in gradation of subject performance based on TE experience. This is a goal of ongoing studies. This study is also limited by the subjective bias inherent in the statistically significant metrics (OSATS, texture, and border), with the exception of time. To eliminate subjectivity with respect to texture, we continue to search for an appropriate instrument for objective quantification, thereby strengthening the nature of the evaluation tool. Finally, two of the investigators for this study, both Experts in burn surgery and key in the development of the simulation and assessment tools, were the two evaluators in this study. Extreme care was taken to blind the evaluators from subject identity, and the evaluators were not eligible to be study subjects. This study successfully validates a collection of modified and novel metrics for an innovative benchmodel of TE with the Goulian knife. Additional studies may include: 1) the full evaluation of subject performance by a lay-evaluator, a person with no experience in burn surgery to determine if burn

experience is required to be an effective evaluator, 2) a construct validation across educational borders in a multi-centered study, thereby increasing our sample sizes, 3) the establishment of learning curves for each TE experience group, 4) the development of a pass or fail scoring policy, in which a single number represents the overall simulation performance, and 5) a predictive validation of proficiency with the simulation on intra-operative performance. REFERENCES 1. Shahrokhi S, Jindal K, Jeschke MG. Three components of education in burn care: surgical education, inter-professional education, and mentorship. Burns 2012;38:783–9. 2. Watt DA, Majumder S, Southern SJ. Simulating split-skin graft harvest. Br J Plast Surg 1999;52:329. 3. Cubison TC, Clare T. Lasagne: a simple model to assess the practical skills of split-skin graft harvesting and meshing. Br J Plast Surg 2002;55:703–4. 4. Bennett SP, Velander P, McArthur PA, McPhail J, Alvi R, Graham KE. A novel model for skin graft harvesting. Plast Reconstr Surg 2004;114:1660–1. 5. Edwards MA, Asmann B. Methods of assessment. In: Tsuda ST, Scott DJ, Jones DB, editors. Textbook of simulation. 1st ed. Woodbury, CT: Cine-Med, Inc.; 2012 March. p. 15–34.

Copyright © American Burn Association. Unauthorized reproduction of this article is prohibited.

Simulation of Tangential Excision: A Test for Construct Validity.

A foundational skill in burn surgery is tangential excision (TE). The purpose of this study was to develop a simulation model for TE, hypothesizing th...
391KB Sizes 2 Downloads 9 Views