Clinical Neurophysiology 126 (2015) 857–865

Contents lists available at ScienceDirect

Clinical Neurophysiology journal homepage: www.elsevier.com/locate/clinph

Review

Somatosensory and motor evoked potentials as biomarkers for post-operative neurological status R.N. Holdefer a,⇑, D.B. MacDonald b, S.A. Skinner c a

Department of Rehabilitation Medicine, University of Washington School of Medicine, Box 359740, Seattle, WA 98104-2499, USA Section of Clinical Neurophysiology, Department of Neurosciences, King Faisal Specialist Hospital & Research Center, MBC 76, PO Box 3354, Riyadh, Saudi Arabia c Intraoperative Monitoring, Department of Neurophysiology, Abbott Northwestern Hospital, 800 E 28th Street, Minneapolis, MN 55407, USA b

a r t i c l e

i n f o

Article history: Accepted 12 November 2014 Available online 20 November 2014 Keywords: Somatosensory evoked potentials Motor evoked potentials Intraoperative neurophysiological monitoring Evidence-based medicine A.B. Hill Causality guidelines Surrogate endpoint

h i g h l i g h t s  Intraoperative evoked potentials (EPs) are often used during surgery as surrogates for true clinical

endpoints.  A three step framework recently proposed by the Institute of Medicine was used to evaluate EP

biomarkers as surrogate endpoints.  Causality guidelines and contingency analysis provided partial validation of EP surrogates.

a b s t r a c t SEPs and MEPs (EPs) are often used as surrogates for postoperative clinical endpoints of muscle strength and sensory status, as these true endpoints are not available during surgery. EPs as surrogate endpoints were evaluated using a three step framework (Analytical Validation, Qualification, Utilization) recently proposed by the Institute of Medicine (USA). EP performance on Analytical Validation may surpass that of some other biomarkers used in medicine (tumor size, cardiac troponin). Qualification of EP surrogates was evaluated with guidelines for causation proposed by A.B. Hill, which supported causal links between surgical events and EP changes and revised estimates of EP diagnostic test performance for three illustrative studies. Qualification was also addressed with a 3  2 contingency analysis which demonstrated decreased deficit proportions for EP declines which recovered after surgeon intervention. Utilization of EP surrogates will depend on surgical procedure and alert criteria. EPs are often used as surrogate endpoints to avoid new postoperative deficits. Although not fully validated, their continued use as surrogates during surgical procedures with the potential for significant morbidity is justified by their potential to help avoid injury and the absence of ‘‘second best options.’’ Ó 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

Contents 1. 2.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Literature search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Assessing causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3. Diagnostic statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

858 858 858 858 858

Abbreviations: EPs, somatosensory evoked potentials (SEP) and/or transcranial electrical motor evoked potentials (MEP) as a group; RSC, reversible signal change (lost and recovered EP); IONM, intraoperative neurophysiological monitoring; EBM, evidence based medicine; EMG, electromyography. ⇑ Corresponding author at: Rehabilitation Medicine, University of Washington, Harborview Medical Center, 325-9th Ave, Box 359740, Seattle, WA 98104-2499, USA. Tel.: +1 206 744 5465; fax: +1 206 744 8580. E-mail address: [email protected] (R.N. Holdefer). http://dx.doi.org/10.1016/j.clinph.2014.11.009 1388-2457/Ó 2014 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

858

3. 4.

5.

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865

EPs as biomarkers and surrogate endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A framework for evaluating EP biomarkers and surrogate endpoints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Application of a three step framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2. Analytical validation: can EPs be accurately measured? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3. Utilization: in what context are EPs used? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4. Qualification: associations between surgical events, EPs, and postoperative outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.1. First component of Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2. Second component of Qualification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conflict of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1. Introduction Over the past several years increasing attention has been given to the evidence base of intraoperative neurophysiological monitoring (IONM) for improving surgical outcomes. The prominence of the evidence-based medicine (EBM) movement and related literature has continued to grow, and there is an increased recognition, at times reluctant, that its guidelines may have practical value for distinguishing good from more precarious evidence. On-going changes in healthcare policy and administration also have prompted a more critical look at IONM and outcomes. At the same time, there is a growing appreciation of the need to avoid an oversimplified application of EBM methods to individual patient values and different medical practice contexts (Greenhalgh et al., 2014). IONM is an example, where evaluation of outcomes requires a thoughtful integration of the empirical methods of EBM with clinical expertise (Straus, 2005). Somatosensory evoked potentials (SEPs) and transcranial electrical motor evoked potentials (MEPs) are used to reduce new postoperative neurological deficits involving the dorsal column somatosensory and corticospinal motor pathways. Despite general agreement on the prognostic (predictive) value of IONM for many surgical procedures, its efficacy in improving surgical outcomes remains contested (Resnick et al., 2009; Fehlings et al., 2010; Nuwer et al., 2012). This is in large part due to the rarity of randomized control trials and controlled observational studies. Surgeons who use IONM will typically not withhold an intervention to an EP alert or do without IONM for a controlled study for fear of potentially harming the patient. Medical studies often make use of biomarkers and surrogate endpoints when controlled research designs are not ethical or practical (Aronson, 2008; Institute of Medicine, 2010; Bell et al., 2014). A surrogate endpoint is ‘‘a biomarker that is intended to substitute for a clinical endpoint.’’ A biomarker is simply ‘‘a characteristic that is objectively measured and evaluated as an indication of normal biologic processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention (Institute of Medicine, 2010).’’ In this paper we propose that EPs are usefully conceptualized as biomarkers and surrogate endpoints and may be evaluated by a recent framework recommended by the Institute of Medicine of the National Academy of Sciences (USA). In fact, with the exception of the ‘‘wake-up’’ test or awake cranial surgeries, true clinical endpoints for neurological status during surgery have never been used. EPs are biomarkers by definition, as are the blood pressure and pulse oximetry monitors of tissue perfusion and oxygenation, respectively, during surgery.

859 859 859 859 859 859 860 862 864 864 864

There was no restriction on language. Titles were reviewed, and data from appropriate articles was compiled. The average number of annual citations was determined.1 2.2. Assessing causality Causal links between surgical events and EP changes were investigated using the guidelines for causation proposed by A.B. Hill. Hill described nine guidelines of evidence for causation when an association is observed between two variables (Hill, 1965) (Table 1). They are most useful when controlled observations are not practical or ethical. These guidelines are increasingly used in medicine, epidemiology, and environmental health, and have recently been incorporated in evidence assessments used by the Grading of Recommendations, Assessment, Development and Evidence (GRADE) working group (Guyatt et al., 2011). Rate ratios have been proposed for assessing strength of association when there is a rapid response against a stable background and are defined as the rate of progression during treatment divided by the rated of progression during no treatment (Glasziou et al., 2007). Rate ratios were used to compare rate of EP change following a surgical event with the rate of EP change immediately preceding the event. When EPs were stable before the event, 0 was replaced with 0.5 for a more robust estimate and to avoid division by zero. Rate ratios beyond 10 for strength of association may indicate causation, even in the presence of confounding variables (Glasziou et al., 2007). Rate ratios as a quantifiable metric of the strength of association between surgical events and EP changes are a topic for future research. 2.3. Diagnostic statistics Sensitivity and specificity estimates of EP performance were revised using causality guidelines as described in Section 4.4.1.3. Confidence intervals (95%) and forest plots were obtained using RevMan (Review Manager, 2012). Likelihood ratios (LR) were calculated from sensitivity and specificity using the following equation: LR = sensitivity/(1 specificity). Unlike predictive values LR can adjust posttest outcome probabilities for pretest risk factors (Grimes and Schulz, 2005; Bhandari et al., 2003). To calculate posttest probability: 1. Pretest probability was converted to pretest odds (probability/ (1 probability)). 2. Pretest odds were multiplied by the LR to obtain posttest odds.

2. Methods 2.1. Literature search The Web of Science database (Thomson Reuters) was queried August 13, 2014 for EP surgical monitoring during 1970–2014.

1 The search strategy identified two sets. Set 1 with the fields (TITLE: (intraoperative⁄) OR TITLE: (IONM) OR TITLE: (monitor⁄) OR TOPIC: (surg⁄) OR TITLE: (IOM). Set 2 with the fields (TITLE: (somato⁄ evoke⁄ potential⁄) OR TITLE: (mot⁄ evoke⁄ potential⁄) OR TITLE: (EMG) OR TITLE: (electromyog⁄) OR TITLE: (evoke⁄ potential⁄). Final set for analysis: Set 1 AND Set 2.

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865 Table 1 Hill’s guidelines for causation when an association is observed between two variables (e.g. surgical event and EP change). 1. 2. 3. 4. 5. 6. 7. 8. 9.

Strength of association Consistency of association Specificity of association Temporal relationship of the association Biological gradient revealed by the association Biological plausible causation Coherence with generally known facts Experimental evidence Analogy

3. Posttest odds were converted to posttest probability (probability = odds/(1 + odds)). 3. EPs as biomarkers and surrogate endpoints Surrogate endpoints and biomarkers are used when it is not ethical, possible, or practical to use the clinical outcomes. For example, it would be unethical after acetaminophen overdose to wait for evidence of liver damage before deciding whether or not to treat a patient. The plasma concentration of acetaminophen is used as a biomarker to predict whether treatment is required (Aronson, 2008). Blood pressure is considered an exemplar surrogate endpoint for cardiovascular mortality and morbidity (Fleming and Powers, 2012). Tumor size is used as a biomarker for the efficacy of cancer therapeutics. Although not always linked to clinical benefit, its continued use is justified by the seriousness of many malignancies (Institute of Medicine, 2010). Cardiac troponin is a useful biomarker for myocardial infarct. Recently, morphometric analysis of corticospinal tract imaging has been proposed as a biomarker for acute spinal cord injury (Cadotte and Fehlings, 2013; Freund et al., 2013). EPs are of necessity used as surrogate endpoints during surgery because the clinical endpoints cannot be obtained in the anesthetized patient. These patient-relevant clinical endpoints are vibration and position sense for SEPs and voluntary muscle strength for MEPs. An example of EPs as surrogates is correction reduction by the surgeon during scoliosis surgery in response to reported loss of EPs. Often surgical intervention will continue until the EPs have recovered, using them as surrogates for postoperative muscle strength and sensory status. 4. A framework for evaluating EP biomarkers and surrogate endpoints 4.1. Application of a three step framework In view of their importance and usefulness in medicine, the Institute of Medicine of the National Academies of Science recently described a three step framework for evaluating good biomarkers and surrogates (Institute of Medicine, 2010, 2011). These steps in biomarker evaluation are interrelated and may occur concurrently. 1. Analytical validation—can the biomarker be accurately measured? 2. Utilization—what is the specific context of the proposed use? 3. Qualification—is the biomarker associated with the clinical endpoint of concern? 4.2. Analytical validation: can EPs be accurately measured? Tumor size continues to be used as a biomarker for the efficacy of cancer therapeutics despite poor performance on analytical val-

859

idation. Lack of standardization in how tumor size is defined (area, mass or volume) and image analysis platforms across groups limit accuracy of measurement. Despite this it continues to be used as a surrogate endpoint because of the seriousness of cancer. Although cardiac troponin analytical validation is limited by the sensitivity of its assays and reference standards, it is the preferred biomarker to diagnose myocardial infarction (Institute of Medicine, 2010). EPs outperform tumor size and cardiac troponin on analytical validation. They can be accurately measured by suitably trained personnel following the specific guidelines of professional societies (ASNM, ACNS, IFCN, and ASET). The signal to noise ratio is typically good or excellent, and EPs are responsive to diverse surgical interventions during different surgical procedures. Amplitudes and latencies can be quantified over a large dynamic range and are typically compared to earlier baselines from the same patient to detect changes. SEP amplitudes and latencies show little variability under stable surgical and anesthetic conditions, while MEP amplitudes, although inherently more variable, are still consistent enough to detect marked reduction or disappearance. 4.3. Utilization: in what context are EPs used? Utilization assesses the specific context of biomarker and surrogate endpoint use. Biomarkers may succeed in some contexts but fail in others. For example, tumor size is predictive of clinical benefit from therapeutics for some cancers, but may not correlate with long-term clinical benefit in locally advanced breast cancer (Institute of Medicine, 2010). Likewise, EP surrogates may perform better in some surgical contexts than in others. MEPs are probably less sensitive to compromise of individual nerve roots during lumbar surgeries, than to corticospinal tract or anterior horn gray matter damage during surgeries at the cervical and thoracic levels (MacDonald et al., 2012). EP alert criteria also depend on surgical context. An overly conservative (specific) MEP alert criterion may not be sufficiently sensitive during some cranial and spinal surgeries. In this instance, an opportunity to intervene may be missed and a new postoperative deficit may result (Zhou and Kelly, 2001; Dong et al., 2005; Kobayashi et al., 2014). On the other hand, an overly sensitive criterion for some procedures could produce too many false alarms and compromise surgical treatment, resulting in incomplete tumor removal or suboptimal deformity correction. 4.4. Qualification: associations between surgical events, EPs, and postoperative outcomes In its recommendations on biomarker Qualification, the Institute of Medicine (2010) emphasized two components in assessing the association of a biomarker with a clinical endpoint. The first component demonstrates that a biomarker is on the causal pathway linking disease (injury) and clinical endpoints. Mechanistic associations between disease (injury), biomarker, and clinical endpoints are emphasized, as is the predictive value of the biomarker for patient outcomes. The second component examines causal relationships between interventions targeting the biomarker, and the clinical endpoints. Probabilistic approaches are stressed given the complex mechanistic causal pathways typically present between diverse interventions and patient outcomes. EPs are measurements of the causal pathways linking intraoperative neural injury, surgical intervention, and clinical endpoint. Vibration and position sense deficits following compromised conduction of the dorsal column somatosensory pathway (lost SEP) and voluntary motor weakness following compromised corticospinal tract conduction or lower motor neuron function (lost MEP), define the clinical endpoints. EP amplitude decrements that follow

860

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865

Fig. 1. Common EP changes associated with surgical events. (A) BAEPs to stimulation of the left ear rapidly declined after temporary clipping of the basilar artery, and quickly recovered with removal of the clip during an AICA aneurysm surgery. At the asterisk peak V latency had increased by 1.0 ms and the amplitude decreased by more than 50%. (B) MEPs from the lower extremities were lost during correction and quickly recovered after removal of both rods during surgery for severe thoracic scoliosis. In both of these examples, strength of association and biological plausibility of causal links between surgical events and EP changes is very strong or strong, and temporality is satisfied by the presumed cause (surgical event) preceding the EP change.

a surgical event may represent compromised conduction in these neural pathways. 4.4.1. First component of Qualification 4.4.1.1. Mechanistic associations. The pathophysiologic mechanisms resulting in EP amplitude decrements are relatively well understood. For example, over distraction during scoliosis correction can stretch the spinal cord and its blood vessels, causing mechanical distortion and/or ischemia (Pastorelli et al., 2011; Lall et al., 2012; Ferguson et al., 2014). Similarly, misdirected pedicle screws and sublaminar hooks or wires can compress the cord or roots, causing mechanical distortion and secondary ischemia (Schwartz et al., 2007; Thuet et al., 2010). EP amplitudes reflect the number of functionally intact axons and neurons in the tested pathways. Therefore, as affected axons and/or lower motor neurons begin to fail, EP amplitudes decrease accordingly. Both mechanical distortion and ischemia can produce time-dependent structural damage. However, if the pathophysiology is relieved quickly enough by undoing the offending surgical maneuver, then the affected neural elements and their EPs can recover (Macdonald et al., 2007; Skinner et al., 2009). 4.4.1.2. Causal links. Causal links between adverse events, EP decline, surgeon intervention, and EP recovery can be evaluated using the guidelines for causation proposed by Bradford Hill (Aronson, 2008; Institute of Medicine, 2010). Rigorous application of these causality guidelines has been proposed for the first component of Qualification. Rapid EP changes (against a stable background) with application and removal of clips during aneurysm

surgery are a compelling example (Fig. 1A). Four minutes after temporary clipping of the basilar artery, the latency of peak V increased by 1.0 ms and the amplitude decreased by more than 50%. The rate of these changes contrasts sharply with more than an hour of stable BAEP waveforms, only a portion of which is shown. The rate ratio comparing the rate of change after clipping with before clipping is greater than 30. Strength of association is also strongly supported by the reversibility of peak V changes with removal of the clip and spatial proximity of the clipping to the EP change. No changes to right BAEPs were observed during this surgery with a left sided approach. Biological plausibility of compromised perfusion of lateral lemniscal pathways and cochlea with basilar artery clipping is also strong. Spatial proximity, reversibility and biological plausibility also provide the primary support for causal links between aneurysm clipping and indocyanine green (ICG) angiography and doppler ultrasonography. Like EPs, these flow oriented biomarkers are used as surrogates for preserved neurologic status. Reduction of correction during spinal deformity surgery is a credible context which favors a causative relationship between this event and MEP recovery (Fig. 1B). After correction, MEPs were lost for 20 min, and then recovered 3 min after removing both rods. The ratio comparing rate of effect for rod removal after the 20 min MEP loss is 13.3. Strength of association is also found in the reversibility of MEP changes associated with rod placement and removal, and the spatial proximity of the thoracic correction to MEP decline in the legs and not the hands. Although casual links between the surgical events and EP changes are obvious in these examples, this is often not the case. Making explicit the

861

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865

aspects of association which support causation can be helpful when the links between EP changes and surgical events are less clear. Good evidence rules out or makes less likely competing hypotheses (Howick, 2011). The competing hypothesis here is that intervention could have been withheld to an EP alert, and it would have spontaneously recovered. Hill’s guidelines for causation make this less likely, and can provide strong support for causal links between surgical intervention and EP changes, as in these two examples (Table 2). Causality guidelines can also improve estimates of EP prognostic performance, in support of the first component of Qualification. 4.4.1.3. EP prognostic performance. The first component of Qualification requires biomarkers to accurately predict the disease or injury. If EPs are good biomarkers, they should accurately predict new or evolving injury during surgery for therapeutic intervention. Although this has been supported by several excellent systematic reviews, estimates of EP diagnostic accuracy are difficult to determine and can vary widely (Resnick et al., 2009; Fehlings et al., 2010; Nuwer et al., 2012). Reversible signal changes (RSCs) where the surgeon intervenes in response to an EP decline are largely responsible for this difficulty (Skinner and Holdefer, 2014). If the patient wakes up with no new deficits, these RSCs are commonly scored as true positives in the literature (Resnick et al., 2009; Fehlings et al., 2010). However, the literature also demonstrates that some fraction of EP alerts are falsely positive (e.g. Lee et al., 2006; Kim et al., 2007; Ferguson et al., 2014; Tamkus et al., 2014), raising the possibility that surgical intervention for some RSCs could have been withheld with no adverse consequences. This is made less likely by using causality guidelines to improve estimates of EP test performance. Rigorous application of causality guidelines can support causation of EP changes by surgical events, going beyond a mere

association between the two. The premise here is that EP changes caused by surgical events are more likely to be predictive of new deficits than those changes with either no association, or an association that is not directly causative. Improved estimates of EP diagnostic performance using causality guidelines is illustrated with three studies. Hilibrand et al. (2004) was one of the two most widely cited studies of MEP test performance during spine surgeries in a recent literature search, with 8.36 average citations per year. Thirumala et al. (2014) examined SEP performance in a large retrospective series of surgeries for idiopathic scoliosis. Kobayashi et al. (2014) examined MEP performance during spine surgeries with diverse patient diagnoses. They recognized that the neurological status of the patient during a RSC is unknown, and the initial decline could be either a true or false positive. Hilibrand et al. (2004) compared MEP and SEP diagnostic performance in 427 patients undergoing cervical spine surgeries. Their MEP results are shown in Fig. 2. There were a total of 12 true positives: two new motor deficits predicted by irreversible MEP declines, and 10 RSCs, where MEP amplitudes declined P60% and then recovered with intervention. All 10 patients with RSCs awoke with no new deficits. Perfect sensitivity and specificity were reported. The calculated LR (sensitivity/(1 specificity)) equals infinity. Four of the 10 RSCs were associated with insertion of a strut graft and the remaining 6 with hypotension. Strength of association between graft insertion and MEP decline was demonstrated by the rapid change in MEP amplitudes with graft insertion against a stable background of MEPs unchanged from baseline values. A similar contrast in rate of change of MEP amplitudes following hypotension is often not present. The biological plausibility of an oversize strut graft leading to an EP decline from local stretch or compression of vasculature is strong. An EP decline from systemic hypotension may be less compelling in many cases where autoregulatory mechanisms in the spinal cord normally maintain constant blood flow despite variations in arterial pressure.

Table 2 Causality guidelines applied to the surgical events and EP changes in Fig. 1. Bolded guidelines are those emphasized for these examples.

Strength of association Consistency across settings Specificity Temporality (does treatment precede effect) Biological gradient (dose response) Biological plausibility (mechanistic reasoning) Coherence with existing data Experiment Analogy

Clipping during aneurysm surgery

Spinal correction during deformity surgery

Very strong Very strong No Yes Yes Very strong Yes No NA

Strong Strong No Yes Yes Strong Yes No NA

Fig. 2. Causality guidelines were used to revise diagnostic statistics. The original and revised (asterisked) statistics and confidence intervals are shown for each study. RSCs with strong support for causal links with a preceding surgical event are kept as true positives, and those without strong support moved to the false positive category. These revised estimates of EP performance correspond more closely with those of other diagnostic tests used in medicine, and accommodate an incidence of EP false positives. See text for details.

862

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865

Hill’s guidelines support causal links beyond simple association between strut graft insertion and MEP decline more strongly than between hypotension and MEP decline. The four RSCs caused by graft insertion, and the two irreversible MEP losses with new deficits are scored as six true positives in the diagnostic statistics revised with causality guidelines. The six RSCs associated with hypotension are moved from the true positive category of the original study to the false positive category of the revised statistics (Fig. 2). These revised statistics now include a less than perfect specificity and a likelihood ratio made possible by avoiding division by zero. Thirumala et al. (2014) recently re-examined the performance of SEP monitoring in 477 patients during surgery for idiopathic scoliosis. There were 18 RSCs.2 One RSC was in a patient who woke up with a new deficit for a false negative.3 Eight RSCs were associated with rod placement, wire tightening, or hook placement and no new deficits. The rapidity of SEP changes against a stable background often associated with these events, and the specific changes to the lower, and not the upper, extremities strongly support strength of association. As described earlier in Section 4.4.1.1, the biological plausibility of these changes associated with instrumentation is also strong. The remaining 9 RSCs associated with the pressure cuff, hypotension, and positioning receive less support for causal links with these events. For cases with graded EP changes associated with adjustments to positioning (‘‘dose response’’), strength of association would be supported, but for many EP changes attributed to positioning clear temporal proximity of these events may not be present. Diagnostic statistics revised with causality guidelines places the 8 RSCs associated with hardware and no new deficits in the true positive category. There was also an irreversible SEP decline with new deficits, for a total of 9 true positives. Nine RSCs associated with the pressure cuff, hypotension and positioning are moved from the true positive to false positive category (Fig. 2). Sensitivity and specificity point estimates and confidence intervals appear more credible, and overall diagnostic performance of SEPs remains strong. The multi-center study by Kobayashi et al. (2014) examined MEPs in 959 patients with diverse diagnoses which included spinal deformity and extra-and intra-medullary tumors. They avoided the bias introduced by RSCs to diagnostic statistics by omitting them from the analysis as ‘‘indeterminate (rescue)’’ cases, an approach also taken by Kim et al. (2007). Point estimates of sensitivity and specificity with narrow confidence intervals were obtained, made possible by the large sample size and inclusion of diagnoses with relatively high incidence of new deficit (Fig. 2). A disadvantage of this approach is the loss of information when RSCs are omitted from an analysis of EP test performance. EPs as surrogate endpoints are expected to capture the effects of interventions on the true endpoints (Prentice, 1989). EP declines which can indicate injury and the need for intervention, and then by their recovery demonstrate its effectiveness, are a primary reason for EP monitoring during surgery. How well EPs capture the efficacy of these

2 The authors reported 19 RSCs in their Table 1. One patient (#3) had new biceps and deltoid weakness, with no changes in median or ulnar SEPs, and a reversible change in tibial nerve SEPs. The authors classified this patient as both a FN and a FP, which creates unit of analysis issues. In the present analysis this patient is considered a false negative for no change in upper extremity SEPs with new deficits in the upper extremity. 3 Peroneal nerve SEPs recovered with loosening of the instrumentation wires and the patient experienced new, bilateral leg weakness. This case (#14) was classified as a true positive in Table 1 of Thirumala et al., 2014. The present analysis examines the prognostic value of EPs for Qualification as surrogate endpoints. Here the patient experienced new deficits despite EP recovery to preoperative baseline levels, and accordingly is classified as a false negative.

Fig. 3. Likelihood ratios adjust posttest probabilities for pretest risk factors. The probability of new deficit after EP alarm is greater for pediatric spondylolisthesis because this diagnosis carries a greater risk of new deficits than adult degenerative disease (Hamilton et al., 2011). Posttest probabilities for the likelihood ratios in Fig. 2 are shown.

interventions is difficult to determine when RSCs are excluded from analysis. There are three points here pertaining to the prognostic performance of EP biomarkers. The first is that causality guidelines give estimates of EP diagnostic test performance more in line with studies which have demonstrated an incidence of false positive IONM reports (Lee et al., 2006; Kim et al., 2007; Ferguson et al., 2014; Tamkus et al., 2014). Although more credible, these estimates are still subject to interpretation and bias. For example, there will be differences of opinion regarding the biological plausibility of causal links between surgical interventions in diverse surgical procedures and EP recovery. The second point is that these revised estimates of sensitivity and specificity can yield meaningful changes in likelihood ratios and the probability of a new deficit with an EP alarm (Fig. 3). For example, the probabilities of a new deficit with an EP alarm are 0.36, 0.19 and 0.06 for the likelihood ratios in Fig. 2 and a preoperative risk of 0.0057, which was described as the prevalence for new neurologic deficits for adult degenerative disease in a recent Scoliosis Research Society survey (Hamilton et al., 2011). These post EP alarm probabilities increase to 0.86, 0.72, and 0.40 for pediatric spondylolisthesis, assuming a preoperative risk of 0.059 (Hamilton et al., 2011). Pre-operative probability of iatrogenic injury will depend on patient diagnosis and surgical events, and likelihood ratios revised with causality guidelines can result in significant changes in prognosis associated with an EP alert. The third point is that the likelihood ratios of the IONM studies in Fig. 2 compare favorably with those of other diagnostic tests used in medicine (Table 3). Likelihood ratios greater than 10 give large and often conclusive changes in probability with a positive diagnostic test (Bhandari et al., 2003). The revised diagnostic statistics of Fig. 2 and several systematic reviews (Resnick et al., 2009; Fehlings et al., 2010; Nuwer et al., 2012) support the prognostic value of EP biomarkers and the first step of Qualification.

4.4.2. Second component of Qualification 4.4.2.1. Surgical interventions, EPs and outcomes. The second component of Qualification for surrogate endpoints requires that they capture all effects of interventions or therapies on clinical endpoints (Institute of Medicine, 2010; Fleming and Powers, 2012; Prentice, 1989). Controlled designs and probabilistic approaches are often necessary due to the complexity of mechanistic pathways linking the two. Blood pressure is an exemplar of a surrogate

863

R.N. Holdefer et al. / Clinical Neurophysiology 126 (2015) 857–865 Table 3 Likelihood ratios for commonly used diagnostic tests in medicine. Target disorder

Sign/test

Patient setting

LR

Proximal DVT DVT Myocardial infarction Myocardial infarction Myocardial infarction Urinary tract infection in children

Ultrasonography Ultrasonography Any ST-segment ECG elevation Radiation of pain to both arms Cardiac-specific troponin T Urinalysis in hospital with automated analyzer Spirometry

Symptomatic patients Asymptomatic patients Adults with chest pain Adults with chest pain Patients with chest pain Random sample of children presenting to ER with symptoms of UTI 1/3 confirmed OAD, 1/3 suspected OAD, 1/3 no evidence of OAD Elderly patients

47.5 7.8 11.2 7.1 6.3 (at 0–2 h) 7.6

Chronic obstructive airway disease (COAD) in adults Dementia

Clock drawing test

8.3 (smoking >40 pack/year) 9.6

Abstracted from: http://ktclearinghouse.ca/cebm/glossary/lr, accessed 5/07/2014.

endpoint. Different classes of antihypertensive agents which decrease systolic blood pressure (surrogate) have also been shown in controlled trials to reduce the likelihood of stroke (endpoint). The mechanistic pathways linking surgical interventions and EP recordings with postoperative neurological assessments include changes in patient physiology with recovery from anesthesia that may be no less complex. Probabilistic approaches and controlled research designs may also be necessary to show that surgeon interventions in response to EP alerts impact outcomes. Extending the 2  2 contingency analysis used for diagnostic test accuracy calculations to an n  n contingency analysis is a step in this direction. RSCs are key events that may be classified separately without making any assumptions. There are not just positive and negative EP monitoring results, but three fundamental types of EP deterioration (none, reversible, and irreversible) and two basic neurologic outcomes (no new deficit and new deficit), which implies a 3  2 contingency table: EP deterioration

New postoperative deficit

None Reversible Irreversible

No No No

Yes Yes Yes

This approach does not calculate sensitivity and specificity. Instead, proportional differences can be evaluated with Chi-square of Fischer’s exact statistics and the proportion of postoperative deficit with each EP result determined. For example, if it were shown that the deficit proportion is significantly higher with irreversible EP deterioration than with reversible deterioration, then measures to reverse EP deterioration encountered during surgery could be justified. This might be particularly true when causality seems likely. Table 4 illustrates a 3  2 contingency table re-analysis of an orthopedic spine surgery monitoring series (Macdonald et al., 2007). The alert criteria consisted of MEP disappearance or unequivocal SEP amplitude reduction clearly exceeding trial-totrial variability. The re-analysis considers all observed postoperative deficits (early or delayed, spinal cord, root or peripheral nerve). It shows a significant difference in deficit proportions, with a very low incidence of poorly predictable deficits (lumbar radiculopathy, delayed postoperative paraparesis) in the case of no EP deterioration, modest incidence of congruent deficits with reversible deterioration – particularly if protracted, and very high incidence of congruent deficits with irreversible deterioration. Thus, efforts to make EP deterioration reversible seem justifiable. The approach can be extended to n  n contingency tables analyzing more types of EP deterioration or neurologic outcome.

Table 4 Example of a 3  2 re-analysis of an orthopedic spine surgery EP monitoring series (Macdonald et al., 2007). EP deterioration

No new deficit

New deficit

Deficit proportion*

None Reversible Irreversible

182 14 0

3a 4b 3c

0.016 0.22 1.0

*

Chi-square difference in proportions: P < 0.001. L4 or L5 radiculopathy (n = 2), postoperative day 2 paraparesis (n = 1). b Radial nerve palsy after reversible median nerve SEP deterioration and thenar MEP disappearance (n = 1), bilateral leg dysesthesiae after reversible bilateral tibial nerve SEP deterioration (n = 1), unilateral leg paresis after protracted (>40 min) but eventually reversible leg MEP disappearance (n = 2). c Brown-Sequard after irreversible ipsilateral tibial nerve SEP deterioration and leg MEP disappearance (n = 1), postoperative day 2 paraparesis after irreversible MEP disappearance in one leg muscle and reversible MEP loss in other leg muscles (n = 1), L5 radiculopathy after irreversible tibialis anterior MEP disappearance (n = 1). a

Table 5 A 4  3 analysis of intracranial surgery MEP monitoring (Neuloh et al., 2008). MEP deterioration

None Reversible deterioration or loss Irreversible deterioration Irreversible loss

New postoperative motor deficit None

Temporary

Permanent

Always Frequent Rare Never

Never Frequent Frequent Never

Never Rare Frequent Always

For example, Table 5 summarizes a graduated and clinically meaningful 4  3 analysis of IONM results during intracranial surgery correlating reversible and irreversible MEP deterioration or loss (disappearance) to temporary or permanent postoperative motor deficits (Neuloh et al., 2008). The impact of surgical interventions on postoperative outcomes may also be related to the duration of EP decline. For example, Calancie et al. (1998) found that MEP deterioration (P100 V threshold elevation) lasting more than 1 h correlated with postoperative motor deficits. Also, EP deterioration lasting more than 30– 40 min appears to increase the likelihood of spinal cord infarction during descending aortic surgery (MacDonald and Dong, 2008). This matches the time-dependent structural damage of spinal cord compression or ischemia. Even after some damage has occurred, surviving neural elements and their evoked potentials may at least partially recover after eventually relieving the offending process. Thus, one could define quickly reversible (e.g.,

Somatosensory and motor evoked potentials as biomarkers for post-operative neurological status.

SEPs and MEPs (EPs) are often used as surrogates for postoperative clinical endpoints of muscle strength and sensory status, as these true endpoints a...
767KB Sizes 0 Downloads 9 Views