American Journal of Epidemiology © The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: [email protected].

Vol. 180, No. 9 DOI: 10.1093/aje/kwu206 Advance Access publication: September 25, 2014

Practice of Epidemiology Methodological Considerations in Observational Comparative Effectiveness Research for Implantable Medical Devices: An Epidemiologic Perspective

* Correspondence to Dr. Soko Setoguchi, Duke Clinical Research Institute, P.O. Box 17969, Durham, NC 27715 (e-mail: [email protected]).

Initially submitted June 19, 2013; accepted for publication July 10, 2014.

Medical devices play a vital role in diagnosing, treating, and preventing diseases and are an integral part of the health-care system. Many devices, including implantable medical devices, enter the market through a regulatory pathway that was not designed to assure safety and effectiveness. Several recent studies and high-profile device recalls have demonstrated the need for well-designed, valid postmarketing studies of medical devices. Medical device epidemiology is a relatively new field compared with pharmacoepidemiology, which for decades has been developed to assess the safety and effectiveness of medications. Many methodological considerations in pharmacoepidemiology apply to medical device epidemiology. Fundamental differences in mechanisms of action and use and in how exposure data are captured mean that comparative effectiveness studies of medical devices often necessitate additional and different considerations. In this paper, we discuss some of the most salient issues encountered in conducting comparative effectiveness research on implantable devices. We discuss special methodological considerations regarding the use of data sources, exposure and outcome definitions, timing of exposure, and sources of bias. comparative effectiveness; epidemiologic methods; medical device epidemiology; pharmacoepidemiology; prostheses and implants; United States Food and Drug Administration

Abbreviations: EHR, electronic health record; FDA, Food and Drug Administration; ICD, implantable cardioverter-defibrillator; LVAD, left ventricular assist device; UDI, unique device identifier.

Prompted by several high-profile recalls of medical devices, the FDA recently asked the Institute of Medicine to conduct an independent review of the clearance process (4). The Institute of Medicine concluded that the process was insufficient to assure safety and effectiveness and also detailed weaknesses in the FDA’s postmarketing surveillance system. Thus, not only are there gaps in the understanding of device safety and effectiveness at the time of market entry, but there is also a need for high-quality data to fill those gaps in the postmarketing setting. Although the objectives of pharmacoepidemiology and medical device epidemiology are similar, differences between devices and medications have important implications for how epidemiologic methods are developed and applied. In this article, we discuss special methodological considerations in device comparative effectiveness research with regard to data sources, exposures, outcomes, time, and bias.

Implantable medical devices are used to restore body functions, prevent life-threatening events, prolong life, and improve quality of life. Use of implantable devices has increased steadily (1), and with their use being concentrated in older populations, the trend is likely to continue. The Food and Drug Administration (FDA) classifies medical devices into 3 classes according to the level of control necessary to assure safety and effectiveness (2). Class III devices include most implantable devices (e.g., hip implants, pacemakers, stents), pose the highest risks, and are intended to undergo premarket approval, analogous to a new drug application. However, the dominant mechanism used for market entry is Section 510(k) of the Food, Drug and Cosmetic Act, which requires that the device be “substantially equivalent” to one already being marketed (3, 4). Devices entering the market through this process are “cleared” rather than “approved” by the FDA. 949

Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

Jessica J. Jalbert, Mary Elizabeth Ritchey, Xiaojuan Mi, Chih-Ying Chen, Bradley G. Hammill, Lesley H. Curtis, and Soko Setoguchi*

950 Jalbert et al.

DATA SOURCES FOR EVALUATING DEVICE EFFECTIVENESS

Device registries

Medical records

Device registries can be used to obtain detailed information on devices and clinical conditions (see Table 1 for a list of example registries). In our experience, a major limitation of device registries for comparative effectiveness research is the limited follow-up information. Registries may only collect data during the encounter for the implantation up to the time of discharge or may have significant loss to follow-up. For example, the Centers for Medicare & Medicaid Services created an implantable cardioverter-defibrillator (ICD) registry and a carotid artery stenting database in response to changes in the National Coverage Determinations for these devices. Neither data source, however, contains follow-up information beyond the index hospitalization. In the Society for Vascular Surgery’s Vascular Registry (2005–2008), followup information was available for 59.0% of carotid artery stenting and 54.3% of carotid endarterectomy patients, while those who were followed had mean and median follow-up durations of less than 1 year. For identification of long-term outcomes, device registries have been linked to administrative databases such as Medicare (7, 8).

The pros and cons of using medical records for research purposes in pharmacoepidemiology are largely the same as those for device comparative effectiveness research. Adoption of electronic health records (EHRs) by US hospitals has been slow, with fewer than 10% having a basic EHR system by 2008 (10). In facilities with EHRs, much of the information may be in the form of unstructured free text, which is difficult to search, summarize, and analyze. Text mining may be used to “numericize” and extract the information but can be labor-intensive. Many EHRs are also institution-specific, such that services rendered outside the institution are not captured, and the lack of standardization also hinders efforts to combine EHRs across multiple institutions.

Administrative claims databases

Administrative claims databases are derived from claims submitted for payment for clinical services and treatments. Within these databases, there exist neither unique device identifiers (UDIs) nor a claims-based system permitting identification of medical devices with sufficient granularity. In claims data, implantable devices are captured using Current Procedural Terminology or International Classification of Diseases procedure codes. Different health-care settings may use different coding systems or a mix of coding systems, as is the case with Medicare data. Implantation of a device in an inpatient setting will be coded using both International Classification of Diseases codes (inpatient file) and Current Procedural Terminology codes ( physician/carrier file), whereas an outpatient procedure will be coded using Current Procedural Terminology codes. Although Current Procedural Terminology codes tend to be more specific than International Classification of Diseases codes, both types of codes relate to the implantation rather than the device itself and often convey little information about the actual device. Sampled national data sources

IDENTIFYING DEVICES Unique device identifiers

Devices and their components are often not consistently identifiable with great specificity, such as by brand, model, Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

National surveys such as the Nationwide Inpatient Sample (5) and the Nationwide Emergency Department Sample (6) provide data on representative samples of inpatient and emergency department visits, respectively, and can be used to derive national estimates. However, data on implantable medical devices are also captured using procedure codes and thus are subject to the same limitations as those detailed above for administrative data. Moreover, device performance can only be assessed during the encounter, because data are not captured longitudinally.

Many device registries collect information on patients receiving a device but not on patients receiving “standardof-care” or alternative treatments. In this situation, other disease, drug, or device registries have been used to identify comparison groups (8, 9). Because registries are often focused on a specific procedure or device, comparative effectiveness analyses using multiple registries and/or data sources may be needed, particularly if more than 1 comparator is possible. Published literature may also be used to provide aggregate data for a historical or objective performance criterion for comparison. The aggregate comparison afforded by published literature, however, may be insufficient for comparative effectiveness analyses, because information supplied within a published article may not be granular to the level of the individual, and thus accounting for differences between the intervention and comparator groups may not be possible. Data quality and granularity can be heterogeneous across and within registries and can change over time as data collection forms are updated or reporting requirements change. Whether the data are complete and accurate depends highly on the procedures in place to ensure that the highest-quality data are recorded at the time of collection (e.g., clear data element definitions and training of data collectors) and to identify and correct sources of error after collection (e.g., data checks and imputation). Within a device registry, data quality may vary highly depending on differences in regulatory structures. For example, the Improving Pediatric and Adult Congenital Treatment (IMPACT) Registry is a voluntary national registry, but the contribution of states like Massachusetts and New York is mandatory with data adjudication. Systematic ascertainment biases across individual centers, states, or systems contributing to a registry could result in treatment effect heterogeneity across centers, states, or systems. Both data quality and granularity can limit the analyses that can be performed to certain time periods, certain data contributors, or certain groupings of devices.

Medical Device Comparative Effectiveness 951

Table 1. Characteristics of Cardiovascular and Orthopedic Device Registries in the United States Data Source and Registry

Exposure(s)

Type of Information Collected Participation

Catchment Area

National Cardiovascular Data Registrya (45) Carotid artery stenting, carotid endarterectomy

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

National

ICD Registry

Implantable cardioverter-defibrillators

Clinical, procedural, implant, and in-hospital outcomes

Mandatory

National

IMPACT Registry

Cardiac catheterization

Clinical, procedural, implant, and in-hospital outcomes

Voluntary

National

TVT Registry

Transcatheter aortic valve replacement

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

National

CART-CL

Percutaneous coronary intervention with or without stenting

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

National

VHA National ICD Surveillance Center

Implantable cardioverter defibrillators

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

National

Pacemaker Surveillance Program

Pacemaker

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

National

GLORY

Total hip arthroplasty, total knee arthroplasty

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Voluntary

International

Harris Joint Registry

Total hip surgery, total knee surgery

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Massachusetts General Hospital (Boston, Massachusetts)

HealthEast Joint Replacement Registry

Hip arthroplasty, knee arthroplasty, spine arthroplasty, shoulder arthroplasty

Clinical, procedural, implant, and in-hospital outcomes

HealthEast Care System Hospitals (St. Paul, Minnesota)

Veterans Administration data sources

Joint registries (46)

HSS/CERT Total Joint Hip arthroplasty, hip revision, knee Arthroplasty arthroplasty, knee revision, shoulder Registries arthroplasty, shoulder revision

Clinical, procedural, implant, and in-hospital outcomes

Voluntary

1 US hospital

Kaiser Permanente Total Joint Replacement Registry

Total hip arthroplasty, total knee arthroplasty

Clinical, procedural, implant, and in-hospital outcomes; follow-up data for revision only

Kaiser Permanente health-care plan

Mayo Clinic Joint Replacement Database

Total hip arthroplasty, hip revision, total knee arthroplasty, knee revision

Clinical, procedural, implant, in-hospital, and follow-up outcomes

Mayo Clinic (Rochester, Minnesota)

American Joint Replacement Registry

Hip replacement, knee replacement

Clinical, procedural, implant, in-hospital, and follow-up outcomes

National, 165 hospitals

Clinical, procedural, implant, and in-hospital outcomes

Virginia

Virginia Joint Registry, Hip arthroplasty, knee arthroplasty Inc. (Glen Allen, Virginia)

Abbreviations: CARE, Carotid Artery Revascularization and Endarterectomy; CART-CL, Cardiac Assessment, Reporting, and Tracking System for Cardiac Catheterization Laboratories; CERT, Centers for Education & Research on Therapeutics; GLORY, Global Orthopaedic Registry; HSS, Hospital for Special Surgery; ICD, implantable cardioverter-defibrillator; IMPACT, Improving Pediatric and Adult Congenital Treatments; TVT, Transcatheter Valve Therapy; VHA, Veterans Health Administration. a American College of Cardiology, Washington, DC.

version, or design. The FDA Amendments Act of 2007 charged the agency with developing and instituting a comprehensive UDI system for medical devices. The final rule was released in September 2013 and required that labels and packages for class III medical devices include UDIs by Am J Epidemiol. 2014;180(9):949–958

September 2014 and that the remainder of implantable devices include UDIs by September 2015 (11). The UDI system is a requirement for manufacturers and is expected to assist with the identification of medical devices for adverse event reporting and with the reduction of errors at the point of

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

CARE Registry

952 Jalbert et al.

care. The UDI requirements do not extend to administrative billing data or EHRs. However, capture of UDI data would provide detailed information about a device which would be useful for payers and researchers (12). As the UDI is implemented, it will probably be incorporated differentially by data source and region, and even across facilities. Researchers should treat UDI-based variables with the same rigor and skepticism as they would other variables that are subject to differential reporting and missingness. Given the limitations involved in assessing the performance of specific devices and models, we have limited our discussion to methodological issues associated with assessing device types or categories.

Medical devices tend to have shorter life cycles, or time on the market before being replaced, than medications. Incremental changes to devices are common, and if the modifications are not considered to significantly impact safety or effectiveness, the sponsor may receive clearance for newer models without submitting data to the FDA. This regulatory environment allows marketed devices to evolve continually. Over time, a device may become substantially different from the one that was originally approved or cleared (13, 14). Minor changes in design could also impact the device’s safety or effectiveness profile, as was the case for the femoral component of a hip replacement device which was found to fracture prematurely on the left side when the location of the etched icon of the company was changed to the left side (15). Short life cycles complicate the study of the long-term safety and effectiveness of a specific device design or feature. Even in data sets with detailed device information, too few people may have received a given device to meaningfully assess performance. On the other hand, when available information fails to distinguish between device types or characteristics, estimates may consist of uniform associations between all device subtypes and the outcome or these associations may be nonuniform, stemming from differences in performance or preferential use of certain subtypes of the device. Stratifying by model, brand, or design may be particularly informative in cases where device mechanics (e.g., pulsatile vs. continuousflow ventricular assist devices), components (e.g., synthetically derived vs. non-synthetically derived urogynecological mesh), or design (e.g., tapered vs. untapered stents) differ radically within a class. Distinguishing the anatomical side of a device

Some devices may be implanted in 1 anatomical structure on either side of the body. The same device-related procedure codes could signify a device revision, an implantation on the opposite side of the body, or a bilateral or staged procedure. Distinguishing between revisions and new implantations can be accomplished if laterality is known and if information about previous procedures and/or indication is available. In claims databases, modifier codes can indicate implantation side and bilateral or staged procedures. The degree of missingness in claims is procedure-dependent, as reimbursement is not affected by whether or not information on laterality is

Identifying “true” exposure

Medical devices regulating biological processes may have more than 1 exposure type. For example, ICDs and pacemakers deliver electrical impulses to maintain adequate heart rhythm or rate. “True” exposure can be defined as either permanent implantation of the device or the intermittent shocks it administers. Data may be collected and stored in devices like ICDs (analogous to having a medication event monitoring system) and transmitted through networks to treating physicians (22). When such data are available, the number and strength of shock therapy treatments can be quantified, the appropriateness of shock therapy can be determined, and different exposure categories can be created. While, at a minimum, it is generally possible to identify medical device class, information on the frequency or strength of electrical impulses is not collected in most observational data sources. When data on the activity of such devices are lacking, “true” exposure is unknown. This is analogous to the situation in pharmacoepidemiology, where it is not possible to infer from dispensing dates in claims data precisely how patients took their medications. In medical device epidemiology, as in pharmacoepidemiology, it may be necessary to make assumptions about exposure based on available information and to conduct sensitivity analyses. IDENTIFYING COMPARATORS

At least 2 potential issues need to be considered when using comparison groups from different databases. First, bias due to noncomparability is possible, because exposed and unexposed patients may have different characteristics as a result of being sampled differently and/or because they come from different source populations (23). In a study Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

Device life cycles

provided (16). For example, among Medicare beneficiaries undergoing total knee replacement between 2000 and 2009, modifier codes indicating the procedural side were complete in 52% of nonbilateral procedures (17), while information on laterality was available for fewer than 10% of Medicare beneficiaries undergoing carotid artery stenting (J. J. Jalbert, Brigham and Women’s Hospital, personal communication, 2012) but for close to 90% for age-related macular degeneration procedures performed between 2006 and 2008 (L. H. Curtis, Duke University, personal communication, 2014) (18). In medical records, the quality of data on laterality is likely to vary across institutions, but 1 study found that it was incomplete or inaccurate in 34% of operative notes (19). Registries for devices implanted in body parts with bilaterality generally collect information on the implantation side, with few or no missing data. For example, the National Cardiovascular Data Registry’s Carotid Artery Revascularization and Endarterectomy (CARE) Registry (2005–2009) and Kaiser Permanente’s National Total Joint Replacement Registry (2001–2005) have minimal missing data on laterality (20, 21). Complete information on laterality may stem from quality control practices, imputation, or selective release of records meeting completeness thresholds. Researchers should be aware of quality control standards used by the keepers of the data repository.

Medical Device Comparative Effectiveness 953

IDENTIFYING OUTCOMES OF INTEREST Periprocedural outcomes

Although most medications are administered and studied in outpatient settings, device implantations frequently require a brief hospitalization. Establishing when an outcome occurs relative to device implantation during this hospitalization can be challenging in administrative databases where procedure dates are recorded but dates of diagnosis are not, particularly when the outcome is also a risk factor. For instance, carotid artery stenting is indicated for stroke prevention, but stroke is both a risk factor for the procedure and a potential complication resulting from the procedure (Figure 1). Variables collected in administrative data may be used to develop algorithms to disentangle preprocedural events from postprocedural events. The admitting diagnosis may be used to identify conditions preceding the implantation but is rarely validated. The Centers for Medicare & Medicaid Services recently implemented a “present on admission” variable that applies to all listed diagnoses and indicates whether the condition was present at the time of inpatient admission. Discharge diagnostic codes exist for some postoperative complications such as stroke but may have low sensitivity, particularly for patients with multiple comorbid conditions, and they are also rarely validated.

Length of hospital stay may offer some additional insights, but attributing a lengthy hospitalization to a certain complication should be done on a case-by-case basis and by examining all diagnoses and procedures listed for that hospitalization, as well as patterns of care. While registries may not suffer from this problem, lack of follow-up data may limit analyses of periprocedural outcomes to in-hospital complications. Long-term outcomes

Medications consist of chemicals that must be administered repeatedly or released in a sustained fashion to maintain their therapeutic effects. While medical devices such as dermal fillers may require repeat administration, many devices maintain their therapeutic effects through an unchanging vehicle consisting of mechanical and electronic components. Thus, questions about the long-term safety or effectiveness of implantable devices mainly concern the durability of the devices. The definition of “long-term” depends on the expected life span of the device, with some intended to last for the duration of the patient’s life. Assessing long-term outcomes using registries is difficult, since many registries include 30 or fewer days of follow-up. In the United States, no single data source covers patients throughout their lives, such that all observations are leftand/or right-censored. Once patients reach 65 years of age, most can be followed in Medicare data for the remainder of their lives. However, the issues of device durability and longterm outcome assessment are more relevant for younger subjects with longer life expectancies. A hybrid approach combining exposure and baseline covariate assessment in a registry or database with prospective collection of data on the outcome variable and time-varying covariates from another database could provide a better understanding of longterm outcomes, especially in younger subjects. As was noted above, when using multiple databases, differential ascertainment of variables between the 2 data sources is possible. Thus, when using a hybrid approach, assessment of the differences in ascertainment is needed.

Patient has a stroke, the principal reason for hospital admission

Patient has a stroke during high-risk postprocedural period

Patient has a stroke while hospitalized

Hospital admission

CAS date

Hospital discharge

Time Figure 1. Possible timing of stroke occurrence relative to a carotid artery stenting (CAS) procedure when a Medicare patient has a discharge diagnosis of stroke.

Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

linking a device and a disease registry to Medicare data, it is possible that patients from the registry will differ with respect to demographic characteristics, comorbidity burden, disease severity, and/or geographical location even though they are all Medicare beneficiaries. Since information on some of these factors may be unmeasured, comparisons across different databases could be biased due to noncomparability of exposed and unexposed populations. Bias may be minimized through consideration of hospital clusters and controlling for proxies of the potential confounders. Second, even if a comparable group is identified, different data collection forms, definitions, and instructions may be used. This noncomparability of available information for confounder adjustment may result in residual confounding.

954 Jalbert et al.

BIAS-RELATED CONSIDERATIONS

this propensity score technique may not be sufficient to remove all confounding factors (25).

Confounding by indication or severity

Healthy-user bias

Surgical and other procedures pose short-term risks in exchange for long-term benefits. Patients with severe target disease or comorbid conditions are less likely to be considered for invasive procedures. The selection of healthier patients into more invasive treatments for the same indication can be problematic when comparing different treatment modalities, such as devices versus medical management. Healthyuser bias is not only a function of disease severity but also of better overall physiological health, functional status, psychological well-being, and more health-seeking behaviors (26, 27)—factors that are either difficult to quantify or not routinely measured in administrative, registry, or EHR data. Controlling for healthy-user bias may only be achieved in observational studies when information on health behaviors or their surrogates is available or a good instrument for performing an instrumental-variable analysis exists. Sensitivity analyses assessing the impact of healthy-user bias are necessary, and more research is needed to understand factors associated with the selection of patients to understand the magnitude of healthy-user bias in comparative effectiveness research on medical devices. Provider characteristics

Characteristics such as physician specialty (28, 29), hospital specialty (30–32), hospital ownership (30), hospital size

Table 2. Characteristics of Non-ICD Patients Versus ICD Patients Before Application of Primary Prevention ICD Indications, 2005–2008 Characteristic

Non-ICD Patients (n = 535,885) Median (IQR)

ICD Patients (n = 4,990)

No.

%

Male sex

278,344

52

3,469

70

White race

354,414

66

3,528

71

a

Age, years

Ejection fraction, %

Median (IQR)

69 (20)

No.

%

66 (19)

48 (33–59)

25 (20–30)

Systolic blood pressure, mm Hg

138 (119–158)

127 (112–144)

Diastolic blood pressure, mm Hg

75 (64–88)

73 (64–84)

Serum sodium, mEq/L

138 (136–141)

139 (136–141)

Serum B-type natriuretic peptide, pg/mL

761 (375–1,546)

822 (330–1,707)

Serum creatinine, mg/dL

1.2 (1.0–1.7)

1.2 (1.0–1.5)

Systolic blood pressure, mm Hg

121 (109–137)

118 (105–131)

Diastolic blood pressure, mm Hg

67 (59–75)

68 (60–76)

Serum sodium, mEq/L

138 (136–140)

137 (135–139)

Serum B-type natriuretic peptide, pg/mL

524 (250–1,114)

561 (227–1,300)

Serum creatinine, mg/dL

1.3 (1.0–1.9)

1.2 (1.0–1.6)

Admission

Discharge

Abbreviations: ICD, implantable cardioverter-defibrillator; IQR, interquartile range. a Values are expressed as mean (standard deviation).

Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

Depending on the condition, medications may be used for patients with milder disease and device implantation may be reserved for those with more severe disease (or vice versa). A stepwise approach to treatment might also be recommended; for instance, a medical device may be considered if drug treatment has failed. Although knowledge of the guidelines and standards of practice is necessary to assess the direction, magnitude, and potential for confounding by indication or severity, this is not different from within-drug or within-procedure comparisons. With sufficiently detailed information on disease severity or indication, confounding by indication or severity may be handled through study design considerations or analytically. In a recent study of the comparative effectiveness of ICDs using Medicare-linked registry data (2005–2008), we compared baseline characteristics of patients receiving ICDs with the baseline characteristics of those who did not, before and after restricting the population to patients who were eligible for ICDs for primary prevention (see Tables 2 and 3). The difference in ejection fraction between the 2 groups (48% in non-ICD patients vs. 25% in ICD patients) was significantly reduced after restricting the data to patients who were eligible for an ICD for primary prevention (29% in non-ICD patients vs. 25% in ICD patients). Adjustment for confounding by severity or indication may also be done analytically. Highdimensional propensity score modeling, which uses codes as proxies for confounding adjustment, may minimize confounding by severity or indication when administrative data alone are available (24), but, depending on the comparison,

Medical Device Comparative Effectiveness 955

Table 3. Characteristics of Non-ICD Patients Versus ICD Patients After Application of Primary Prevention ICD Indications, 2005–2008 Characteristic

Non-ICD Patients (n = 19,687) Median (IQR)

ICD Patients (n = 1,087)

No.

%

Male sex

11,168

57

816

75

White race

17,679

90

989

91

a

Age, years

Ejection fraction, %

80 (8)

Median (IQR)

No.

%

76 (6)

29 (21–33)

25 (20–30)

Systolic blood pressure, mm Hg

132 (115–151)

132 (116–147)

Diastolic blood pressure, mm Hg

73 (62–85)

72 (64–81)

138 (135–141)

139 (137–141)

Admission

Serum B-type natriuretic peptide, pg/mL Serum creatinine, mg/dL

1,240 (658–2,240)

850 (334–1,443)

1.3 (1.0–1.8)

1.2 (1.0–1.7)

Systolic blood pressure, mm Hg

120 (108–134)

117 (104–131)

Diastolic blood pressure, mm Hg

64 (56–72)

66 (59.5–75)

Serum sodium, mEq/L

138 (135–140)

137 (135–139)

Serum B-type natriuretic peptide, pg/mL

797 (414–1,527)

648 (280–1,276)

Serum creatinine, mg/dL

1.4 (1.1–1.9)

1.2 (1.0–1.6)

Discharge

Abbreviations: ICD, implantable cardioverter-defibrillator; IQR, interquartile range. a Values are expressed as mean (standard deviation).

(31), and teaching status (32) affect patient outcomes during the periprocedural period. Both the lead-in phase of the Carotid Revascularization Endarterectomy vs. Stenting Trial (CREST) (33) and the trial itself (34) suggested that physician specialty may affect outcomes following carotid artery stenting. Physician specialty or provider preference may exert a strong influence on therapeutic modality selection (devices vs. drugs) and on what types of devices the patient ultimately receives. A volume-outcome relationship between higher physician procedural volume (14, 35–37) and hospital procedural volume (35, 38–42) and lower periprocedural outcome risks have also been documented for a number of devices and procedures. A recent study found a strong inverse association between past-year physician and hospital carotid artery stenting volume and 30-day mortality risk among Medicare beneficiaries (2005–2009), after adjustment for patient and provider characteristics (43). More technically complex device deliveries and implantations will have significant learning curves associated with their use that are likely to change over time, particularly for first-in-class devices or for those devices with significant changes in their use. Exploring provider characteristics to account for them appropriately is essential to conducting valid comparative effectiveness research on medical devices. Provider characteristics which may affect exposure selection and outcomes should be evaluated as potential confounders and effect-measure modifiers. In the presence of a learning curve, device performance and, concomitantly, comparative effectiveness estimates may change over time. To obtain valid inferences, clustering of patients should also be taken into account by introducing random or fixed effects or by using the sandwich variance estimator. Am J Epidemiol. 2014;180(9):949–958

Device failure or removal (sick stopper bias)

Patients taking medications could have various degrees of adherence, from completely stopping the medication to skipping doses to taking the medication as prescribed. In most data sources, measuring medication adherence requires making strong assumptions. While adherence issues do not generally apply to implantable medical devices, devices can fail, be turned off, and/or be removed. Whether or not to take this into account depends on the study question. If the goal is to assess real-world effectiveness, where some degree of device failure or removal is expected, comparing the different modalities without adjusting for device failure or removal rates but reporting device failure or removal rates is appropriate. If the goal is to compare conditional effectiveness assuming no device failure or removal, this must be taken into account in the analysis. Bias related to treatment assignment: immortal time bias and time-varying confounding by severity

Comparative effectiveness studies of devices are potentially susceptible to immortal person-time, a period of follow-up time during which, by design, the outcome(s) of interest cannot occur (44, 45). It arises when survival time (i.e., time from the beginning of a defined follow-up period to the development of the study outcome) is a study endpoint and when the start of the follow-up period varies for different treatment groups. This arises from the fact that treatment assignment is not decided at the time that observation of the subjects begins. In comparing devices with medications, immortal person-time may arise when treatments are used in a

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

Serum sodium, mEq/L

956 Jalbert et al.

CONCLUSION

In this article, we have discussed some of the most pertinent methodological issues facing researchers conducting comparative effectiveness studies of medical devices; we have discussed special considerations regarding the use of data sources, exposure and outcome identification, exposure timing, and sources of bias. For the continued advancement of the field, we envision a UDI system implemented in administrative databases, registries, and EHRs enabling the assessment of product-specific performance in the real world. In addition, the development of methods for medical device epidemiology, built on current understanding of the similarities and differences between pharmacoepidemiology and medical device epidemiology, would greatly facilitate comparative effectiveness research for implantable medical devices.

ACKNOWLEDGMENTS

Author affiliations: Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women’s Hospital

and Harvard Medical School, Boston, Massachusetts (Jessica J. Jalbert, Chih-Ying Chen); LA-SER Analytica, New York, New York (Jessica J. Jalbert); Procter & Gamble, Mason, Ohio (Mary Elizabeth Ritchey); Duke Clinical Research Institute, Durham, North Carolina (Bradley G. Hammill, Lesley H. Curtis, Xiaojuan Mi, Soko Setoguchi); and Department of Medicine, Duke University School of Medicine, Durham, North Carolina (Lesley H. Curtis, Soko Setoguchi). S.S. was supported by midcareer development award K02HS017731 from the Agency for Healthcare Research and Quality, US Department of Health and Human Services. The project was also supported by contract HHSA29020050016I from the Agency for Healthcare Research and Quality as part of the Developing Evidence to Inform Decisions about Effectiveness (DEcIDE) Network and by contract HHSM500201000001I from the Centers for Medicare & Medicaid Services, US Department of Health and Human Services. We thank Damon M. Seils for editorial assistance. Dr. Mary Elizabeth Ritchey is a full-time employee of Procter & Gamble (Cincinnati, Ohio). Dr. Soko Setoguchi reported receiving research support from Johnson & Johnson (New Brunswick, New Jersey) and has made available online a detailed listing of financial disclosures (http://www.dcri. duke.edu/about-us/conflict-of-interest/). Dr. Lesley H. Curtis reported receiving grants from Medtronic, Inc. (Minneapolis, Minnesota) and Boston Scientific Corporation (Natick, Massachusetts) and receiving research and salary support from Johnson & Johnson, GlaxoSmithKline (Brentford, United Kingdom), the Agency for Healthcare Research and Quality (Rockville, Maryland), and the National Heart, Lung, and Blood Institute (Bethesda, Maryland). Dr. Curtis also has made a detailed listing of financial disclosures available online (http://www.dcri.org/research/coi.jsp).

REFERENCES 1. Zhan C, Baine WB, Sedrakyan A, et al. Cardiac device implantation in the United States from 1997 through 2004: a population-based analysis. J Gen Intern Med. 2008; 23(suppl 1):13–19. 2. Food and Drug Administration, US Department of Health and Human Services. Classify your medical device. http://www.fda. gov/MedicalDevices/DeviceRegulationandGuidance/Overview/ ClassifyYourDevice/. Updated June 6, 2014. Accessed June 12, 2014. 3. Government Accountability Office. Medical Devices: FDA Should Take Steps to Ensure That High-Risk Device Types Are Approved Through the Most Stringent Premarket Review Process. (Publication no. GAO-09-190). Washington, DC: Government Accountability Office; 2009. http://www.gao.gov/ products/GAO-09-190. Published January 15, 2009. Accessed June 12, 2014. 4. Institute of Medicine. Medical Devices and the Public’s Health: the FDA 510(k) Clearance Process at 35 Years. Washington, DC: The National Academies Press; 2011. http:// www.iom.edu/Reports/2011/Medical-Devices-and-thePublics-Health-The-FDA-510k-Clearance-Process-at-35Years.aspx. Published July 29, 2011. Accessed June 12, 2014. 5. Healthcare Cost and Utilization Project (HCUP). Overview of the National (Nationwide) Inpatient Sample (NIS). Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

stepwise manner (e.g., drug first and then the device if the drug was not effective). For the device user, time spent on medication may be a period during which, by design, the outcomes of interest cannot occur. Either misclassifying or omitting immortal time from the analysis can result in immortal person-time bias (46, 47). This bias is described extensively in pharmacoepidemiology but less so in studies of medical devices (45, 47–50). Immortal person-time bias was initially described in the context of assessing the survival benefit of heart transplants in the 1970s (51). To illustrate this type of bias, consider a study comparing the effectiveness of left ventricular assist devices (LVADs) with heart transplantation in the era of the Randomized Evaluation of Mechanical Assistance for the Treatment of Congestive Heart Failure (REMATCH) Study (52), which demonstrated the efficacy of long-term LVAD as destination therapy. Most patients awaiting heart transplantation receive an LVAD as bridge therapy. However, given that patients must survive long enough to undergo transplantation, comparing survival after LVAD placement with survival after transplantation will introduce immortal person-time bias if the treatment sequence is not considered and the immortal person-time for heart transplant patients while on LVAD is not appropriately allocated. Immortal time bias can be corrected by using appropriate study designs and statistical analyses. The performances of different methods for handling immortal time bias were recently compared (53, 54). However, immortal person-time bias can be further complicated if treatment groups are not comparable in terms of disease severity and comorbidity burden. Therefore, how immortal time is handled can have a complex and significant impact on the estimates. Careful attention to the time from initial diagnosis to the sequence of treatments and their modalities, as well as use of an appropriate method for handling immortal time, is needed to avoid introducing this type of bias.

Medical Device Comparative Effectiveness 957

6.

7.

8.

10. 11.

12. 13. 14.

15.

16.

17.

18.

19. 20.

Am J Epidemiol. 2014;180(9):949–958

21. 22.

23.

24.

25.

26. 27.

28.

29. 30.

31. 32. 33.

34. 35.

36. 37. 38.

comparison of patient characteristics and outcomes after carotid artery stenting. Circulation. 2011;123(13):1384–1390. Namba RS, Inacio M. Early and late manipulation improve flexion after total knee arthroplasty. J Arthroplasty. 2007; 22(6 suppl 2):58–61. Saxon LA, Hayes DL, Gilliam FR, et al. Long-term outcome after ICD and CRT implantation and influence of remote device follow-up: the ALTITUDE survival study. Circulation. 2010; 122(23):2359–2367. Hammill BG, Curtis LH, Setoguchi S. Performance of propensity score methods when comparison groups originate from different data sources. Pharmacoepidemiol Drug Saf. 2012;21(suppl 2):81–89. Schneeweiss S, Rassen JA, Glynn RJ, et al. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20(4): 512–522. Jalbert JJ, Seeger JD, Williams LA, et al. Confounding by indication in comparative effectiveness research: does adding registry to claims data make a difference? [abstract]. Pharmacoepidemiol Drug Saf. 2013;22(suppl 1):129. Glynn RJ, Schneeweiss S, Wang PS, et al. Selective prescribing led to overestimation of the benefits of lipid-lowering drugs. J Clin Epidemiol. 2006;59(8):819–828. Brookhart MA, Patrick AR, Dormuth C, et al. Adherence to lipid-lowering therapy and the use of preventive health services: an investigation of the healthy user effect. Am J Epidemiol. 2007;166(3):348–354. Curtis JP, Luebbert JJ, Wang Y, et al. Association of physician certification and outcomes among patients receiving an implantable cardioverter-defibrillator. JAMA. 2009;301(16): 1661–1670. Hollenbeak CS, Bowman AR, Harbaugh RE, et al. The impact of surgical specialty on outcomes for carotid endarterectomy. J Surg Res. 2010;159(1):595–602. Cram P, House JA, Messenger JC, et al. Percutaneous coronary intervention outcomes in US hospitals with varying structural characteristics: analysis of the NCDR®. Am Heart J. 2012; 163(2):222–229.e1. Cram P, Vaughan-Sarrazin MS, Wolf B, et al. A comparison of total hip and knee replacement in specialty and general hospitals. J Bone Joint Surg Am. 2007;89(8):1675–1684. Katz JN, Bierbaum BE, Losina E. Case mix and outcomes of total knee replacement in orthopaedic specialty hospitals. Med Care. 2008;46(5):476–480. Hopkins LN, Roubin GS, Chakhtoura EY, et al. The Carotid Revascularization Endarterectomy versus Stenting Trial: credentialing of interventionalists and final results of lead-in phase. J Stroke Cerebrovasc Dis. 2010;19(2):153–162. Brott TG, Hobson RW 2nd, Howard G, et al. Stenting versus endarterectomy for treatment of carotid-artery stenosis. N Engl J Med. 2010;363(1):11–23. Jollis JG, Peterson ED, Nelson CL, et al. Relationship between physician and hospital coronary angioplasty volume and outcome in elderly patients. Circulation. 1997;95(11): 2485–2491. Hannan EL, Racz M, Ryan TJ, et al. Coronary angioplasty volume-outcome relationships for hospitals and cardiologists. JAMA. 1997;277(11):892–898. Birkmeyer JD, Stukel TA, Siewers AE, et al. Surgeon volume and operative mortality in the United States. N Engl J Med. 2003;349(22):2117–2127. Luft HS, Bunker JP, Enthoven AC. Should operations be regionalized? The empirical relation between surgical volume and mortality. N Engl J Med. 1979;301(25):1364–1369.

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

9.

http://www.hcup-us.ahrq.gov/nisoverview.jsp. Updated December 11, 2013. Accessed June 12, 2014. Healthcare Cost and Utilization Project (HCUP). Overview of the Nationwide Emergency Department Sample (NEDS). http://www.hcup-us.ahrq.gov/nedsoverview.jsp. Updated January 14, 2014. Accessed June 12, 2014. Hernandez AF, Fonarow GC, Hammill BG, et al. Clinical effectiveness of implantable cardioverter-defibrillators among Medicare beneficiaries with heart failure. Circ Heart Fail. 2010;3(1):7–13. Effective Health Care Program, Agency for Healthcare Research and Quality. Real world effectiveness of implantable cardioverter defibrillators (ICDs) in Medicare patients [abstract]. Rockville, MD: Agency for Healthcare Research and Quality; 2010. http://www.effectivehealthcare.ahrq.gov/index. cfm/search-for-guides-reviews-and-reports/?pageaction= displayproduct&productid=431. Published April 14, 2010. Accessed June 12, 2014. Askling J, van Vollenhoven RF, Granath F, et al. Cancer risk in patients with rheumatoid arthritis treated with anti-tumor necrosis factor α therapies: does the risk change with the time since start of treatment? Arthritis Rheum. 2009;60(11): 3180–3189. Jha AK, DesRoches CM, Campbell EG, et al. Use of electronic health records in U.S. hospitals. N Engl J Med. 2009;360(16): 1628–1638. Food and Drug Administration, US Department of Health and Human Services. Unique device identification (UDI). http:// www.fda.gov/MedicalDevices/DeviceRegulationandGuidance/ UniqueDeviceIdentification/default.htm. Published 2013. Updated June 11, 2014. Accessed June 12, 2014. Food and Drug Administration, US Department of Health and Human Services. Unique device identification system. Final rule. Fed Regist. 2013;78(185):58785–58828. Ardaugh BM, Graves SE, Redberg RF. The 510(k) ancestry of a metal-on-metal hip implant. N Engl J Med. 2013;368(2): 97–100. McGrath PD, Wennberg DE, Dickens JD Jr, et al. Relation between operator and hospital volume and outcomes following percutaneous coronary interventions in the era of the coronary stent. JAMA. 2000;284(24):3139–3144. Food and Drug Administration, US Department of Health and Human Services. Unique Device Identification (UDI) for Postmarket Surveillance and Compliance Public Workshop, September 12–13, 2011—transcripts. http://www.fda.gov/ MedicalDevices/NewsEvents/WorkshopsConferences/ ucm275182.htm. Updated March 13, 2014. Accessed June 12, 2014. Cahaba Government Benefit Administrators, LLC. Modifiers for Medicare billing. https://www.cahabagba.com/part_b/ education_and_outreach/general_billing_info/modifers.htm. Published May 2, 2012. Accessed June 12, 2014. Bolognesi MP, Greiner MA, Attarian DE, et al. Unicompartmental knee arthroplasty and total knee arthroplasty among Medicare beneficiaries, 2000 to 2009. J Bone Joint Surg Am. 2013;95(22):e174. Curtis LH, Hammill BG, Qualls LG, et al. Treatment patterns for neovascular age-related macular degeneration: analysis of 284 380 Medicare beneficiaries. Am J Ophthalmol. 2012; 153(6):1116–1124.e1. Flynn MB, Allen DA. The operative note as billing documentation: a preliminary report. Am Surg. 2004;70(7): 570–574. Yeh RW, Kennedy K, Spertus JA, et al. Do postmarketing surveillance studies represent real-world populations? A

958 Jalbert et al.

47. Suissa S. Effectiveness of inhaled corticosteroids in chronic obstructive pulmonary disease: immortal time bias in observational studies. Am J Respir Crit Care Med. 2003;168(1): 49–53. 48. Suissa S. Immortal time bias in observational studies of drug effects. Pharmacoepidemiol Drug Saf. 2007;16(3):241–249. 49. Lash TL, Cole SR. Immortal person-time in studies of cancer outcomes. J Clin Oncol. 2009;27(23):e55–e56. 50. Hoffmann F, Andersohn F. Immortal time bias and survival in patients who self-monitor blood glucose in the Retrolective Study: Self-monitoring of Blood Glucose and Outcome in Patients with Type 2 Diabetes (ROSSO). Diabetologia. 2011; 54(2):308–311. 51. Gail MH. Does cardiac transplantation prolong life? A reassessment. Ann Intern Med. 1972;76(5):815–817. 52. Rose EA, Gelijns AC, Moskowitz AJ, et al. Long-term use of a left ventricular assist device for end-stage heart failure. N Engl J Med. 2001;345(20):1435–1443. 53. Mi X, Hammill BG, Curtis LH, et al. Impact of immortal person-time and time scale in comparative effectiveness research for medical devices: a case for implantable cardioverterdefibrillators. J Clin Epidemiol. 2013;66(8 suppl):S138–S144. 54. Mi X, Hammill BG, Curtis LH, et al. Relative performance of approaches handling immortal person-time in comparative effectiveness research: a simulation study. Pharmacoepidemiol Drug Saf. In press.

Am J Epidemiol. 2014;180(9):949–958

Downloaded from http://aje.oxfordjournals.org/ at University of California, San Francisco on March 14, 2015

39. Showstack JA, Rosenfeld KE, Garnick DW, et al. Association of volume with outcome of coronary artery bypass graft surgery. Scheduled vs. nonscheduled operations. JAMA. 1987; 257(6):785–789. 40. Cebul RD, Snow RJ, Pine R, et al. Indications, outcomes, and provider volumes for carotid endarterectomy. JAMA. 1998; 279(16):1282–1287. 41. Urbach DR, Baxter NN. Does it matter what a hospital is “high volume” for? Specificity of hospital volume-outcome associations for surgical procedures: analysis of administrative data. Qual Saf Health Care. 2004;13(5):379–383. 42. Birkmeyer JD, Siewers AE, Finlayson EV, et al. Hospital volume and surgical mortality in the United States. N Engl J Med. 2002;346(15):1128–1137. 43. Jalbert JJ, Gerhard-Herman MD, Nguyen LL, et al. Effect of physician and hospital experience on outcomes following carotid artery stenting (CAS) [abstract]. Pharmacoepidemiol Drug Saf. 2012;21(suppl 3):313–314. 44. Walker AM. Observation and Inference: An Introduction to the Methods of Epidemiology. Newton Lower Falls, MA: Epidemiology Resources, Inc.; 1991. 45. Suissa S. Immortal time bias in pharmaco-epidemiology. Am J Epidemiol. 2008;167(4):492–499. 46. Mantel N, Byar DP. Evaluation of response-time data involving transient states: an illustration using heart-transplant data. J Am Stat Assoc. 1974;69(345):81–86.

Methodological considerations in observational comparative effectiveness research for implantable medical devices: an epidemiologic perspective.

Medical devices play a vital role in diagnosing, treating, and preventing diseases and are an integral part of the health-care system. Many devices, i...
182KB Sizes 0 Downloads 3 Views