G Model

ARTICLE IN PRESS

IJB-3207; No. of Pages 11

International Journal of Medical Informatics xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

International Journal of Medical Informatics journal homepage: www.ijmijournal.com

CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data夽 Brian L. Hazlehurst a,∗ , Stephen E. Kurtz a , Andrew Masica b , Victor J. Stevens a , Mary Ann McBurnie a , Jon E. Puro c , Vinutha Vijayadeva d , David H. Au e , Elissa D. Brannon f , Dean F. Sittig g a

Kaiser Permanente Northwest, Center for Health Research, Portland, OR, USA Baylor Scott & White Health, Center for Clinical Effectiveness, Dallas, TX, USA c OCHIN Inc, Portland, OR, USA d Kaiser Permanente Hawaii, Center for Health Research, Honolulu, HI, USA e VA Puget Sound Health Care System, Seattle, WA, USA f Kaiser Permanente Southeast, Atlanta, GA, USA g University of Texas Health Science Center, School of Biomedical Informatics, Houston, TX, USA b

a r t i c l e

i n f o

Article history: Received 10 December 2014 Received in revised form 17 February 2015 Accepted 2 June 2015 Available online xxx Keywords: Comparative effectiveness research Natural language processing Electronic health records

a b s t r a c t Objectives: Comparative effectiveness research (CER) requires the capture and analysis of data from disparate sources, often from a variety of institutions with diverse electronic health record (EHR) implementations. In this paper we describe the CER Hub, a web-based informatics platform for developing and conducting research studies that combine comprehensive electronic clinical data from multiple health care organizations. Methods: The CER Hub platform implements a data processing pipeline that employs informatics standards for data representation and web-based tools for developing study-specific data processing applications, providing standardized access to the patient-centric electronic health record (EHR) across organizations. Results: The CER Hub is being used to conduct two CER studies utilizing data from six geographically distributed and demographically diverse health systems. These foundational studies address the effectiveness of medications for controlling asthma and the effectiveness of smoking cessation services delivered in primary care. Discussion: The CER Hub includes four key capabilities: the ability to process and analyze both free-text and coded clinical data in the EHR; a data processing environment supported by distributed data and study governance processes; a clinical data-interchange format for facilitating standardized extraction of clinical data from EHRs; and a library of shareable clinical data processing applications. Conclusion: CER requires coordinated and scalable methods for extracting, aggregating, and analyzing complex, multi-institutional clinical data. By offering a range of informatics tools integrated into a framework for conducting studies using EHR data, the CER Hub provides a solution to the challenges of multi-institutional research using electronic medical record data. © 2015 Published by Elsevier Ireland Ltd.

1. Introduction

夽 The CER Hub project (www.cerhub.org) reported here is funded by grant R01HS019828 (Hazlehurst, PI) from the Agency for Health Care Research and Quality (AHRQ), US Department of Health and Human Services. ∗ Corresponding author at: Kaiser Permanente Northwest, Center for Health Research 3800 N, Interstate Avenue Portland, Oregon 97227, USA. E-mail address: [email protected] (B.L. Hazlehurst).

The primary goal of comparative effectiveness research (CER) is to generate new evidence on the effectiveness, benefits, and harms of treatments, diagnostic tests, disease prevention methods, and on ways to deliver care under “real world” conditions [1–7]. To accomplish this goal using electronic data, CER requires the capture and analysis of disparate data sources held by different institutions with

http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002 1386-5056/© 2015 Published by Elsevier Ireland Ltd.

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

2

Fig. 1. Requirements of informatics platform for CER (adapted from Sittig et al. [15]).

diverse electronic health record (EHR) implementations (Fig. 1). We report on the development and use of an informatics platform designed to overcome these challenges, the CER Hub. 1.1. Background and significance The expanding use of EHR systems is generating a “tsunami of data” [8]. These large databases offer new opportunities to measure and improve health and health care delivery [9]. The enormity of the data sets is illustrated by the Kaiser Permanente (KP) health system. KP providers care for approximately 9 million people in the U.S., documenting about 100,000 care encounters per day—or more than 36 million encounters per year. Encounter data is recorded in both structured (coded) and unstructured (non-coded and narrative text) fields in patients’ electronic records, providing a rich source for understanding details of health status, behaviors, care processes, and outcomes reported by both patients and clinicians. The American Recovery and Reinvestment Act of 2009 (ARRA) provided more than $1 billion to advance CER [10] through development of infrastructure to expand adoption of EHRs in health care and enhance capacities to utilize them for research to improve care. To carry out CER, researchers must identify, capture, aggregate, integrate, and analyze disparate data sources held by different institutions with diverse electronic record systems. Furthermore, researchers seek to carry out CER with increased speed and efficiency, by using data already collected in the course of patient care. The following are key requirements for health information technology to realize this vision [11]: 1.1.1. CER requires comprehensive, patient-level data Creating a complete view of a patient’s health and medical history requires different kinds of data. These include structured data

created by clinical transaction applications such as computer-based provider order entry (CPOE), clinical billing, laboratory, radiology, and patient registration systems. They also include free-text clinical data and patient-reported data collected from narrative visit notes and patient-education materials, or from standalone, computerbased instruments. 1.1.2. High-quality CER requires data from large, diverse populations at multiple organizations Data must be aggregated across multiple organizations to yield enough information to identify statistically meaningful differences between groups, reduce bias, allow for subgroup analyses, improve generalizability of study results, and study rare events. To conduct CER, researchers must be able to aggregate this heterogeneous, disparate data and metadata. 1.1.3. CER requires standardized data extraction and analysis Researchers must be able to efficiently extract data from various electronic data systems, including EHRs as well as administrative and billing systems. To enable aggregation, these data should conform to standardized clinical representations and hold the information to answer specific research questions. Creating standardized health care datasets across a multi-institutional analysis is a formidable challenge, since different organizations often refer to the same activity, condition, or procedure by different names, and the same names often mean different things across institutions and care practices. 1.1.4. CER must conform to each local organization’s governance and Institutional Review Board’s (IRB) rules To protect patient privacy while providing researchers with secure, auditable, and efficient access to data, methods for conform-

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

ing to local IRB and compliance rules will be essential. Combining data from disparate organizations requires robust technology solutions coupled with sound clinical data-governance processes. The substantial amount of electronic clinical data being generated through increasing adoption of EHRs is not directly available for aggregation and analysis for CER. Clinical practices vary widely across the U.S., and diverse care and organizational priorities affect the capture and meaning of clinical events. Capturing events of interest to all CER studies cannot be anticipated fully or designed into the workflow of busy frontline clinicians, even with sophisticated charting tools. Patient diversity as well as variation in EHR vendor products and implementations introduces additional heterogeneity in the clinical events recorded both across and within institutions. Furthermore, data relevant to understanding many clinical situations reside in the non-coded text notes of EHRs—by some estimates, more than 50% of useful clinical information [11]. The data in these notes do not lend themselves to automated summarization of clinical events. This has led to increased interest in using natural language processing (NLP) to automatically code text portions of the medical record [12–14] and this technology’s potential to enable CER across institutions is becoming increasingly apparent. Given the complexity of EHR data, there is a need for scalable informatics solutions for assigning consistent, specific meanings to highly heterogeneous data across diverse institutions, EHR implementations, and care practices. This need exists for not only routine healthcare and CER but also for multi-institution studies in quality improvement, pragmatic clinical trials, public health, and other types of research. Informatics solutions meeting this need require the ability to integrate data in multiple components of the EHRs of multiple institutions, while ensuring patient confidentiality and compliance with organizational, state, and federal policies. In a previous study, we compared and contrasted six large-scale projects that were developing or extending informatics solutions for using EHR data for CER, including the CER Hub project that is the focus of this paper [15]. Table 1 provides an overview of these projects, to which we have also added references to the work of (1) the Mayo Clinic-led Strategic Health IT Advanced Research Project on Secondary Use of Clinical Data (SHARP-n), (2) the University of California, San Diego (UCSD)-led Scalable National Network for Effectiveness Research (SCANNER) project, (3) the Shared Health Research Information Network (SHRINE) project at Harvard and Partners Healthcare, and (4) PCORnet, led by the Patient-Centered Outcomes Research Institute. Many of these projects were developed to address widely differing health care, organizational, and research objectives. All projects conduct the six generic data processing steps necessary for distributed, multi-institutional CER projects using EHR data, as shown in Fig. 1. While all projects performed all of the steps shown in Fig. 1, there were wide variations in how the data were extracted (realtime aggregation of HL-7 transactions vs. nightly or as-needed extraction, transformation, and loading), where the processing took place (local site vs. central coordinating center), and how users interacted with the systems (web-based query interfaces for researchers vs. tools to develop natural language processing modules). In addition, there were many differences in the projects’ governance and organizational structures [15]. In what follows, we describe how the CER Hub platform meets the design requirements set out above. Results from CER Hub studies in smoking cessation and management of asthma are in preparation. 1.2. The CER Hub platform With support from the Agency for Healthcare Research and Quality (AHRQ), we built the CER Hub to provide clinical data infrastructure and advance CER. Currently, most data processing

3

systems supporting CER utilize non-coordinated applications, data, and processes. The result is disjointed data specificity and availability, which limits scalability. New informatics platforms (see Table 1) provide efficient access to electronic clinical data as well as methods and governance required for inter-institutional CER [15,25]. Briefly, a “platform” is a suite of interconnected, coordinated applications, together with the operational environment that hosts those applications and regulates their use. Applications and their operational environments are implicated in any solution to the problem of inter-institutional CER [15]. The CER Hub, a web-based informatics platform, supports collaborative research studies that use comprehensive electronic clinical data from multiple institutions. The CER Hub offers tools and methods for developing and applying study-specific, standardized processors of data that are distributed to participating “data providers” (typically, health care delivery organizations), enabling accurate aggregation and analysis of EHR data. Patient confidentiality is ensured by disclosing only study-specific coded data outside of the data providers’ organizations. Each CER Hub study team receives its own website and access to tools and methods to develop EHR-based measures. The CER Hub helps researchers to build, test, and share processors of heterogeneous clinical data to answer study-specific questions using diverse clinical data systems. This platform surmounts challenges to multi-institutional CER by providing access to the entire medical record for efficient data organization, aggregation, and analysis. 2. Material and methods A CER Hub study involves two potentially independent types of participants: (1) data providers that own the data and perform local work in accordance with the framework described below, and (2) researchers who design and conduct studies using EHR data to answer questions. In the current CER Hub consortium, all investigators are employees of the six organizations providing data for the initial two CER studies (see Section 3). However, there is no requirement for the coupling of science participation and data provision on CER Hub studies. 2.1. CER Hub data processing pipeline: EHR encounters to study results Fig. 2 displays the CER Hub data processing pipeline. Activities unique to a study – those that require data or metadata driven from a specific study question – are shown as shaded objects. In the lower half of the figure, we see a series of data transformations, generating a flow of data from an EHR data warehouse on the left (only one is shown) to study results on the right. The pipeline provides for extraction, modeling, aggregation, and analysis of EHR data to address study questions across EHR implementations, care providers, settings, and institutions. Sittig et al. [15] have described six common processes across emerging CER informatics platforms (Fig. 1). Fig. 2 shows how activities related to the CER Hub data processing pipeline provide a scalable solution to the requirements of multi-institutional CER described by Sittig et al. The six common processes as implemented by CER Hub are described next (see Sections 2.2 through 2.7). 2.2. The identification of applicable data within health care transaction systems CER Hub studies begin with a protocol that defines the study population, describes the research questions and analysis methods, and specifies the data to be used in the analyses. The structure of data extracted from the EHR for studies is based on a common data model that does not change across studies. The study protocol

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model

ARTICLE IN PRESS

IJB-3207; No. of Pages 11

B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

4

Table 1 Large-scale projects implementing informatics platforms addressing various aspects of comparative effectiveness research (CER). Project name

Project description

The comparative effectiveness research Hub (CER Hub)

A web-based platform for implementing multi-institutional studies using the MediClass system for processing comprehensive electronic medical records, including both coded and free-text data elements Creating infrastructure to facilitate patient-centered, comprehensive analysis of populations in New York City, NY by leveraging data from existing EHRs, and combining data from institutions representing various health care processes Uses its virtual data warehouse (VDW) to provide a standardized, federated data system across 11 partners spread out across the nation

Washington heights/inwood informatics infrastructure for comparative effectiveness research (WICER) [16] Scalable PArtnering network for comparative effectiveness research: across lifespan, conditions, and settings (SPAN) [17] The partners research patient data registry (RPDR) [18] and shared health research information network (SHRINE) [19] The Indiana network for patient care (INPC) comparative effectiveness research trial of Alzheimer’s disease drugs (COMET-AD) [20] The surgical care outcomes assessment program comparative effectiveness research translation network (SCOAP-CERTAIN) [21] Strategic health IT advanced research project on secondary use of clinical data (SHARP-n) [22] Scalable national network for effectiveness research (SCANNER) [23] Patient-centered outcome research network (PCORnet) [24]

An enterprise data warehouse combined with a multi-faceted user interface that enables clinical research and CER across Partners Healthcare and Harvard in Boston, MA Started in 1994 as an experiment in community-wide health information exchange serving five major hospitals in Indianapolis, IN. Currently using data from hospitals and payers statewide to monitor various health care processes and outcomes Assessing how well an existing statewide quality assurance and quality improvement registry (i.e., the surgical care outcomes assessment program) can be leveraged to perform CER Developing open source services and components to support the ubiquitous exchange, sharing and reuse or ‘liquidity’ of operational clinical data stored in electronic health records A consortium of UCSD, Tennessee VA, and three federally qualified health systems in the Los Angeles area supplemented with claims and health information exchange data, led by the University of Southern California. An initiative to establish an effective, sustainable national research infrastructure that will advance the use of electronic health data in comparative effectiveness research

Fig. 2. CER Hub data processing pipeline. Data status are indicated with one of “PHI” (data records contain Protected Health Information), “No PHI” (data absent of the 18 personal identifiers as per the HIPAA safe harbor method), and “LDS” (data shareable as a Limited Data Set for specific research purposes under HIPAA).

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

sets the parameters for which patients and what periods of data to include for each project. The CER Hub provides a dedicated, secure website for each study, providing standard collaboration tools such as a team calendar, project membership roster, links to relevant resources, and internal document management. In addition, the website offers informatics tools for developing and implementing the study, including those to develop a standardized study-specific data processor (described in Section 2.5). Key early activities of every CER Hub study are developing the measures and deciding on the clinical events. Clinical events are indications evidenced within an encounter record (e.g., wheezing reported as persistent), whereas measures aggregate clinical events over time to address study questions (e.g., wheezing that persists across multiple encounters over a one-year period might be used as a measure of persistent asthma). The definition of measures and clinical events for a study generates requirements for the datasets and the data processor used to extract clinical events from those data. When developing and testing the data processor for a given study, two datasets (one for development and one for validation of data processor components) are extracted from each provider’s data warehouse, de-identified locally (i.e., stripped of the 18 personal identifiers per HIPAA’s “safe harbor method”), and shared with the CER Hub. These datasets are made available to the study team through secure web applications that provide capacity to annotate the records, develop a gold standard, and to evaluate capacity to automatically extract clinical events from encounter records. Fig. 2 shows the extraction of datasets for quality assurance (second from bottom) and for data processor development in the two data-flow branches at the bottom. Fig. 3 shows the workflow associated with study development. 2.3. Extraction to a local data store All data are extracted in accord with a user-friendly model we developed, the Clinical Research Document (CRD). The CRD schema (CRDS) defines an encounter-based view of a patient’s contact with their health care provider. This schema is designed to capture all transactional data generated in the EHR for a specific contact, including vital signs, medications, procedures ordered or performed, diagnoses, progress notes, and after visit summaries (Table 2). Data generated between visits with the health system (e.g., medication dispenses, delayed updates to patient problems, and other elements that are time-stamped but not otherwise linked to the patient visit) are captured in the same format using an “encounter type” attribute on the record that allows the same structure to represent these between-visit data elements. The CRDS specifies both the structure of data elements generated in the EHR as part of the encounter, and the allowed vocabularies and values for each component. Generation of data from the EHR is a two-step process. First, an extraction step produces XML documents that conform to the structure and data types specified by the CRDS. Second, content of discrete (coded) data fields are mapped to prescribed codes using the emrAdapter tool customized for and managed by each data provider [26]. These steps are next described in greater detail. 2.4. Modeling data to enable common representations across multiple health systems The CER Hub utilizes data extracted from EHRs with the aid of industry-standard formats provided by Health Level 7 (HL7) Clinical Document Architecture (CDA) [27,28]. While EHR systems are becoming increasingly capable of generating CDA documents directly, the CRDS format gives data providers a simplified “onramp” to participate in the CER Hub in the absence of a

5

mechanism to generate CDA documents directly from their EHRs, providing an intermediate representation that can be translated into HL7’s standard CDA formats with CER Hub tools. The CER Hub uses an early implementation of HL7’s CDA to represent encounters as clinical contacts between the health care system and patient, generalized (as described above in Section 2.3) to include between-contact patient data generated by the health system. To create encounter records in CDA format, data providers initiate them in CRD format. They next employ the emrAdapter tool to implement semantic constraints on each field of the CRD (e.g., medication coding system must be “NDC” or “RxNorm”). This twostep process allows for standardization in how data are extracted, providing for coordinated, managed, and standardized semantics across multiple data providers and across time. Extracted and normalized study data are transformed from the CRD structure into a format compatible with HL7’s CDA, using a second implementation of the emrAdapter tool. Past work has utilized CDA 1.0 as the target format in our studies. We are also working to develop support for an implementation of CDA 2.0 derived from the Continuity of Care Document (CCD) [29]. The CCD is an implementation of CDA 2.0 that contains a selective “snapshot” of patient information relevant to the referral, and typically does not contain the patient’s entire history or a representation of all clinical events that may be relevant to research. It has been noted that CCDs generated by current (2011) Office of the National Coordinator – Authorized Testing and Certification Body (ONC–ATBC)-certified EHRs are not sufficient for many data uses, including many types of research [30]. For research purposes, a more granular and more complete representation of the patient’s medical record is often required. Therefore, we are developing a CDA 2.0-compliant format we call the CRD–CCD, which will serve as our common data model for future CER Hub studies. In this format, a single encounter is represented in one document, and all EHR data generated by care operations for the encounter are included. For a given study, all such encounters relevant to the study protocol are extracted from the EHR or data warehouse and provided as input to a study-specific data processor (described next). The CRD–CCD implementation of CDA 2.0 incorporates the constraints required of the CCD and documents any CRD-required deviations from the CCD. The goal is to create an automated pathway to include CCD documents that are generated from mechanisms other than the CER Hub, such as those generated by an EHR directly for care operations or data exchange as mandated by the federal government’s meaningful use criteria [31]. 2.5. Aggregation of data according to this common data model Many events of interest to CER studies reside in providers’ progress notes and other text clinical notes. To identify studyspecific events in the complete encounter record, including both coded and free-text data, a “data processor” is developed (using CER Hub tools) that extends and specializes a proven medical record classifier (MediClass), enabling access to the entire encounter record [12]. MediClass applications use classification rules developed specifically for each study or re-used and extended from prior studies. These classification rules entail Boolean combinations of clinical concepts. A large database of clinical concepts, defined by equivalence classes of terms from combined clinical vocabularies of more than 120 source terminologies, are provided by the Unified Medical Language System’s (UMLS) Metathesaurus [32]. Tools on the CER Hub allow for rule and concept development, testing, extension, and refinement to enable study-specific applications of MediClass. Other tools hosted on the CER Hub enable comparison of automated classification performed by the MediClass application to manual coding of development and validation datasets from each data provider. Altogether these processes, performed by study team

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model

ARTICLE IN PRESS

IJB-3207; No. of Pages 11

B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

6

Table 2 Fields of the Clinical Research Document schema (CRDS), highlighting those discrete variables that are constrained by controlled vocabularies (other variables not shown are either numeric or free-text). CRDS Sections

Total # of variables

Variables mapped to controlled vocabularies

Controlled vocabularies

Patient detail

8

Gender Race primary Race secondary Ethnicity

M = Male, F = Female, UN = Unknown or other CDC defined Race & Ethnicity coding system adopted by HL7 and Health Information Technology Standards Panel (HITSP) and supported by PHIN Vocabulary Access and Distribution System (PHIN VADS)

Encounter detail

6

Encounter type

Service department

1 = outpatient – scheduled care, 2 = outpatient – emergency department care 3 = outpatient – urgent care/unscheduled or same day visit, 4 = outpatient – ancillary, 5 = outpatient – observation, 6 = inpatient, 7 = lab, 8 = pharmacy, 9 = telephone, 10 = assist, 11 = e-mail, 12 = other NUCC Health Care Provider Taxonomy Code Set: v11.0, 1/1/11

Providers

3

Provider type Provider dept

1 = attending physician, 2 = physician trainee, 3 = other NUCC Health Care Provider Taxonomy Code Set: v11.0, 1/1/11

Payers

3

Ins coverage

1 = commercial/private, 2 = veterans affairs, 3 = CHAMPUS, 4 = medicare traditional, 5 = medicare managed care plan, 6 = medicaid traditional, 7 = medicaid managed care plan, 8 = managed care HMO/pre-paid, 9 = managed care PPO/fee-for-service, 10 = managed care capitated, 11 = self pay/charity, 12 = workers compensation, 13 = other

Vital Signs

12

Visit Diagnosis

5

Diag code Diag coding system Diag order

(see DiagCodingSystem) ICD9CM, SNOMEDCT, Other (give name) 1 = Primary, 2 = Secondary

Problems

6

Prob code Prob coding system Problem status

(see ProbCodingSystem) ICD9CM, SNOMEDCT, Other (give name) Active, Inactive, Chronic, Intermittent, Recurrent, Rule out, Ruled out, Resolved

Medications

25

Med code Med coding system Med event type

(see MedCodingSystem) NDC, RXNORM, Other (give name) Order, Discontinue order, Dispense, Administer, Medication review taking, Medication review discontinued, Administrative cancellation 1=MEQ/MG, 2=MEQ/ML, 3=MG/ACTUAT, 4=MG, 5=MG/ML, 6=ML, 7=PNU/ML, 8=UNT/MG, 9=UNT/ML 1=tablets, 2=capsules, 3=vials, 4=packs, 5=ML, 6=MG, 7=ACTUAT FDA Route of Administration NCI Thesaurus OID: 2.16.840.1.113883.3.26.1.1 NCI concept code for route of administration: C38114 seconds, minutes, hours, days, weeks, months

Strength units Dose units Route

Freq units Spirometry

21

Tobacco

6

Tobacco status Tobacco type Smoking packs per day

1 = current, 2 = former, 3 = never, 4 = unknown 1 = cigarettes, 2 = pipes, 3 = cigars, 4 = smokeless, 5 = unknown 0 = 0 packs a day (non-smoker would have this value), 1 = 1/2 pack a day, 2 = 1 pack a day, 3 = 2 packs a day, 4 = 3 packs a day, 5 = 4 packs a day, 6 = 5 or more packs per day, 7 = Unknown

Immunizations

5

Code Coding system

(see CodingSystem) CPT, RXNORM, Other (give name)

Reasons

4

Coding system

Other (give name)

Allergies

7

Code Coding system Severity Status

(see CodingSystem) CPT, RXNORM, Other (give name) High, Medium, Low Active, Prior History, No Longer Active

Health maintenance alerts

5

Code Coding system Resolution code

(see CodingSystem) Other (give name) Done, Deferred, Pt Refused, Cancelled/NA

Procedures

5

Code Coding system Status

(see CodingSystem) CPT, SNOMEDCT, LOINC, Other (give name) Cancelled, Held, Aborted, Active, Completed

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model

ARTICLE IN PRESS

IJB-3207; No. of Pages 11

B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

7

Table 2 (Continued) CRDS Sections

Total # of variables

Variables mapped to controlled vocabularies

Controlled vocabularies

Referrals

8

Code Coding system Status Med specialty code

(see CodingSystem) CPT, SNOMEDCT, LOINC, Other (give name) Cancelled, Held, Aborted, Active, Completed NUCC Health Care Provider Taxonomy Code Set: v11.0, 1/1/11

Progress notes

6

Code Coding system Note status

(see CodingSystem) Other (give name) Addendum, Signed, Retracted

Patient instructions

5

Code Coding system

(see CodingSystem) Other (give name)

members, yield a “data processor tuning loop” ensuring that automated clinical event identification meets study goals (see Fig. 4). At the start of each CER Hub study, each participating data provider sends a de-identified dataset of encounter records (per HIPAA safe harbor method, as described in Section 2.2) in CRD format to the CER Hub coordinating center. Using a secure project website, the study team develops and tests a study-specific data processor using CER Hub tools (see Figs. 2–4). Once the data processor performs to satisfaction, it is downloaded into data providers’ respective secure environments to produce study data from encounter records, as shown in the main processing flow of Fig. 2. 2.6. Analysis of data to address research questions Each data provider runs the study-specific data processor on extracted encounter records, augmenting the source records with identified clinical events of interest. For example, patient-reported symptoms or history, clinical interpretations, patient counseling, and other data specific to a study that would only be found in text notes can be included to answer study questions [33,34]. Other relevant data are provided in coded fields (e.g., medication or procedure orders); they can be identified with the same data processor rules based on UMLS concepts. Because study data may involve hundreds of thousands of confidential medical encounter records, the studyspecific data processor is distributed to data providers that can run the processor within secured local data systems. The resultant augmented encounter records are then filtered via relatively simple extraction rules to produce a shareable limited data set (LDS, per HIPAA rules) called the eventstream file (Fig. 2). This simple text file includes one comma-delimited row per clinical event, identifying (with keys held only by the data provider) the patient, care

provider, location of care, and clinical event detail. The eventstream file includes only coded data addressing the clinical events needed to answer the study questions, and conforms to HIPAA rules for research data sharing. Consistent with data use agreements, data providers share eventstream files with a centralized analysis datacenter where a design for aggregating clinical events into study measures is applied. The collaborating study team reviews and refines results as necessary. 2.7. Dissemination of study results Each CER Hub study team operates under the usual model of investigator-led, collaborative research. The CER Hub framework provides an efficient method to streamline the research process and thereby compress the time to study results. Study leaders agree to make resources available to the broader CER Hub membership once a study has produced its main finding. In particular, the protocol defining study measures, events, rules, concepts, and terms becomes part of a “library” that is public to the broader CER Hub membership. In this manner, the CER Hub can accelerate publication of study results and ensure that knowledge resources are productively reused. 3. Results The CER Hub is being used to conduct retrospective research studies utilizing data from six diverse health systems. Researchers and data providers for these studies come from three Kaiser Permanente health plans (Northwest, Hawaii, and Georgia regions), a consortium of Community Health Centers located primarily along the West Coast (OCHIN, Inc.), a Veterans Administration service region (Puget Sound VA in Washington), and an integrated net-

Fig. 3. Workflow associated with a CER Hub study. Boxes shown in orange represent activities that take place at the local sites of data providers. Boxes shown in blue represent activities that utilize centralized CER Hub facilities. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model

ARTICLE IN PRESS

IJB-3207; No. of Pages 11

B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

8

Fig. 4. The CER Hub informatics tools that support study workflow. Yellow ovals represent specific tools of the CER Hub platform. Boxes show a sequence of tasks performed using these tools to generate study data. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

work of hospitals and physicians in the greater Dallas/Fort Worth area (Baylor Scott & White Health). Table 3 provides an overview of the health systems and the data provided for these studies, which represent more than 2.5 million patients. Researchers associated with these health systems are using the CER Hub to conduct studies addressing the effectiveness of (1) medications for controlling asthma and (2) smoking cessation services. The power of this platform is evident in the capacity of these studies to make efficient secondary use of large volumes of EHR data that has been collected in the course of patient care. All six health systems have built out their capacity to generate data records in accordance with the prescribed standardized formats. Each study site was equipped with a single server or work-

station hosting at least 500GB of local disk storage and 8GB of memory. The CER Hub studies in smoking cessation services and asthma control therapy have defined EHR-based measures relevant to each study, based on established clinical guidelines.[35–37] These studies include hundreds of thousands of smokers and diagnosed asthmatics (Table 3). For each study, a dataset of 500 representative encounter records was generated at each project site, purged of PHI, and manually coded for the necessary components. This coding took place using annotation tools provided on the centralized CER Hub website. The de-identification of these records, including the text notes, proved to be time-consuming and costly, however these data provided information necessary to develop and validate each study-specific data processor. Several

Table 3 Participating data providers and patient populations included in the Asthma Control and Smoking Cessation CER studies of CER Hub. Site

EMR

Asthma population* 2006–2012

Smoker population** 2006–2012

Full health system populations*** 2006–2012

Baylor Scott & White Health KP Georgia KP Hawaii KP Northwest OCHIN, Inc VA puget sound (Seattle) Total distinct patients

GE Centricity Health Connect/Epic Health Connect/Epic

38,940 (6.69%) 38,554 (8.91%) 39,084 (11.02%)

68,465 (11.77%) 68,912 (15.92%) 65,117 (18.36%)

581,759 432,762 354,599

Health Connect/Epic Epic VISTA

74,103 (10.53%) 42,604 (10.09%) ––– 233,285 (8.76%)

148,588 (21.12%) 93,651 (22.19%) 42,988 (25.69%) 487,721 (18.32%)

703,400 422,124 167,348 2,661,992

Notes: Patients shown are ages 12 years and older on 01/01/2006 with either an asthma diagnosis code* or a smoker diagnosis (or social history) code** applied to a visit during the 2006-2012 time period. ***Patients shown are ages 12 years and older on 01/01/2006 with at least one encounter during the 2006-2012 time period. The VA study site did not participate in the Asthma Control study.

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

rounds of data quality assurance activities were conducted across the network, using samples of the study data [38]. Data processors for each study were built, validated, and deployed to generate CER data across a six-year period.1 Application of the data processor to the full study datasets took anywhere from one to several weeks per study. Challenges often encountered included insufficient disk space and contention for processing resources created by local virus scanning software. Manuscripts reporting on the data QA process and on study results are in preparation. The use of the CER Hub is expanding to additional areas with three studies evaluating: (1) exercise counseling in primary care as documented in the EHR, (2) use of the EHR to enhance detection and characterization of preconception health, and (3) a method to identify COPD exacerbations and the quality of care addressing exacerbations.

9

4.4. Development of a library of shareable clinical data processing applications that anyone in the CER Hub community can use Failure to reuse the time-consuming work to extract and model clinical events necessary to conduct CER will result in significant delays in our understanding of which interventions work and which ones do not. The CER Hub library offers a mechanism for accelerating knowledge development related to using EHRs for CER. Several other large-scale, informatics platforms are currently under development or in use for various CER-related projects (Table 1) and address many of the issues we have discussed in this paper. We believe the CER Hub is distinguished by its focus upon these four capabilities. These capabilities allow CER Hub users to efficiently collaborate on large, multi-institutional CER projects to sustainably generate knowledge to improve health care. 4.5. Limitations

4. Discussion The CER Hub proves that a cross-institutional informatics platform can be built atop a widely available generic clinical data interchange format (i.e., CDA). To meet the complex challenges of CER, our platform offers several key capabilities:

4.1. Ability to process and analyze both the free-text and coded clinical data portions of the patient’s entire health record Failure to include patient data contained within the free-text portions of EHRs will certainly lead to erroneous estimates of the effectiveness of many clinical interventions. For many CER questions the data will only reside in the EHR text notes.

4.2. A distributed data processing environment supported by a distributed data and study governance process that enables each organization to maintain control of its data Failure to ensure patient, and even organizational, confidentiality will greatly reduce researchers’ access to the large, cross-organizational data sets required to address many CER questions. Furthermore, to gain efficiencies, we need to create governed independence between data creators and data users. In other words, we need methods allowing researchers to analyze data that they were not personally, or even organizationally, involved in collecting. Likewise, we must enable patients and organizations without the time, money, expertise, or even interest to participate in the generation of new knowledge by sharing their data in ways that allow them to maintain physical control.

4.3. Use of a standard, generic clinical data interchange format (consistent with data standards being required of EHRs for certification by the Office of the National Coordinator for Health IT) facilitate standardized extraction of patient-specific clinical research data from any EHR [39] Aggregating, organizing, reducing, and transforming clinical data from the transactional data systems used by modern EHRs into a longitudinal format where researchers can interpret and synthesize them is important, yet difficult [40]. Taking advantage of data interchange standards creates efficiencies and promises to compress the time needed to generate health care research results.

1

Only 5 of the 6 study sites participated in the Asthma Control study.

The CER Hub provides infrastructure for collaborations of scientists to conduct studies that aggregate diverse, heterogeneous, EHR data. Many challenges remain to make EHR information interoperable for research purposes across health care institutions. The CER Hub provides a common data model defining content constraints on the HL7 Clinical Document Architecture (CDA) together with a method and tools for extracting standardized study variables from CDA records, including from free-text clinical notes contained in those records. Our solution provides a framework to aggregate data across EHRs as needed for the studies described above, but undoubtedly our solution will need on-going refinement to include new EHRs, new data content areas, and new studies. The HL7 CDA provides guidelines, templates, methods, tools, and also some specific implementation choices (including clinical content choices) that provide a foundation to build upon to achieve data interoperability for use in research studies. Our approach has been to (a) utilize this framework to capture EHR source data, and also (b) to make use of adopted implementations of the CDA that can be leveraged for providing research data “out of the box” from vendors’ EHR products. As such, we have utilized early versions of the CDA and also have ongoing work to utilize the Continuity of Care Document (CCD) implementation, a more recent and highly refined implementation of CDA used in practice for patient referral to external care services [41]. In order to ensure we are not creating a “data model silo” with our work, we have attempted to clearly document the requirements of each field in our common data model, and our ongoing work strives to define all deviations from the Continuity of Care Document (CCD) using HL7 CDA methodology and conformance rules [29]. Such characterizations might also be captured using other meta-models, such as IHE Profiles [42]. Our methods have been guided by HL7 standards and methodology in developing the CER Hub infrastructure, yet some reviewers noted that we are failing to do the work to define the “one global standard” that is surely needed for regional and national exchange of clinical data for research purposes. The “chicken-or-egg” challenge is that before standards can be successfully developed or utilized globally, development and sharing of infrastructure to productively use those standards in limited ways (and to refine those standards so that they actually work in practice) will need to be accomplished. To that end, the CER Hub represents a pilot implementation that bites off a big enough piece of the problem to produce real and tangible outcomes (multi-institution CER study results) while serving as a learning platform upon which future standards can be developed and refined. In this paper, we barely touched upon the many issues in governance, organizational policies, data security, and others that are essential to operating a research network. Instead, we focused here

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

10

Acknowledgements Summary points • Comparative effectiveness research (CER) requires coordinated and scalable methods for extracting, aggregating, and analyzing complex, multi-institutional clinical data. • The CER Hub is a web-based informatics platform for developing and conducting CER studies that combine comprehensive electronic clinical data from multiple health care organizations. • The CER Hub platform implements an internet-accessible, informatics standards-based, data processing pipeline for developing study-specific applications to access both coded and freetext, patient-centric electronic health record-based (EHR) data across organizations. • The CER Hub is being used to conduct two CER studies utilizing data from six geographically distributed and demographically diverse health systems.

on the data representation, manipulation, and integration challenges and solutions, to attempt clinical data interoperability for research. Ours is one pilot implementation that builds on prior work in an attempt to explore the relevant issues and develop solutions. The job is not complete, but this work will hopefully inform future efforts. The current state of knowledge does not permit optimal design of the national or even regional research data interchange platform today. Much more work must be done to clearly identify problems and potential solutions [11,43]. It is also worth noting that the alternative to piloting research networks such as CER Hub to learn and generate trial solutions is to drive the implementation of standards through top-down mandates. Meaningful Use is the example most relevant here, which attempts to drive standards implementation through EHR vendors responding to use-case needs established by the government on EHR users (health care systems). This is a massive effort to use the HL7CDA framework to achieve clinical data interoperability at the national level. Definitive evaluation of the outcome of that effort is premature, but it is definitely a bumpy road [44]. 5. Conclusion Although the growing adoption of EHRs generates substantial amounts of electronic clinical data, these data are not directly available for CER for various social and technical reasons. CER requires coordinated and scalable methods for extraction, aggregation, and analysis of complex, multi-institutional clinical data. By offering a range of informatics tools integrated into a framework for conducting studies using EHR data, the CER Hub provides a solution to the challenges of multi-institutional comparative effectiveness research. Author contributions Brian L. Hazlehurst and Stephen E. Kurtz – Manuscript writing, methods development and application, oversight and management of entire project. Dean F. Sittig – Manuscript writing, review, and editing. Andrew Masica, Jon E. Puro and Vinutha Vijayadeva – Oversight and management of one project site, data analysis review, manuscript review and editing. Victor J. Stevens and Mary Ann McBurnie – Data analysis review, manuscript review and editing. David H. Au and Elissa D. Brannon – Oversight and management of one project site, data analysis review, manuscript review and editing.

The CER Hub project (www.cerhub.org) is funded by grant R01HS019828 (Hazlehurst, PI) from the Agency for Health Care Research and Quality (AHRQ), Department of Health and Human Services. The funders had no role in protocol development, data collection, analysis, interpretation, or manuscript production. The authors of this report are solely responsible for its content.

References [1] Institute of Medicine, Initial National Priorities for Comparative Effectiveness Research, The National Academies Press, Washington, DC, 2009. [2] Federal Coordinating Council for Comparative Effectiveness Research. Report to the President and the Congress: June 30, 2009. 2009 June. (cited 19.03.10). Available from: . [3] AHRQ. Effective Health Care Program . (cited 09.08.13). Available from: . [4] J.K. Iglehart, Prioritizing comparative-effectiveness research–IOM recommendations, N. Engl. J. Med. 361 (4) (2009) 325–328. [5] C. Clancy, F.S. Collins, Patient-centered outcomes research institute: the intersection of science and health care, Sci. Transl. Med. 2 (37) (2013) 325–328. [6] M.S. Lauer, F.S. Collins, Using science to improve the nation’s health system: NIH’s commitment to comparative effectiveness research, JAMA 303 (21) (2010) 2182–2183. [7] H.P. Selker, B.L. Strom, D.E. Ford, et al., White paper on CTSA consortium role in facilitating comparative effectiveness research: september 23, 2009CTSA consortium strategic goal committee on comparative effectiveness research, Clin. Transl. Sci. 3 (1) (2010 Feb) 29–37. [8] L.W. D’Avolio, W.R. Farwell, L.D. Fiore, Comparative effectiveness research and medical informatics, Am. J. Med. 123 (12 Suppl 1) (2010) e32–e37. [9] G. Hripcsak, D.J. Albers, Next-generation phenotyping of electronic health records, J. Am. Med. Inform. Assoc. 20 (1) (2013) 117–121. [10] The American Recovery and Reinvestment Act of 2009. Public Law 111-5-February 17, 2009. 2009. (cited 12.08.12). [11] D.F. Sittig, B.L. Hazlehurst, Informatics grand challenges in multi-institutional comparative effectiveness research, J. Comp. Eff. Res. 1 (5) (2012) 373–376. [12] B.L. Hazlehurst, H.R. Frost, D.F. Sittig, V.J. Stevens, MediClass. A system for detecting and classifying encounter-based clinical events in any electronic medical record, J. Am. Med. Inform. Assoc. 12 (5) (2005) 517–529. [13] C. Friedman, L. Shagina, Y. Lussier, G. Hripcsak, Automated encoding of clinical documents based on natural language processing, J. Am. Med. Inform. Assoc. 11 (5) (2004) 392–402. [14] P.M. Nadkarni, L. Ohno-Machado, W.W. Chapman, Natural language processing: an introduction, J. Am. Med. Inform. Assoc. 18 (5) (2011) 544–551. [15] D.F. Sittig, B.L. Hazlehurst, J. Brown, et al., A survey of informatics platforms that enable distributed comparative effectiveness research using multi-institutional heterogenous clinical data, Med. Care 50 (Suppl) (2012) S49–S59. [16] Forum, EDM, Informatics Tools and Approaches To Facilitate the Use of Electronic Data for CER, PCOR, and QI. Resources Developed by the PROSPECT, DRN, and Enhanced Registry Projects (2013). Issue Briefs and Reports. Paper 11. Available at . [17] J.S. Brown, J.H. Holmes, K. Shah, K. Hall, R. Lazarus, R. Platt, Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care, Med. Care 48 (6 Suppl. 1) (2010) S45–S51. [18] S.N. Murphy, G. Weber, M. Mendis, V. Gainer, H.C. Chueh, S. Churchill, et al., Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2), J. Am. Med. Inform. Assoc. 17 (2) (2010) 124–130. [19] G.M. Weber, S.N. Murphy, A.J. McMurry, D. Macfadden, D.J. Nigrin, S. Churchill, I.S. Kohane, The shared health research information network (SHRINE): a prototype federated query tool for clinical data repositories, J. Am. Med. Inform. Assoc. 16 (5) (2009) 624–630. [20] C.J. McDonald, J.M. Overhage, M. Barnes, G. Schadow, L. Blevins, P.R. Dexter, B. Mamlin, I.N.P.C. Management Committee, The Indiana network for patient care: a working local health information infrastructure. An example of a working infrastructure collaboration that links data from five health systems and hundreds of millions of entries, Health Aff. (Millwood). 24 (2005) 1214–1220. [21] D.R. Flum, R. Alfonso Cristancho, E.B. Devine, A. Devlin, E. Farrokhi, P. Tarczy-Hornoch, L. Kessler, D. Lavallee, D.L. Patrick, J.L. Gore, S.D. Sullivan, CERTAIN collaborative. implementation of a real-world learning health care system: Washington state’s comparative effectiveness research translation network(CERTAIN), Surgery 155 (5) (2014) 860–866.

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

G Model IJB-3207; No. of Pages 11

ARTICLE IN PRESS B.L. Hazlehurst et al. / International Journal of Medical Informatics xxx (2015) xxx–xxx

[22] S. Rea, J. Pathak, G. Savova, et al., Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: the SHARPn project, J. Biomed. Inform. 45 (4) (2012) 763–771. [23] K.K. Kim, D. McGraw, L. Mamo, L. Ohno-Machado, Development of a privacy and security policy framework for a multistate comparative effectiveness research network, Med. Care 51 (8 Suppl. 3) (2013 Aug) S66–72. [24] R.L. Fleurence, L.H. Curtis, R.M. Califf, R. Platt, J.V. Selby, J.S. Brown, Launching PCORnet, a national patient-centered clinical research network, J. Am. Med. Inform. Assoc. 21 (4) (2014) 578–582. [25] P. Payne, D. Ervin, R. Dhaval, T. Borlawsky, A. Lai, TRIAD the translational research informatics and data management grid, Appl. Clin. Inform. 2 (3) (2011) 331–344. [26] R. Frasier, A. Allisany, B.L. Hazlehurst, The Emr adapter tool: a general-purpose translater for electronic clinical data, Proceedings of the AMIA Annual Symposium 2012 (1740), Available at . [27] R.H. Dolin, L. Alschuler, C. Beebe, et al., The HL7 clinical document architecture, J. Am. Med. Inform. Assoc. 8 (6) (2001) 552–569. [28] R.H. Dolin, L. Alschuler, S. Boyer, et al., HL7 clinical document architecture, release 2, J. Am. Med. Inform. Assoc. 13 (1) (2006) 30–39. [29] HL7/ASTM Implementation Guide for CDA Release 2-Continuity of Care Document (CCD® ) Release 1. 2013 (cited 15.08.13). Available from: URL: . [30] J.D. D’Amore, D.F. Sittig, A. Wright, M.S. Iyengar, R.B. Ness, The promise of the CCD: challenges and opportunity for quality improvement and population health, AMIA Ann. Symp. Proc. 2011 (2011) 285–294. [31] Meaningful Use EHR. Certification Criteria. 2013 (cited 15.08.13). Available from: URL: . [32] B.L. Humphreys, D.A. Lindberg, H.M. Schoolman, G.O. Barnett, The unified medical language system: an informatics research collaboration, J. Am. Med. Inform. Assoc. 5 (1) (1998) 1–11. [33] B.L. Hazlehurst, D.F. Sittig, V.J. Stevens, et al., Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines, Am. J. Prev. Med. 29 (5) (2005 Dec) 434–439.

11

[34] B.L. Hazlehurst, A. Naleway, J. Mullooly, Detecting possible vaccine adverse events in clinical notes of the electronic medical record, Vaccine 27 (14) (2009) 2077–2083. [35] National Asthma Education and Prevention Program NHLaBI. Program Description. 2013. [36] National Institutes of Health – National Heart LaBI, Expert panel report 3 (EPR-3): guidelines for the diagnosis and management of asthma, summary report 2007, J. Allergy Clin. Immunol. 120 (5, Supplement) (2007) S94–S138. [37] M.C. Fiore, Treating tobacco use and dependence: an introduction to the US public health service clinical practice guideline, Respir. Care 45 (10) (2000 Oct) 1196–1199. [38] K.L. Walker, O. Kirillova, S.E. Gillespie, D. Hsiao, V. Pishchalenko, A.K. Pai, J.E. Puro, R. Plumley, R. Kudyakov, W. Hu, A. Allisany, M. McBurnie, S.E. Kurtz, B.L. Hazlehurst, Using the CER Hub to ensure data quality in a multi-institution smoking cessation study, J. Am. Med. Inform. Assoc. 21 (6) (2014 Nov) 1129–1135. [39] Certified health IT product list. Office of the National Coordinator for Health Information Technology . [40] J.C. Feblowitz, A. Wright, H. Singh, L. Samal, D.F. Sittig, Summarization of clinical information: a conceptual model, J. Biomed. Inform. 44 (4) (2011 Aug) 688–699. [41] C. Daniel, G.B. Erturkmen, A.A. Sinaci, B.C. Delaney, V. Curcin, L. Bain, Standard-based integration profiles for clinical research and patient safety, AMIA Jt. Summits Transl. Sci. Proc. 18 (2013) 47–49, eCollection 2013. [42] J.G. Klann, M. Mendis, L.C. Phillips, A.P. Goodson, B.H. Rocha, H.S. Goldberg, N. Wattanasin, S.N. Murphy, Taking advantage of continuity of care documents to populate a research repository, J. Am. Med. Inform. Assoc. (Oct 28) (2014). [43] G.M. Weber, K.D. Mandl, I.S. Kohane, Finding the missing link for big biomedical data, JAMA 311 (2014) 2479–2480. [44] J.D. D’Amore, J.C. Mandel, D.A. Kreda, A. Swain, G.A. Koromia, S. Sundareswaran, L. Alschuler, R.H. Dolin, K.D. Mandl, I.S. Kohane, R.B. Ramoni, Are meaningful use stage 2 certified EHRs ready for interoperability? Findings from the SMART C-CDA collaborative, J. Am. Med. Inform. Assoc. 21 (6) (2014) 1060–1068.

Please cite this article in press as: B.L. Hazlehurst, et al., CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data, Int. J. Med. Inform. (2015), http://dx.doi.org/10.1016/j.ijmedinf.2015.06.002

CER Hub: An informatics platform for conducting comparative effectiveness research using multi-institutional, heterogeneous, electronic clinical data.

Comparative effectiveness research (CER) requires the capture and analysis of data from disparate sources, often from a variety of institutions with d...
2MB Sizes 0 Downloads 8 Views