Annals of Internal Medicine

RESEARCH AND REPORTING METHODS

Innovations in Data Collection, Management, and Archiving for Systematic Reviews Tianjing Li, MD, MHS, PhD; S. Swaroop Vedula, MBBS, PhD; Nira Hadar, MS, PhD; Christopher Parkin, MS; Joseph Lau, MD; and Kay Dickersin, MA, PhD

Data abstraction is a key step in conducting systematic reviews because data collected from study reports form the basis of appropriate conclusions. Recent methodological standards and expectations highlight several principles for data collection. To support implementation of these standards, this article provides a step-by-step tutorial for selecting data collection tools; constructing data collection forms; and abstracting, managing, and archiving data for systematic reviews. Examples are drawn from recent experience using the Systematic Review Data Repository

for data collection and management. If it is done well, data collection for systematic reviews only needs to be done by 1 team and placed into a publicly accessible database for future use. Technological innovations, such as the Systematic Review Data Repository, will contribute to finding trustworthy answers for many health and health care questions.

A

siveness, completeness, accuracy, consistency, transparency, efficiency, and accessibility. Clinicians, regardless of their experience in performing systematic reviews, may not have been exposed to the nuts and bolts of data collection methods as part of their medical education. In this article, we provide a step-by-step tutorial for collecting, managing, and archiving data for systematic reviews and suggest steps for developing rigorous data collection forms in the Systematic Review Data Repository (SRDR) (http://srdr.ahrq.gov) to facilitate implementation of the methodological standards and expectations of the IOM and other organizations (Table 1) (5).

ccording to the Institute of Medicine (IOM), evidence-based clinical decisions require health care systems to meet the challenge of building workforce capacity for comparative effectiveness research (1). Systematic review, a form of comparative effectiveness research in which existing research is synthesized and interpreted, is expected to provide a complete and accurate picture of all that is known about the intervention or disease (2). One crucial step in achieving valid results in systematic reviews is to collect data from relevant study reports accurately and completely, a process known as data abstraction (also called data extraction). In data abstraction, 1 or 2 persons read a research report, such as a published paper, and abstract information about the study design and setting, characteristics of study participants, test and comparison interventions (or exposure groups in an epidemiologic study), outcomes, numerical results on treatment effects (or associations), and other key elements relevant to the review from the report. Data abstraction is at the core of a systematic review because accurate data and their synthesis form the basis of appropriate conclusions. To ensure high-quality data abstraction, the IOM provides several recommendations: Review authors should “use standard data extraction forms developed for the specific systematic review” (3.5.3) and “pilot-test the data extraction forms and process” (3.5.4) (2). The IOM also recommends that review authors should, “at a minimum, use two or more researchers, working independently, to extract quantitative and other critical data from each study. For other types of data, one individual could extract the data while the second individual independently checks for accuracy and completeness. Establish a fair procedure for resolving discrepancies— do not simply give final decision-making power to the senior reviewer” (3.5.1) (2). The Cochrane Collaboration makes recommendations similar to the 3 IOM standards and further suggests that data abstractors should “collect characteristics of the included studies in sufficient detail to populate a table of ‘Characteristics of Included Studies'” (3, 4). These recommendations highlight a few principles for data collection: comprehen-

Ann Intern Med. 2015;162:287-294. doi:10.7326/M14-1603 www.annals.org For author affiliations, see end of text.

SELECTING DATA COLLECTION TOOLS Data abstraction for systematic reviews is typically performed using paper forms, electronic forms (such as PDF forms or forms associated with a database, such as Microsoft Access [Microsoft]), or commercial or custombuilt data systems that allow online form building, data entry by several users, data sharing, and data management. The choice of the format depends largely on the size of the systematic review and resources available to the review authors (Table 2). Paper forms (still a commonly used approach) are pilot-tested and then distributed to data abstractors for data collection. Data collected on paper forms need to be entered into a database or sometimes a spreadsheet, compiled, and processed for analysis. These extra steps make them inefficient to use and susceptible to errors. Collecting data on electronic forms is an alternative that makes it easier to process the data. For example,

See also: Web-Only Supplements CME quiz © 2015 American College of Physicians 287

RESEARCH AND REPORTING METHODS Table 1. Suggested Steps for Developing Rigorous Data Collection Forms in the SRDR Develop outlines of evidence tables, figures, and meta-analyses needed for the systematic review and identify data elements needed for each Assemble and logically group data elements to facilitate form development and the data collection process For each data element, identify the optimal way of framing the data abstraction item to ensure complete and accurate data; use commoncore data elements when possible Develop data abstraction forms using word processing software and maintain forms for distribution and discussion Set up and pilot-test data abstraction forms in the SRDR Train data abstractors in using the SRDR for data abstraction or data entry (basic training plus review-specific training) Implement a plan to ensure and control data quality and to monitor progress Export and clean the data for analysis

SRDR = Systematic Review Data Repository.

review authors build forms using Acrobat (Adobe), Microsoft Access, or Google Forms (Google). The design of electronic forms requires familiarity with software packages, specifically off-the-shelf products oriented to form design. For example, using Adobe Acrobat form builder to build a question, one starts by selecting the question type (such as a check box, radio button, or text field), followed by specifying the question and response options through a user-friendly wizard. Data abstractors use 1 copy of the form for each study. Data collected from all studies and by several persons then can be imported and compiled electronically into a format ready for cleaning and analysis. Collecting data on electronic forms may be more efficient and result in fewer errors than collecting data on paper forms. We discourage the use of a spreadsheet for data collection because it is neither a database nor a form. A more sophisticated alternative to electronic forms is to use a data system designed for capturing and storing data for systematic reviews. Data systems are usually needed for large-scale systematic reviews (such as evidence reports produced as part of the Effective Health Care Program at the Agency for Healthcare and Quality) or for reviews that will be updated frequently. A data system comprises the database, computer programs, forms, and procedures that control the flow of data captured in a study (or in our case, a systematic review) (6). In systematic reviews, data systems are used to collect, maintain, and store data and monitor progress and performance on data abstraction. The SRDR is an example of an open-access, stateof-the-art, Web-based data system, which was developed and is maintained by the Agency for Healthcare Research and Quality– designated Evidence-based Practice Center at Brown University (Providence, Rhode Island). It was launched for public use in June 2012 (5). We chose to use the SRDR for data collection for this tutorial because the SRDR integrates the latest technology, is designed specifically for systematic reviews, is available to anyone for free, and bears minimal upfront cost to review authors. In addition, the repository nature of the SRDR contributes to increasing public avail288 Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015

Data Collection Innovations for Systematic Reviews

ability of clinical trial data and strengthens the validity and reproducibility of systematic reviews. When using the SRDR, a project leader or the primary investigator creates the project and data abstraction forms by following a user-friendly dialogue and wizard (http://srdr.ahrq.gov/help). The project leader specifies questions and possible responses to set up a form in the SRDR. The questions may be based on existing paper or electronic forms. The project leader can provide data abstractors with varying levels of access to the form and data: permission to edit the form, enter data, or only view the form. Data abstractors at different locations can abstract and enter data directly from relevant reports into the same SRDR database because it is Web-based. A space for data abstractors to add comments is an option associated with each data field. For example, data abstractors could note the location of hard-to-find information within a report to facilitate data checking and adjudication. The Data Comparison Tool in the SRDR automates comparison of data abstracted in duplicate by 2 or more persons, and the 2 abstractors or a third person can then adjudicate discrepancies. The SRDR also keeps track of the data abstraction progress (such as information about how many studies have been abstracted and by whom). Data entered into the SRDR can be readily exported into an appropriate format for data cleaning (the process of detecting and correcting incomplete and inaccurate data) and analysis. Review authors can archive or publish the cleaned data within the SRDR. As an example, we have used the SRDR for a network meta-analysis project that included approximately 500 trials. The SRDR form we used contains approximately 125 questions and 550 data items. Sixteen data abstractors, some working remotely from other countries, abstracted data for our project using this form and the SRDR.

DEVELOPING DATA COLLECTION FORMS Regardless of whether systematic review data are collected using paper or electronic forms or a data system, such as the SRDR, the key to successful data collection is to construct easy-to-use forms and collect complete and unambiguous data that faithfully represent the source in a structured and organized manner. Development of a good form requires input and expertise from all members of the team (that is, content area experts, epidemiologists or others with formal training in form design, statisticians, and persons who will perform data abstraction), and it is an iterative process (2, 7). A good data abstraction form should minimize the need to go back to the source documents (such as the study reports). Table 1 describes suggested steps for setting up data collection forms in the SRDR; these principles are applicable to any method of data collection for systematic reviews. Step 1: Develop Outlines of Tables and Figures Collecting the right amount of data (not too much and not too little) in a systematic review is highly desirwww.annals.org

RESEARCH AND REPORTING METHODS

Data Collection Innovations for Systematic Reviews

able, and this can be efficiently achieved by developing outlines of evidence tables and figures that will appear in the systematic review ahead of time. It is important that systematic reviewers familiarize themselves with guidelines for reporting systematic reviews and metaanalyses, such as Preferred Reporting Items for Systematic Reviews and Meta-Analyses and Meta-analysis of Observational Studies in Epidemiology, to ensure that relevant elements and sections are incorporated (8, 9). A complete list of guidelines for reporting health care research can be found at the Enhancing the Quality and Transparency of Health Research Network's Web site (www.equator-network.org). For each research question addressed in the systematic review, the tables and figures contain qualitative and quantitative data from each study that contribute to the assessment of what is known about an intervention's effects, how much confidence we have about the evidence, and to identify the research gaps. A typical systematic review includes a flow chart of study selection, a table about the characteristics of included studies (also called an evidence table), a table about the characteristics of excluded or ongoing studies, a table about the risk of bias, and figures showing the meta-analysis results (2– 4, 7–10). Meta-analysis is the quantitative analysis that combines results from similar but separate studies and is an optional component of a systematic review. Other tables and figures showing the summary of the findings, sensitivity analysis, and subgroup analysis are added as desired. Step 2: Assemble and Group Data Elements Table 3, adapted from the Centre for Reviews and Dissemination's guidance for undertaking reviews in

health care (York, United Kingdom) (7) and the Cochrane Handbook for Systematic Reviews of Interventions (4) and built on our own experience, lists the types of data that are often collected to populate the tables and figures constructed in step 1. Important characteristics that would modify the treatment effect or association of interest should be collected. In addition, different data items should be collected, depending on the review type. For example, the data items for assessing risk of bias differ among reviews of intervention effectiveness, epidemiology, and diagnostic test accuracy. The data elements in Table 3 are grouped to facilitate form development and data collection. Tabs in the SRDR are designed to mimic this grouping process (Table 4). For example, questions about study characteristics could be asked under the Design and Quality tabs. It is not necessary to use every single tab in the SRDR, and systematic reviewers should arrange questions in order of flow and ease of use. Because the SRDR is a relational database (that is, data tables are linked to each other through common keys and identifiers), responses to earlier items may affect subsequent questions. For example, if data abstractors first enter 2 study groups, “Calcium” and “Calcium plus vitamin D,” under the Arms tab, the 2 entries will automatically populate questions under the Arms Detail tab where all data pertinent to individual study groups can be entered (examples are shown in Figures 1 to 3 of Supplement 1, available at www.annals.org). The same relationship also populates the column headers for items under the Baseline tab to ensure that

Table 2. Considerations in Selecting Data Collection Tools Considerations

Paper Forms

Electronic Forms

Data Systems

Examples

Printed forms developed using a word processing software

DistillerSR (Evidence Partners) SRDR Doctor Evidence

Suitable review type and team sizes

Small-scale reviews (20 studies), as well as reviews that need constant updating All team sizes, especially large teams (i.e., >6 data abstractors) Low (open-access data systems, such as the SRDR) High (commercial data systems) Allow form building and data collection Allow data storage, linking, and sharing Can be integrated with title/abstract and full-text screening Can readily automate data comparison Allow easy monitoring of progress and performance Allow real-time data entry over the Internet Facilitate coordination among abstractors Improve public accessibility Upfront investment of resources to set up the form and train data abstractors Commercial data systems can be expensive Require familiarity with data systems Susceptible to changes in software versions

PDF = Portable Document Format; SRDR = Systematic Review Data Repository. www.annals.org

Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015 289

RESEARCH AND REPORTING METHODS Table 3. Types of Data Collected From Individual Studies in Systematic Reviews of Intervention Effectiveness, Epidemiology, and Prognosis of a Disease General information Name of data abstractor, date of data abstraction, and identification features of each report from which data are being abstracted Study characteristics Aim or objectives of the study Study design: Parallel, factorial, crossover design for RCTs and cohort, case– control, or cross-sectional design for observational studies Single or multicenter study; if multicenter, number of recruiting centers Region(s) and country/countries from which study participants were recruited Recruitment and sampling procedures used Enrollment start and end dates; length of participant follow-up Sample size or power calculation Details on random sequence generation, allocation concealment, and masking for RCTs and methods used to prevent and control for confounding, selection bias, and information bias for observational studies Methods used to prevent and address missing data Likelihood of reporting and other biases Source(s) of monetary or material support (“funding”) for the study Authors' financial relationship and other potential conflicts of interest Participant characteristics Study eligibility criteria Characteristics of participants at the beginning (or baseline) of the study Intervention (or exposure) and setting Setting in which the intervention is delivered or exposure is in effect Description of the intervention(s) and comparison interventions(s) for RCTs: Routes of delivery, doses, timing, frequency, and length of intervention Integrity of interventions (that is, the degree to which specified procedures or components of the intervention are implemented as planned) Description of co-interventions Description of exposure groups for observational studies: Dose, timing, frequency, length of exposure, cumulative exposure, threshold effect, and timing between exposure and expected outcome Description of how exposure measurements are made Outcomes and numerical results For each prespecified outcome in the systematic review: Outcome domain or title (such as anxiety) Specific diagnostic or measurement method; for a scale, name of the scale (such as the Hamilton Anxiety Rating), upper and lower limits, and whether a high or low score is favorable, definition of the threshold if appropriate Specific metric (such as change in anxiety from baseline and anxiety at a time point) Method of aggregation (such as mean anxiety score and proportion of persons with anxiety) Timing for outcome measurements (such as the number or times of follow-up measurements and outcome measurement time windows allowed) Unit of analysis (such as individual participant, clinic, and village) Statistical methods used Measure of effect or association used (such as mean difference and risk ratio) Between-group summary results that quantify the effect or association between the intervention (or exposure) and the outcome; precision of the summary results

290 Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015

Data Collection Innovations for Systematic Reviews

Table 3—Continued Types of analysis (such as intention-to-treat and per-protocol) and covariates adjusted in the statistical model for RCTs and covariates adjusted for observational studies For all groups, and for each outcome at each time point: number of participants randomly assigned, enrolled, or included in the analysis and those who withdrew or were lost to follow-up If subgroup analysis is planned, this information will need to be abstracted for each patient subgroup

RCT = randomized, controlled trial.

baseline data from study groups are properly entered, linked, and saved. Step 3: Identify the Optimal Way of Framing the Data Abstraction Item The next step is to decide the optimal way of framing the data abstraction item to ensure complete, consistent, accurate, and rigorous data collection. The types of questions supported in the SRDR include multiple-choice, drop-down menu, matrix-type, and text. Figures 4 to 9 of Supplement 1 show examples for each type of question. Much has previously been written about how to frame data items for developing robust data collection forms in primary research studies (11, 12). We summarize the key messages and highlight a few issues that are pertinent to systematic reviews (Table 5). First, ask closed-ended questions as much as possible. A closed-ended question is one that defines a list of permissible responses (11). To set up a closedended question, one must anticipate and structure possible responses and add an “other, specify” category because the anticipated list may not be exhaustive. The data collected from closed-ended questions is typically precoded and thus will be easy to process. On the contrary, an open-ended question is characterized by the absence of a defined list of permissible responses and typically comprises unformatted written text (11). Open-ended questions are useful when it is not possible to anticipate the different responses that may be given or when it is necessary to avoid leading the data abstractors by indicating permissible replies. However, open-ended questions provide little control over data quality, post hoc coding is required, and analysis may be problematic or influenced by responses. As an example, sometimes data about adverse events are collected through open-ended questions when the association between adverse events and an intervention is not well-studied. This can be challenging because the adverse events that are of interest to a systematic reviewer may not be reported in the primary studies using a consistent format. The use of an openended question (such as, “What are the adverse events reported in the study?”) is a shortcut to design the form; however, data collected this way are almost impossible to summarize or analyze without further processing. As much as possible, important adverse events associated with the intervention should be prespecified, and data should be collected from each study using closed-ended questions. The option “other, specwww.annals.org

Data Collection Innovations for Systematic Reviews

Table 4. Types of Data Abstracted From Reports of Individual Studies and Corresponding Tabs to Use in the SRDR Data Abstracted From Reports of Individual Studies

Corresponding Tab to Use in the SRDR

General information Study characteristics Participant characteristics Interventions (or exposures) and setting Outcomes and quantitative results

Publication Design and Quality Baseline Groups and Group Details Outcomes, Outcome Details, Results, and Adverse Events

SRDR = Systematic Review Data Repository.

ify” could be used to capture unanticipated answers. Systematic reviewers could use validated standard formats, such as the Medical Dictionary for Regulatory Activities, to categorize adverse event data. The Medical Dictionary for Regulatory Activities, developed in the late 1990s by the International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use, is used by international regulatory agencies to monitor the safety of medical products (www.meddra.org). Second, avoid asking a question in a way that the response may be left blank and instead use “not applicable,” “cannot tell,” or “not reported” options as needed. For example, the response options for the question, “Were participants masked to the treatment assignment?,” should include “cannot tell” as an option for cases in which reporting is unclear or insufficient, such as when an article states simply that the study was “double-blind.” The “cannot tell” option tags uncertain items and prompts data abstractors to contact authors for clarification, especially on data items critical for reaching conclusions. Third, asking what is reported is more appropriate than asking what has been done in a study, keeping in mind that the study report may not fully reflect how the study was actually conducted. The previous question would be rephrased as, “Did the article report that the participants were masked to the treatment assignment?,” which in our view, is preferable to, “Were participants masked to the treatment assignment?” Fourth, ask 1 question at a time to avoid confusion. For example, the question, “Did the article report that participants and those who measured outcomes were masked to the treatment assignment?,” should be separated into at least 2 questions, depending on the number of outcomes: 1 asking whether the article reported masking of the study participants and 1 (or more) asking about masking of outcome assessors. Fifth, when the data abstractor or analyst has to make a judgment, abstract the verbatim information from the source document used to make the judgment so the process is transparent. For example, for the assessment of risk of bias, we recommend recording verbatim the methods applied in the study, followed by a judgment question. The judgment question should start with the phrase, “In your opinion, is the study at www.annals.org

RESEARCH AND REPORTING METHODS high, low, or unclear risk for selection bias?” This method allows readers of systematic reviews to evaluate the judgments for themselves. Sixth, minimize mathematical manipulations required during data abstraction; when calculation is needed, record the data as provided in the source document for later calculation with a computer. For example, SDs that feed into a meta-analysis can often be computed from CIs or P values. This mathematical computation should not occur at the data abstraction stage, especially in cases in which the data abstractors have different levels of statistical training. The implication for data abstraction is to anticipate possible data type and build data items accordingly. Step 4: Develop Data Abstraction Forms Data abstraction forms can be developed using word processing software, for example. A data abstraction form serves as a permanent reference that can be distributed to others, including programmers and data analysts, and as a guide for creating an electronic data abstraction form and a codebook (a document that describes how each data element is captured and coded in the data system). The form is useful for anticipating variable numbers and names; anticipating skip patterns; and checking clarity, consistency, and coding conventions. Supplement 2 (available at www.annals .org) is an example of a data abstraction form containing key data elements that are relevant to many systematic reviews. Data abstraction forms should document the name of the study, the name of the form, version number, and version date (11, 12). Every data item on a data collection form, which corresponds to a field in the data system, needs to be numbered. Definitions and instructions helpful for answering a question should appear next to the question to improve quality and consistency across data abstractors. Step 5: Set Up and Pilot-Test Data Abstraction Forms in the SRDR The SRDR developers have developed and maintained training modules on how to build questions and set up data abstraction forms in the SRDR. It is the project leader's responsibility to learn the system and build the form using the paper forms as prototypes. It is essential to develop a user manual with instructions, coding conventions, and definitions specific to the project. A manual plays a role in both quality assurance and documentation. For example, for questions

Table 5. Suggestions About Item Construction Ask closed-ended questions Avoid asking a question in a way that the response may be left blank Ask what is reported in the article instead of what has been done in the study Ask 1 question at a time When judgment is required, record the raw data (that is, quote directly from the source document) used to make the judgment Minimize mathematical manipulations during data abstraction; instead, extract sufficient data for subsequent computations using statistical software Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015 291

RESEARCH AND REPORTING METHODS under the Outcomes tab, providing a list of outcomes and particular time points specified by the systematic review protocol improves consistency and completeness in data abstraction. Other examples of instructions include whether quotation marks should be used when recording verbatim text from a report, whether “U.S.” should be spelled out as United States, and whether a percent sign (%) should be entered for proportions. All data collection forms and data systems must be thoroughly pilot-tested before launch. In our experience, testing should involve several persons abstracting data from at least 3 articles (13). The initial testing focuses on the clarity and completeness of questions and the usefulness of the user manual. After initial testing, accuracy of the abstracted data should be checked against the source document or verified data to identify problematic questions and response items that need modification and additional testing. Step 6: Train Data Abstractors Training individual data abstractors using the data abstraction form specific to the review should be undertaken as a quality assurance measure. Training should include modules to familiarize them with the data system and data abstraction form, and discussion among abstractors of ambiguous questions or responses to establish consistency. Training to use the SRDR begins with asking each data abstractor to complete the general SRDR training modules respective to his or her role, as required by the SRDR developers. Then, the project leader trains the data abstractors so they can work independently on the target systematic review project. Data abstractors should have a basic understanding of the clinical issues surrounding the topic; have knowledge of the study design, analysis, and statistics; and pay attention to detail while following instructions on the forms and the user manual. There are advantages to data abstractors having complementary skills and expertise (for example, a topic area specialist and a methodologist) (4, 14). Training sessions at the project onset and intermittently over the course of the project allow for discussion of areas that cause confusion or problems and thus facilitate high-quality data collection. For example, when data related to a single item on the form are present in multiple locations within a document (such as the abstract, main body of text, tables, and figures) or in several sources (such as publications, Clinical Trials.gov, and conference abstracts), the development and provision of instructions in following an agreed algorithm for data abstraction are critical and should be reinforced during the training sessions. The team should document specific issues discussed and decisions made during such sessions and circulate and file the documentation to team members for their reference during subsequent data collection. Step 7: Implement a Quality Assurance and Control Plan and Monitor Progress Data quality refers to data completeness, consistency, and accuracy. Errors during data abstraction are common and have been well-documented (13–17). Er292 Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015

Data Collection Innovations for Systematic Reviews

rors occur when data abstractors omit information from the study documents and when information is transcribed incorrectly due to typographical errors, carelessness, or misunderstanding. The estimated error rate is approximately 30% for single data abstraction and is similar across levels of experience (17). Unfortunately, errors that occur at the data abstraction stage rarely can be detected by peer reviewers or editors. The IOM notes that “the only known effective means of reducing data extraction errors is to have at least two individuals independently extract data” (2). For this reason, we recommend having 2 data abstractors who work independently to collect data on the SRDR. The Data Comparison Tool in the SRDR can be used to check data consistency and accuracy when 2 data abstractors are paired for double data abstraction. The tool displays responses from 2 abstractors underneath each question. Adjudication may be performed through discussion between the abstractors concomitant with a recheck of the source document. The adjudicated results are saved separately from the initial abstraction so both the original data and adjudicated data are captured. Individual data abstractors mark each tab in the SRDR as complete or incomplete and submit the entire form when they are satisfied with the data entered. The project leader can request a report from the SRDR to monitor timeliness of data abstraction and progress. The project leader should periodically export and download abstracted data to run logic checks (the process to validate data type, range, and consistency) and discover any irregularities. Step 8: Export and Clean the Data for Analysis All data, or a subset of all data, abstracted for a systematic review using the SRDR can be exported for cleaning and analysis. When the complete data set is exported, a plain-text file or spreadsheet can be created with each worksheet containing data collected from 1 tab in the SRDR (such as Design Details or Outcomes). Individual studies are presented in rows, and data elements are presented in columns, which can directly be imported into statistical software for processing and analysis. When only a subset of the data is needed, a filtered table can be built within the SRDR by selecting the desired data fields and study attributes, and this can be exported. For instance, users can export data from studies that reported a certain outcome or studies with low risk of selection bias. The tables (or reports) created can stand alone or be imported back into the SRDR as part of the project record for reporting and public viewing purposes.

DISCUSSION To ensure a fair accounting of what is known, data abstracted for systematic reviews must be accurate, complete, reliable, amenable to analyses and presentation, and accessible for future updates of the review and data sharing. In this article, we provided guidance to support implementation of the current methodologwww.annals.org

Data Collection Innovations for Systematic Reviews

ical standards for data abstraction in systematic reviews. We used the SRDR, a free, Web-based data system, to illustrate principles in data abstraction for systematic reviews. The SRDR represents a technological innovation to reliably streamline production of systematic reviews through rigorous data collection and management; open-access, Web-based data sharing among investigators; and faithful data archival for updating reviews. The main reason for organized data abstraction is to capture information necessary to answer the key questions or hypotheses specified in the systematic review. The data abstraction procedure should be structured to minimize subjective interpretation and the need to refer to source documents during data analysis. The data abstracted for a systematic review should be available for use in future updates of the review, regardless of whether the original review authors or a different group of authors update the review. Data abstractors need to be trained and provided with written documentation for ongoing guidance during data abstraction. Ideally, data collected for systematic reviews only needs to be abstracted by 1 review team and deposited into a publicly accessible database for future use. Data linking and sharing have become a real challenge for groups involved in producing systematic reviews. With the Cochrane Linked Data Project of The Cochrane Collaboration as an example, efforts have been invested in linking trial data (collected as part of a systematic review) to a study-based trial registry and eventually making them readily available, accessible, and searchable to users who may have different needs (www .cochrane.org/community/development-projects /cochrane-linked-data-project). RevMan (The Cochrane Collaboration), which is used to produce Cochrane systematic reviews for publishing and storing data in an XML format, is not a data collection tool per se. The SRDR was designed using a structured query language relational database structure, which is flexible for exporting, searching, and linking data, and complements the functionality of RevMan. The Cochrane Linked Data Project and the SRDR will contribute to increasing the public availability of study data and the movement toward open science. Standardizing and sharing data collection tools as well as data management systems among a group of vested review authors working in the same topic area is the next logical step to streamlining the systematic review production. Data collected from studies in a review can be used for updating the review only if the forms for the original review and the update included identically constructed items or questions. Instead of duplicating previous data abstraction efforts, a new systematic review could build on previous work and leverage the often-limited resources by using common data elements (18 –20). Such elements help to ensure that data are captured and recorded uniformly and that the same data items, measurements, and observations are collected, defined consistently, and stored in the same format (such as numerical vs. character), thereby enwww.annals.org

RESEARCH AND REPORTING METHODS hancing data quality and facilitating data reporting, sharing, and archiving. Harmonizing data elements, especially outcomes, also facilitates comparison and aggregation of results across studies (21). The generic data collection form we share with readers in this article is an example of a first step toward achieving this goal. Because our form contains core data elements that are relevant across many clinical areas, we encourage others to adapt our form for their own systematic reviews and citing the source when they do so. Adapting such a form is likely to allow rapid implementation of rigorous data collection; improve efficiency; and facilitate data sharing, data reformatting, secondary analysis, and data archiving after review completion. Many technology advances are likely to increase the value of existing data collection tools for systematic reviews. We anticipate that tools for automated screening of titles and abstracts will enable seamless data flow and handling from screening of search results to data abstraction (22, 23). New tools that record the location for each data element and automatically enter validated data into a data collection system will further improve the efficiency and reproducibility of systematic reviews (24). With more than 4000 systematic reviews published each year (25), innovation in methods to improve the efficiency and validity of systematic reviews is likely to have profound implications and will be indispensable for finding trustworthy answers to many health and health care questions. From Center for Clinical Trials, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland. Disclaimer: The findings and conclusions of this article are

those of the authors, who are responsible for its content, and do not necessarily represent the views of the Agency for Healthcare Research and Quality. No statement in this article should be construed as an official position of the Agency for Healthcare Research and Quality or the U.S. Department of Health and Human Services. Grant Support: By the National Eye Institute, National Insti-

tutes of Health (grant 1 RC1 EY020140). The SRDR was initially developed by the Tufts University Evidence-based Practice Center and is now maintained by the Brown University Evidence-based Practice Center under contract with the Agency for Healthcare Research and Quality (contract no. HHSA 290-2007-10055-I and HHSA 290-2012-00012-I). Disclosures: Dr. Li reports grants from the National Eye Institute during the conduct of the study. Dr. Vedula reports personal fees from Tufts University outside the submitted work. Ms. Hadar has nothing to disclose. Mr. Parkin has nothing to disclose. Dr. Lau reports grants from the Agency for Healthcare Research and Quality during the conduct of the study. Dr. Dickersin reports grants from the National Eye Institute during the conduct of the study and reports other from the National Eye Institute outside the submitted work. Forms can be viewed at www.acponline.org/authors/icmje/ConflictOf InterestForms.do?msNum=M14-1603. Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015 293

RESEARCH AND REPORTING METHODS Requests for Single Reprints: Tianjing Li, MD, MHS, PhD,

Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, E6011, Baltimore, MD 21205; e-mail, tli@jhsph .edu. Current author addresses and author contributions are available at www.annals.org.

References 1. Olsen LA, Grossmann C, McGinnis JM. Learning What Works: Infrastructure Required for Comparative Effectiveness Research: Workshop Summary. Washington, DC: National Academies Pr; 2011. 2. Eden J, Levit L, Berg A, Morton S, eds; Committee on Standards for Systematic Reviews of Comparative Effectiveness Research; Board on Health Care Services. Finding What Works in Health Care: Standards for Systematic Reviews. Washington, DC: National Academies Pr; 2011. 3. Chandler J, Churchill R, Higgins J, Lasserson T, Tovey D. Methodological standards for the conduct of new Cochrane Intervention Reviews. Version 2.3. The Cochrane Collaboration. 2 December 2013. Accessed at www.editorial-unit.cochrane.org/sites/editorial -unit.cochrane.org/files/uploads/MECIR_conduct_standards%20 2.3%2002122013.pdf on 14 January 2014. 4. Selecting studies and collecting data. In: Higgins JPT, Green S, eds. Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0. The Cochrane Collaboration. 2011. Accessed at www .cochrane-handbook.org on 23 January 2014. 5. Ip S, Hadar N, Keefe S, Parkin C, Iovin R, Balk EM, et al. A Webbased archive of systematic review data. Syst Rev. 2012;1:15. [PMID: 22588052] doi:10.1186/2046-4053-1-15 6. Data collection and processing. In: Meinert CL. Clinical Trials Handbook: Design and Conduct. Hoboken, NJ: J Wiley; 2012. 7. Centre for Reviews and Dissemination. Systematic Reviews: CRD's Guidance for Undertaking Reviews in Health Care. York, United Kingdom: York Publishing Services; 2009. Accessed at www.york.ac.uk /inst/crd/index_guidance.htm on 8 March 2013. 8. Moher D, Liberati A, Tetzlaff J, Altman DG; PRISMA Group. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med. 2009;151:264-9. [PMID: 19622511] 9. Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283:2008-12. [PMID: 10789670] 10. Cochrane Collaboration. Methodological standards for the reporting of Cochrane Intervention Reviews, Version 1.1. 17 December 2012. Accessed at www.editorial-unit.cochrane.org/sites/editorial -unit.cochrane.org/files/uploads/MECIR%20Reporting%20standards %201.1_17122012_2.pdf on 23 January 2014.

294 Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015

Data Collection Innovations for Systematic Reviews 11. Data collection considerations. In: Meinert CL. Clinical Trials: Design, Conduct, and Analysis. New York: Oxford Univ Pr; 1986:140-8. 12. Data definition, forms, and database design. In: McFadden E. Management of Data in Clinical Trials. Hoboken: J Wiley; 1997. 13. Gresham G, Li T. Evaluating the efficiency and accuracy of data abstraction for systematic reviews using Systematic Review Data Repository [Abstract]. Presented at 35th Society for Clinical Trials Annual Meeting, Philadelphia, Pennsylvania, 18 –21 May 2014. Abstract no. P078. 14. Buscemi N, Hartling L, Vandermeer B, Tjosvold L, Klassen TP. Single data extraction generated more errors than double data extraction in systematic reviews. J Clin Epidemiol. 2006;59:697-703. [PMID: 16765272] 15. Gøtzsche PC, Hro´bjartsson A, Maric K, Tendal B. Data extraction errors in meta-analyses that use standardized mean differences. JAMA. 2007;298:430-7. [PMID: 17652297] 16. Jones AP, Remmington T, Williamson PR, Ashby D, Smyth RL. High prevalence but low impact of data extraction and reporting errors were found in Cochrane systematic reviews. J Clin Epidemiol. 2005;58:741-2. [PMID: 15939227] 17. Horton J, Vandermeer B, Hartling L, Tjosvold L, Klassen TP, Buscemi N. Systematic review data extraction: cross-sectional study showed that experience did not increase accuracy. J Clin Epidemiol. 2010;63:289-98. [PMID: 19683413] doi:10.1016/j.jclinepi.2009 .04.007 18. Clarke M. Standardising outcomes for clinical trials and systematic reviews. Trials. 2007;8:39. [PMID: 18039365] 19. Williamson PR, Altman DG, Blazeby JM, Clarke M, Devane D, Gargon E, et al. Developing core outcome sets for clinical trials: issues to consider. Trials. 2012;13:132. [PMID: 22867278] doi: 10.1186/1745-6215-13-132 20. COMET Initiative. About COMET. 2013. Accessed at www.comet -initiative.org on 23 January 2014. 21. Saldanha IJ, Dickersin K, Wang X, Li T. Outcomes in Cochrane systematic reviews addressing four common eye conditions: an evaluation of completeness and comparability. PLOS One. 2014;9(10): e109400. [PMID: 25329377] doi: 10.1371/journal.pone.0109400. 22. Wallace BC, Small K, Brodley CE, Lau J, Schmid CH, Bertram L, et al. Toward modernizing the systematic review pipeline in genetics: efficient updating via data mining. Genet Med. 2012;14:663-9. [PMID: 22481134] 23. Wallace BC, Trikalinos TA, Lau J, Brodley C, Schmid CH. Semiautomated screening of biomedical citations for systematic reviews. BMC Bioinformatics. 2010;11:55. [PMID: 20102628] doi:10.1186 /1471-2105-11-55 24. Kiritchenko S, de Bruijn B, Carini S, Martin J, Sim I. ExaCT: automatic extraction of clinical trial characteristics from journal publications. BMC Med Inform Decis Mak. 2010;10:56. [PMID: 20920176] doi:10.1186/1472-6947-10-56 25. Bastian H, Glasziou P, Chalmers I. Seventy-five trials and eleven systematic reviews a day: how will we ever keep up? PLoS Med. 2010;7:e1000326. [PMID: 20877712] doi:10.1371/journal.pmed .1000326

www.annals.org

Annals of Internal Medicine Current Author Addresses: Dr. Li: Johns Hopkins Bloomberg

Author Contributions: Conception and design: T. Li, S.S.

School of Public Health, 615 North Wolfe Street, E6011, Baltimore, MD 21205. Dr. Vedula: Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, 200 Hackerman Hall, 3400 North Charles Street, Baltimore, MD 21218. Ms. Hadar, Mr. Parkin, and Dr. Lau: Center for Evidence-based Medicine, Brown University School of Public Health, 121 South Main Street, Providence, RI 02912. Dr. Dickersin: Johns Hopkins Bloomberg School of Public Health, 615 North Wolfe Street, E6152, Baltimore, MD 21205.

Vedula, N. Hadar, J. Lau. Analysis and interpretation of the data: T. Li, K. Dickersin. Drafting of the article: T. Li, S.S. Vedula, N. Hadar, C. Parkin, J. Lau. Critical revision of the article for important intellectual content: T. Li, S.S. Vedula, J. Lau, K. Dickersin. Final approval of the article: T. Li, S.S. Vedula, N. Hadar, J. Lau, K. Dickersin. Provision of study materials or patients: T. Li, N. Hadar. Statistical expertise: T. Li. Obtaining of funding: K. Dickersin. Administrative, technical, or logistic support: T. Li, S.S. Vedula, N. Hadar, J. Lau. Collection and assembly of data: T. Li, N. Hadar, C. Parkin.

www.annals.org

Annals of Internal Medicine • Vol. 162 No. 4 • 17 February 2015

Copyright © American College of Physicians 2015.

Innovations in data collection, management, and archiving for systematic reviews.

Data abstraction is a key step in conducting systematic reviews because data collected from study reports form the basis of appropriate conclusions. R...
156KB Sizes 0 Downloads 8 Views