EDITORIAL Database research in transfusion medicine: the power of large numbers

T

he article by Edgren and colleagues1 in this issue of TRANSFUSION presents information about the expansion of a previously constructed linked blood donor–transfusion recipient database, the SCANdinavian Donations And Transfusions database (SCANDAT). The “original” SCANDAT consisted of computerized records of blood donation and transfusion activities dating back to the mid-1960s in Sweden and the early 1980s in Denmark with long-term follow-up data from national health registries extending through 2002. This database included data on more than one million blood donors linked to more than 1.3 million transfused recipients. Health outcomes were retrieved by record linkage to Swedish and Danish nationwide data health registries on cancer occurrence, hospital care, and mortality. This linkage was made possible by the use of a national registration number for all health-related activities, thereby providing thorough follow-up of donors and recipients for multiple (long-term) health outcomes. Previous SCANDAT accomplishments include determination of cancer risks in volunteer blood donors compared with general population data, evaluation of the potential transmissibility of cancer to transfusion recipients from donors who subsequently develop cancer, and evaluation of recipient posttransfusion survival stratified by a number of variables including red blood cell (RBC) storage age.2-6 The second version of SCANDAT has added another 8 to 10 years of data, thereby extending the original observations through 2010 to 2012.1 SCANDAT2 contains 25.5 million donation records, 21.3 million transfusion records, 3.7 million unique persons, and 40 million person-years of follow-up. Data quality is high as there is a more than 97% concordance with official annual statistics on blood donations and transfusions, and 96% of all transfusions in Sweden (94% in Denmark) are linkable to their respective donation(s). This robust linkage will enable investigators to study disease concordance between donors and recipients and to make inferences about disease transmissibility. This capability is enhanced by SCANDAT2’s impressive number of transfused components and unprecedented longterm follow-up morbidity and mortality data. doi:10.1111/trf.13139 C 2015 AABB V

TRANSFUSION 2015;55;1591–1595

As SCANDAT2 illustrates, the routine collection of massive amounts of clinical data in large multiinstitution databases has become increasingly common across multiple medical and surgical specialties. These databases come in two main flavors: 1) administrative databases whose primary purpose is for reimbursement and insurance purposes and 2) clinical registries that focus on collecting data on patients with a given disease or undergoing a specific procedure.7 As in other medical specialties, the use of complex databases that extract data from various sources are becoming more common in transfusion medicine (TM), where such databases can be used for hemovigilance and monitoring purposes and blood management evaluations or to address key research questions. SCANDAT and SCANDAT2 are excellent examples of how TM research questions can be approached using a large multicenter database. The remainder of this editorial will further review how TM research can benefit from such large database queries and will address some important considerations in this type of research. We have organized our discussion in order of the complexity of the efforts needed to compile the databases, although none of these efforts should be characterized as uncomplicated or straightforward!

BLOOD DONOR RESEARCH USING DONOR AND DONATION DATABASES Because blood donor, blood donation, and/or donor deferral data are under the direct control of blood collection agencies, it has been relatively easier to construct this type of database and to use it for research purposes. The National Heart, Lung, and Blood Institute (NHLBI) Retrovirus Epidemiology Donor Study (REDS) pioneered this effort by combining donor and donation data from five US blood centers and using an external data coordinating center to implement data quality control (QC) procedures and develop, guide, and perform research analyses.8 The American Red Cross pursued similar efforts by compiling data from its regional blood centers into a centralized database known as ARCNET.9 Subsequently, other organizations, both within the United States and internationally, have followed similar data warehouse approaches.8,10 Uses of such databases have included assessing the prevalence and incidence of transfusion-transmissible infections (TTIs) and their demographic determinants, monitoring Volume 55, July 2015 TRANSFUSION 1591

EDITORIAL

temporal trends for TTI risks and donor demographics, and evaluating various aspects of donor deferral and donor return. In addition these databases have provided a representative donor sampling frame for survey- and protocol-based research studies.8

BLOOD DONOR RESEARCH LINKED TO DISEASE REGISTRIES In addition to SCANDAT, at least two other research teams have evaluated long-term health outcomes in blood donors; both were post hoc efforts that required establishing a relationship with an outside agency. Using provincial health and mortality records, Germain and colleagues11 compared the incidence of coronary artery disease in Quebec, Canada, between two blood donor cohorts: 12,537 donors (125,000 person-years of follow-up) who were ineligible for future donations due to false-positive infectious disease test results and 51,000 donors (517,000 person-years) who remained eligible and continued to donate. Vahidnia and coworkers12 evaluated cancer incidence, epidemiology, and mortality up to 10 to 20 years postdonation in a cohort of 66,984 blood donors who originally consented to data collection and repository sample storage at the Blood Centers of the Pacific. In this study, donor identifying information was cross-referenced against patient identities in the California Cancer Registry and the Social Security Administration Death Master File.

TRANSFUSION RECIPIENT RESEARCH USING CLINICAL OUTCOMES DATABASES FROM OTHER DISCIPLINES The Centers for Medicare & Medicaid Services (CMS) administrative databases have been accessed to assess the frequency of relatively rare transfusion reactions and their association with patient demographics, components transfused, and ICD-9 diagnostic codes.13 For example, Menis and coworkers13 accessed more than 11 million records on transfused hospitalized patients in the CMS 2007 to 2011 Inpatient and Medicare enrollment common working files to evaluate risk factors for transfusion-related acute lung injury. Although data were confined to patients older than age 65, this limitation is not major given that the majority of transfused patients are in this age group. Of greater concern is the lack of external validation of the coded transfusion reaction diagnoses by subject matter experts. Clinical outcomes databases established by cardiologists and surgeons have been retrospectively queried to ask questions about transfusion, preoperative anemia, and outcome variables such as short-term mortality.14-16 Although such studies produce important hypothesis-generating results, it is important to use 1592 TRANSFUSION Volume 55, July 2015

caution when interpreting their findings as the databases were not originally constructed for transfusion research; therefore, the quality of transfusion data may be suboptimal or particular variables may not have been collected. Importantly, studies that find a positive association between transfusions and mortality should evaluate whether severity of illness is a confounder in their analysis.17 This factor may not be readily recognized by investigators who are not TM experts. As illustrated by the following example of severity of illness as the confounder, three criteria must be met for confounding to potentially exist: 1) the confounder (illness severity) must be associated with the exposure (transfusion)—e.g., severely ill patients are more likely to become anemic and be transfused; 2) the confounder must be a causal risk factor for the outcome of interest (e.g., mortality) in the non-exposed population (nontransfused patients)—e.g., severely ill patients who are not transfused are more likely to die; and 3) the confounder must not be an intermediate step in the causal pathway between the exposure and the outcome—e.g., the transfusion did not cause the patients to be severely ill to begin with. It is important for TM specialists to understand that despite these data and analysis limitations, research using clinical outcomes databases may be well accepted by clinicians within the discipline that has conducted the study.

TRANSFUSION RECIPIENT RESEARCH USING SPECIFICALLY DESIGNED DATABASES Based on the above considerations, it is clear that TM and clinical care would clearly benefit from the use of specific transfusion or clinical outcomes databases designed by TM specialists working in conjunction with clinical colleagues. Further, TM expertise should enable the design of an analytical plan that recognizes major potential confounders. Recently, a number of such databases have been established and accessed to examine issues such as transfusion recipient demographics, blood utilization patterns, transfusion triggers, and short-term clinical outcomes. Some examples follow. In the Netherlands, the Profile of Transfusion Recipients (Proton) database has compiled data from 1996 to 2006 from recipients at 20 hospitals accounting for 28% of that country’s blood usage.18,19 Data from 290,000 recipients who received 2.4 million blood products include recipient demographics and component type obtained from hospital records, primary discharge diagnosis obtained from the Dutch National Medical Registry, and mortality data obtained from the Dutch Municipal Population Register. Published analyses include the demographics of blood component usage and posttransfusion survival.

EDITORIAL

In a US managed health care system (Kaiser Permanente Northern California, consisting of 21 hospitals), investigators have linked data from the TM laboratory system to comprehensive inpatient electronic health records (EHR) and have built upon their institution’s previous experience with EHR-based outcomes research. The database extends for almost 5 years and contains data from more than 525,000 transfused RBC units. An additional strength of this database is the ability to apply comorbidity and illness severity scales to each patient’s data, which allows for adjusted outcome analyses. Analyses have established trends over time in RBC usage and the lack of impact of RBC transfusion on 30-day patient mortality.20 A model has been constructed to determine recipient factors that best predict RBC transfusion.21 Given the current focus on patient blood management, there are now commercially available software tools to compile such data. Investigators at Johns Hopkins used one of these commercial Web-based blood management systems (IMPACT Online, Haemonetics, Inc., Braintree, MA), which can extract data from the blood bank computer system, the laboratory medicine computer system, the electronic medical record, the physician order entry portal, and the hospital billing computer system. RBC transfusion triggers and posttransfusion hemoglobin measurements were collected and analyzed over a 44-month period (January 2009August 2012) for 23,559 inpatients in the nine hospital services with the greatest number of RBC-transfused patients.22 The REDS-III program is a 7-year transfusion safety research initiative launched in 2011 by NHLBI.23 The domestic component involves four blood centers, 12 hospitals, a data coordinating center, and a central laboratory. An initial effort of this program involved the establishment of a retrospective recipient database that compiled blood bank and EHR data for 1 year (20102011) for all adult inpatients transfused with plasma at 10 of the 12 hospitals.24 This initial pilot effort at extracting data from a variety of hospital-based information systems was somewhat limited in scope but importantly included pre- and posttransfusion coagulation data. Data were compiled on 9,269 patients who received 72,167 units of plasma, transfused in 19,596 doses. Data analyses included the locations in which plasma was transfused, the dosages and category of plasma used (e.g., FFP, PF24, or thawed plasma), the INR triggers for plasma transfusion, and the resultant changes in INR after transfusion.

LINKED DONOR–RECIPIENT DATABASES SCANDAT is the prototype example of a linked blood donor and transfusion recipient database that evaluates

long-term health outcomes in both donors and recipients. In the United States, the lack of a national registration number prohibits easily capturing long-term health outcomes and limits a US-linked donor–recipient database to assessments of short-term health outcomes that will often be geographically or institutionally circumscribed. On the other hand, these databases may be able to compile an expanded number of detailed variables and to conduct additional QC checks that would not be feasible on a larger scale. Linked donor–recipient databases that rely on data collected by blood banks and hospitals are still novel as they are more difficult and expensive to build than isolated donor or recipient databases. The challenge is to reliably link donation and blood component production data from a blood center, to blood component issue data from a hospital’s blood bank transfusion service, to recipient data from a hospital’s EHR. REDS-III is currently constructing such a linked database, combining data from four blood centers and 12 hospitals. Estimated annual transfusion volume exceeds 120,000 RBC units with at least 80% supplied by the REDS-III participating blood centers.23 The REDS-III database infrastructure will allow a particular donation to be traced through component production and, if transfused at a participating hospital, to a data extract from the EHR of the transfusion recipient. This activity will permit the conduct of numerous analyses that characterize blood component utilization patterns in diverse settings, assess transfused recipients characteristics, evaluate blood donor and donation effects on recipients’ shortterm clinical outcomes, and inform the design and feasibility of future clinical trials and observational studies. Unlike SCANDAT2, the REDS-III databases are not currently linked to any longer-term outcome registries.

REQUIREMENTS FOR HIGH-QUALITY DATABASE RESEARCH A major advantage of large database research is that it provides access to a very large number of observations (often hundreds of thousands of data points), allowing for the incidence of rare events to be calculated, for subgroup analysis to be performed, and in some cases, for predictive models to be built and tested.7 In addition, analyses of large databases can help inform clinical trial or observational study designs by determining the appropriate sampling frame and inform enrollment projections and timelines. Databases can also provide information on donors or patients who agree to (or refuse to) participate thus avoiding the need to collect these data elements again as part of the new study. However, because these data have not been collected as part of a randomized clinical trial, we believe that conclusions from such studies should generally be Volume 55, July 2015 TRANSFUSION 1593

EDITORIAL

considered as hypothesis generating for further prospective observational studies or focused clinical trials. If the results are clear and strong enough to stand alone, no further research may be needed; if not, and the question is important enough, the results generated from these database analyses become substantial evidence that targeted (and more expensive) research is necessary. In general, due to limitations of data collection, especially the timing of events, it will be difficult to infer causality from database studies. As with any research, analyses of large databases can produce important and meaningful results, but have pitfalls if done incorrectly.7,25 The main challenge is ensuring data quality by establishing that the data are accurate, have been extracted across multiple institutions using the same definitions and interpretations, and are complete. For example, evidence indicates that in many cases the accuracy of diagnostic coding data (ICD-9 codes) is suboptimal,26 although there are specific databases in which accuracy has been verified.27 For any particular database analysis, any such limitations should be reported so that research findings can be clearly interpreted. Particular care is needed when data are being used for a purpose other than the original reason for which they were collected. Even with strong QC efforts, there is likely to be some uncertainty in data quality, particularly when one integrates or links multicenter data from different type of databases (donor and donation data sets, hospital records, and nationwide health outcome registers). This problem should be substantially mitigated by the statistical power provided by the very large amount of data; furthermore, uncertainty due to missing or less than optimal data can be further investigated by performing sensitivity analyses. Building and analyzing large linked databases requires a cohesive and experienced multidisciplinary team. TM specialists should be involved to ensure that the appropriate component and transfusion variables are being collected and properly analyzed. Strong IT support is needed, as is expertise in data management, QC procedures, and statistical analysis. Administrative issues including data sharing between institutions, protecting subject confidentiality, adhering to privacy rules, and data “ownership” require careful consideration and planning as well as approval by appropriate review panels. For all of these reasons, setting up large databases is an expensive activity and the return on investment will only be realized if the same data source can be used to effectively address multiple research questions in a timely fashion. In conclusion, large (usually multicenter) database research in TM, particularly in the area of transfusion recipient outcomes, is an approach that has expanded over recent years and is a paradigm that will increase in 1594 TRANSFUSION Volume 55, July 2015

importance. It is a key and complementary research tool that should be integrated with other types of research designs to address scientific and clinical practice questions in our field. CONFLICT OF INTEREST SK is a paid participant in some of the studies that are referred to in the editorial (REDS and REDS-III). SG works for the funding agency (NHLBI).

Steven Kleinman, MD1 e-mail: [email protected] Simone A. Glynn, MD, MSc, MPH2 1

UBC School of Medicine Victoria, BC, Canada 2 Blood Epidemiology and Clinical Therapeutics Branch Division of Blood Diseases and Resources National Heart, Lung, and Blood Institute National Institutes of Health Bethesda, MD

REFERENCES 1. Edgren G, Rostgaard K, Wikman A, et al. The new Scandinavian Donations and Transfusions database (SCANDAT2): a blood safety resource with added versatility. Transfusion 2015;55:000-00. 2. Edgren G, Tran TN, Hjalgrim H, et al. Improving health profile of blood donors as a consequence of transfusion safety efforts. Transfusion 2007;47:2017-24. 3. Edgren G, Reilly M, Hjalgrim H, et al. Donation frequency, iron loss, and risk of cancer among blood donors. J Natl Cancer Inst 2008;100:572-9. 4. Edgren G, Hjalgrim H, Reilly M, et al. Risk of cancer after blood transfusion from donors with subclinical cancer: a retrospective cohort study. Lancet 2007;369:1724-30. 5. Kamper-Jorgensen M, Ahlgren M, Rostgaard K, et al. Survival after blood transfusion. Transfusion 2008;48:2577-84. 6. Edgren G, Kamper-Jorgensen M, Eloranta S, et al. Duration of red blood cell storage and survival of transfused patients (CME). Transfusion 2010;50:1185-95. 7. Cook JA, Collins GS. The rise of big clinical databases. Br J Surg 2015;102:e93-101. 8. Kleinman S, King MR, Busch MP, et al. The National Heart, Lung, and Blood Institute retrovirus epidemiology donor studies (Retrovirus Epidemiology Donor Study and Retrovirus Epidemiology Donor Study-II): twenty years of research to advance blood product safety and availability. Transfus Med Rev 2012;26:281-304, 304.e1-2. 9. Dodd RY, Notari EP 4th, Stramer SL. Current prevalence and incidence of infectious disease markers and estimated window-period risk in the American Red Cross blood donor population. Transfusion 2002;42:975-9.

EDITORIAL

10. Custer B, Bravo M, Bruhn R, et al. Predictors of hemoglobin recovery or deferral in blood donors with an initial successful donation. Transfusion 2014;54:2267-75. 11. Germain M, Delage G, Blais C, et al. Iron and cardiac ischemia: a natural, quasi-random experiment comparing eligible with disqualified blood donors. Transfusion 2013; 53:1271-79. 12. Vahidnia F, Hirschler NV, Agapova M, et al. Cancer inci-

19. Borkent-Raven BA, Janssen MP, van der Poel CL, et al. Survival after transfusion in the Netherlands. Vox Sang 2011; 100:196-203. 20. Roubinian NH, Escobar GJ, Liu V, et al. Trends in red blood cell transfusion and 30-day mortality among hospitalized patients. Transfusion 2014;54:2678-86. 21. Roubinian N, Murphy EL, Swain BE, et al. Predicting red blood cell transfusion in hospitalized patients: role of

dence and mortality in a cohort of US blood donors: a 20year study. J Cancer Epidemiol 2013;2013:814842.

hemoglobin level, comorbidities, and illness severity. BMC Health Serv Res 2014;14:213.

13. Menis M, Anderson SA, Forshee RA, et al. Transfusion-related

22. Frank SM, Resar LM, Rothschild JA, et al. A novel method

acute lung injury and potential risk factors among the inpatient US elderly as recorded in Medicare claims data, during 2007 through 2011. Transfusion 2014;54:2182-93. 14. Wu WC, Schifftner TL, Henderson WG, et al. Preoperative hematocrit levels and postoperative outcomes in older patients undergoing noncardiac surgery. JAMA 2007;297:

of data analysis for utilization of red blood cell transfusion. Transfusion 2013;53:3052-59. 23. Kleinman S, Busch MP, Murphy EL, et al. The National Heart, Lung, and Blood Institute Recipient Epidemiology and Donor Evaluation Study (REDS-III): a research program striving to improve blood donor and transfusion

2481-88. 15. Yang X, Alexander KP, Chen AY, et al. The implications of

recipient outcomes. Transfusion 2014;54:942-55. 24. Triulzi D, Gottschall J, Murphy E, et al. A multicenter

blood transfusions for patients with non-ST-segment ele-

study of plasma use in the United States. Transfusion

vation acute coronary syndromes results from the CRUSADE national quality improvement initiative. J Am Coll Cardiol 2005;46:1490-95. 16. Ferraris VA, Davenport DL, Saha SP, et al. Surgical outcomes and transfusion of minimal amounts of blood in the operating room. Arch Surg 2012;147:49-55. 17. Yazer MH, Triulzi DJ. Things aren’t always as they seem: what the randomized trials of red blood cell transfusion tell us about adverse outcomes. Transfusion 2014;54:3243-46. 18. Borkent-Raven BA, Janssen MP, van der Poel CL, et al. The PROTON study: profiles of blood product transfusion recipients in the Netherlands. Vox Sang 2010;99:54-64.

2015;55:1313-9. 25. Cooke CR, Iwashyna TJ. Using existing data to address important clinical questions in critical care. Crit Care Med 2013;41:886-96. 26. van Walraven C, Bennett C, Forster AJ. Administrative database research infrequently used validated diagnostic or procedural codes. J Clin Epidemiol 2011;64:105459. 27. Lambert L, Blais C, Hamel D, et al. Evaluation of care and surveillance of cardiovascular disease: can we trust medico-administrative hospital data? Can J Cardiol 2012; 28:162-68.

Volume 55, July 2015 TRANSFUSION 1595

Database research in transfusion medicine: The power of large numbers.

Database research in transfusion medicine: The power of large numbers. - PDF Download Free
60KB Sizes 2 Downloads 6 Views