EDITORIAL H E A LT H C A R E

More Health for the Money— Toward a More Rigorous Implementation Science

Citation: M. E. Kruk, More health for the money—Toward a more rigorous implementation science. Sci. Transl. Med. 6, 245ed17 (2014).

CREDIT: M. KRUK

10.1126/scitranslmed.3009527

www.ScienceTranslationalMedicine.org 16 July 2014 Vol 6 Issue 245 245ed17

Downloaded from http://stm.sciencemag.org/ on November 14, 2015

Margaret E. Kruk is Associate Professor of Health Policy and Management and directs the Better Health Systems Initiative on global health care coverage and quality at the Mailman School of Public Health, Columbia University, New York, NY 10032, USA. E-mail: [email protected]

IMPLEMENTATION SCIENCE IS IN STYLE. SINCE 2005, THE FIELD HAS GAINED A JOUR nal (Implementation Science), a research network (Global Implementation Initiative), and an annual conference (Global Implementation Conference). Te U.S. National Institutes of Health (NIH), Agency for Healthcare Research and Quality, and Patient-Centered Outcomes Research Institute have awarded funding for implementation research in areas ranging from AIDS to mental health. But what is implementation science and why the recent burst in popularity? Eccles defnes implementation science as the “scientifc study of methods to promote the systematic uptake of clinical research fndings and other evidence-based practices into routine practice,” while Madon calls it a “scientifc framework to guide health-care scale-up” (1, 2). According to NIH, implementation research should make use of theoretical frameworks and multidisciplinary methods to close the gap between discovery and delivery (https://grants.nih. gov/grants/guide/notice-fles/NOT-DK-10-001.html). World Bank President Jim Yong Kim proposed that “science of delivery” guide development assistance and technical support for programming in low-income countries (3). In other words, science must move beyond the laboratory in order to generate evidence on how best to deliver proven health solutions to more people. Still, confusion about implementation science abounds. How does it difer from process evaluation or operations research, which are, afer all, also concerned with improving implementation? In the absence of a clear distinction, many activities formerly called operations research or program monitoring and evaluation are being rebranded as implementation science. Tis lack of clarity is particularly evident in global health, which has a long tradition of relying on routine monitoring of programs as evidence for efectiveness. At the 2012 Health Systems Research Symposium in Beijing, the many sessions on implementation science had little in common in terms of defnitions, methods, or measures. Most featured descriptive rather than analytical methods, with program managers outnumbering researchers. Implementation science and its cousin, policy implementation research, are well-established approaches for studying health care delivery in the United States and Europe but are used less ofen to improve health programming in low- and middle-income countries (4). Yet scaling up safe and efective health programs is crucial to improving health around the globe. We are near the 2015 fnish line of the Millennium Development Goals, a set of global health and development targets aimed at reducing global inequality; several of the health goals will not be met—not for lack of efective interventions but because of their inadequate application in the feld. For example, in 2010 in sub-Saharan Africa, the region with the highest maternal and newborn mortality, only 45% of births were attended by a health professional who could handle complications, even though many women live near health facilities. Only 4 in 10 African children at risk for malaria sleep under a bed-net even though malaria is among the top killers of children and nets are both efective and cheap (www.un.org/millenniumgoals/pdf/MDG%20Report%202012.pdf). So what can implementation science contribute to global eforts to scale up proven health solutions? In a word, science—generalizable, rather than local, knowledge about effective delivery. Whereas operations research seeks to improve one program, implementation science should produce evidence to improve multiple programs. Tis goal requires moving beyond a description of what was delivered to a systematic and rigorous analysis of which delivery approaches worked across a variety of health needs and which did not. From these data, one can begin to formulate models for implementation at scale. Although the ubiquity of the term monitoring and evaluation (M&E) suggests that all health programs are evaluated, in fact, very few are. Te emphasis instead is on monitoring: tracking program activities and outputs over time. Monitoring is important, as it allows 1

EDITORIAL

Downloaded from http://stm.sciencemag.org/ on November 14, 2015

managers to gauge progress and make program adjustments, but it is not adequate for determining program efectiveness—that is, judging whether the program has improved the health of a population. Evaluating efectiveness, be it of a delivery model or a new drug, requires testing the program against current practice or an alternative program. Tis in turn requires a research design that maximizes internal validity—one that can isolate true program efects from the efects of confounders (other programs, policies, weather, economic growth) that could produce the same result. One such design borrowed from clinical research is the randomized controlled trial (RCT). Tese studies randomly allocate individuals, clinics, communities, or districts into intervention and control groups; if done correctly, confounders will be equally distributed among groups and any diferences in outcomes can be attributed to the program. Although RCTs are the gold standard in clinical research, their use in health service delivery and program implementation has stirred up controversy in the public health community. Some object to the idea of communities as laboratories, protest that randomization is unethical when an intervention is known to be efective (5), or insist that RCTs cannot properly account for important contextual characteristics, such as cultural practices, politics, or organization culture. A variation of the latter criticism of RCTs is that stringent control of context and other study parameters by researchers makes the results difcult to reproduce in actual health systems (weak external validity) (6). Last, some worry that the focus on a precise experimental design may push out smaller but important local implementation research eforts (5). RIGOROUS EVALUATION IS FEASIBLE Tis debate might lead a casual observer to conclude that experimental implementation science has run amok in the global health arena. In truth, there are vanishingly few randomized assessments of implementation models in health and education. What’s more, many of the concerns about randomized assessments can be allayed. For example, communities are already unwitting guinea pigs for many programs that end up failing. Testing programs using rigorous methods strengthens, rather than detracts from, the accountability of government and nongovernmental organizations to the public. As for ethical concerns, an implementation trial tests a new delivery model, not the intervention itself, and we don’t know a priori whether the new delivery model works better than standard practice. Tus, implementation science starts from the same stance of equipoise that underpins clinical trials for a new drug or device. Given that a new program might be less efective than an existing approach (or another new program) at producing desired health outcomes, delivery methods must be compared before scaling. Tere is also no denial or withholding of treatment: Controlgroup participants have the same access to a standard treatment as they would have without the implementation study. Last, implementation trials are ofen accused of ignoring context. However, the contrary is ofen true: Carefully designed evaluations incorporate measurement of potentially infuential contextual factors, such as economic development or other programs, and account for these in interpreting the efects. Pragmatic implementation RCTs also typically involve much less control of context than would be typical for a clinical trial, permitting real-world take-up afer study completion. However, sometimes experimental evaluation of health programs is not possible or appropriate. For example, program rollout may be politically determined and not amenable to experimental assignment; there may be too few clusters available to permit comparison; or the evaluators are brought on board afer implementation begins and cannot assign participants prospectively. In addition, most countries have few “control” regions as more and more organizations embark on global health eforts, ofen working on similar programs (7). In these cases, other rigorous evaluation options can approximate the internal validity of randomized studies. Quasi-random designs that include well-selected comparison groups or that repeat outcome measures over time (time series) and other so-called plausibility studies (designs that attempt to rule out external factors that may have caused the observed efects) provided robust evidence for program efectiveness (6, 7). Te reluctance of policymakers to conduct rigorous evaluation can be allayed by closer collaboration with researchers from the outset of program delivery and a more pragmatic mind-set among evaluators. Rather than imposing a “perfect” research design on program implementers, researchers should take advantage of sequential program rollout to conduct pragmatic prospective evalwww.ScienceTranslationalMedicine.org 16 July 2014 Vol 6 Issue 245 245ed17

2

EDITORIAL

Downloaded from http://stm.sciencemag.org/ on November 14, 2015

uations (for example, using the so-called stepped wedge design). Tis approach has proved politically acceptable in many settings. Politicians who are worried about denying programs or services to constituents can commit to extending the program if the delivery model is proven to be successful in the initial districts. Where comparison groups are not available, evaluators can collect data longitudinally, starting before and continuing afer program implementation, a framework in which subjects serve as their own controls and that can account for secular trends. In sum, welldesigned quasi-random designs are a massive improvement over the prevailing descriptive studies that go by the name of implementation research. Implementation science also needs to embrace greater rigor in measurement and analysis of data than typically sufces for monitoring. Measures should be validated in the local context and collected electronically to improve data quality and reduce time to analysis. Ensuring representative survey sampling is essential for assessing population rather than clinic-level efects. Stratifcation, propensity score matching, and use of instrumental variables can assist in mitigating selection bias where randomization is impractical. Analytical methods such as hierarchical linear modeling can assess the contribution of context in the causal chain. Last, qualitative methods are invaluable in understanding mechanisms of action and unearthing unintended consequences of well-intentioned programs and should be used more ofen in implementation science. Te approach described above is not new. Experimental and quasi-experimental methods and new measurement approaches have been used extensively in clinical medicine, epidemiology, and education—and increasingly in development economics. However, these methods have yet to be systematically applied to programming in global health. Te most compelling argument for a more scientifc approach to implementation research is to improve accountability for investments that afect the health of many people. Scaling up inefective programs carries large fnancial costs and delays the uptake of other interventions that can improve people’s health. Programs that don’t work also carry large policy-opportunity costs, distracting policy-makers from other health priorities. Of course, evidence—no matter how rigorous— will never be the sole driver of scale-up. Politics, incentives, history, and power dynamics all determine whether good ideas will be disseminated. But credible data on what implementation approaches work should be the starting point for decision-makers. Good intentions and strong convictions motivate the many global eforts to scale up effective health interventions. But these are not enough. To get more health from our investment, we need the humility and intellectual courage to put our favored notions to the test. A more rigorous implementation science would help ensure that our eforts to close global health gaps are based on evidence and not belief. – Margaret E. Kruk

1. M. P. Eccles, D. Armstrong, R. Baker, K. Cleary, H. Davies, S. Davies, P. Glasziou, I. Ilott, A. L. Kinmonth, G. Leng, S. Logan, T. Marteau, S. Michie, H. Rogers, J. Rycroft-Malone, B. Sibbald, An implementation research agenda. Implement. Sci. 4, 18 (2009). 2. T. Madon, K. J. Hofman, L. Kupfer, R. I. Glass, Public health. Implementation science. Science 318, 1728–1729 (2007). 3. www.worldbank.org/en/news/speech/2012/10/12/remarks-world-bank-group-president-jim-yong-kim-annual-meetingplenary-session. 4. R. Grol, R. Jones, Twenty years of implementation research. Fam. Pract. 17 (suppl. 1), S32–S35 (2000). 5. R. W. Sanson-Fisher, B. Bonevski, L. W. Green, C. D’Este, Limitations of the randomized controlled trial in evaluating population-based health interventions. Am. J. Prev. Med. 33, 155–161 (2007). 6. C. G. Victora, J. P. Habicht, J. Bryce, Evidence-based public health: Moving beyond randomized trials. Am. J. Public Health 94, 400–405 (2004). 7. C. G. Victora, R. E. Black, J. T. Boerma, J. Bryce, Measuring impact in the Millennium Development Goal era and beyond: A new approach to large-scale effectiveness evaluations. Lancet 377, 85–95 (2010).

www.ScienceTranslationalMedicine.org 16 July 2014 Vol 6 Issue 245 245ed17

3

More Health for the Money−−Toward a More Rigorous Implementation Science Margaret E. Kruk (July 16, 2014) Science Translational Medicine 6 (245), 245ed17. [doi: 10.1126/scitranslmed.3009527] Editor's Summary

Article Tools

Related Content

Permissions

Visit the online version of this article to access the personalization and article tools: http://stm.sciencemag.org/content/6/245/245ed17 The editors suggest related resources on Science's sites: http://stm.sciencemag.org/content/scitransmed/6/240/240ed13.full http://stm.sciencemag.org/content/scitransmed/6/233/233ed9.full http://stm.sciencemag.org/content/scitransmed/5/213/213ed20.full http://stm.sciencemag.org/content/scitransmed/5/184/184fs16.full http://stm.sciencemag.org/content/scitransmed/6/252/252cm7.full http://stm.sciencemag.org/content/scitransmed/7/288/288ed5.full http://stm.sciencemag.org/content/scitransmed/7/289/289fs22.full Obtain information about reproducing this article: http://www.sciencemag.org/about/permissions.dtl

Science Translational Medicine (print ISSN 1946-6234; online ISSN 1946-6242) is published weekly, except the last week in December, by the American Association for the Advancement of Science, 1200 New York Avenue, NW, Washington, DC 20005. Copyright 2015 by the American Association for the Advancement of Science; all rights reserved. The title Science Translational Medicine is a registered trademark of AAAS.

Downloaded from http://stm.sciencemag.org/ on November 14, 2015

The following resources related to this article are available online at http://stm.sciencemag.org. This information is current as of November 14, 2015.

More health for the money--toward a more rigorous implementation science.

More health for the money--toward a more rigorous implementation science. - PDF Download Free
278KB Sizes 0 Downloads 5 Views