Commentary for history special issue of Research Synthesis Methods.

Commentary Received 11 March 2015,

Accepted 11 March 2015

Published online 10 June 2015 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/jrsm.1144

Commentary for history special issue of Research Synthesis Methods Iain Chalmers*

1. Introduction I am delighted to have been invited by Will Shadish to contribute to this history special issue of research synthesis methods. The fact that some of the pioneers of research synthesis – Fred Mosteller (Petrosino, 2004) and Tom Chalmers (Dickersin and Chalmers, 2014), for example – are no longer with us is a reminder of our responsibilities. It is already clear that the evolution of scientifically defensible reviews of research will come to be regarded as a development of fundamental importance in the history of science – a ‘Meta-Analytic Big Bang’ – as Will Shadish puts it. We owe it to future historians to ensure proper archiving of relevant material from our era and to record ‘witness statements’ such as those provided here by Frank Schmidt, Gene Glass, and Robert Rosenthal. What characteristics do reviews need to have to make them ‘scientifically defensible’? Like any other research project (Cooper, 1982), they need to take steps to reduce the likelihood that we will be misled by (i) biases of various kinds and (ii) the play of chance. If appropriate and possible, statistical synthesis of estimates from similar but separate studies can be used to deal with the second of these threats. Most of us can agree that Gene Glass’ term ‘meta-analysis’ covers the statistical synthesis of estimates from similar but separate studies (Glass, 1976); but consensus is unlikely over what term to use to cover the bias-reducing features of scientifically defensible reviews. I will use the term ‘systematic review’ because this term begs the question ‘What system was used to reduce biases?’ and demands an answer in terms of bias-reducing methods (Chalmers and Altman, 1995). ‘Meta-analysis’ is not a relevant answer to that question: statistical synthesis may actually amplify the effects of inadequately controlled biases and may or may not be used in systematic reviews. These terminological matters were touched on in a brief history of research synthesis that Larry Hedges, Harris Cooper, and I published in 2002 (Chalmers et al., 2002). There have been many relevant developments since then, and an updated history of systematic reviews is currently being prepared for publication in the James Lind Library (www.jameslindlibrary.org). As with other papers in the James Lind Library, there will be links from this history to scans of key documents and other relevant material.

2. Research synthesis in the social and health sciences The James Lind Library includes material about research synthesis from both the social and health sciences. Although relevant reports were published before the 1970s (e.g., Park et al., 1928; Pratt et al., 1940; Daniels and Hill, 1952; Leitch, 1959; Wechsler et al., 1965; Smith et al., 1969), it was not until the 1970s that the modern history of research synthesis began to gather momentum, beginning with 1971 papers by Light and Smith and Feldman. Research synthesis offered a potential solution to an increasingly acknowledged problem – the need to help those who wanted to make sense of ever larger mountains of research results and use them to inform policy, practice, and further research. By the 1980s, very readable introductions to research synthesis were becoming available such as Richard Light’s and David Pillemer’s Summing Up (1984) and Milos Jenicek’s Méta-analyse en médecine (1987). Some of us in health research first became aware of and acknowledged the research synthesis work of the (mainly) American social scientists and statisticians during the 1970s (Chalmers et al., 2002; Jenicek, 2006; Guyatt and Oxman, 2009). However, developments within health research were also occurring largely independently of those in the social sciences. In 1974, a Swedish radiotherapist combined estimates from five similar randomized trials

268

*Correspondence to: Iain Chalmers, James Lind Initiative, Oxford, OX2 7LG, UK † [email protected]

Copyright © 2015 John Wiley & Sons, Ltd.

Res. Syn. Meth. 2015, 6 268–271

COMMENTARY

and showed that radiotherapy after surgery increased mortality in women with breast cancer (Stjernswärd, 1974). When I asked him what had led him to use statistical synthesis, he thought for a moment then said it was simply ‘bondförnuft’ – ‘peasant sense’ (Stjernswärd, 2009). Others working in heart disease, cancer, and perinatal care had arrived at similar conclusions (Chalmers et al., 1977; Peto et al., 1977; Chalmers, 1979). My impression is that any ‘crossovers’ that occurred between research synthesis work in the social and health sciences were mainly in the field of mental health and involved both psychologists and psychiatrists (for example, Smith and Glass, 1977; Davis, 1976).

3. Why so little ‘crossover’ between the health and social sciences? Why was there apparently little ‘crossover’ between health researchers and social researchers? One of the reasons probably relates to differences in the type of research material available to the two communities for synthesis. The people promoting research synthesis in health research were often also proponents of randomized trials, for which dichotomous outcomes are usual. By contrast, social scientists have had to cope with studies using a variety of research designs, with outcomes assessed using a variety of metrics. The ‘technical’ solution to this problem was to calculate ‘effect sizes’, but these yielded overall estimates of effects that were more difficult to interpret and apply in practice than relative risk reductions in dichotomous outcomes such as mortality, for example. Another possible reason may have been the rather different consequences of not synthesizing research evidence in the two spheres. Failures to synthesize the results of research assessing the effects of social and educational interventions has probably only very rarely led to avoidable deaths. By contrast, as illustrated most dramatically in an analysis led by Fred Mosteller and Tom Chalmers (Antman et al., 1992), failure to cumulate evidence about the effects of clinical treatments has probably resulted in the avoidable deaths of millions of patients.

4. Publication bias – an enduring problem for both social and health research Social scientists first provided quantified evidence of biased reporting of research (Sterling, 1959; Smart, 1964; Mahoney, 1977), and they proposed statistical approaches to dealing with the problem in research syntheses (for example, Rosenthal, 1979). A more fundamental approach was proposed by John Simes after he showed that reporting bias influenced the choice of treatments for women with ovarian cancer (Simes, 1986): he called for an international register of clinical trials. Thirty years later, international registers of clinical trials and requirements to register new studies at inception have become prominent features in the landscape of clinical research. I am unaware of any comparable developments for prospective registration of trials of social and educational interventions. Registers of clinical trials have allowed quantification of the extent of under reporting of these studies (Dickersin and Chalmers, 2010). It is substantial: reports are unavailable for about 50% of clinical trials registered at inception. The support received by the all trials campaign (www.alltrials.org) makes clear that this is now widely regarded as a scandalous betrayal of those who have participated in trials. In the UK, proposals for clinical trials will not receive ethics approval unless they have been registered (Chalmers, 2013).

5. New research should begin and end with systematic reviews Finally, I want to suggest that the researchers who have developed the science and practice of research synthesis have not been sufficiently outspoken about the ethical and scientific relevance of their work. Embarking on additional primary research without reviewing systematically what is already known is unethical, unscientific, and wasteful (Chalmers and Nylenna, 2014; Chalmers et al., 2014). Some research funders are taking a very firm view on this. The National Institute for Health Research in England, for example, advises people applying for support of new primary research as follows: Where a systematic review already exists that summarises the available evidence this should be referenced, as well as including reference to any relevant literature published subsequent to that systematic review. Where no such systematic review exists it is expected that the applicants will undertake an appropriate review of the currently available and relevant evidence (using as appropriate a predetermined and described methodology that systematically identifies, critically appraises and then synthesises the available evidence) and then present a summary of the findings of this in their proposal. All applicants must also include reference to relevant on-going studies, e.g. from trial registries (NIHR, 2013).


Res. Syn. Meth. 2015, 6 268–271

269

And among research regulators, the guidance for researchers issued by the Health Research Authority in the UK now states “Any project should build on a review of current knowledge. Replication to check the validity of previous research is justified, but unnecessary duplication is unethical.” (Health Research Authority 2014).

COMMENTARY

Similarly, reports of additional research that fail to interpret new evidence in the context of updated reviews of other relevant research are, in an important sense, uninterpretable. Among the principal general medical journals, The Lancet has given a lead in this respect in editorials, the most recent of which (Kleinert et al., 2014) advises authors as follows: “Research in context” “Evidence before this study This section should include a description of all the evidence that the authors considered before undertaking this study. Authors should state: the sources (databases, journal or book reference lists, etc) searched; the criteria used to include or exclude studies (including the exact start and end dates of the search), which should not be limited to English language publications; the search terms used; the quality (risk of bias) of that evidence; and the pooled estimate derived from meta-analysis of the evidence, if appropriate.” “Added value of this study Authors should describe here how their findings add value to the existing evidence (including an updated meta-analysis, if appropriate).” “Implications of all the available evidence Authors should state the implications for practice or policy and future research of their study combined with existing evidence.” Increasing recognition that, for scientific and ethical reasons, reports of new research should begin and end with systematic reviews of other relevant research (Clarke et al., 2007) is beginning to influence the behavior of research funders, research regulators, journals, and researchers themselves (Moher et al., in preparation).

Acknowledgement I am grateful to Andy Oxman for comments on an earlier draft of this paper.

References

270

Antman EM, Lau J, Kupelnick B, Mosteller F, Chalmers TC. 1992. A comparison of results of meta-analyses of randomized control trials and recommendations of clinical experts. JAMA 268: 240–248. Chalmers I. 1979. Randomized Controlled Trials of Fetal Monitoring 1973–1977. In Thalhammer O, Baumgarten K, Pollak A. (eds.). Perinatal Medicine (pp. 260–265). Stuttgart: Georg Thieme. Chalmers I. 2013. Health Research Authority’s great leap forward on UK trial registration. BMJ 3479:f5776 Chalmers I, Altman DG. 1995. Systematic Reviews. London: BMJ Publications. Chalmers I, Nylenna M. 2014. A new network to promote evidence-based research. Lancet 384: 1903–04. Chalmers I, Hedges LV, Cooper H. 2002. A brief history of research synthesis. Evaluation and the Health Professions 25: 12–37. Chalmers I, Bracken MB, Djulbegovic B, Garattini S, Grant J, Gulmezoglu AM, Ioannidis JPA, Oliver S. 2014. How to increase value and reduce waste when research priorities are set. Lancet 383: 7–16. Chalmers TC, Matta RJ, Smith H, Kunzler A-M. 1977. Evidence favoring the use of anticoagulants in the hospital phase of acute myocardial infarction. New England Journal of Medicine 297: 1091–96. Clarke M, Hopewell S, Chalmers I. 2007. Reports of clinical trials should begin and end with up-to-date systematic reviews of other relevant evidence: a status report. Journal of the Royal Society of Medicine 100: 187–190. Cooper HM. 1982. Scientific principles for conducting integrative research reviews. Review of Educational Research 52: 291–302. Daniels M, Hill AB. 1952. Chemotherapy of pulmonary tuberculosis in young adults: an analysis of the combined results of three medical research council trials. BMJ 1: 1162–1168. Davis JM. 1976. Overview: maintenance therapy in psychiatry: II. Affective disorders. American Journal of Psychiatry 133: 1–13. Dickersin K, Chalmers F. 2014. Thomas C. Chalmers (1917–1995): a pioneer of randomized clinical trials and systematic reviews. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Dickersin K, Chalmers I. 2010. Recognising, investigating and dealing with incomplete and biased reporting of clinical research: from Francis Bacon to the World Health Organisation. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Feldman KA. 1971. Using the work of others: some observations on reviewing and integrating. Sociology of Education 44: 86–102. Glass GV. 1976. Primary, secondary and meta-analysis of research. Educational Researcher 10: 3–8. Guyatt GH, Oxman AD. 2009. Medicine’s methodological debt to the social sciences. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Copyright © 2015 John Wiley & Sons, Ltd.

Res. Syn. Meth. 2015, 6 268–271

COMMENTARY

Health Research Authority. 2014. Guidance. Guidance: specific questions that need answering when considering the design of clinical trials. Available from: http://www.hra.nhs.uk/documents/2014/05/guidance-questionsconsiderations-clinical-trials.pdf (accessed Nov 18, 2014). Jenicek M. 1987. Méta-analyse en médecine. Évaluation et synthèse de l’information clinique et épidémiologique. [Meta-analysis in medicine: evaluation and synthesis of clinical and epidemiological information]. St.Hyacinthe and Paris: EDISEM and Maloine Éditeurs. Jenicek M. 2006. Méta-analyse en médecine: the first book on systematic reviews in medicine. JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Kleinert S, Benham L, Collingridge D, Summerskill W, Horton R. 2014. Further emphasis on research in context. Lancet 384: 2176–77. Leitch I. 1959. The Place of Analytical and Critical Reviews in Any Growing Biological Science and the Service they may Render to Research. Proceedings of the International Conference on Scientific Information, 2 Vols. Washington DC: National Academies Press, p. 571-588. Light RJ, Pillemer DB. 1984. Summing up. Cambridge: Harvard University Press. Light RJ, Smith PV 1971. Accumulating evidence: procedures for resolving contradictions among research studies. Harvard Educational Review 41: 429–471. Mahoney MJ. 1977. Publication prejudices: an experimental study of confirmatory bias in the peer review system. Cognitive Therapy and Research 1: 161–75. Moher D, Glasziou P, Chalmers I, Nasser M, Bossuyt PMM, Korevaar DA, Graham ID, Ravaud P, Boutron I. In preparation. Increasing value, reducing waste in biomedical research: who’s listening. National Institute for Health Research. 2013. Guidance notes for applicants that ensure all primary research is informed by a review of the existing literature. Version 5. May. Available from: http://www.nets.nihr.ac.uk/ __data/assets/pdf_file/0006/77217/Guidance-notes_literature-review.pdf (accessed Nov 18, 2014). Peto R, Pike MC, Armitage P, Breslow NE, Cox DR, Howard SV, Mantel N, McPherson K, Peto J, Smith PG. 1977. Design and analysis of randomized clinical trials requiring prolonged observation of each patient. II. Analysis and examples. British Journal of Cancer 35: 1–39. Park WH, Bullowa JGM, Rosenbluth NM. 1928. The treatment of lobar pneumonia with refined specific antibacterial serum. JAMA 91: 1503–1508. Petrosino A. 2004. Charles Frederick [Fred] Mosteller (1916–2006). JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Pratt JG, Rhine JB, Smith BM, Stuart CE, Greenwood JA. 1940. Extra-Sensory Perception after Sixty Years: A Critical Appraisal of the Research in Extra-Sensory Perception. New York: Henry Holt. Rosenthal R. 1979. The ‘file drawer problem’ and tolerance for null results. Psychological Bulletin 86: 638–641. Simes RJ. 1986. Publication bias: the case for an international registry of clinical trials. Journal of Clinical Oncology 4: 1529–41. Smart RG. 1964. The importance of negative results in psychological research. Canadian Psychologist 5: 225–232. Smith ML, Glass GV. 1977. Meta-analysis of psychotherapy outcome studies. American Psychologist 32: 752–760. Smith A, Traganza E, Harrison G. 1969. Studies on the effectiveness of antidepressant drugs. Psychopharmacol Bulletin (suppl) 1-53: 1–20. Sterling TD. 1959. Publication decisions and their possible effects on inferences drawn from tests of significance – or vice versa. Journal of the American Statistical Association 54: 30–34. Stjernswärd J. 1974. Decreased survival related to irradiation postoperatively in early breast cancer. Lancet 304: 1285–1286. Stjernswärd J. 2009. Meta-analysis as a manifestation of ‘bondförnuft’ (‘peasant sense’). JLL Bulletin: Commentaries on the history of treatment evaluation (www.jameslindlibrary.org). Wechsler H, Grosser GH, Greenblatt M. 1965. Research evaluating antidepressant medications on hospitalized mental patients: a survey of published reports during a five-year period. The Journal of Nervous and Mental Disease 141: 231–239.

271


Res. Syn. Meth. 2015, 6 268–271

Special issue on spatial methods for health policy research.

Governance for health: special issue commentary.

Special issue of photosynthetic research.

Special issue on behavioral research.

Introduction and commentary for special issue on health information technology.

Introduction. Special Issue Overview: Optimizing Mixed Methods for Implementation Research in Large Systems.

Special issue: recent advances in muscle research.

Chirality research in Australia special issue 2014.

Inside the adolescent brain: a commentary on the Special Issue.

Introduction to the Special Issue on Propensity Score Methods in Behavioral Research.

Preface of the special issue: "recent CMV research".

Special issue on warnings: advances in delivery, application, and methods.

Introduction to the special issue on implementation research.

Editorial message: Special issue on current research in apiculture.

HIV prevention research ethics: an introduction to the special issue.

Special issue.

Foreword to the Virus Research special issue on "hantaviruses".

Special Issue: Methods for Estimating Treatment Effects for Persons with Multiple Chronic Conditions.

Special issue on "Analytical methods for the detection of oxidized biomolecules and antioxidants".

Editorial for the special issue on Ganoderma.

Editorial for the special issue.

CBCT special issue.

Special issue: Biofilms.

In this special issue.