This article was downloaded by: [Imperial College London Library] On: 11 June 2014, At: 18:03 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20

An overview of statistical issues and methods of meta-analysis a

a

E. Judith Schmid , Gary G. Koch & Lisa M. LaVange

a

a

Department of Biostatistics School of Public Health , University of North Carolina at Chapel Hill , Chapel Hill, North Carolina, 27599-7400 Published online: 29 Mar 2007.

To cite this article: E. Judith Schmid , Gary G. Koch & Lisa M. LaVange (1991) An overview of statistical issues and methods of meta-analysis, Journal of Biopharmaceutical Statistics, 1:1, 103-120, DOI: 10.1080/10543409108835008 To link to this article: http://dx.doi.org/10.1080/10543409108835008

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/ terms-and-conditions

Journal of Biopharrnaceutical Statistics, 1(1), 103-120 (1991)

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

A N OVERVIEW OF STATISTICAL ISSUES A N D METHODS OF META-ANALYSIS Judith E . Schmid, Gary G . Koch, and Lisa M . LaVange Department of Biostatistics School of Public Health University of North Carolina at Chapel Hill Chapel Hill, North Carolina 27599-7400

Keywords. Meta-analysis; Random effects model; Survey data regression; Combination of studies

Abstract A meta-analysis is a statistical analysis of the data from some collection of studies in order to synthesize the results. In this paper we discuss issues that frequently arise in meta-analysis and give an overview of the methods used, with particular attention to the use of fixed- and random-effects approaches. The methods are then applied to two sample datasets.

Introduction Meta-analysis has become increasingly popular over the last decade, but it is still a controversial topic. It is clearly desirable to have standard methods for drawing an overall conclusion from the collection of studies that have investigated any single question, but there are also problems associated with combining data from disparate studies. The benefits of meta-analysis include all the statistical advantages of a large sample size as well as the qualitative

Copyright O 1991 by Marcel Dekker, Inc.

Schmid et al.

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

104

information that can be gained by critically comparing and evaluating studies. The criticisms center around the investigator's lack of control over the data; what influence will publication bias, unobservable study aberrations, incompletely reported data, or nonindependence of studies have on the result of the meta-analysis? Although these questions can never be answered completely, their impact can be minimized by careful planning of the analysis through such steps as a thorough search for relevant studies, well-defined criteria for inclusion of studies in the analysis, weighting the more reliable studies more heavily than less reliable studies, and updating the analysis as necessary. The methods most commonly used in meta-analysis include methods of directly combining p values, such as Fisher's method; the sign test, signed-rank test, and Mantel-Haenszel test to determine the presence of an effect (an association between a set of groups and an outcome); and fixed-effects linear and logistic regression techniques. These are all useful when the individual studies are homogeneous in the pattern of association; however, if significant heterogeneity is present, a random-effects approach may be more reasonable. DerSimonian and Laird (1) have presented an easily implemented procedure for determining the overall association between a set of groups and an outcome for collections of studies in which differences among groups vary randomly across the studies about a population mean or "true difference." Methods for estimating between-group differences in clustered sample settings can also be used by regarding each study as a cluster of subjects and taking into account both the between- and within-study variance. Software for this approach, such as RTILogit (2, 3), is readily available. Each of these methods will be illustrated with data from a set of clinical trials studying the efficacy of two drugs, ranitidine and cimetidine, in preventing the recurrence of duodenal ulcers in recently healed ulcer patients.

Advantages and Issues of Meta-analysis The problem of drawing conclusions from a set of different but related studies is not new. In the 1930s, statisticians such as Pearson and Fisher published work on combining p values across studies (4, p. 478) and Cochran (5, p. 101) addressed the question of combining estimates across studies. Metaanalysis is the name given by Glass in 1976 (6) to the process of combining the results of related studies to come to an overall conclusion. Dissatisfied with narrative discussions of collections of studies (". . . we too often substitute literary exposition for quantitative rigor7'), he used "meta-analysis" to refer to a statistical analysis of the statistical results of individual studies. Other terms used are "overview" and "superanalysis. " Some authors (7) prefer the term "quantitative meta-analysis" (or 'quantitative overview") to stress

Statistical Issues and Meta-analysis

105

the quantitative and statistical nature of the analysis and contrast it with a qualitative meta-analysis, which is more concerned with such characteristics of studies as their conclusions, design, or potential biases.

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Advantages of Meta-analysis The reason generally given for the increasing number of meta-analyses produced each year is the attempt to make sense of a mass of studies that may be available on any one topic, with varying or conflicting results. A metaanalysis can do this in several ways. The increased sample size can increase the power of the analysis to demonstrate moderate effects that were not consistently detectable in the individual studies, but may be important scientifically or medically (8). The increased sample size will also generate more precise estimates of the effect (9) and facilitate subgroup analyses that were not possible before (10). Meta-analysis can lead to a critical evaluation of the studies, highlighting outlying or poorly designed studies (1 1). And finally, looking at diverse studies can increase evidence of the generalizability of the studies, generate new hypotheses, and direct attention to areas needing further research (9, 12).

Issues and Limitations of Meta-analysis Inclusion of Studies The major limitation of meta-analysis is that it can work only with what is available. A meta-analysis can include only studies that are published or in some other way retrievable. The available studies may be incomparable or give inadequate data (13). They may vary in design, quality, outcome measure, or population studied. This leads to two issues that must be addressed at the beginning of a meta-analysis: first, of the available studies, which are similar enough and of high enough quality that the results can be combined? And second, to what extent will the result of the analysis be biased by not including those studies which are not available? The question of which studies to include is usually referred to as the "apples and oranges" problem. Some authors feel that all possible studies should be included in the analysis, regardless of their differences. Glass ( 6 ) , for instance, wonders whether there is any great difference between the results of well-designed and poorly designed studies and recommends including both. Most authors, however, believe in minimizing any source of possible bias that can be controlled by the investigator, and for this reason recommend including only studies of high quality with a similarly defined outcome. For studies of clinical trials, for example, this often means that only randomized trials with some form of blinding and identical or very similar treatment reg-

Schmid et al.

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

106

imens and outcome measures are combined (10). The purpose of the metaanalysis is the major factor that determines which studies to include. An analysis that includes more diverse studies can address questions of generalizability of the findings, but it may obscure an association that is clearly present only in well-controlled studies or specific subgroups of a population. Including only very similar studies will enhance the clarity of the result and the precision of the effect estimate for association between a set of groups and an outcome (14). In any case, the criteria for including studies in the metaanalysis should be well defined before the literature search is begun. Publication bias is the main concern regarding the availability of studies. It has been shown that authors are more likely to submit, and that editors are more likely to publish, studies with significnt results (15, 16). This is also called the "file drawer problem"; those doing meta-analyses wonder about the influence the studies tucked away in file drawers might have on their results. Authors generally agree that some degree of effort should be made to find and include these studies; they are also not optimistic about the outcome of such a search. Another approach has been to try to quantify in some way what the impact of publication bias might be. For instance, Rosenthal's 'fail-safe N' (17) estimates the number of additional studies with null results that would be required to reverse the (positive) findings of a meta-analysis. For the most part, however, publication bias is mentioned by authors as a problem which does not at this time have a satisfactory solution, and which must be kept in mind when interpreting results. Finally, a meta-analysis must be updated as more studies become available, in order to maintain its usefulness.

Independence of Studies Most statistical procedures used in meta-analysis assume independence of the individual studies. It may be argued that studies done at different times or in different locations or by different investigators are intrinsically independent. However, the proceedings of later studies may be modified according to the outcome of earlier studies, and studies done in a specific area of science are often interrelated; the scientists in one area often have similar backgrounds and prior beliefs and frequently communicate with each other, which may influence study outcomes ( 16). As a precaution, the data should be examined for time, center, and investigator effects. Another question of interest is the independence of the subjects within studies. If the studies are not independent or if they are heterogeneous, the subjects within each study may need to be considered as correlated.

Statistical Issues and Meta-analysis

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Homogeneity of the Association Between Groups and the Outcome Variable Before the results of studies are combined, there must be some assessment of homogeneity of the results. Of course, it is expected that results of different studies will vary somewhat, and the more diverse the studies, the more heterogeneity one might expect from the results. But particularly for very similar studies, heterogeneity may decrease confidence in the repeatability of an experiment or raise doubts that the investigators were studying the same phenomenon. In any case, if the results vary widely, any overall result must be viewed cautiously since different ways of combining the results may lead to different conclusions. Bailey (18) lists a hierarchy of explanations for heterogeneity in patterns of association across studies. First, the heterogeneity may be attributable to chance (and so be perfectly acceptable). The second most desirable situation would be heterogeneity due to the scale of measurement used; for instance, an outcome that is heterogeneous in odds ratio may be homogeneous when a difference in proportions is examined. Heterogeneity may also be due to factors of design or characteristics of the populations used. Finally, the heterogeneity may be unexplainable or, at any rate, unexplainable with the data available. Accordingly, steps to reduce heterogeneity would be first to consider changing the measurement scale and second, by previous knowledge or examination of the data, to stratify the results based on design or population characteristics. [If the subpopulations are determined by examination of the data, rather than prespecified, more caution must be used in interpreting the results (8).] If the heterogeneity is not explainable it is not reasonable to assume that there is one treatment effect, or even one treatment effect per stratum. In this case, many authors recommend using a random effects model to account for between-study heterogeneity as well as sampling error within studies. Weighting the Studies One final decision to be made before beginning to combine the results is to decide how the studies should be weighted. One strategy is to consider each study as projecting an equivalent estimate of the true effect or association between groups and outcome, and therefore each study is weighted equally. Some attempts have been made to weight studies by their relative quality, but other than noting major aspects of design or obvious numerical errors, this is very difficult to judge and even more difficult to quantify. Another approach has been to conduct separate analyses for groups of studies of similar quality. The most common approach is to weight the larger studies more heavily than the smaller studies, assuming that a larger study will give a better

Schmid et al.

108

estimate of effect than a comparable small study. It is sometimes argued, however, that a larger study will give a better estimate only for the particular characteristics of that study, and not necessarily a better estimate of the true effect ( 12).

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Methods of Meta-analysis The methods used are determined to a large extent by the purpose of the metaanalysis and by the type of outcome measure used. A meta-analysis is usually done to determine if there is, on average, a treatment effect either in the studies that have been done or in all possible studies. Sometimes, however, the purpose is to determine if there is at least one set of circumstances where the treatment has an effect. The outcome measure used may be dichotomous, ordinal, or continuous. For dichotomous data, generally, the differences in proportions ( p , - p,), the risk ratios (p,/p,), or the odds ratios [ ( p e ( l - p,)/ p,(l - p,)] from the individual studies are combined as measures of effect, where p, and p,, respectively, are the proportions of experimental and control subjects showing the desired outcome. For continuous data, many scientists use a standardized measure of outcome such as the effect size, which is usually calculated as the difference in means of the experimental and control groups of the individual study divided by the pooled standard deviation of the groups. Hedges and Olkin ( 1 9 ) use an unbiased estimator of effect size: d = J(N - 2)(ye - y,)/B, where J is a bias correction factor depending on the sample size (see Ref. 19, p. 80), N is the total number of subjects, and J(N - 2) = [ l - 3 / ( 4 N - 9 ) ] ;y, and y, are the experimental and control means; and B is the pooled standard deviation for the experimental and control subjects.

Testing for Homogeneity of Results k

A generally applicable test for homogeneity has the form: Q = Ci,,w,(yi y*)', where y, is a measure of the difference between experimental and control groups in the ith of k studies, y* is the weighted mean or the expected value of the y,, wi is the weight assigned to the ith study, generally the inverse of the variance, and Q has a chi-square distribution with k - 1 degrees of freedom. For continuous data, Hedges and Olkin ( 1 9 ) test for homogeneity of effect sizes with the test statistic Q = Xf=,(d,- d*)2/&(di), where d, is the unbiased estimator of effect size described above, d* is the weighted mean of the d,: d* = Cf=,d,/&(d,)/Cf=,1 /&(d,), and &(do = (n, + n,)/n,n, d f / 2 ( n e + n,) is the estimated variance of the ith study.

+

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Statistical Issues and Meta-analysis

109

For dichotomous data, DerSimonian and Laird (1) similarly use Q = Cf=,wi(y,- y*)2 to test homogeneity of outcome. For a difference in proportions, yi = p,, - pCi,y* = Cf=lwlyi/Cfi=lwi, wi = 1/e,and &: = pei(l pe,)/nei + pci(l - pci)/nci, where p,, is the proportion of subjects in the ith experimental or control group with the desired outcome. To test for homogeneity of odds ratios, yi = log [pei(l - pci)/pci(l - pei)] and l/wi = cf = [neipei( 1 - pCi)l- ' + [ncipci( 1 - PC,)]- I . If linear or logistic regression models are used, lack of homogeneity of treatment effect can be tested by computing tests of goodness of fit of the model with only main effects for study and treatment, or by testing the interaction of study and treatment. Some authors (8) feel that quantitative study by treatment interactions (that is, where the treatment effect varies in magnitude across studies, but not in direction) is inevitable, and that it is only important to test for qualitative interaction, where the treatment effect varies in direction across studies. Gail and Simon (20) give a test for qualitative interaction based on a likelihood ratio statistic. The hypothesis of no qualitative interaction is rejected if Q- = C D , , , ~ f / u ?> C and Q+ = C,,,,Df/af > C, where Di is the estimate of the true difference between the treatment groups, and ai is the standard deviation. A table for C, the critical value, can be found in Gail and Simon. The test assumes that the D,are independent and normally distributed, with mean Si and variance u f . In large samples &f can be used in place of (Gail and Simon use an estimate of the difference that does not have a bias correction factor and is not standardized by dividing by 8, but the d, of Hedges and Olkin can also be used here.)

d.

Combining the Results of Individual Studies Methods of Combining p Values Among the simplest and earliest proposed procedures for combining the results of separate studies are methods that combine the individual study p values or, equivalently, transform the individual p values into 2,t , or chi-square statistics and then combine them. These tests address H,: Oi = 0 , for i = 1, 2, . . . k, where 8; is the treatment effect in the ith study (for instance, 0, may be the difference in means or medians in the ith study for continuous data or the difference in proportions or the log of the odds ratio of the ith study for dichotomous data), and most give equal weight to every study. Fisher's procedure (17) is based on the product of the p values; for k independent studies Q = -2 E~=,log,(ui),where ui is the ith one-sided p value, and Q has a chi-square distribution with 2k degrees of freedom.

110

Schmid et al.

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

The inverse normal method transforms the p values into Z scores, where Zi = F-'(u,) is the standard normal score associated with the ith p value. Z = C f = , ~ ~ / then f i has a standard normal distribution. Similarly, Z = C;=,ti/Cf=,d~/(dj- 2), where ti is the t statistic with appropriate degrees of freedom d j associated with the ith p value, is approximately normally distributed, with mean 0 and variance 1. Similar tests are available that allow different weights for different studies. These are mostly of the form Z = XF=,wiZi/Xf=,wi(21). However, Fisher's test is the most commonly used procedure. These tests can be used for all types of measurements and for studies that may have measured different but related outcomes. However, they may give confusing results if the significant effects of individual studies are in opposite directions. Other Nonparametric Methods

To test H,,: 8 = 0 , where 8 is the overall effect of the treatment, the sign test or the Wilcoxon signed-rank test will give a result for the situation where all studies are weighted equally, whereas the Mantel-Haenszel test weights larger studies more heavily (12). The sign test and the signed-rank test are both based on the idea that if there is no difference between treatment and control, the median difference will be 0. The sign test codes the nonzero differences between the control and treatment outcomes in each study as positive or negative and then compares the number of positive results with a binomial distribution with .rr = 0.5. The signed-rank test also takes into account the magnitude of the individual treatment-control differences (22). If the data are dichotomous, they can be expressed in a 2 X 2 table: Treatment Outcome 1 Outcome 2

+

Control

a c

with a b + c + d = N the total number of subjects in the study. Then E(a) = (a + b)(a + c)/N and V(a) = (a + b)(c + d)(a + c)(b + ~)/N'(N - 1). The Mantel-Haenszel statistic Q = {Xf=,(a;- ~ ( a ~ ) ) ) ' / 2 f = ~ ~has ( a a, ) chi-square distribution, with 1 degree of freedom (9). In much of the metaanalysis literature (8), this is referred to as the " 0 - E" method (observed - expected), and the standard normal distribution of the transformed test ) ) " ' rather than the chi-square statistic Z = Cf=,(ai- ~ ( a ~ ) ) / { C f = ~ ~ ( ais~ used distribution, Q. For continuous data, the analogous procedure tests a weighted difference of means. Let d* = C.f=,wid,/X~=,wi, where di is the difference in means be-

Statistical Issues and Meta-analysis

111

tween experimental and control groups, wi= nrinci/(nrl+ n,,), and s2= Ef=,{(n,, - I)&, (n,, - 1 ) & } / ~ - 2k. Then F = d*'E;=,w,/~~approximately has an F distribution, with 1 and N - 2k degrees of freedom, given that d, and 4 , are equal for all studies (22). The Mantel-Haenszel statistic can also be expressed as Q = {C~=,(neincj/ ni)(pei- pci)}2/Ef=,v(u,). The statistics for continuous and dichotomous data both have numerators equal to a weighted sum of differences between experimental and control groups, with the appropriate variance in the denominator.

+

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Linear Models There has been a lot of discussion on the use of linear models in meta-analysis. One major area of concern has been the relative merits of using fixedor random-effects models. Cochran (5), in a paper concerning the combination of estimates of different experiments written in 1954, suggested that if the values to be combined do not agree within the limits of experimental error, then a random-effects model of the form yi = 8, + ei = 8 + s, + ei is more appropriate than the fixed-effect model y, = 8 + ei. Under the randomeffects model there is a distribution of effects across studies rather than a single effect, and the goal is to estimate the mean of the effect, 8. The variance of yi is then the sum of the sampling variance (var(e,)) and the variance of the s,, the deviations of the study-specific effects, 8,, from the overall mean, 8. Other questions arise with regard to the use of linear models in metaanalysis, concerning the validity of the assumptions. The assumptions usually questioned are the homogeneity of variance and the normality of the outcome variable, given the predictor variable(s). There is often heterogeneity of variance between studies, as the individual studies may have widely varying sample sizes; this is often addressed by using a weighted least-squares approach. There is divided opinion about the question of normality, but many authors feel that moderate deviations from normality will not greatly affect the results. Hedges and Olkin devote several chapters of their book to the use of fixed- and random-effects models. They use di, the unbiased estimate of effect size, as the outcome measure, where d, = J(Ni - 2)(yej - yCi)/bi. As discussed previously, J is a bias correction factor depending on the sample size (Ref. 19, p. 80), N is the total number of subjects, and J(N - 2) = [ l - 3/(4N - 9)]; ye, and y,, are the experimental and control means, and eiis the pooled standard deviation for the experimental and control subjects. The models assume that ye, N(pei, di)and that yci N(pc,, di). Then, &dl) = (nei + nCi)/neinci+ 6?/2(ne, + n,,), where ai = pei- pci, and c? is estimated by replacing 6; with d?. TO account for the heterogeneity of variance that may arise, a weighted estimate of 6 is used: d* = Xf=,[d,/62(di)]/Ef=,[l /&(d,)].

-

-

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Statistical Issues and Meta-analysis

113

jth treatment group in the ith study or group. This model can be fit by software packages such as SAS Proc Catmod (23, 24). 1.f the data are heterogeneous in pattern of association after taking into account the other study characteristics included in the model [as determined by tests of treatment by study interaction or model goodness of fit tests such as the log-likelihood ratio chi-square statistic (23, 24)], then a random effects approach may be warranted. DerSimonian and Laird (1) have presented a procedure for combining dichotomous study results using a random-effects approach, which may be used with either a difference in proportions or an odds ratio as the outcome measure. Using the model yi = 8, + ei = 8 + s, + e,, the estimate of 8 is Cf=,w,+yi/C;=,w,+ where w,'" = ( w ~ + ' var(s))-I, and the standard error of is (x~wT)"'~.The var(6) = max (0, (Q - (k - l))}/{C;= ,wi - 2;=,wf/C;= ,wi}, ( ~ ~ is the chi-square statistic used to test for howhere Q = C , ~ , , W-~ y*)2 mogeneity, and w, = l / e f . The definitions of yi and e; for the risk difference and odds ratio were described earlier. One characteristic of the method of DerSimonian and Laird is that in using these weights, the reciprocals of the sums of the individual study sampling error and the between-study variance, a large amount of heterogeneity in the studies will tend to push the weighted estimate toward an unweighted estimate. While the presence of heterogeneity should increase the uncertainty in the estimate, an estimate that reflects differences in individual study sample size, variance, or quality may still be more useful than one that does not. An approach that allows differential weighting of studies while taking the between-study heterogeneity of association into account is the use of survey sampling methods. In considering that a sample of studies has been chosen from some population of studies, and a sample of observations has been chosen within each study, there is essentially equivalence with the randomeffects model, where each observation within a study is thought of as an expression of the outcome for that treatment group, and each study result as an expression of the random variable, effect size. The survey sampling approach differs from the random-effects model by not involving an explicit estimate of the subject or study variance. Rather, a robust estimate of the variance of the treatment effect is computed and is used to produce test statistics about those effects. Software is readily available for applying logistic regression models to survey data. RTILogit (3), executed in conjunction with either Proc Logist of SAS (24, 25) or the Logistic procedure in the SUDAAN (26) system, produces estimates of effects and their variances while taking into account any unequal weighting, stratification, and clustering in the survey design. The procedure allows for specification or individual study (or observation) weights

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

8

Schmid et al.

114

and stratum and primary sampling unit variables. An explanation of the derivation of the formulas used can be found in LaVange er al. (2).

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

The Dataset Studies that lend themselves relatively easily to meta-analysis are drug efficacy studies. The benefits of the analysis are tangible. Demonstration of moderate improvement of patients with a common disease is important for both public health policy and pharmaceutical product usage. An analysis that identifies a subgroup which reacts differently to a drug, or a study that has a very different outcome than the bulk of studies can likewise be very important. Drug efficacy studies also limit some of the problems of meta-analysis. Nonsignificant results are often as notable as significant results and therefore not as likely to be unpublished as in some other fields, and experimental conditions such as dosage and type of patient are often similar. The dataset to be used in an example of meta-analysis for this paper consists of studies of two drugs used in the treatment of ulcers: cimetidine and ranitidine. These are both of the class of drugs known as histamine H,receptor antagonists, which inhibit the secretion of acid into the stomach, thereby enhancing ulcer healing or decreasing the chance of ulcer formation. In the studies in this example, ranitidine and cimetidine are used in ulcer maintenance therapy-that is, they are given in low doses to patients whose ulcers have recently been healed to prevent further ulcer occurrence. The studies are all 12 month (with the exception of one 11-month study), randomized, double-blind studies comparing ranitidine with cimetidine or one of the two drugs with placebo. In all studies the drug or placebo is taken by patients once daily at bedtime or with the evening meal, and the dose used is 400 mg for cimetidine and 150 mg for ranitidine. The studies in this dataset are identical to those used by Palmer et al. (27) with the addition of the studies by Hovdenak et al. (28) and Van Deventer et al. (29) and the use of the updated values reported by Bolin et al. in 1987 (30), rather than the 1983 figures. A listing of the dataset can be found in Table 1.

Results Assessment of Outcome Homogeneity For each type of study, ranitidine-cimetidine, ranitidine-placebo, cimetidineplacebo, and active drug-placebo (combining both types of placebo studies), the presence of heterogeneity in the outcome variable was assessed by the

115

Statistical Issues and Meta-analysis Table 1 .

Listing of Ranitidine and Cimetidine Studies Ranitidine

Author

Downloaded by [Imperial College London Library] at 18:03 11 June 2014

Battaglia Boyd Van Dornmelen Walt Gough Silvis Bolin Bresci Hovdenak Boyd Kozarek Leidberg Mekel Van Deventer GlaxoInternational Alstead Sontag Moshal Bardhan Bianchi-Porro Eichenberger Sonnenburg Venables Walters Loof Berstad Morin

Year

Recurrences

Cimetidine Total

Recurrences

Total

Ranitidine-Cimetidine Studies 4 18 25 106 4 24 7 28 44 243 7 60 5 37 10 55 Ranitidine-Placebo Studies 6 29 28 125 33 138 5 29 4 22 18 70 37 174

Cimetidine-Placebo Studies 15 67 11 20 10 447 9 25 6 13 7 34 3 19 13 49 14 49 2 23 11 54

chi-square statistics presented by DerSimonian and Laird for both risk difference and odds ratio, the Breslow-Day test for homogeneity of odds ratio, which is given by SAS Proc Freq, and the log-likelihood ratio test from SAS Proc Catmod (24). Results of these tests are shown in Table 2. The ranitidinecimetidine studies are very homogeneous, the ranitidine-placebo studies very heterogeneous and the cimetidine-placebo studies significantly heterogeneous when the risk difference is used as the outcome variable, but not when the odds ratio is used.

Schmid et aI.

Table 2.

Tests of Homogeneity Risk difference DerSimonianLaird

Studies R-C R/C-P R-P C-P

Odds ratio DerSimonianLaird

B

~

~

~Log-likelihood ~ ~ ~ ratio

d.f.

Q

p value

Q

p value

Q

p value

Q

p value

7 18 7 10

4.90 50.86 31.18 19.19

0.672

An overview of statistical issues and methods of meta-analysis.

A meta-analysis is a statistical analysis of the data from some collection of studies in order to synthesize the results. In this paper we discuss iss...
867KB Sizes 0 Downloads 0 Views