Special Issue Paper Received 14 November 2013,

Accepted 16 September 2014

Published online 15 October 2014 in Wiley Online Library

(wileyonlinelibrary.com) DOI: 10.1002/sim.6321

A selection model for accounting for publication bias in a full network meta-analysis Dimitris Mavridis,a,b*† Nicky J. Welton,c Alex Suttond and Georgia Salantia Copas and Shi suggested a selection model to explore the potential impact of publication bias via sensitivity analysis based on assumptions for the probability of publication of trials conditional on the precision of their results. Chootrakool et al. extended this model to three-arm trials but did not fully account for the implications of the consistency assumption, and their model is difficult to generalize for complex network structures with more than three treatments. Fitting these selection models within a frequentist setting requires maximization of a complex likelihood function, and identification problems are common. We have previously presented a Bayesian implementation of the selection model when multiple treatments are compared with a common reference treatment. We now present a general model suitable for complex, full network meta-analysis that accounts for consistency when adjusting results for publication bias. We developed a design-by-treatment selection model to describe the mechanism by which studies with different designs (sets of treatments compared in a trial) and precision may be selected for publication. We fit the model in a Bayesian setting because it avoids the numerical problems encountered in the frequentist setting, it is generalizable with respect to the number of treatments and study arms, and it provides a flexible framework for sensitivity analysis using external knowledge. Our model accounts for the additional uncertainty arising from publication bias more successfully compared to the standard Copas model or its previous extensions. We illustrate the methodology using a published triangular network for the failure of vascular graft or arterial patency. Copyright © 2014 John Wiley & Sons, Ltd. Keywords:

consistency; mixed treatment comparison; propensity for publication; publication bias; study design

1. Introduction For most healthcare problems, there are a large number of competing interventions. The relative effectiveness of treatments is typically assessed in randomized control trials (RCTs), and these RCTs comparisons together form a network of treatment comparisons, around which information may flow as long as the network is connected. Such a body of evidence can be synthesized via network meta-analysis (NMA) [1–4]. NMA is now an established method in the evidence-based medicine literature and is getting increasingly used to assess comparative effectiveness of healthcare interventions [5–7]. Despite the increasing popularity, there are still methodological gaps in many aspects of the model, including methods to explore and account for publication bias. Central to NMA methodology is the concept of evidence consistency [3, 8, 9]. Evidence consistency refers to the agreement between direct evidence (evidence arising from many direct studies comparing the treatments of interest) and various indirect sources of evidence (that is collation of studies that compare treatments of interest to a common comparator). Inconsistency and heterogeneity are two sources of variability in NMA, the former referring to variation across pairwise comparisons (loop inconsistency)

a Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, b Department of Primary Education, University of Ioannina, Ioannina, Greece c School of Social and Community Medicine, University of Bristol, Bristol, U.K. d Department of Health Sciences, University of Leicester, Leicester, U.K.

Ioannina, Greece

Ioannina, Greece [email protected]

† E-mail:

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5399

*Correspondence to: Dimitris Mavridis, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine,

D. MAVRIDIS ET AL.

5400

and the latter across trials within the same comparison. Inclusion of study level covariates has been suggested to explain heterogeneity and reduce inconsistency [10, 11]. Publication bias is a welldocumented potential explanation for observed heterogeneity in the pairwise meta-analysis literature. Publication bias is identified as a major threat that may invalidate the results derived from a meta-analysis [12, 13]. Publication bias may be present in the network of studies and can introduce inconsistency as well as heterogeneity. Chaimani et al. conducted a network meta-epidemiological study and found that in the majority of the 32 networks they analyzed, small studies tended to exaggerate the effectiveness of the active or new intervention [14]. This phenomenon, known as small-study effects, could be partly associated with publication bias. A plethora of methods has been developed to identify or correct for small-study effects in pairwise meta-analysis. Some methods are based on a visual inspection of the funnel plot or use regression methods to detect whether there is an association between the effect size and a measure of its precision [15, 16]. Chaimani et al. extended the use of funnel plots to NMA by plotting the differences in the studyspecific estimates from the respective comparison-specific summary estimate versus its standard error [17]. Its interpretation depends on the direction of the small-study effects. The empirical evidence about the presence of small-study effects in NMA highlights the need for developing and exploring methods specific to properly detect and account for publication bias in NMA, such as selection models [17]. Selection models account for the mechanism with which studies are selected for publication. In particularly, the selection model by Copas has become popular in medical evidence synthesis [18–21]. Copas advocated a sensitivity analysis in which the pooled intervention effect is computed under a range of assumptions about the severity of selection bias in the data. This selection model has been evaluated empirically and was found to perform well in comparison to other methods [22, 23]. The use of this model has received little attention in the NMA context. A relevant extension of the model was presented by Chootrakool et al. and accounts for different selection probabilities between two-arm and three-arm studies (studies with more than two treatments) in a frequentist setting [24]. The effect sizes are inherently consistent within a multi-arm trial, and Chootrakool et al. have not taken consistency into account when adjusting estimates for publication bias within a multi-arm trial. The original approach of Copas and Shi and its extension by Chootrakool et al. fit the selection model in a frequentist setting. They conduct a sensitivity analysis by considering a range of different selection mechanisms, and for each model, they assess its goodness-of-fit using a likelihood ratio test. We take a different approach, fitting the model in a Bayesian setting and placing emphasis not on evaluating the fit of the model but on estimating correlations between observed effect sizes and the probability of publication; that is, whether there is evidence of, and the extent of, ‘publication bias’. Fitting the selection model in a frequentist setting requires maximizing a complex likelihood, which may be stuck in a region of flat density causing identification problems. Carpenter et al. developed an R function to fit the Copas selection model in pairwise meta-analysis. They conducted an empirical analysis and found that in almost 20% of the cases the algorithm failed to converge [22]. Consequently, extension of the Copas section model for complex networks of treatments with multiple arms is challenging within a frequentist setting. An alternative Bayesian formulation of the model in the context of pairwise meta-analysis and in special cases of star-shaped networks (where each treatment is compared with a reference treatment but not in head-to-head trials) has first been suggested in [25]. This is a straightforward extension from simple metaanalysis because a star-shaped network does not involve closed loops of evidence (reference) or multi-arm studies, and consequently, the consistency assumption does not impose constraints in the estimation of the parameters. In this paper, we properly modify the consistency equations on the adjusted summary estimates within a multi-arm trial, and we adjust summary estimates taking consistency into account. We have also shown how external empirical evidence can be obtained and incorporated in the model to estimate treatment effects adjusted for publication bias [25]. In this paper, we present a general form of the selection model applicable to any type of network, by extending our earlier work for star networks to general connected networks. We consider the general case of a network that involves different types of studies (different combinations of the treatments, termed as study designs) [26, 27]: studies that compare active treatments to a reference, head-to-head studies (comparisons between active treatments), and multi-arm studies. Industry sponsor trials are likely to include a placebo control [28] and are also likely to report favorable results to get regulatory approval [29]. Hence, the design of the study may be associated with different levels of publication bias. We develop methodology that allows us to explore if there is a design by treatment publication bias. We fit the model in a Bayesian setting using Markov Chain Monte Carlo (MCMC), which we see as the natural approach for a number of reasons. A Bayesian fitting of the model overcomes the problem of maximizing a complex Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

likelihood, allows more flexibility in parameter estimation while fully accounting for their uncertainty, relaxes some assumptions of the original model that were made because of their analytical tractability, and, finally, allows incorporation of prior knowledge to inform the parameters of the selection model. The paper is organized as follows. In Section 2, we present the statistical model and discuss about prior distribution of its parameters. We use an example with three interventions, one placebo and two active, to illustrate the methodology, and we conclude with a discussion in Section 4.

2. Methods 2.1. Notation and setting Suppose that n RCTs have been conducted comparing a total of T treatments (A, B, C, …). We use the term design to refer to the set of treatments compared in a study [27]. A design is simply each possible subset of at least two treatments in the network. In total, there are 2T − T − 1 potential designs, and the designs of interest are those that are realised in at least one trial. Let d = 1, … , D index the designs evaluated in a network and nd the number of studies included in the network that pertain to the dth design comprising Td treatments. Some of the studies(may)have more than two arms (multi-arm studies, Td > 2). Td contrasts. In practice, only Td − 1 contrasts need to From a study of design d, we may compute 2 be estimated as the rest can be computed as linear combinations. Hence, information from each study of design d is summarized in Td −1 effect sizes (i.e., log-odds ratio and mean differences) and their standard (T −1)(T −2) errors along with d 2 d correlations between the Td − 1 effect sizes. Let y represent the vector of the observed effect sizes estimated in studies. We use two subscript indexes to identify the study and its design and a superscript to denote the actual contrast being evaluated so that yXY refers to the effect size for the XY comparison (where X and Y are two randomly chosen treatments) in id the ith study that has the dth design. In two-arm studies, defining both the design and the pair of treatments compared is redundant as there is a one-to-one correspondence between them. To illustrate, consider the case where there are seven studies and three treatments, the first three studies comparing AB (d = 1, n1 = 3), the fourth and fifth study comparing AC (d = 2, n2 = 2), the sixth BC (d = 3, n3 = 1), and the seventh compares three treatments so as design ABC (d = 4, Td = 3, n4 = 1). There are three treatments (T = 3) and four designs (D = 4) because each of the seven studies compares one of ( these four combinations of treatments )′ (AB, AC, BC, or ABC). The vector of effect sizes would be AB AB AB AC AC BC AB AC y = y1,1 , y2,1 , y3,1 , y4,2 , y5,2 , y6,3 , y7,4 , y7,4 . We will also use the vector notation yid to denote the effect )′ ( ( ) AC sizes in each study; for example, y𝟕,𝟒 = yAB . , y and y = yAB 𝟑,𝟏 7,4 7,4 3,1 2.2. The assumption of consistency in network meta-analysis

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5401

Consider that there are three treatments A, B, and C. Suppose there are no studies directly comparing A versus B yielding a direct treatment effect estimate. We may get an indirect estimate by comparing studies of A versus C with studies of B versus C. The true (or average) direct effect of B relative to A is denoted by 𝜇AB and indirect estimate of the true effect of B relative to A is found to be 𝜇AC − 𝜇BC . Consistency implies that direct and indirect evidence are in agreement, that is, 𝜇AB = 𝜇AC − 𝜇BC , and by mixing both sources of evidence (direct and indirect), we gain precision and power [30]. Differences in studies comparing AB and AC such as differences in the distribution of effect modifiers across study designs (by which we mean the set of treatments compared in a trial) may have an impact on the results, and the consistency assumption may not hold [9]. Models to account for inconsistency have been suggested both in a Bayesian [8, 31, 32] and in a non-Bayesian setting [27, 33, 34]. Estimators of the effect sizes from multi-arm studies are inherently consistent and correlated because they are estimated on the same group of individuals [4, 35]. Multivariate extensions of the NMA model under consistency and inconsistency are needed to include data from multi-arm studies. The observed effects from all studies can be summarized across studies to estimate a pooled relative treatment effect for each treatment comparison XY that we denote with 𝜆XY . In general, each study ( ABassoci)′ ated with design d provides information about a set of summary effects 𝝀 ; for example, 𝝀 = 𝜆 , 𝜆AC d 𝟒 ( AB ) and 𝝀𝟑 = 𝜆 , where 𝜆X 𝛶 are the same irrespective of the study design that the contrast X𝛶 is encountered.

D. MAVRIDIS ET AL.

Suppose that a specific treatment A is chosen to be the overall reference treatment in the analysis of the network. There is a (T − 1) vector of basic parameters 𝝁 = (𝜇AB , 𝜇AC , … , 𝜇AT )′ containing the treatment effects 𝜇Aj for treatment j compared with the reference treatment A (j = B, C, … , T). The aim of NMA is to estimate the parameters 𝜇Aj using both direct evidence (included in 𝜆XY parameters) and indirect comparison (included in combinations of the 𝜆XY parameters) parameters. Assuming consistency, each 𝜆XY can be written as a function of the 𝜇Aj parameters; for instance, 𝜆XY = 𝜇AY − 𝜇AX , where 𝜇AA = 0. In general, there is a design-specific transformation matrix Xd between 𝝀d and 𝝁 so that 𝝀d = X(d 𝝁. In)the ( )′ 10 example stated earlier, it is 𝝁 = 𝜇AB , 𝜇AC X1 = (1, 0) , X2 = (0, 1) X3 = (−1, 1) and X4 = . 01 2.3. Measurement model

( )2 XY in a two-arm trial is modelled as a normal distribution yXY ∼ N(𝜃id , 𝜎iXY ) Each observed effect size yXY id id ( )2 ( )2 with variance 𝜎iXY assumed equal to the square of the observed standard error, sXY . Publication i bias is confounded with heterogeneity, and it is therefore advised to assume a random effect models XY XY [36]. It is assumed that the mean relative treatment effect in the study is modelled as 𝜃id = 𝜆XY + 𝛿 id , XY XY 2 where the random effects 𝛿id are normally distributed 𝛿id ∼ N(0, 𝜏 ). We assume a common betweenstudy variance (heterogeneity) 𝜏 2 across treatment comparisons. This assumption, although not always realistic, is widely used and is often necessary in practice. For the majority of networks, there are typically only a few studies per treatment comparison, which renders the estimation of comparison-specific heterogeneities challenging. ( BC XY |𝜃 ∼ N 𝜆BC + 𝛿 id , For example, in a study of BC design d = 3, the conditional likelihood is yBC id id ) ( BC )2 si with an overall mean effect size 𝜆BC . Alternatively, we may write the model for a two-arm trial ( ( )) 2 BC (i.e., B versus C) as yBC = 𝜆BC +𝛿id + 𝜀BC , where 𝜀BC ∼ N 0, sBC is a random error term. Then, id id id i the mean is written as a function of the basic parameters 𝜆BC = 𝜇AC − 𝜇AB or equivalently 𝝀𝟑 = X3 𝝁. In a multi-arm trial i of design d, the vector of the Td −1 contrasts is modelled as a multivariate normal distribution. For instance, if three treatments A, B, and C are included (d = 4) then (

| 𝜃 AB yAB id | id | AC |𝜃 yAC id | id

)

(( ∼N

𝜆AB 𝜆AC

)

( +

AB 𝛿id AC 𝛿id

) ( ,

( AB )2 ) )) ( si cov yAB , yAC id id . ( ) ( AC )2 , yAC cov yAB si id id

Assuming a common across treatment comparisons, the random effects are (( ) heterogeneity ( 1 )) parameter ( AB ) ( AB ) ( AB ) 1 𝛿id 0 𝜇 𝜆 , 𝜏2 1 2 ∼N = or 𝛌𝟒 = X4 𝛍. where AC 0 𝜆AC 𝜇AC 𝛿id 1 2 In its general form, the model can be written as 𝐲𝐢𝐝 = 𝛌𝐝 +𝛅𝐢𝐝 +𝛆𝐢𝐝 ( ) where 𝛅id is the vector of random effects(for contrasts in design d, where 𝛅id ∼N 𝟎, Δd , and 𝛆id is a vector ) with random error terms, where 𝛆id ∼N 𝟎, Sid . The matrix Sid is the within study variance–covariance matrix involving the variances s2i and the covariances (assumed known) between elements of 𝐲𝐢𝐝 , and Δd is the between studies variance–covariance. there is a common between-study variance across ( Assuming ) 1 12 treatment comparisons we have Δd =𝜏 2 1 and 𝜏 needs to be estimated [2]. 1 2 2.4. Selection model

5402

To model the probability with which studies are selected for publication, it has been assumed that there is a latent variable underlying each study [20]. This latent variable takes positive values if the specific study is published and negative values otherwise. There are as many latent variables as study designs, and each latent variable represents the propensity for publication given the design of that specific study. Thus, the propensity for publication of a study comparing AB is not the same with the propensity for a study comparing ABC. A similar selection model has been developed in a non-Bayesian framework [24]. Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

The propensity for publication for each design ( and ) study is denoted by zid and is modeled as a function of two parameters, 𝛼d and 𝛽d , and a function f Sid of the variance–covariance matrix of the estimated contrasts for this design. 𝛽 zid = 𝛼d + ( d ) + 𝜉id = uid + 𝜉id f Sid where 𝜉id ∼ N (0, 1), and we constrain 𝛽d ⩾ 0 to reflect the belief that larger studies are more likely to be published. The distribution of the random error 𝜉id defines the scale of the latent variable ( )zid . In a two-arm trial having dth design (i.e., X versus Y), we take f to be the identity function, f Sid = sXY . ( ) siAB +sidiAC Chootrakool et al. consider the average of the standard errors in the study, for example, f Sid = 2 in a three-arm trial [24]. Alternatively, we may consider a measure that takes the correlation structure into account such as the generalized variance, which is the determinant of the variance–covariance matrix |𝐒| 1 ( ) and is a scalar (measure of multidimensional scatter. If we have T treatments, we take f Sid = |𝐒| 2(T−1) . ) The function f Sid becomes (

f Sid

)

( AB )2 ( AB AC ) | 14 √ ( ) | | ( AB )2 ( AC )2 s cov yid , yid || 4 2 1 − 𝜌 = || s s = ( AC )2 ( idAB AC ) | id id ,yAC (yAB id id ) | cov yid , yid | sid | |

for a three-arm trial. If we have variances for all contrast-based variances, we can derive the arm-based variances and use the latter to derive the contrast-based covariances However, it is not common that studies report the variances for all contrast-based variances. In a two-arm ( trial, ) the square root of the generalized variance–covariance matrix is the estimated standard error f Sid = si . The generalized variance is negatively correlated with the covariance between the study-specific effect sizes; increasing covariances decrease the generalized variance and consequently increase the propensity for publication. The probability that a study i with design d is published is equal to ( ) ) ( ( ) 𝛽d (1) P zid > 0 = Φ 𝛼d + ( ) = Φ uid f Sid where Φ is the cumulative probability distribution of the standard normal distribution. Equation (1) provides us with an interpretation of parameters 𝛼d and 𝛽d . Parameter 𝛼d is the marginal probability that a study with design d and an infinite generalized variance (or standard error if design d refers to a two-arm study) will be published. Parameter 𝛽d is a discrimination parameter, discriminating the probabilities of publication between studies with different variance–covariance matrices. If for a specific study design 𝛽d = 0 and 𝛼d is a large number (such as 𝛼d = 10), then the probability of any study with that design being published is 1 irrespectively of the estimated standard errors; that is, that there is no publication bias. It is expected that 𝛽d is positive because larger studies are more likely to get published and the generalized variance is disproportionate ( ) to sample size. The inverse of Φ uid gives us the expected number ( ) of studies similar to study i (of design d) that has been conducted. By summing up the inverse of Φ uid across studies, we obtain an estimate of the total number of studies, published and unpublished TS =

n D ∑ ∑ i=1 d=1

(

1 ( ). Φ uid

)

Similarly, by summing the inverse of Φ uid for each design d, we estimate the total number of studies for each design. The total number of published and unpublished studies may be misleading as it does not provide any information whether the unpublished studies are systematically different than the published ones. 2.5. Combined measurement and selection model

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5403

The measurement and selection models for each design do not share common parameters but are connected through their residual terms. For instance, for design d referring to a two-arm trial (i.e., X versus

D. MAVRIDIS ET AL.

) ( Y), we have that 𝜌d = corr yid , zi , where i runs through all studies with design d. If 𝜌XY = 0, the effect d size yid has no effect on whether the study is published or not, and this is the model with no publication bias. If 𝜌XY is positive, then for published studies, we have zid > 0 , and we expect that 𝜉id would be posd itive (𝜉id is not distributed as N(0,1) in the published studies). Consequently, 𝜀id would be positive and the observed effect size yid would systematically overestimate the true effect 𝜆d , which is an indication of publication bias. If the outcome measures a harm or risk, we expect 𝜌XY < 0 so that, for a given standard d deviation, effect sizes further than zero have a larger probability of publication. We may assume that correlations for a specific contrast, that is, XY and propensity for publication depends on the study design. This assumption implies that if contrast AB engaged in various designs (i.e., in the example presented in Section 2.1, d = 1 refers to AB studies and d = 4 to ABC studies), then it could be the case that 𝜌AB ≠ 𝜌AB 1 4 ( ) ( ) D ∑ Td Td and for each design there are correlations resulting in a total of correlation parame2 2 d=1 ters that need to be estimated. Treatment comparisons should be given in the same order across studies (e.g., alphabetically, AB, AC, and BC) so that the model clearly identifies what a positive or negative correlation means for a specific comparison (which treatment is favored in the published studies). It is clear that the number of parameters may increase exponentially if there are many designs with multi-arm studies, and we may have little power to detect significant correlations and consequently the presence of ( ) D ∑ Td publication bias. In total, there are T − 1 summary estimates, correlation parameters and one 2 d=1 heterogeneity variance that need to be estimated. For a two-arm trial, the joint distribution for an effect size for a study design (i.e., A versus B) and its propensity for publication is a bivariate normal distribution (( (( ) )) ( AB ) 2 AB AB AB ) 𝜃id yid sAB 𝜌 s i d i ∼N , IzAB >0 , AB id zAB uAB 𝜌AB s 1 id id d i where IR is an indicator variable assuming value 1 if R is correct and zero otherwise. For a three-arm trial, the joint distribution of its effect sizes and propensity for publication is a multivariate truncated normal distribution ( AB AC ) AB AB ( AB )2 AB ⎛⎛ 𝜃 AB ⎞ ⎛ cov yid , yid 𝜌d si ⎞⎞ s ⎛ yid ⎞ i id ⎟⎟ ⎜ ⎜ ( ) ( )2 ⎜ yAC ⎟ ∼ N ⎜⎜ 𝜃 AC ⎟ , ⎜ cov yAB , yAC (2) 𝜌AC sAC ⎟⎟ Izid >0 . sAC id ⎟ id i d i id ⎜ id ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ ⎝ zid ⎠ 𝜌AB sAB 𝜌AC sAC 1 ⎠⎠ ⎝⎝ uid ⎠ ⎝ d

i

d

i

The expected values of the effect sizes for the two contrasts in a three-arm study are ( ) ( ( AB AB ) AB ) | ( ) yAB 𝜌d si 𝜃id | id z E + zid − uid = AC || id AC AC AC yid | 𝜃id 𝜌d si

(3)

and the corresponding variances are ( ) ( ( AB AC ) ) ( AB AB ) ( AB )2 | ( AB AB AC AC ) yAB 𝜌d si cov yid , yid s | id 𝜌d si 𝜌d si V − | zid = ( iAB AC ) ( AC )2 AC AC | yAC 𝜌 s cov yid , yid si id | d i It is clear from Equation (3) that, for any comparison XY, if 𝜌XY > 0 and because, for d ) a published study, ( XY would be smaller than the unadjusted one E yXY |z zid > 0, the adjusted mean 𝜃id id . id The key assumption in NMA is( that of) transitivity and consistency. Within a three-arm study, the ( AB ) ( BC ) transitivity relation implies that E yAC |z |z |z − E y = E y . Using Equation (2), we obtain id id id id id id ( ) ( ) ( ) AC AC AB AB BC BC + 𝜌AC −𝜌AB + 𝜌BC zid −uid − 𝜃id zid −uid = 𝜃id zid −uid 𝜃id d si d si d si Within a three-arm study, it holds that the random effects adjusted for selection are also consistent, that is,

5404

AC AB BC 𝜃id − 𝜃id = 𝜃id

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

Covariance is a linear operator, hence we obtain AC AB AB BC BC 𝜌AC d si − 𝜌d si = 𝜌d si

(4)

(

) Td which poses a restriction on the correlation coefficients within a multi-arm study of design d 2 that depends on the observed standard deviations. Unlike Chootrakool et al., we take both Equation (4) and the positive definiteness of the covariance matrix in Equation (2) into account when adjusting for publication bias. ( ) Td correlation In Table I, we describe the model for the example presented in Section 2.1. There are 2 parameters between propensity for publication and effect size for each design The most important parameters to inspect are the correlations 𝜌XY between effect sizes and propensity d for publication. When the probability of publication for a specific design is not related to the size of the effect(s), then correlation is expected to be zero. As mentioned earlier, differences in correlations for the same comparison across different designs may occur. For instance, we might estimate 𝜌AB > 𝜌AB , which 1 4 means that for a given effect size and standard error, an AB comparison is more likely to be published if it is in an ABC study rather than in an AB study. 2.6. Prior distributions for model selection parameters To fit the model described earlier, we need prior distributions for the selection model parameters 𝛼d and 𝛽d . These parameters connect to the probability of publication for studies with different study designs. Copas and Shi suggest using either fixed values for 𝛼 and 𝛽 or asking authors to identify an appropriate range of values for (𝛼, 𝛽) and use these values to explore the impact of publication bias in a sensitivity analysis [18, 20]. Alternatively, selection parameters a and 𝛽 can be treated as random variables [25]. ( ) high To extend this method to NMA, we need a lower and an upper bound Plow for the probability , P d d that a study of design d is published, where these extremes relate to the lowest and the largest observed high low generalized variances. ( ) Pd and Pd are modeled as random variables to reflect the uncertainty around them. Then, 𝛼d , 𝛽d are calculated for each study design from the inequalities ( ( )) high Plow ⩽ Pd ∀ d d ⩽ P zid > 0 | f Sid ( ) where P zid > 0 | f (Sid ) = Φ(uid ), where Φ denotes the cumulative distribution function of the standard normal distribution. We assume an inverse monotonic relationship between generalized variance and probability of publication so that we can derive values of ad and 𝛽d from the equations. ( ) 𝛽d ( ( )) = Φ−1 Plow d max f Sid ( ) 𝛽d high 𝛼d + ( ( )) = Φ−1 Pd min f Sid 𝛼d +

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5405

If the range of Sid values is narrow, then there will be relatively little impact on the adjusted estimates as the probability of publication will not differ a lot across studies. If there is a small number of studies and a wide range of Sid , results may be sensitive to the prior of the selection parameters. ( distributions ( ) ) high We need to obtain prior distributions for Plow ∼ U L , L ∼ U U , U and P 1d 2d 1d 2d to inform the d d parameters of the selection process. This can be achieved either using expert opinion or by conducting high a sensitivity analysis assuming a plausible range of values for Plow and Pd . Methods to inform the d selection model parameters by eliciting expert opinion have been presented in pairwise and star-NMA, where all treatments are compared with placebo [25]. Another important aspect of prior specification that needs attention is that of the prior distributions for the correlation parameters. Restrictions are posed both from Equation (4) and the fact that the variance– covariance matric in Equation (2) must be positive definite. A matrix is )positive definite if(all of the) ( AB AC leading principal minors are positive. To ensure that we need 𝜌AC r y , y >0, where r yAB , yAC −𝜌AB d id id id d id is the estimated correlation between effect sizes within a three-arm trial i.

5406

Copyright © 2014 John Wiley & Sons, Ltd.

= 1∀d

Aspirin + dipyridamole versus aspirin versus placebo (d = 4)

Aspirin + dipyridamole versus aspirin (d = 3)

Plarge = 1∀d d

Plow =1 3 Plarge =1 3 Plow ∼ U(0.1, 0.2) 4 Plarge ∼ U(0.4, 0.5) 4

Plow =1 3 Plarge =1 3 Plow ∼ U(0.4, 0.5) 4 Plarge ∼ U(0.7, 0.8) 4

Plow =1 2 Plarge =1 2

Plow =1 2 Plarge =1 2

Aspirin + dipyridamole versus placebo (d = 2)

Selection probabilities

Scenario 3 Severe probability of publication for designs 1 and 4

Plow ∼ U(0.1, 0.2) 1 Plarge ∼ U(0.4, 0.5) 1

Scenario 2 Moderate probability of publication for designs 1 and 4

Plow ∼ U(0.4, 0.5) 1 Plarge ∼ U(0.7, 0.8) 1

Plow d

Scenario 1 No selection bias

Aspirin versus placebo (d = 1)

Design

Scenario Amount of bias

Table I. Five scenarios for the probability of publication for studies of each of the four designs.

Plow ∼ U(0.1, 0.2) 4 Plarge ∼ U(0.4, 0.5) 4

Plow = U(0.1, 0.2) 3 Plarge = U(0.4, 0.5) 3

Plow = U(0.1, 0.2) 2 Plarge = U(0.4, 0.5) 2

Plow ∼ U(0.1, 0.2) 1 Plarge ∼ U(0.4, 0.5) 1

Scenario 4 Severe probability of publication for designs 1–4

Plow =1 4 Plarge =1 4

Plow =1 3 Plarge =1 3

Plow =1 2 Plarge =1 2

Plow ∼ U(0.1, 0.2) 1 Plarge ∼ U(0.4, 0.5) 1

Scenario 5 Severe probability of publication for design 1

D. MAVRIDIS ET AL.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

We can re-write Equation (4) so that correlations are study-specific AC AB AB BC BC 𝜌AC id si − 𝜌id si = 𝜌id si BC and work out the inequalities from Equation (4) so that −1 ⩽ 𝜌BC id ⩽ 1. We get that 𝜌id ⩽ 1 ↔

𝜌AC ⩽ id

sBC +sAB 𝜌AB i id i sAC i

= ui and 𝜌BC ⩾ −1 ↔ 𝜌AC ⩾ id id

−sBC +sAB 𝜌AB i id i

sAC i definiteness of the variance–covariance matrix is now 𝜌AB < id ′ into account, we assume a hierarchical model over ρid s so

= li . The restriction for the positive AC ( AB AC ) this restriction 𝜌id r yid , yid . (Taking ) AB AB 2 that 𝜌id ∼ N 𝜌d , 𝜏𝜌 and 𝜌AC ∼ id

) ( ( AC AC AB AB ) 2 T max(−1, l ), min(1, u ) and 𝜌BC = 𝜌id si −𝜌id si with 𝜏 2 ∼ N , 𝜏 N 𝜌AC (0, 1) T (0, ∞) , where T(a, b) i i 𝜌 𝜌 d id sBC i denotes that the distribution is truncated on [a,b]. We then estimate the inverse variance weighted mean to obtain a summary estimate for 𝜌BC of 𝜌BC id d

3. Application to antiplatelet data We re-analyzed the data from a network that has been recently used to illustrate the extension of the Copas model to multi-arm trials [24] and the use of network meta-regression to evaluate the existence of smallstudy effects [37]. The network consists of 31 studies and three treatments, namely, placebo (A), aspirin (B), and aspirin plus dipyridamole (C). The outcome is failure of ) vascular graft or arterial patency. There ( = 2 , aspirin are four study designs: aspirin versus placebo d = 1, T 1 ) ( ) + dipyridamole versus placebo ( versus aspirin d = 3, T = 2 , and aspirin + dipyridamole versus d = 2, T2 = 2 , aspirin( + dipyridamole 2 ) aspirin versus placebo d = 4, T2 = 3 . The number of studies in each design is n1 = 7, n2 = 14, n3 = 4, and n4 = 6, respectively. We re-analyzed the data to obtain all relative treatment effects and to rank the treatments. MCMC allows us to estimate the probability that each treatment assumes any of the possible ranks by computing the percentage of times it assumes that rank in the MCMC cycles. It has been suggested to rank the available treatments by computing the Surface Under the Cumulative Ranking curve known as SUCRA [38]. SUCRA values represent the relative effectiveness of an intervention with respect to an imaginary intervention which is the best without uncertainty. It is actually the average cumulative probability for each treatment. If the treatment always ranks first, SUCRA equals one whereas if the treatment always ranks last it equals zero. For each intervention t, ∑we compute the probabilities of assuming each rank, and SUCRA values are T−1

cumptj

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5407

estimated as SUCRAt = j=1T−1 , where cumptj is the cumulative probability that treatment t would lie within the first j places. In all scenarios, we ran a burn-in period of 30,000 draws from two chains after which convergence was deemed to be satisfactory as judged by the history plot of the two chains. All results are based on a further run of 170,000 MCMC draws. Vague normal prior distributions are used for treatment effects (i.e., N(0,1000)). Prior distributions for the correlation parameters are taken as suggested in Section 2.6. We used a truncated normal on [0,100] for the square root of heterogeneity 𝜏 as suggested by Gelman [39]. Code in OpenBUGS is provided (see supplementary material and can also be found in www.mtm.uoi.gr). The selection process was informed by the contour-enhanced funnel plots of placebo-control trials for two-arm and three-arm studies [20, 40]. Figure 1 shows the contourenhanced funnel plots for the logarithm of odds ratio (OR) versus its standard error for aspirin versus placebo (left-hand side plot), aspirin + dipyridamole versus placebo (center), and aspirin + dipyridamole versus aspirin (right-hand side plot). Circles refer to two-arm studies and triangles to three-arm studies. There is a clear asymmetry in the funnel plot for the AB comparison in two-arm studies, but the AC funnel plot does not look asymmetric, especially for the two-arm trials. There does not seem to be an association between magnitude of effect and standard error in the BC comparison. There are only six three-arm studies which, although their funnel plot looks asymmetric, cannot give evidence against or for the presence of publication bias. Based on the visual inspection of the contour-enhanced funnel plots, we assumed that aspirin versus placebo trials and three-arm trials are likely to suffer from publication bias. We assumed five scenarios for the probability of publication for studies of various designs. These scenarios are depicted in Table I. The first scenario assumes that all conducted studies are selected for publication (no publication bias). The second and third scenarios assume that all studies of the second and third designs are selected for publication while studies of the first and fourth designs are selected for publication with probabilities

D. MAVRIDIS ET AL.

Figure 1. Contour-enhanced funnel plots for yAB against sAB where (d = 1, 4) (left plot), for yAC against sAC where id id id id BC BC (d = 2, 4) (middle plot), and for yid against sid where (d = 3, 4). Triangles refer to three-arm trials and circles to two-arm studies. Lines refer to contours for different levels of significance. Table II. Correlations between effect sizes and propensity for publication under the five selection model scenarios presented in Table 2. Scenario 1

Scenario 3

Scenario 4

Scenario 5

Correlation 𝜌XY between effect size d for contrast XY in design d and propensity for publication

Design d Aspirin versus placebo (d = 1) n1 = 7

Scenario 2

𝜌AB = 0.00 1 (−0.95, 0.95)

𝜌AB = −0.56 𝝆AB = −𝟎.𝟔𝟗 𝝆AB = −𝟎.𝟔𝟗 𝜌AB = −0.58 1 1 𝟏 𝟏 (−0.98, 0.25) (−𝟎.𝟗𝟗, −𝟎.𝟎𝟗) (−𝟎.𝟗𝟗, −𝟎.𝟎𝟔) (−0.98, 0.06)

Aspirin + dipyridamole versus 𝜌AC = −0.01 𝜌AC = 0.00 2 2 Placebo (d = 2) n2 = 14 (−0.95, 0.94) (−0.95, 0.95)

𝜌AC = 0.00 2 (−0.95, 0.95)

𝜌BC = −0.22 𝜌AC = 0.00 3 2 (−0.72, 0.33) (−0.95, 0.95)

Aspirin + dipyridamole versus aspirin (d = 3) n3 = 4

𝜌BC = 0.00 3 (−0.95, 0.95)

𝜌BC = 0.00 3 (−0.95, 0.95)

𝜌BC = 0.00 3 (−0.95, 0.96)

𝜌BC = −0.11 𝜌BC = 0.00 3 3 (−0.84, 0.73) (−0.95, 0.95)

Aspirin + dipyridamole versus aspirin versus placebo

𝜌AB = −0.01 4 (−0.97, 0.88)

𝜌AB = −0.39 1 (−0.91, 0.33)

𝜌AB = −0.55 4 (−0.93, 0.03)

𝜌AB = −0.54 𝜌AB = 0.00 1 4 (−0.92, 0.06) (−0.83, 0.83)

(d = 4) n4 = 6

= 0.01 𝜌AC 4 (−0.96, 0.93)

𝜌AC = −0.29 𝜌AC = −0.34 4 4 (−0.88, 0.44) (−0.84, −0.25)

𝜌AC = −0.41 2 (−0.88, 0.19)

𝜌AC = −0.01 4 (−0.85, 0.83)

= −0.16 𝜌BC 4 (−0.50, 0.18)

𝜌BC = −0.18 4 (−0.52, 0.16)

𝜌BC = −0.19 4 (−0.53, 0.15)

𝜌BC = −0.16 4 (−0.50, 0.18)

𝜌BC = −0.17 4 (−0.51, 0.17)

Credible intervals are given in parenthesis. Credible intervals that do not include zero are printed in bold. In scenario 1 (no publication bias), the 𝜌 estimates are effectively represented by their prior distributions and are not informed by the data. This is the case in all designs for which we have assumed there are no unpublished studies.

5408

defined in the third and fourth columns, respectively, of Table II. The fourth scenario assumes the same selection mechanism for all four designs (column 5 of Table II). The fifth scenario assumes there are lots of unpublished aspirin versus placebo studies but not of any other design (column 6 of Table II). Similar to the first three scenarios were also considered in a previous paper that uses fixed values instead of large probability distributions for Plow and Pd [24]. Bayesian inference includes uncertainty in the selection d process, yielding more realistic predictions. Tables II and III show the results of model fitting. Table II shows the correlation parameters for all contrasts by design for the five selection model scenarios presented in Table I. The correlation between the treatment effect of aspirin versus placebo and propensity for publication differs in absolute values across designs one and four when the same selection mechanism is assumed, but differences are not = −0.56 while 𝜌AB = statistically significant. More specifically, for the second scenario, we have 𝜌AB 1 4 −0.39. Differences between two-arm and three-arm studies are more striking in the other designs that Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

Table III. Treatment effects, 95 % credible intervals, heterogeneity estimate 9, and SUCRA values under the five selection model scenarios presented in Table 2. Treatments

Scenario 1

Scenario 2

Scenario 3

Scenario 4

Scenario 5

Placebo

OR (CrI) SUCRA mean rank

Reference Reference 0 3

Reference Reference 0 3

Reference Reference 0 3

Reference Reference 0 3

Reference Reference 0 3

Aspirin

OR (CrI) SUCRA Mean rank

0.50 (0.37,0.64) 0.90 1.19

0.55 (0.41,0.71) 0.83 1.35

0.64 (0.47,0.83) 0.69 1.62

0.64 (0.45,0.85) 0.78 1.43

0.57 (0.42,0.73) 0.77 1.45

Aspirin + dipyridamole

OR (CrI) SUCRA Mean rank

0.56 (0.45,0.69) 0.61 1.81

0.58 (0.46,0.72) 0.67 1.65

0.62 (0.48,0.76) 0.81 1.38

0.66 (0.48,0.89) 0.71 1.57

0.58 (0.46,0.71) 0.73 1.55

Aspirin versus aspirin + dipyridamole

OR (CrI)

1.12 (0.79,1.59)

1.07 (0.79,1.47)

0.96 (0.70,1.11)

1.04 (0.70,1.58)

1.03 (0.76,1.39)

0.32 (0.12,0.57)

0.33 (0.09,0.58)

0.31 (0.10,0.54)

0.35 (0.15,0.58)

0.32 (0.11,0.55)

Heterogeneity 𝜏

The summary estimates for the head-to-head comparison are also given. CrI, credible intervals.

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5409

assume different selection mechanisms, for example, 𝜌AC = 0.00 vs 𝜌AC = −0.29 . Correlations for the 2 4 fourth design are constrained by Equation (4) and the positive definiteness of the covariance matrix in Equation (2); the Credible Intervals obtained ( (CrIs) ) from the first scenario show the range of possible AC BC values for the three correlations 𝜌AB . All 95% CrIs for the correlations include 0 in the 𝜌 and 𝜌 4 4 4 second scenario. In the third scenario that assumes there are lots of unpublished studies for designs 1 and 4, correlations associated with comparison AB are statistically significant for design 1 and marginally non-significant for design 4. Scenario 4 yields similar results to scenario 3, although we have assumed that there are unpublished studies not only of designs 1 and 4 but also of designs 2 and 3. The results from scenario 4 suggest that even if we assume low probability of being published for the second and the third designs, this is not translated into publication bias. Both 𝜌AC and 𝜌BC , although they have negative 2 3 mean values, their 95% CrIs are around zero and suggest that propensity for publication is only weakly associated with the magnitude of relative treatment effects. Assuming that only the fourth design has selection bias (scenario 5), we obtain correlations close to zero; there is probably not enough power to detect a significant correlation between the effect size and propensity for publication for aspirin versus placebo trials. In Scenarios 2, 3, and 4 that place a selection mechanism on the fourth design (three-arm trials), the estimate for the BC comparison (head-to-head) remains unchanged suggesting that although publication bias may operate in the placebo-control comparisons in a three-arm trial, it does not operate in the BC comparison. Table III shows the estimated standard deviation of heterogeneity 𝜏, the SUCRA values, the mean rank, and the summary OR for the three treatments accompanied by their 95% CrI under the three different scenarios. There is also a line pertaining to the head-to-head comparison. The estimated heterogeneity increases slightly with increasing selection. SUCRA values and the estimated OR suggest that Aspirin is the best treatment in all, but the third scenario in which aspirin + dipyridamole appears to be the most effective intervention. It should be noted that ranking measures are associated with much uncertainty, and we should be careful with the ranking interpretation. However, it is obvious that assuming a strong selection of AB studies decreases effectiveness of aspirin. Both effects of aspirin and aspirin + dipyridamole reduce as we allow for selection bias suggesting that their effect from the published data might be exaggerated. Chootrakool et al. assumed similar scenarios to scenarios 2 and 3 and found similar results. More specifically, we found that the aspirin versus placebo OR increases from 0.50 (no selection bias) to

D. MAVRIDIS ET AL.

0.64 (scenarios 3 and 4). Similarly, the aspirin + dipyridamole versus placebo effect increases from 0.56 to 0.66 (scenario 4). Chootrakool et al., in a similar selection bias scenario, estimated the intervention effect of aspirin versus placebo to be 0.61 and of aspirin + dipyridamole versus placebo to be 0.63.

4. Discussion

5410

In this paper, we have described a Bayesian extension of the Copas selection model for NMA, where a different probability for publication is assumed according to three study characteristics: the study size, the study design, and the estimated effect size. We also accounted for consistency between direct and indirect sources of evidence when adjusting results for publication bias. The antiplatelet example illustrates how probability of publication can be related to study design and how relative effectiveness of treatments may change if publication bias is present and accounted for. We employed a sensitivity analysis based on scenarios that reflect moderate and large selection of studies. The scenarios were formed after considering previous assumptions about the same dataset [24] that were corroborated by a visual inspection of the comparison-adjusted funnel plots. An advantage of the extended Copas selection model presented in this paper is that it allows investigators to explore different assumptions about the implications of study design in the propensity for publication. Study design might interact with sample size and effect size, and therefore, it is often impossible to say which characteristics impact most on the propensity for publication. For instance, Chan and Altman found that 26% of the published randomized trials are multi-arm studies, and it has been argued that multi-arm trials have a larger probability of publication [41]. However, the larger probability of publication in multi-arm studies could be attributed to their design (comparison of more than one interventions is of interest to decision makers) or to their typically large sample size. Placebo-controlled trials are more likely to show big effects in general, and the relative effectiveness of a treatment compared with placebo may be larger in a two-arm trial rather than in a multi-arm trial. Even within placebo-controlled trials, studies might have different probability of publication depending on the active compound. For example, studies evaluating novel treatments may have a larger likelihood to be published as there is much interest in the medical community about their effectiveness. Finally, head-to-head studies can also have a higher probability to be published because of their design. Such studies are often non-inferiority trials or publically funded trials, so their results are likely to be published even if the estimated relative effect is small and non-significant. We generally recommend that expert opinion in the field is collected in order to assess the plausibility of the various scenarios about the role of study design in the propensity of publication and form reasonable assumptions about the model parameters. Clinical expertise may inform not only the mechanism by large and Pd [26] but also the interaction which studies are selected for publication through parameters Plow d between study design and propensity for publication. As the probability of publication for a study may be connected to reasons other than sample size and study design fitting the model in a Bayesian setting offers the flexibility to introduce covariates to the selection process. For example, we may use an indicator variable Xid to express whether study i that used design d is sponsored (Xid = 1) or not (Xid = 0) and model the propensity of publication for 𝛽 study i as zid = 𝛼d + f 𝐒d + 𝛾d Xid + 𝜉 id . In this case, the constant will be either ad + 𝛾d or ad ( id ) and will control the probability of publication of an infinitely large study that is either sponsored or not, respectively. In all non-Bayesian applications of this selection model, a likelihood ratio statistic is used to test the association between intervention effect and standard error (funnel plot asymmetry). This is based on asymptotic results. Most meta-analyses have only a small number of studies calling into question the validity of the likelihood methods both for fitting and testing the model [37]. Bayesian inference using MCMC offers a series of advantages compared with likelihood methods traditionally used in the literature. There are numerical problems with maximizing the likelihood of the selection model, and asymptotic results may not hold for small number of studies, which is often the case in meta-analysis. In the absence of prior knowledge, the use of probability distributions instead of fixed values for the parameters that govern the selection process provides a more realistic and flexible framework. We place emphasis on the correlation between propensity for publication and strength of results. If this correlation is significant, bias arises. Exploring the posterior distribution of this correlation by taking 95% CrIs provides a straightforward method to test for publication bias. Overall, as Bayesian setting allows to account for Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

D. MAVRIDIS ET AL.

uncertainty in all model parameters (correlations, probabilities for publication, heterogeneity, etc.), our approach reflects the additional uncertainty arising from publication bias more successfully compared with the standard Copas model or its extension for multi-arm studies. It should be noted that the consistency equations connect all summary estimates of relative effectiveness between comparisons. The consistency assumption makes non-zero correlations to spread out to contrasts that have no significant correlations to propensity, themselves. Also, because it is possible to adjust effect sizes of a comparison in a three-arm trial but not in a two-arm trial (or vice versa), we may introduce inconsistency. Results should be treated with caution. We suggest setting as control in each design the treatment that is not likely to be favoured in the presence of publication bias, but this is not always possible. We may result in large positive and negative correlations for different designs if a treatment that is favoured by publication bias is the control in one design and the experimental treatment in another one. A limitation of the method is that it requires consistency of bias throughout the studies. For example, if significant AB studies are published in both directions (favouring A and B) but non-significant studies are not published, the model may yield a zero correlation, although there is a relation between magnitude of effect and propensity for publication. A small number of studies in a design may mask publication bias, and the adjustment in effect estimates may not be substantial or precise. Any adjustment in any effect estimate may spread to other designs via the consistency equations, even if no publication bias is assumed in these designs.

Acknowledgements D. Mavridis and G. Salanti received research funding from the European Research Council (IMMA 260559).

References

Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

5411

1. Caldwell DM, Ades AE, Higgins JP. Simultaneous comparison of multiple treatments: combining direct and indirect evidence. BMJ 2005; 331:897–900. 2. Higgins JPT, Whitehead A. Borrowing strength from external trials in a meta-analysis. Statistics in Medicine 1996; 15: 2733–2749. 3. Lu G, Ades AE. Combination of direct and indirect evidence in mixed treatment comparisons. Statistics in Medicine 2004; 23:3105–3124. 4. Salanti G, Higgins JPT, Ades AE, Ioannidis JPA. Evaluation of networks of randomized trials. Statistical Methods in Medical Research 2008; 17:279–301. 5. Bafeta A, Trinquart L, Seror R, Ravaud P. Analysis of the systematic reviews process in reports of network meta-analyses: methodological systematic review. BMJ 2013; 347:f3675. DOI: http://dx.doi.org/10.1136/bmj.f3675. 6. Lee AW. Review of mixed treatment comparisons in published systematic reviews shows marked increase since 2009. Journal of Clinical Epidemiology 2014; 67(2):138–143. 7. Nikolakopoulou A, Chaimani A, Veroniki AA, Vasiliadis HS, Schmid CH, Salanti G. Characteristics of networks of interventions: a description of a database of 186 published networks. PLoS One 2014; 9:e86754. DOI: 10.1371/journal.pone.0086754. 8. Dias S, Welton NJ, Caldwell DM, Ades AE. Checking consistency in mixed treatment comparison meta-analysis. Statistics in Medicine 2010; 29:932–944. 9. Salanti G. Indirect and mixed-treatment comparison, network, or multiple-treatments meta-analysis : many names, many benefits, many concerns for the next generation evidence synthesis school. Research Synthesis Method 2012; 3:80–97. 10. Cooper NJ, Sutton AJ, Morris D, Ades AE, Welton NJ. Addressing between-study heterogeneity and inconsistency in mixed treatment comparisons: application to stroke prevention treatments in individuals with non-rheumatic atrial fibrillation. Statistics in Medicine 2009; 28:1861–1881. 11. Salanti G, Marinho V, Higgins JP. A case study of multiple-treatments meta-analysis demonstrates that covariates should be considered. Journal of Clinical Epidemiology 2009; 62(8):857–864. 12. Sutton AJ, Song F, Gilbody SM, Abrams KR. Moddeling publication bias in meta-analysis : a review. Statistical Methods in Medical Research 2000; 9:421–445. 13. Thornton LP. Publication bias in meta analysis : its causes and consequences. Journal of Clinical Epidemiology 2000; 53:207–216. 14. Chaimani A, Vasiliadis HS, Pandis N, Schmid CH, Welton NJ, Salanti G. Effects of study precision and risk of bias in networks of interventions: a network meta-epidemiological study. International Journal of Epidemiology 2013; 42: 1120–1131. 15. Egger M, Smith GD, Scheider M, Minder C. Bias in meta analysis detected by a simple graphical test. BMJ 1997; 315: 629–634. 16. Moreno SG, Sutton AJ, Ades AE, Stanley TD, Abrams KR, Peters JL, Cooper NJ. Assessment of regression-based methods to adjust for publication bias through a comprehensive simulation study. BMC Medical Research Methodology 2009; 9:2. DOI:10.1186/1471-2288-9-2.

D. MAVRIDIS ET AL. 17. Chaimani A, Higgins JP, Mavridis D, Spyridonos P, Salanti G. Graphical tools for network meta-analysis in STATA. PLoS One 2013; 8:e76654. DOI: 10.1371/journal.pone.0076654. 18. Copas J, Shi JQ. Meta-analysis, funnel plots and sensitivity analysis. Biostatistics 2000; 1(3):247–262. 19. Copas JB. What works? Selectivity models and meta-analysis. Journal of the Royal Statistical Society, Seies A 1999; 162(1):95–109. 20. Copas JB, Shi JQ. A sensitivity analysis for publication bias in systematic reviews. Statistical Methods in Medical Research 2001; 10:251–265. 21. Heckman JJ. Sample selection bias as a specification error. Econometrica 1979; 47:153–161. 22. Carpenter JR, Schwarzer G, Rucker G, Kunstler R. Empirical evaluation showed that the Copas selection model provided a useful summary in 80% of meta-analyses. Journal of Clinical Epidemiology 2009; 62:624–631. 23. Schwarzer G, Carpenter J, Rucker G. Empirical evaluation suggests Copas selection model preferable to trim-and-fill method for selection bias in meta-analysis. Journal of Clinical Epidemiology 2010; 63:282–288. 24. Chootrakool H, Shi JQ, Yue R. Meta-analysis and sensitivity analysis for multi-arm trials with selection bias. Statistics in Medicine 2011; 30:1183–1198. 25. Mavridis D, Sutton A, Cipriani A, Salanti G. A fully Bayesian application of the Copas selection model for publication bias extended to network meta-analysis. Statistics in Medicine 2013; 32:51–66. 26. Higgins JPT, Jackson D, Barrett JK, Ades A, White IR. Consistency and inconsistency in network meta-analysis: concepts and models for multi-arm studies. Research Synthesis Method 2012; 3:98–110. 27. White IR, Barrett JK, Jackson D, Higgins JPT. Consistency and inconsistency in network meta-analysis: model estimation using multivariate meta-regression. Research Synthesis Methods 2012; 3:111–125. 28. Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, Adams JR, Kuderer NM, Lyman GH. The uncertainty principle and industry-sponsored research. Lancet 2000; 356:635–638. 29. Lundh A, Sismondo S, Lexchin J, Busuioc OA, Bero L. Industry sponsorship and research outcome. Cochrane Database Systematic Reviews 2012. DOI: 10.1002/14651858.MR000033.pub2. 30. Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results of direct and indirect treament comparisons in meta-analysis of randomized controlled trials. Journal of Clinical Epidemiology 1997; 50:683–691. 31. Dias S, Welton NJ, Sutton AJ, Caldwell DM, Lu G, Ades A. Inconsistency in networkds of evidence based on randomized controlled trials. Medical Desicion Making 2013; 33(5):641–656. 32. Lu G, Ades AE. Assessing evidence inconsistency in mixed treatment comparisons. Journal of the American Statistical Association 2006; 101:447–459. 33. Konig J, Krahn U, Binder H. Visualizing the flow of evidence in network meta-analysis and characterizing mixed treatment comparisons. Statistics in Medicine 2013; 32:5414–5429. 34. Krahn U, Binder H, Konig J. A graphical tool for locating inconsistency in network meta-analyses. BMC Medical Research Methodology 2013; 13:35. 35. Franchini AJ, Dias S, Ades AE, Jansen JP, Welton NJ. Accounting for correlation in network meta-analysis with multi-arm trials. Research Synthesis Method (Special issue on network meta-analysis) 2012; 3:142–160. 36. Copas JB. A likelihood-based sensitivity analysis for publication bias in meta-analysis. Journal of the Royal Statistical Society, series C 2013; 62:47–66. 37. Chaimani A, Salanti G. Using network meta-analysis to evaluate the existence of small-study effects in a network of interventions. Reserach Synthesis Methods 2012; 3:161–176. 38. Salanti G, Ades A, Ioannidis J. Graphical methods and numerical summariesfor presenting results from multiple-treatment meta-analysis: an overview and tutorial. Journal of Clinical Epidemiology 2011; 64:163–171. 39. Gelman R. Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 2006; 1:515–534. 40. Peters JL, Sutton AJ, Jones DR, Abrams KR, Rushton L. Contour-enhanced meta-analysis funnel plots help distinguish publication bias from other causes of asymmetry. Journal of Clinical Epidemiology 2008; 61:991–996. 41. Chan AW, Altman DG. Identifying outcome reporting bias in randomised trials on PubMed: review of publications and survey of authors. British Medical Journal 2005; 330:753. DOI: http://dx.doi.org/10.1136/bmj.38356.424606.8F.

Supporting information Additional supporting information may be found in the online version of this article at the publisher’s web site.

5412 Copyright © 2014 John Wiley & Sons, Ltd.

Statist. Med. 2014, 33 5399–5412

A selection model for accounting for publication bias in a full network meta-analysis.

Copas and Shi suggested a selection model to explore the potential impact of publication bias via sensitivity analysis based on assumptions for the pr...
243KB Sizes 0 Downloads 7 Views