Journal of Clinical Epidemiology 67 (2014) 305e313

A users’ guide to understanding therapeutic substitutions Edward J. Millsa,b,*, David Gardnerc, Kristian Thorlundb,d, Matthias Brield,e, Stirling Bryanb, Brian Huttonf, Gordon H. Guyattd b

a Faculty of Health Sciences, University of Ottawa, 43 Templeton Street, Ottawa, Canada K1N6X1 Centre for Clinical Epidemiology and Evaluation (C2E2), University of British Columbia, 828 West 10th Ave, Research Pavilion, Vancouver, Canada, V5Z 1M9 c College of Pharmacy, Faculty of Health Professions, Dalhousie University, 5968 College St, Halifax, Nova Scotia, Canada, B3H 4R2 d Department of Clinical Epidemiology & Biostatistics, McMaster University, 1280 Main St. West, Hamilton, Ontario, Canada, L8S 4K1 e Basel Institute for Clinical Epidemiology and Biostatistics, University Hospital Basel, CH-4031, Basel, Switzerland f Clinical Epidemiology Program, Ottawa Hospital Research Institute, 725 Parkdale Ave, Ottawa, Ontario, Canada, K1Y 4E9

Accepted 3 September 2013; Published online 2 December 2013

Abstract Therapeutic substitutions are common at the level of ministries of health, clinicians, and pharmacy dispensaries. Guidance in determining whether drugs offer similar riskebenefit profiles is limited. Those making decisions on therapeutic substitutions should be aware of potential biases that make differentiating therapeutic agents difficult. Readers should consider whether the biological mechanisms and doses are similar across agents, whether the evidence is sufficiently valid across agents, and whether the safety and therapeutic effects of each drug are similar. This article uses a problem-based format to address the biological mechanism, validity, and results of a scenario in which therapeutic substitutions may be considered. Ó 2014 Elsevier Inc. All rights reserved. Keywords: Therapeutic substitutions; Class effects; Network meta-analysis; Generic substitutions; Statins; Evidence-based medicine

1. Introduction By its broadest definition, a therapeutic substitution (or therapeutic interchange) occurs when a medication is automatically provided in a manner other than prescribed, whether by changing the dose, formulation, or medication. A guideline by the American College of Clinical Pharmacy provides a comprehensive review on therapeutic substitutions [1]. This guideline recommends that therapeutic substitution policies be limited to institutions and health systems with a functioning formulary, pharmacy and therapeutics committee, that rationale for each substitution policy be readily available to all clinicians, and that the clinical, economic, and humanistic impact be measured. Determining the suitability of therapeutic substitutions is usually based on an evaluation of the empirical data and pharmacopathophysiologic reasoning. Because of the typical inadequacies of the former and the subjective nature of Funding: Pfizer Canada and Canadian Institutes of Health Research (CIHR) provided support for this manuscript. Pfizer Canada’s contribution was made through an institutional donation to the Centre for Clinical Epidemiology & Evaluation (C2E2) at the University of British Columbia. * Corresponding author. Tel.: 778-317-8530; fax: 604-875-5179. E-mail address: [email protected] (E.J. Mills). 0895-4356/$ - see front matter Ó 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jclinepi.2013.09.008

the latter, a rigorous and reproducible process is required to support the establishment of what are intended to be widely acceptable and valid therapeutic substitution policies. In this users’ guide (see Box 1), we focus on evaluating the suitability of substituting one medication when another is prescribed as a policy affecting a population of patients as opposed to when an individual pharmacist uses their clinical discretion to substitute one medication for another. To justify the substitution policy, the replacement medication needs to demonstrate predictable effectiveness that is equal to that of the originally prescribed medication but is preferred because of one or more distinct advantages including improved tolerability, safety, access, cost, or convenience. Therapeutic substitution differs from generic substitution, which involves the use of a pharmacokinetically equivalent form of the same medication as a generic. A therapeutic substitute need not be from the same pharmacological class and is typically based on evidence in the form of randomized outcome studies. It can differ in its mechanism (pharmacology) and its pharmacokinetics, resulting in clinical differences in its adverse effect and drug interaction profiles, desired and undesired [1]. The ability for health workers to change a prescribed medication without involving the prescribing physician varies

306

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

What is new?

Box 1 Readers guide questions

Key findings  Therapeutic substitutions are the change of a prescribed medication by a person or institution other than the prescribing clinician. Therapeutic substitutions broadly assume similar effectiveness and safety of the substituted drug. However, methods to assess whether the drugs are similar have been lacking. In this review, we suggest questions a user should ask when determining if substitutions are valid.

The biological agent  Are the agents biologically similar?

What this adds to what was known?  We ask as a series of methodological questions to ascertain the similarity of the drugs, the validity of the evidence and whether the results are clear. We demonstrate that even in common prescribing areas, such as statin therapy, there are potentially important differences at the level of the biological agent, the quality of the evidence, and the clarity of the results.

 If indirect evidence is used, is it sufficiently convincing?

What is the implication and what should change now?  This review suggests users should be cautious in assuming therapeutic similarity and seek convincing evidence before a therapeutic substitution can be applied with confidence.

 Are treatment effects similar across agents?

Are the sources of evidence valid?  What is the geometry of evidence for your evaluation?  Does evidence exists from large head-to-head evidence?

 Are the end points in clinical trials of similar importance that a patient would consider them equal?

What are the results?  Are there important differences in the number of trials representing different agents?  Would the addition of sufficiently powered evidence change the results of direct or indirect evidence?  Are adverse events similar across agents?  What is the overall quality and limitations of the evidence?

within and across countries. In Canada, for example, 6 of 13 provinces permit generic and therapeutic substitutions [2]. In the United Kingdom, pharmacists are permitted to conduct therapeutic substitutions for within-class agents and across classes for more minor conditions [3]. Much of Europe, as well as New Zealand and Australia, have followed suit [4,5]. In the United States, certain health systems and health management organizations may permit therapeutic substitutions, but this is normally decided upon by private companies rather than a government regulator [6]. Hospitals (health districts and trusts) have had therapeutic substitution policies in place for decades [7], a classic example being the automatic use of oral amoxicillin when oral ampicillin is prescribed. Amoxicillin’s advantages are reduced dosing frequency (every 8 instead 6 hours) and reliable absorption regardless of stomach contents. Policy decisions for therapeutic substitutions are generally proposed by a hospital or health district’s Drugs and Therapeutics (D&T) committee, composed of hospital physicians, pharmacists, and nurses, and ratified by the Medical Advisory Committee (or its equivalent). Nonetheless, these policies occasionally conjure up acrimony, often based on a judgment-based disagreements or a misunderstanding of the process [8e10].

An underlying assumption with therapeutic substitutionsd an assumption that may or may not be accuratedis that the replacement drug offers similar therapeutic efficacy and at least as good a safety profile as the prescribed drug. However, the methods for determining whether two drugs exhibit the same therapeutic effectiveness or safety are not well established. To date, considerations regarding the methodological shortcomings of therapeutic substitution have received inadequate consideration [10,11]. Conceptually, determining whether a drug is sufficiently similar to another drug should be based on its evidence profile rather than its name or mechanism of action alone. Using a series of methodological questions developed by methodological and clinical experts (Box 1), we use the clinical example of 3-hydroxymethyl-3-methylglutaryl coenzyme A reductase inhibitors (statins) to determine whether therapeutic substitution offers patients a sufficiently similar efficacyesafety profile to justify interchangeable use of different statins. We chose statins as the example because they have been well evaluated in more than 80 randomized clinical trials (RCTs) [12], are one of the most

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

widely prescribed drugs in the history of modern medicine, used for both primary and secondary cardiovascular disease (CVD) protection [13,14], and are currently permitted for therapeutic substitutions at the level of pharmacists in six provinces in Canada [15].

2. Using the guide 2.1. The biological agent 2.1.1. Are the agents biologically similar? There is no uniformly accepted definition of a class effect [10,11]. Although the exact mechanism of action of drugs is rarely known, the biological target of a drug may be well established. For example, although all pharmacological antihypertensives reduce blood pressure, there are several unrelated putative mechanisms involved (eg, natriuresis with diuretics, inhibition of vascular cellular calcium influx with calcium channel blockers, and impaired synthesis of the vasoconstrictor angiotensin II with angiotensinconverting enzyme inhibitors). Although these different mechanisms may result in similar changes in blood pressure, their ultimate effect on cardiovascular morbidity and mortality cannot be assumed to be equivalent (and it is not) [16,17]. It is also problematic to assume equivalent clinical effects when the two medications being compared share the same primary pharmacological action. For example, beta blockers are not considered equivalent in their ability to limit cardiovascular event risk despite their shared mechanism of action [18]. Box 2 describes the biological similarity of our example case of statins. 2.2. Are the sources of evidence valid? 2.2.1. What is the geometry of evidence for your evaluation? Determining whether a drug exerts a therapeutic or harmful effect compared with no treatment or another treatment is complex and is not simply based on the critical appraisal of single or multiple RCTs [11,27]. Multiple sources inform our decision making, and one or few RCTs are unlikely to be sufficiently compelling to provide irrefutable evidence of a drug’s comparative safety or effectiveness [28]. The amount of information available for the different potential interventions will vary and may include direct (head-to-head) RCT evidence or be informed by indirect or observational evidence. 2.2.2. Does evidence exists from large head-to-head evidence? Evidence of the therapeutic similarity of drugs rarely comes from one single trial. Rather, a wide body of evidence including multiple RCTs, with placebo and active comparators, need to be considered [29]. Following the well-established hierarchies of evidence for establishing therapeutic effects, the best evidence should come from

307

Box 2 Using the Guide. The biological agent. Statins, for example, were understood to derive their beneficial effects from low-density lipoprotein (LDL)elowering effects [19]. The greater the LDL reduction, the greater the clinical benefit in terms of risk reduction for CVD events [19]. More recently, other actions have been associated with statin benefits, including reduced vascular inflammation, improved endothelial function, and decreased thrombus formation [20e22]. How statins compare among these effects is less well established, which brings forth uncertainties about a class effect and clinical interchangeability. Differences in drug interactions, via CYP450 metabolism, have been well established and offer clinical advantage of one statin over another. CYP3A4 is predominantly responsible for metabolism of lovastatin, simvastatin, atorvastatin, and cerivastatin. Whereas, CYP2C9 predominantly metabolizes fluvastatin (in addition to CYP3A4 and CYP2C8). Rosuvastatin uses mostly CYP2C9, and pravastatin is not predominantly metabolized by any CYP isoenzymes [23]. An automatic therapeutic substitution for the statin with the lowest drug interaction risk would be desirable to avoid the statin’s dose-related adverse effects when combined, for example, with selected protease inhibitors, which are potent CYP3A4 metabolic pathway inhibitors, in patients with human immunodeficiency virus/acquired immunodeficiency syndrome (HIV/AIDS) [24]. If one statin automatically replaces another, for example, when an individual is hospitalized, determining the equivalent dose is often a challenge. The approved dosing ranges across statins may not offer equal efficacy, tolerability, or safety as you move from the lowest to the highest approved (or clinically used) doses [25]. For example, although the Pravastatin or Atorvastatin Evaluation and Infection Therapy– Thrombolysis in Myocardial Infarction 22 (PROVEIT) trial [26] demonstrated better outcomes with atorvastatin 80 mg/day (its maximum recommended dose) compared with pravastatin 40 mg/day (its usual but not maximum dose), it is not known how these two statins compare at their respective maximum doses of 80 mg/day.

head-to-head (direct) evidence from large clinically relevant RCTs evaluating the agents at their usual doses [11]. However, the availability of this type of information is by far the exception rather than the rule. Available data generally involve comparison with placebo, in fixed- and flexible-dose trials, in which demonstrating statistical significance is easier (ie, less costly and less risky) than establishing noninferiority, equivalency, or superiority to the standard medical management strategy [30].

308

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

Fig. 1. (A) An example of a complex network for the evaluation of chronic obstructive pulmonary disease medications; and (B) The same network displaying the connectedness of the network. ICS, inhaled corticosteroids; LABA, long-acting beta-agonist; LAMA, long-acting muscarinic agents; PDE-4, phosphodiesterase-4.

Findings of nonsignificance between agents in a head-tohead trial need to be interpreted with caution. More often than not, a nonsignificant finding results from a study’s lack of power. Typically, a statistical interpretation of no difference comes from any study in which P  0.05 or the 95% confidence interval (CI) includes the value of no difference. However, it is important to recognize that nonsignificance is not the same as clinical equivalence, which requires much higher statistical precision [30]. Readers should recognize that head-to-head evidence may be biased based on the chosen doses compared [31,32]. With this in mind, Song et al. [31] have advocated that indirect comparison evidence may provide stronger evidence than direct evidence for evaluating treatment superiority.

2.2.3. If indirect evidence is used, is it sufficiently convincing? In the past, indirect comparisons across medications were done by simply comparing individual arms between different trials as if they were from a single trial [33]. This has been referred to as a naive approach as it usually fails to consider the uncertainty associated with CIs and does not account for differences in prognostic factors (eg, illness severity) at baseline [33]. In 1997, a method called the adjusted indirect comparison method was first reported. This method provides a formal test for differences between pooled estimates and provides guidance for interpretation [34]. The adjusted indirect comparison requires that two medications use a similar control (eg, medication A vs. placebo and medication B vs. placebo). A limitation of this method is that it evaluates only three interventions at a time and uses only indirect evidence.

More recently, a method called the multiple treatment comparison (MTC) meta-analysis (also called network meta-analysis) has become popular as it allows the comparison of multiple interventions, including head-to-head evaluations at the same time as indirect comparisons, as long as all interventions make up a connected network of comparisons (see Fig. 1A). This is particularly relevant as fields of medicine that are rapidly evaluating new interventions may avoid head-to-head trials, and newer agents may all have a similar comparator [35]. There are several important considerations necessary to determine whether an MTC meta-analysis is valid, and we have described these previously [36,37]. We will address four of these briefly in the following. First, the homogeneity principle addresses whether RCTs comparing any two agents have methods and results that are similar enough to leave one comfortable with combining results. That is, are the doses of the intervention, the included population samples (including illness prognostic factors), and the evaluated outcomes sufficiently similar that justify the combination of results from the considered RCTs? Second is the principle of similarity across the treatment network. Having established that RCTs are sufficiently similar within each comparison, can you say the same when comparing populations and outcomes across all comparisons in the treatment network? That is, with the exception of each intervention, are the RCTs sufficiently similar to justify combining them in an MTC? Third is the coherence principle. When available, does the evidence from indirect evidence cohere, more or less, with the direct evidence? Is the direction of treatment effect consistent between the direct and indirect evidence? Is the magnitude of effect reasonably consistent? Finally, what is the connectivity of the network? Are there sections of

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

309

Box 3 Using the guide. Are the sources of evidence valid? Using MTC and two linked articles we previously published [12,25,46], we identified data from 76 RCTs evaluating statins for both primary and secondary prevention of CVD events. The RCTs range from small investigations involving as few as 38 subjects to large studies involving as many as 20,536 subjects. Twenty-five percent of patients were women. Six individual statins were included with the number of RCTs for each comparison agent ranging from 5 for rosuvastatin vs. controls (n 5 30,245) to 25 for pravastatin vs controls (n 5 51,011). Fig. 2 displays the network. The doses used in the studies ranged from lower to higher doses of each statin. The authors examined whether doses changed the results using a metaregression technique and found borderline significance (P 5 0.054) that higher doses were associated with increased treatment effects on CVD death. There was no useful head-to-head evidence identified as no large equivalent-dose, head-to-head statin trials have been conducted [25]. Therefore, the largest available evidence comes from indirect comparisons using inert controls as the comparator. The end points used across clinical trials varied. However, most clinical trials did provide information on CVD death. Because additional evidence would narrow the CIs in any analysis, the addition of new RCTs as displayed in Table 1 could importantly change our interpretation of the existing evidence.

a network that are better connected than other sections of the network (see Fig. 1B)? Nodes in the treatment network that are well connected (ie, informed directly and indirectly by several RCTs) will usually be better powered and better informed in the network. Whereas, a sparsely connected network may have low power and provides weak inferences about the comparisons that are not robustly connected. 2.2.4. Are the end points in clinical trials of similar importance that a patient would consider them equal? End points used in clinical trials range from measures of clear importance to the patient to measures that, in hindsight, are found to have less clinical relevance. Examples of this range of end points are major clinical end points (eg, all-cause or disease-specific mortality), general clinical end points (eg, visits to the emergency room or hospitalization), patient-reported outcomes using scales, and surrogate measures that predict clinically important patient outcomes. Formulary committees can accept differing levels of evidence for outcomes depending on the severity of the disease and the strength of evidence that a surrogate marker

Fig. 2. Studies identified using different statins for the prevention of CVD events.

will translate to a clinical event. Surrogate measures range in value from strong (eg, viral load in HIV infection or adherence to antipsychotics in schizophrenia) to weak (eg, high-density lipoprotein and triglyceride levels in CVD prediction) to negligible (eg, prostate-specific antigen level as a predictor of prostate cancer outcomes) [38]. Early large clinical trials of an intervention have a greater likelihood of having evaluated major clinical end points than later trials that are used to determine applicability of a medication within specific populations [39]. This is because early RCTs may be mandated to display clinical effectiveness in an area that does not have a well-established and effective standard of care. Once the standard of care is established, newer interventions may need to be evaluated in the presence of the standard of care (eg, medication A plus standard of care vs. placebo plus standard of care) [40]. During the process of establishing the standard of care, the mechanism of action of a medication may become better understood, and a surrogate marker of disease progression will become accepted [41e44]. For example, early RCTs evaluating the effectiveness of antiretroviral treatments for HIV/AIDS originally used progression to AIDS or death as the primary end points [45]. Later trials typically use the surrogate end point of HIV RNA viral load suppression as the end point as this is clinically recognized as a surrogate marker for progression to AIDS/death. Box 3 describes the validity of the evidence available for our case example of statin therapy. 2.3. What are the results? 2.3.1. Are there important differences in the number of trials representing different agents? There are usually large differences in the number of RCTs available for all potential medications for conditions.

310

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

Table 1. Pair-wise meta-analysis results for statin vs. control comparison from (1) the original data and (2) the original data plus one large placebocontrolled trial in which the effect is held constant added for each statin Comparison Pravastatin vs. control Atorvastatin vs. control Fluvastatin vs. control Simvastatin vs. control Lovastatin vs. control Rosuvastatin vs. control Atorvastatin vs. pravastatin Fluvastatin vs. pravastatin Simvastatin vs. pravastatin Lovastatin vs. pravastatin Rosuvastatin vs. pravastatin Fluvastatin vs. atorvastatin Simvastatin vs. atorvastatin Lovastatin vs. atorvastatin Rosuvastatin vs. atorvastatin Simvastatin vs. fluvastatin Lovastatin vs. fluvastatin Rosuvastatin vs. fluvastatin Lovastatin vs. simvastatin Rosuvastatin vs. simvastatin Rosuvastatin vs. lovastatin

Original MTC 0.78 0.80 0.61 0.74 0.73 0.88 1.03 0.79 0.95 0.94 1.13 0.76 0.93 0.91 1.10 1.21 1.20 1.44 0.99 1.19 1.21

(0.65, 0.93) (0.65, 0.96) (0.41, 0.88) (0.56, 0.98) (0.43, 1.22) (0.73, 1.06) (0.79, 1.33) (0.51, 1.19) (0.68, 1.33) (0.55, 1.60) (0.87, 1.46) (0.50, 1.18) (0.66, 1.31) (0.53, 1.58) (0.84, 1.44) (0.76, 1.97) (0.63, 2.27) (0.94, 2.20) (0.55, 1.76) (0.85, 1.66) (0.69, 2.09)

Trials with 5,000 patient arms 0.78 0.80 0.61 0.74 0.73 0.88 1.03 0.79 0.95 0.94 1.13 0.76 0.93 0.91 1.10 1.21 1.20 1.44 0.99 1.19 1.21

(0.68, 0.89) (0.70, 0.92) (0.51, 0.73) (0.63, 0.87) (0.61, 0.88) (0.77, 1.00) (0.85, 1.24) (0.63, 0.98) (0.77, 1.17) (0.75, 1.17) (0.94, 1.36) (0.61, 0.96) (0.75, 1.14) (0.73, 1.15) (0.91, 1.33) (0.95, 1.54) (0.93, 1.55) (1.15, 1.80) (0.77, 1.26) (0.97, 1.46) (0.96, 1.51)

Trials with 10,000 patient arms 0.78 0.80 0.61 0.74 0.73 0.88 1.03 0.79 0.95 0.94 1.13 0.76 0.93 0.91 1.10 1.21 1.20 1.44 0.99 1.19 1.21

(0.70, 0.87) (0.72, 0.89) (0.53, 0.70) (0.65, 0.84) (0.64, 0.83) (0.79, 0.98) (0.88, 1.20) (0.66, 0.93) (0.81, 1.12) (0.79, 1.11) (0.97, 1.31) (0.64, 0.91) (0.78, 1.12) (0.77, 1.11) (0.94, 1.28) (1.01, 1.46) (0.99, 1.45) (1.21, 1.72) (0.82, 1.18) (1.01, 1.40) (1.02, 1.43)

The number of patients per arm in each added (imaginary) trial is presented in the column titles. All results are odds ratios with 95% confidence intervals. A 5% control risk is assumed for the placebo arm in each imaginary trial. Bolded results indicate statistical significance.

Interventions that have been evaluated in multiple RCTs provide evidence on a large number of patients over a long period of time. Therefore, we would expect that there are RCTs that demonstrate large treatment effects and other RCTs that demonstrate nonsignificant or smaller treatment effects [47]. As these are all pooled, we would expect that biases at the individual trial level will become less important as they are washed out by the larger amount of data coming from multiple RCTs [31,48]. Thus, we expect that the pooled estimate will provide a better-powered and more precise estimate of effect. However, when evidence for a medication comes only from a small number of RCTs, the effect of trial-level biases will have a greater impact on the treatment estimate. Small trials, for example, may exhibit large treatment effects that are later judged unreliable in the presence of larger and multiple RCTs [47]. When there is noteworthy imbalance in the amount of RCTs per medication, the comparison between medications may spuriously demonstrate superiority of a newer medication with a low amount of information compared with an older and well-evaluated medication that is based on a large amount of information. This is especially of concern when there is selective publication of the limited number of trials. 2.3.2. Are treatment effects similar across agents? Even among RCTs of the same medication with the same control among similar populations, we would expect heterogeneity of point estimates and CIs simply due to chance. In a pair-wise meta-analysis, I2 is the most commonly used measurement tool that confers whether RCTs

appear to exhibit treatment effects that are different beyond the play of chance [49]. I2 is a useful measure because it provides an estimate of heterogeneity that occurs on a scale between 0% and 100%, with lower estimates on the scale suggesting less heterogeneity. No such measure exists with indirect comparisons or MTCs. Evaluating whether different medications within a class display similar effects may be nuanced. Although readers may examine whether CIs overlap or whether hypothesis tests are significant, there are other issues to consider. It is possible for medication A of a class to display a treatment effect [eg, relative risk (RR) 5 0.81; 95% CI: 0.75, 0.96] that is convincing. Medication B, within the same class, exhibits a nonsignificant treatment effect that is very similar in effect size but with a different interpretation (eg, RR 5 0.85; 95% CI: 0.70, 1.02). In this scenario, a reader who does not believe the class effect may recommend the use of medication A but not medication B. Another reader, who accepts the class effect, may be convinced of the treatment effects of medication A and is willing to accept that medication B is likely similar if it were sufficiently powered. This same reader may become more skeptical, however, if another medication (medication C) within the assumed class exhibits no treatment effects (eg, RR 5 1.00; 95% CI: 0.90, 1.20). It is important to recognize that the CIs overlap between Avs. B, B vs. C, and Avs. C indicating a statistical finding of no difference among the three medications. This example demonstrates that our confidence in a class effect diminishes as new evidence is inconsistent in treatment efficacy. Although this example includes statistically significant and nonsignificant findings, the same

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

issues will apply when findings are statistically significant across all medications but one appears to offer a much larger treatment effect than the rest. 2.3.3. Would the addition of sufficiently powered evidence change the results of direct or indirect evidence? It is unlikely that head-to-head evidence will be solely responsible for informing our decisions regarding therapeutic substitutions. Indirect evidence is far more likely to be used [50]. As new information is added to an evaluation, the reader may become more convinced that treatments differ. We use our statin MTC to illustrate how new trials when added to the information base can affect the results in important ways [12]. We recognize that in cardiovascular RCTs, very large sample sizes are used to demonstrate small but important treatment effects [51]. Therefore, in this example, we examined via simulation what effect adding 5,000 or 10,000 new patients to each intervention arm in a set of new hypothetical placebo-controlled two-arm statin RCTs (one for each statin) would have on our existing analysis. We held the event rates constant based on the original RCTs included [12]. Table 1 demonstrates that in the absence of heterogeneity of the new trials, the point estimate will remain stable, but the CIs will narrow. The results show that in the original RCT-driven meta-analysis, the differences among medications were nonsignificant. When an additional 5,000 patients are added to each comparison (with the same event rates), significant differences are found between fluvastatin, pravastatin, atorvastatin, and rosuvastatin. When an additional 10,000 patients are added, the differences are significant between fluvastatin and pravastatin, atorvastatin, simvastatin, and rosuvastatin, and also between rosuvastatin vs. simvastatin and lovastatin. 2.3.4. Are adverse events similar across agents? A therapeutic substitution needs to give as much consideration to the tolerability and safety of the replacement medication as it does its comparative effectiveness. The comparison is made easier when the substitute is more tolerable and safe, but this is often not known or known not to be the case. Two medications that share a very similar chemical structure and pharmacology often share similar but not identical adverse effect profiles. Tolerability can be crudely compared by adverse event-related dropout rates seen in clinical trials. However, safety, especially for uncommon and rare serious adverse effects, usually requires consideration of other less reliable forms of research, including comparative observational studies (eg, caseecontrol and cohort studies) and pharmacovigilance surveillance systems based on spontaneous reporting of observed adverse events. Although much of the adverse effect profiles may be similar among medications of the same chemical and pharmacologic class, serious idiosyncratic adverse effects may occur with one agent but much less often if at all with the other (eg, liver failure with troglitazone vs.

311

Box 4 Using the guide. What are the results? Returning to our statin scenario, there is imbalance in the number of RCTs included in the statin MTC, with as few as 5 RCTs for rosuvastatin and 25 for pravastatin. Therefore, we should expect some imbalance in the treatment effects on important end points. Treatments did not show a similar consistency in treatment efficacy. In this circumstance, rosuvastatin displayed nonsignificant treatment effects for reducing CVD deaths. Given our understanding of different statins, we wonder whether this is a true effect or an artifact of the predominantly primary prevention trials involved with the use of pravastatin (thus with lower power due to the reduced number of events) [54]. Certain statins exhibit different adverse events than others. An older statin, cerivastatin, for example, was withdrawn from the global market because of its adverse event profile associated with rhabdomyolysis [55]. In the present evaluation of statins, adverse events were reported for each individual medication in a separate linked article [46]. Using indirect comparisons, the authors found that specific statins had slightly different safety profiles. Atorvastatin significantly elevated aspartate transaminase levels compared with pravastatin [odds ratio (OR): 2.21; 95% CI: 1.13, 4.29], and simvastatin significantly increased creatine kinase levels compared with rosuvastatin (OR: 4.39; 95% CI: 1.01, 19.07). In resolving our scenario, we are left with weak inferences that all statins exhibit the same treatment effects. We know that there are different biological mechanisms of how the medications are metabolized. We also recognize that the evidence comes only from indirect comparisons of medications that are imbalanced in the number of patients and RCTs involved. Finally, the medications appear to exhibit different treatment effects that are not entirely explained by their differing patient populations. This leaves us with uncertainty that all statins are the same and can be used interchangeably.

pioglitazone) [52]. The adverse effect profile is generally better characterized for older vs. newer medications, which supports therapeutic substitutions of older medications known to have few serious adverse effects. The decision regarding substitution becomes more complicated when the two medications are similarly tolerable and the substitute medication, which is usually the older medication, has well characterized but very low rates of serious adverse effects, whereas the link between the newer agent and its putative idiosyncratic serious adverse effects remains tenuous.

312

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

Another challenge occurs when both agents have known but quite different adverse effect profiles. For example, nevirapine, a non-nucleoside reverse transcriptase inhibitor (NNRTi) medication used as a component of HIV therapy is associated with chronic toxicity resulting in hepatopathy, severe rash, and fatigue [53]. Efavirenz, an alternative NNRTi, is typically much better tolerated but is associated more commonly with psychiatric symptoms that are not seen with nevirapine [53]. Nevirapine offers an attractive substitute for efavirenz for people with an active mental illness, but this preference may not apply to all patients affected by such a therapeutic substitution policy as some patients, such as those with higher CD4 T-cell levels (O250 cells/uL), could have increased rates of adverse events. 2.3.5. What is the overall quality and limitations of the evidence? It is important to assess the overall quality of the evidence used to inform therapeutic substitutions so that a reader can determine if the evidence provides strong inferences. The following aspects are hallmarks of highquality evidence for class effects: individual studies are at low risk of bias and publication bias is unlikely; studies are well powered and sample sizes are large with CIs that are correspondingly narrow; the findings have been replicated in a number of similar well-designed RCTs; and a plausible biological explanation exists. In most cases, some of these considerations are available, but not all. Therefore, confirming therapeutic substitution suitability will remain challenging. Box 4 describes the results for our case example of statin therapy.

3. Conclusion In this article, we have addressed several necessary questions for evaluating whether different medications can be safely substituted for another medication. In most scenarios, the choice of therapeutic substitution will not be entirely justified by the evidence. Therefore, readers should apply caution to concluding broad class effects and assuming therapeutic equivalence. Acknowledgments E.J.M. and K.T. have consulted either Merck & Co Inc, Pfizer Ltd, Novartis, Takeda, or Glaxo Smithkline on MTC issues. E.J.M. and K.T. have received grant funding from the Canadian Institutes of Health Research (CIHR) Drug Safety & Effectiveness Network to develop methods and educational materials on MTCs. E.J.M. receives salary support from the CIHR through a Canada Research Chair. K.T. receives salary support from the CIHR Drug Safety & Effectiveness Network. D.G. has completed an investigatorinitiated grant funded by Pfizer. There are no other relevant competing interests. E.J.M. had full access to all the data in

the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. The funding agencies had no role in the design or conduct of the study; in the collection, management, analysis, or interpretation of the data; or in the preparation, review, or approval of the manuscript. References [1] Gray T, Bertch K, Galt K, Gonyeau M, Karpiuk E, Oyen L, et al. Guidelines for therapeutic interchange-2004. Pharmacotherapy 2005;25:1666e80. [2] Canadian Pharmacists Association. Summary of pharmacists’ expanded scope of practice activities across Canada. http://www. pharmacists.ca/cpha-ca/assets/File/pharmacy-in-canada/Expanded ScopeChart.pdf. 2012. [3] Duerden MG, Hughes DA. Generic and therapeutic substitutions in the UK: are they a good thing? Br J Clin Pharmacol 2010;70: 335e41. [4] Johnston A, Asmar R, Dahlof B, Hill K, Jones DA, Jordan J, et al. Generic and therapeutic substitution: a viewpoint on achieving best practice in Europe. Br J Clin Pharmacol 2011;72:727e30. [5] Gumbs PD, Verschuren WM, Souverein PC, Mantel-Teeuwisse AK, de Wit GA, de Boer A, et al. Society already achieves economic benefits from generic substitution but fails to do the same for therapeutic substitution. Br J Clin Pharmacol 2007;64:680e5. [6] Vivian JC. Generic-substitution laws. US Pharm 2008;33:30e4. [7] Doering PL, McCormick WC, Klapp DL, Russell WL. Therapeutic substitution and the hospital formulary system. Am J Hosp Pharm 1981;38:1949e51. [8] Kereiakes DJ, Willerson JT. Therapeutic substitution: guilty until proven innocent. Circulation 2003;108:2611e2. [9] Antman EM, Ferguson JJ. Should evidence-based proof of efficacy as defined for a specific therapeutic agent be extrapolated to encompass a therapeutic class of agents? Circulation 2003;108:2604e7. [10] Furberg CD, Psaty BM. Should evidence-based proof of drug efficacy be extrapolated to a ‘‘class of agents’’? Circulation 2003; 108:2608e10. [11] McAlister FA, Laupacis A, Well GA, Sackett DL. Users guides to the medical literature XIX. Applying clinical trial results B. Guidelines for determining whether a drug is exerting (more than) a class effect. JAMA 1999;282:1371e7. [12] Mills EJ, Wu P, Chong G, Ghement I, Singh S, Akl EA, et al. Efficacy and safety of statin treatment for cardiovascular disease: a network meta-analysis of 170,255 patients from 76 randomized trials. Qjm 2011;104:109e24. [13] Mills EJ, Rachlis B, Wu P, Devereaux PJ, Arora P, Perri D. Primary prevention of cardiovascular mortality and events with statin treatments: a network meta-analysis involving more than 65,000 patients. J Am Coll Cardiol 2008;52:1769e81. [14] Briel M, Nordmann AJ, Bucher HC. Statin therapy for prevention and treatment of acute and chronic cardiovascular disease: update on recent trials and metaanalyses. Curr Opin Lipidol 2005;16: 601e5. [15] Canadian Pharmacists Association October 2012. Summary of pharmacists’ expanded scope of practice activities across Canada. http:// blueprintforpharmacy.ca/docs/pdfs/pharmacists’-expanded-scope_ summary-chart—cpha—oct-29-2012.pdf. [16] Major outcomes in high-risk hypertensive patients randomized to angiotensin-converting enzyme inhibitor or calcium channel blocker vs diuretic: the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT). JAMA 2002;288: 2981e97. [17] Dahlof B, Sever PS, Poulter NR, Wedel H, Beevers DG, Caulfield M, et al. Prevention of cardiovascular events with an

E.J. Mills et al. / Journal of Clinical Epidemiology 67 (2014) 305e313

[18]

[19]

[20]

[21]

[22]

[23] [24] [25]

[26]

[27]

[28]

[29]

[30]

[31]

[32] [33]

[34]

antihypertensive regimen of amlodipine adding perindopril as required versus atenolol adding bendroflumethiazide as required, in the Anglo-Scandinavian Cardiac Outcomes Trial-Blood Pressure Lowering Arm (ASCOT-BPLA): a multicentre randomised controlled trial. Lancet 2005;366:895e906. Lindholm LH, Carlberg B, Samuelsson O. Should beta blockers remain first choice in the treatment of primary hypertension? A metaanalysis. Lancet 2005;366:1545e53. Baigent C, Keech A, Kearney PM, Blackwell L, Buck G, Pollicino C, et al. Efficacy and safety of cholesterol-lowering treatment: prospective meta-analysis of data from 90,056 participants in 14 randomised trials of statins. Lancet 2005;366:1267e78. Colivicchi F, Tubaro M, Mocini D, Genovesi Ebert A, Strano S, Melina G, et al. Full-dose atorvastatin versus conventional medical therapy after non-ST-elevation acute myocardial infarction in patients with advanced non-revascularisable coronary artery disease. Curr Med Res Opin 2010;26:1277e84. Rajpatrick SN, Kumbhani DJ, Crandall J, Barzilai N, Alderman M, Ridker PM. Statin therapy and risk of developing type 2 diabetes: a meta-analysis. Diabetes Care 2009;32:1924e9. Colivicchi F, Guido V, Tubaro M, Ammirati F, Montefoschi N, Varveri A, et al. Effects of atorvastatin 80 mg daily early after onset of unstable angina pectoris or non-Q-wave myocardial infarction. Am J Cardiol 2002;90:872e4. Willrich MA, Hirata MH, Hirata RD. Statin regulation of CYP3A4 and CYP3A5 expression. Pharmacogenomics 2009;10:1017e24. Ray GM. Antiretroviral and statin drug-drug interactions. Cardiol Rev 2009;17:44e7. Mills EJ, O’Regan C, Eyawo O, Wu P, Mills F, Berwanger O, et al. Intensive statin therapy compared with moderate dosing for prevention of cardiovascular events: a meta-analysis of O40 000 patients. Eur Heart J 2011;32:1409e15. Cannon CP, Braunwald E, McCabe CH, Rader DJ, Rouleau JL, Belder R, et al. Intensive versus moderate lipid lowering with statins after acute coronary syndromes. N Engl J Med 2004;350: 1495e504. Guyatt GH, Sackett DL, Sinclair JC, Hayward R, Cook DJ, Cook RJ. Users’ guides to the medical literature. IX. A method for grading health care recommendations. Evidence-Based Medicine Working Group. JAMA 1995;274:1800e4. Atkins D, Best D, Briss PA, Eccles M, Falck-Ytter Y, Flottorp S, et al. Grading quality of evidence and strength of recommendations. Bmj 2004;328:1490. Song F, Altman DG, Glenny AM, Deeks JJ. Validity of indirect comparison for estimating efficacy of competing interventions: empirical evidence from published meta-analyses. Bmj 2003;326: 472. Piaggio G, Elbourne DR, Altman DG, Pocock SJ, Evans SJ. Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement. JAMA 2006;295:1152e60. Song F, Harvey I, Lilford R. Adjusted indirect comparison may be less biased than direct comparison for evaluating new pharmaceutical interventions. J Clin Epidemiol 2008;61:455e63. Gardner DM, Baldessarini RJ, Waraich P. Modern antipsychotic drugs: a critical overview. Cmaj 2005;172:1703e11. Glenny AM, Altman DG, Song F, Sakarovitch C, Deeks JJ, D’Amico R, et al. Indirect comparisons of competing interventions. Health Technol Assess 2005;9:1e134. [iiieiv]. Bucher HC, Guyatt GH, Griffith LE, Walter SD. The results of direct and indirect treatment comparisons in meta-analysis of randomized controlled trials. J Clin Epidemiol 1997;50:683e91.

313

[35] Ioannidis JP. Perfect study, poor evidence: interpretation of biases preceding study design. Semin Hematol 2008;45:160e6. [36] Mills EJ, Thorlund K, Ioannidis JP. Demystifying trial networks and network meta-analysis. Bmj 2013;346:f2914. [37] Mills EJ, Ioannidis JP, Thorlund K, Schunemann HJ, Puhan MA, Guyatt GH. How to use an article reporting a multiple treatment comparison meta-analysis. JAMA 2012;308:1246e53. [38] Bucher HC, Guyatt GH, Cook DJ, Holbrook A, McAlister FA. Users’ guides to the medical literature: XIX. Applying clinical trial results. A. How to use an article measuring the effect of an intervention on surrogate end points. Evidence-Based Medicine Working Group. JAMA 1999;282:771e8. [39] FDA. Dose-response information to support drug registration. ICH-E4. http://www.fda.gov/downloads/Drugs/GuidanceCompliance RegulatoryInformation/Guidances/ucm073115.pdf. 1994. [40] Prasad V, Cifu A, Ioannidis JP. Reversals of established medical practices: evidence to abandon ship. JAMA 2012;307:37e8. [41] Buyse M, Molenberghs G. Criteria for the validation of surrogate endpoints in randomized experiments. Biometrics 1998;54:1014e29. [42] Buyse M, Molenberghs G, Burzykowski T, Renard D, Geys H. The validation of surrogate endpoints in meta-analyses of randomized experiments. Biostatistics 2000;1:49e67. [43] Buyse M, Piedbois P. On the relationship between response to treatment and survival time. Stat Med 1996;15:2797e812. [44] Buyse M, Sargent DJ, Grothey A, Matheson A, de Gramont A. Biomarkers and surrogate end pointsdthe challenge of statistical validation. Nature reviews Clinical oncology 2010;7:309e17. [45] Kent DM, Mwamburi DM, Bennish ML, Kupelnick B, Ioannidis JP. Clinical trials in sub-Saharan Africa and established standards of care: a systematic review of HIV, tuberculosis, and malaria trials. JAMA 2004;292:237e42. [46] Alberton M, Wu P, Druyts E, Briel M, Mills EJ. Adverse events associated with individual statin treatments for cardiovascular disease: an indirect comparison meta-analysis. Qjm 2012;105: 145e57. [47] Pereira TV, Horwitz RI, Ioannidis JP. Empirical evaluation of very large treatment effects of medical interventions. JAMA 2012;308: 1676e84. [48] Mills EJ, Thorlund K, Ioannidis JP. Calculating additive treatment effects from multiple randomized trials provides useful estimates of combination therapies. J Clin Epidemiol 2012;65:1282e8. [49] Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. Bmj 2003;327:557e60. [50] Madan J, Stevenson MD, Cooper KL, Ades AE, Whyte S, Akehurst R. Consistency between direct and indirect trial evidence: is direct evidence always more reliable? Value Health 2011;14: 953e60. [51] Yusuf S, Collins R, Peto R. Why do we need some large, simple randomized trials? Stat Med 1984;3:409e22. [52] Rajagopalan R, Iyer S, Perez A. Comparison of pioglitazone with other antidiabetic drugs for associated incidence of liver failure: no evidence of increased risk of liver failure with pioglitazone. Diabetes Obes Metab 2005;7:161e9. [53] Drake SM. NNRTIsda new class of drugs for HIV. J Antimicrob Chemother 2000;45:417e20. [54] Ridker PM, Danielson E, Fonseca FA, Genest J, Gotto AM Jr, Kastelein JJ, et al. Rosuvastatin to prevent vascular events in men and women with elevated C-reactive protein. N Engl J Med 2008; 359:2195e207. [55] Staffa JA, Chang J, Green L. Cerivastatin and reports of fatal rhabdomyolysis. N Engl J Med 2002;346:539e40.

A users' guide to understanding therapeutic substitutions.

Therapeutic substitutions are common at the level of ministries of health, clinicians, and pharmacy dispensaries. Guidance in determining whether drug...
431KB Sizes 0 Downloads 0 Views