Psychopharmacology (1992) 106: S93-S 95

Psychopharmacology © Springer-Verlag 1992

Active control equivalence trials: some methodological aspects U. Ferner and N. N e u m a n n

Department of Clinical Research, Biometrics, F. Hoffmann-La Roche Ltd, CH-4002 Basel, Switzerland

Abstract. The therapeutic efficacy of an investigational

new drug is best demonstrated in clinical trials against a placebo control. However, ethical requirements increasingly preclude the use of an inactive control substance as a treatment alternative, as clinical testing to compare the effects of active treatments, has become more frequent. The statistical evaluation and interpretation of these trials is complex and requires the demonstration of 'equivalence' or 'at-least equivalence' between the experimental drug and a comparable standard therapy. The design of controlled clinical trials to compare the effects of moclobemide, a new and selective monoamine oxidase-A inhibitor, with those of standard tricyclic antidepressant drugs involved the refinement of selection strategies, formulation of hypotheses, means of statistical analysis and clinical interpretation.

standard treatment for that condition, by more than a pre-defined and clinically meaningful difference 8 (Tables 1 and 2). The null hypothesis to be used in these cases is that standard treatment is better than the investigational drug by a pre-determined amount 8. A precise definition of this amount is necessary before the therapies can be described as 'equivalent' or 'at-least equivalent'. This approach differs from the traditional null hypothesis used in placebo-controlled trials, which states

Table 1. Hypothesis for equivalence H0: I~lNV--~cl > 8 H~ : IDINV--~c[ ~ 8 l

Equivalence region

1

Key words: Equivalence testing - Active control study

design - Moclobemide - Clinical trial methodology

The efficacy of a new drug which is under investigation must be determined from data obtained from carefully designed and well-controlled clinical studies. Ideally, to demonstrate a difference between treatments, clinical trials should include placebo therapy as a control, but it is not always ethically acceptable to include an inactive control substance as a treatment alternative. Though clinical testing to compare the effects of active treatments is becoming increasingly frequent, the planning and statistical evaluation of these trials is more complex. This paper describes important parameters in the design of controlled clinical trials to compare the effects o f moclobemide, a new and selective monoamine oxidase-A inhibitor, with those of standard tricyclic antidepressant drugs. Formulation of hypotheses

Active control trials, or equivalence studies, aim to show that the drug under investigation is not worse than a Offprint requests' to: U. Ferner

-8

8

[

I ([I1NV-- ~IC)

95% CI

g = parameter of central tendency for investigational drug (INV) or active control (C) CI = confidence interval

Table 2. Hypothesis for 'at least' equivalence

H0 : P[NV-- gc < -- 6 Ht : Ia[NV-- laC --> -- 8 Equivalence

I

-8

I

0

p,= parameter of central tendency for investigational drug (INV) or active control (C)

$94

Table 3. Sample size estimation

Table 4. Hypothesis for equivalence, a = 0.10, 13= 0.05

1. Difference testing

A (HDRS)

Std dev.

Sample size (per group)

6 = detectable difference

5 5 3 3

10 8 10 8

70 45 191 123

n = sample size per group

c~= probability for error type 1

26.2

n = T-(t,

,,,~+t,_p

2. Equivalence testing 26-2

n : 7(t,

13= probability for error type 2 A= difference between investigational drug and active control HDRS=total sum of Hamilton depression rating scale Std dev. = Standard deviation

~,~+t, ~,) Table 5. Hypothesis for equivalence. 2a =0.10, ]3= 0.05

= half

length of the equivalence region

3. At least equivalence testing 26.2

Equiv. region (HDRS)

Std dev.

(--5.5)

10

(-5.5)

8

86 55

(-3.3)

10 8

235 151

(-3.3)

Sample size (per group)

6 = l e n g t h of the equivalence region t, ~2 = 1 --c~/2-quantile of the central t-distribution

t 1 a([}~= 1- a(]3)-quantile of the central t-distribution 6.2= Variance

0.6 imiprom[ne i

Des[promine

i

r

-

-

Clom[pr c l m [ n e

I

r

AmitripfyNne 7

I

-

-

1

O.L

that, for a given clinical parameter, there is no difference between the experimental drug and the control treatment. If this null hypothesis was retained for the examination of active control substances, its non-rejection would show only that the findings remained consistent with this hypothesis, and not that equivalence had been demonstrated. In practice, the demonstration of at least equivalence in active control trials has to take into account that the upper confidence limit on the measured difference in the main criteria between standard treatment and investigational drug is less than an a priori chosen acceptable difference 6. Consideration of the confidence limits reduces the risk of misinterpretation of a non-significant difference as an indication of equivalence. Analysis and interpretation of previous three-arm randomised trials comparing the effects of a standard therapy, an investigational new drug, and a placebo control has shown that the interpretation of the results obtained from a direct comparison of the two active substances are not necessarily the same as those achieved when the placebo arm is also taken into account. Therefore, special precautions have to be taken in trials which lack the benefit of placebo controls (Leber, 1989).

Sample size estimations The number of cases needed for equivalence testing can be assessed by defining an acceptable difference 6, an estimation of the variability cy, and the consideration of type 1 and 2 errors. These errors state the risk of the

0.2

'HnHnHH -0.2

-0.4

I

I

I

3 0 2

3 0 1

I

I

I

I

3 0 3

3 0 L

3 0 5

I

I

I

i

i

I

2 O L

3 0 7

3 0 8

3 0 9

3 1 0

~

1 1

I

I

I

I

3 1 5

3 1 6

3 1 6

S t u d y number

Fig. 1. Difference in means of the therapeutic index of the Hamilton total score with 95% confidence intervals. Therapeutic index: weighted area under the curve, standardized to the interval

(--1, +1) unjustified rejection or acceptance of the null hypothesis. Formulae for sample size estimations are given in Table 3, e.g. to establish the equivalence of two treatment groups within an equivalence region of _+5, given a variability of or= 10, a type 1 error of 2 ~ = 10% and a type 2 error of [3= 5%, one needs a sample size of 86 subjects within each treatment group, If the detectable difference 6 between the treatment groups is small, then the sample size must be increased. Although in trials involving placebo the detectable difference may be substantial, in active control trials it can be expected to be much less. If the underlying variability cr within the patient population is increasing between the treatment groups, then the sample size must also be increased.

$95 Table 6. Clinical global assessment

Moclobemide Clomipramine Moclobemide Imipramine Moclobemide Clomipramine

If the two types of error are taken into account, confidence levels can be set. The exact number of patients required will depend on the degree to which the underlying variability in the patient population can be controlled and the extent to which the difference between the treatments being studied is considered meaningful (Tables 4 and 5). The numbers of patients will need to be increased if the requirement for the treatments to be considered 'equivalent' is to be increased, i.e. if the interval between the confidence limits is to be decreased.

Equivalence criteria The criteria for determination of equivalence should be decided prior to the start of the trial. Based on the hypothesis that has been formulated, the chosen size of the treatment groups, and the position of the outermost confidence limits, the difference between the active and the test drug has to be shown to be less than a previously defined acceptable amount. This difference can be stated as means, in proportions, or as ratios.

Trials with moclobemide Early comparative trials of moclobemide examined the difference in therapeutic index, using the Hamilton total score. The index is standardised to the interval - 1 to + 1, and can be considered as a weighted area under the curve. The trials performed prior to 1988 compared moclobemide with imipramine, desipramine, clomipramine, and amitriptyline. The mean value differences of the therapeutic index were consistent and close to zero, showing that there were no differences between moclobemide and the standard treatments. However, the position of the 95% confidence limits varied considerably among the studies (Fig. 1). The 301 study (Fig. 1) had the smallest confidence interval, a mean difference of the therapeutic index of approximately zero, and included approximately 145 assessable patients per group. In the absence of a placebo-controlled study, this trial could provide evidence of "at-least equivalence" between the two treatment groups. The remaining studies in this group had sample sizes of between 10 and 40 patients per treatment group: although these data can provide additional in-

Total cases

Very good/ good

Proportions

270 267 252 240 98 93

149 163 154 154 79 57

0.55 0.61 0.61 0.64 0.81 0.61

Difference 95 % CI in proportions -0.06

(-0.14, 0.02)

-0.03

(-0.11, 0.06)

0.20

(0.07, 0.33)

formation, they cannot act as substantial evidence to show that moclobemide has the same efficacy as the other classical tricyclic antidepressant treatments. When the pooled clinical data (without adjustment between centres) from worldwide trials comparing the efficacies of moclobemide, clomipramine, and imipramine (see Table 6) were examined, the results were not easy to interpret. Although there were more patients in each of the treatment groups involved, it was not possible to achieve substantial evidence about equivalence from these studies, due to the relatively wide 95 % confidence interval of the binomially distributed variable. A more recent trial which included 98 patients treated with moclobemide and 93 treated with clomipramine showed a difference of 20% between the responses to moclobemide and clomipramine, respectively. This difference was significant at the 5 % level, and avoided the problem o f 'equivalence' or 'at-least equivalence'. When planning sample sizes for the testing of equivalence, it is necessary to avoid binomially distributed random variables such as responder/non-responder or cured/non-cured classifications, since many more patients would be needed to take account of these. Continuously° distributed variables, e.g. Hamilton or Montgomery Asberg total sums, are more useful in trials which compare one active drug with others.

Conclusion The planning of adequate and meaningful active control trials to provide 'equivalence' or 'at-least equivalence' is not easy. Only continuously distributed variables such as those shown on the Hamilton or Montgomery ~sberg scales should be used. The responder/non-responder criteria lead to a binomial distribution from which it is difficult to assess equivalence, due to much greater sample sizes, and these should not be considered for active control trials.

References Leber PD (1989) Hazards of inference: the active control investigation. Epilepsia [suppl 1] 30: S 57-S 63 Makuch R, Johnson M (1989) Issues in planning and interpreting active control equivalence studies. J Clin Epidemiol 42 [6]: 503-511

Active control equivalence trials: some methodological aspects.

The therapeutic efficacy of an investigational new drug is best demonstrated in clinical trials against a placebo control. However, ethical requiremen...
255KB Sizes 0 Downloads 0 Views