A method of designing clinical trials for combination drugs.

STATISTICS IN MEDICINE, VOL. 11, 1065-1074 (1992)

A METHOD OF DESIGNING CLINICAL TRIALS FOR COMBINATION DRUGS JOSEPH G. PIGEON Department of Mathematical Sciences, Villanova University, Villanova. PA 19085, U.S.A.

MARGARET DIPONZIO COPENHAVER Statistical Services, Cytogen Corporation, Princeton, NJ 08540, U.S.A.

AND JAMES P. WHIPPLE Biostatistics and Research Data Systems, Merck Sharp and Dohme Research Laboratories. West Point, PA 19486, U.S.A.

SUMMARY Many pharmaceutical companies are now exploring combination drug therapies as an alternative to monotherapy. Consequently, it is of interest to investigate the simultaneous dose response relationship of two active drugs to select the lowest effective combination. In this paper, we propose a method for designing clinical trials for drug combinations that seems to offer several advantages over the 4 x 3 or even larger factorial studies that have been used to date. In addition, our proposed method provides a convenient formula for calculating the required sample size.

1. INTRODUCTION

For many patients and indications, treatment with a combination of two or more drugs is often superior to treatment with a single drug. When two or more drugs are given simultaneously, however, a number of different drug interactions are possible, including synergism and antagonism. Thus precise knowledge of such interaction is desirable for the safe and efficacious treatment of patients. Consequently, clinical trials must provide the drug manufacturer and various regulatory agencies with the necessary data to select an effective drug combination for marketing and at the same time, to obtain useful and reliable information for dose titration. This paper describes a method of designing a clinical trial to obtain valuable information about the simultaneous dose response relationship between two active drugs in a practical and cost effective manner. Factorial designs'-3 have been used for many years in agricultural and industrial research primarily because of their ability to provide information about main effects and their interactions efficiently in a single experiment. Furthermore, because each factor is evaluated at more than one level of the other factors, they provide a broad inductive base for any inferences made from the study. In addition, factorial designs often provide the basic foundation for response surface methodology. Response surface method^^-^ are statistical techniques to build and explore 0277-67 15/92/081065-10$05.00 0 1992 by John Wiley & Sons, Ltd.

Received May 1990 Revised December 1991

1066

J. G. PIGEON, M. DIPONZIO COPENHAVER AND J. P. WHIPPLE

empirical models. They assume that the relationship between a response variable, y, and a set of independent factors, x, may be described by the following mathematical model: y

= f hB) + E,

where x is a p-dimensional vector of suitably transformed and coded input variables, is a pdimensional vector of model parameters and E is a vector of random errors assumed mutually independent with zero mean and common variance 0’. The usual response surface methods assumef(x, j?)is polynomial and the concern is with optimizing the response variable over some appropriate design region. Note that response surface methods make no attempt to model or estimate the true response surface, only to approximate the true response surface with a convenient mathematical function. Thus, assuming appropriate choice of the design region, a quadratic function will usually approximate the true response surface adequately well over the design region and is commonly used. Consequently, we restrict our attention to quadratic response surfaces. The application of response surface methodology to the development of combination therapies is not a new idea. For exampk, Carter et a1.’ discuss design and analysis issues in cancer chemotherapy, where the use of three or more drugs in combination is not uncommon. Interest in factorial designs and response surface methods, however, has recently been generated by the suggestion that they might have use in clinical trials of drug combinations for other diseases such as hypertension. We believe that these methods may indeed be useful for some clinical trials of drug combinations, depending on the objectives of the trial. Specifically, if the objective of the trial is to demonstrate that the combination is superior to each of its components, then these methods have little value. There are better and simpler designs to achieve this objective. If the objective, however, is to estimate the simultaneous dose response relationship or to determine an effective combination for marketing, then these methods may be quite valuable. One needs to recognize that some trials, especially those early in a drug development programme, should have objectives that involve estimation as the primary mode of statistical inference and other trials, particularly those later in the development programme, should have objectives involving hypothesis testing. Thus, early in the development programme of a combination drug, the clinical trials conducted should estimate the simultaneous dose response relationship to select a combination for study in later trials. It is only these early trials of estimation for which we propose a factorial/response surface design. We believe that this distinction between estimation and hypothesis testing is necessary because of the different implications associated with each form of statistical inference. That is, with interest in estimating the simultaneous dose response relationship of two drugs, we need to study adequately the design space of the two components through the selection of a sufficient number of dose levels of each component and by the proper selection of those dose levels. On the other hand, with interest in testing specific hypotheses among the cells, then we need to eliminate all cells not involved in the hypothesis. This creates a problem since we may not know a priori the specific hypothesis among the cells. Thus the objectives of estimation and hypothesis testing differ. Estimation objectives call for more cells, properly distributed in the design space and hypothesis testing objectives call for fewer cells, properly chosen. Assuming that we desire to conduct a clinical trial for estimation of the simultaneous dose response relationship, how should we design it? Several trialse-’ reported in the literature seem to have been designed by a ‘seat of the pants’ approach in that selection of both the number of doses and the dose levels appears to have been done without any statistical design considerations. This approach has produced a variety of rather large factorial or grid designs including 3 x 4, 4 x 4, 4 x 5 and 5 x 5. In addition, because of the huge sample size requirement of the larger

DESIGNING CLINICAL TRIALS FOR COMBINATION DRUGS

1067

factorial trials, consideration has also been given to dropping some of the cells. The problem with this 'seat of the pants' approach to clinical trial design is that there is little or no consideration of the statistical aspects of experimental design such as the number of dose levels and the proper selection of dose levels.

2. PROPOSED FACTORIAL DESIGN We propose a design strategy based on the assumption that the trial's primary objective is to estimate the simultaneous dose response relationship of the two components when given in combination. Associated with any particular point, x, on a fitted response surface is the variance of the predicted value, 9, at that point. By defining information,6." I(x), as the reciprocal of the standardized variance of 9 at the point x; that is, Z(x) =

o2

m var (9)'

~

where m is the total sample size over the region of estimation, we can calculate the amount of information at any point in the design space and construct an information surface. Note that this information surface is a function of the particular approximating polynomial (usually quadratic) used to estimate the surface and also of the experimental design. That is, different designs will have different information surfaces associated with them. Box" has shown that the harmonic average of the amount of information at each design point is equal to l/p, where p is the number of coefficients in the polynomial, regardless of the location of the design points. Thus, in at least one sense, the total amount of information associated with fitting a polynomial response surface is constant for all designs. The location of the design points, however, does affect the distribution of information and hence the shape of the information surface. Consequently, the problem of comparing competing designs may not be the selection of a design with more total information. Rather, it may be the selection of a design that distributes the information over the region of estimation in a more desirable manner. Since the shape of the information surface is perhaps the simplest and most direct measure of desirability, we propose to select a design based on a satisfactory distribution of information. In other words, we seek a design whose information surface has desirable characteristics such as rotatability or symmetry. We need the following definitions for any discussions of information surfaces. We say a design is rotatable if the amount of information at a fixed distance from the origin is the same in all directions. We say a design is symmetric with respect to an axis of symmetry L if to each point R of the information surface there corresponds a point S also of the information surface such that L bisects RS perpendicularly. We say a design is symmetric with respect to a plane P if to each point R of the information surface there corresponds a point S also of the information sirface such that P plane bisects RS perpendicularly. Figures 1 to 4 display the information surfaces associated with four different designs used to estimate a quadratic response surface. The x and y axes correspond to coded values of the suitably transformed doses of drug 1 and drug 2, respectively and the z axis corresponds to information. The use of coded values simplifies the computations, often results in uncorrelated parameter estimates and standardizes the doses by removing the unit of measurement. Specifically, we choose to assign the lowest dose of each drug a coded value of - $and the highest dose of each drug a coded value of + $. Thus, each design covers the same range of coded values , for each drug. We plot the information surfaces, however, for values of x from - $ to +b and y from - 2 to + 2 for convenience and clarity of presentation.

1068


0.440

0.298

0. i51 2.00

0.006 i

-2:oo

Figure 1 . A proposed ‘seat of the pants’ design; 4 x 4 with missing cells

Figure 1 shows the information surface associated with a ‘4 x 4 seat of the pants’ design with unequally spaced dose levels and four empty cells. Obviously, this design does not result in a desirable distribution of information as the surface is not rotatable nor symmetrical in any reasonable way. Figure 2 represents the information surface of a commonly used response surface design, a central composite design. We note that even though the design has achieved rotatability and complete symmetry, the design has a rather obvious lack of information in the centre of the design space. By adding additional patients at the centre point of the central composite design, however, we can raise the amount of information in the centre of the design to a desired level, while still maintaining the rotatability of the design. Figure 3 represents the information surface of a 3 x 3 factorial and we note that although it does not achieve rotatability, it does achieve a certain amount of symmetry. Finally, Figure 4 displays the information surface of a 3 x 3 factorial with two centre points that corresponds to a clinical trial with a double patient allocation at the middle design point. We note that this design has the same general shape of the information surface as the 3 x 3 factorial design, but that the surface seems generally higher. Although a visual examination of the information plots aids in comparing designs, a quantitative comparison of the designs is also possible by examining the corresponding equations of the information surfaces. We have not given the equation of the information surface represented in Figure 1 because of its complexity; however the equations of the other three comparative designs in polar co-ordinates are as follows:


0.440

1069

-

0.296

0.18% 2.00

0.006

B

-2:w

Figure 2. Central composite design

Central composite design: Z(r, 8) =

32 99r4 - 252r’

+ 288 ’

3 x 3 design: I(r, 8) =

16 - 27r4 cos2(8)sin’ (8)

+ 18r4 - 36r’ + 80’

3 x 3 with 2 centre points: Z(r, 6) =

168 - 315r4 cosz (8)sin’ (8)

+ 180r4 - 220r’ + 600 .

Following the definitions of rotatability and symmetry given earlier, the following theorems are useful and appear without proof. Theorem 1. A design is rotatable if and only if its information equation in polar co-ordinates is a function of only r. Theorem 2.1. A design is symmetric with respect to the z axis if and only if its information equation in polar co-ordinates remains unaltered by replacing 8 by 8 + n.

1070

J. G. PIGEON, M. DIPONZIO COPENHAVER AND J. P. WHIPPLE INORUTION

0.440

I

0.m

0.161

2.00

0.006 2

Figure 3. 3 x 3 factorial design

Theorem 2.2. A design is symmetric with respect to the y z plane if and only if its information equation in polar co-ordinates remains unaltered by replacing 8 by 7t - 0. Theorem 2.3. A design is symmetric with respect to the xz plane if and only if its information equation in polar co-ordinates remains unaltered by replacing 8 by - 0. Theorem 2.4. A design is symmetric with respect to the x = y plane if and only if its information equation in polar co-ordinates remains unaltered by replacing 8 by 7c/2 - 8. Theorem 2.5. A design is symmetric with respect to the x = - y plane if and only if its information equation in polar co-ordinates remains unaltered by replacing 8 by 3nf2

-

0.

Applying these theorems to the information equations above indicates that only the central composite design is rotatable but that all three designs are symmetric in accordance with Theorems 2.1 to 2.5. Thus, even though the factorial designs are not rotatable, they still possess considerable symmetry. Since the axes and planes of symmetry of Theorems 2.1 to 2.5 constitute a complete set of reasonable symmetries for a design, we define a ‘nearly rotatable’ design as a nonrotatable design that is symmetric in the sense of Theorems 2.1 to 2.5. Thus, we may classify the central composite design as a rotatable design and the two factorial designs as ‘nearly rotatable’ designs.

1071

DESIGNING CLINICAL TRIALS FOR COMBINATION DRUGS INO#UTIOH

0.440

0.2%

0.151

2.00

0.006 2

-r:00

Figure 4. 3 x 3 factorial design; two centre points

Although a rotatable design seems to be generally preferable to a ‘nearly ro atable’ design, we believe that either factorial design is to be preferred for clinical trials because the number of dose levels per drug for either factorial design is only three as opposed to the five required by the central composite design. The number of levels used in a design may not be that critical in other experimental settings, but in a clinical trial, it is imperative that the number of different dose levels be kept to a minimum. The price to be paid for this reduction in the number of dose levels is loss of rotatability. However, in our opinion, the information surface of the factorial designs still has enough symmetry and uniformity to provide a useful estimate of the response surface and yet the design is not unduly complicated. Consequently, the choice of the factorial design over the central composite design represents a balance between the pure statistical considerations (that is, rotatability) and the practical aspects associated with the conduct of a clinical trial (that is, simplicity). The final decision for a trial design comes down to a choice between the 3 x 3 and the 3 x 3 with two centre points. Since the two designs have the same shape or distribution of information, we need to decide what has been gained by adding another centre point or having a double patient allocation at the origin. As previously mentioned, the harmonic average of the information at each design point is 1/6 for each design. Thus, in this sense, the total amount of information is the same for the two designs. However, it seems reasonable to compare designs with respect to the mean amount of information, say I,. By integrating the information equation over the entire

1072


design space and dividing by the area of the design space, we obtain I, = 0.251 for the 3 x 3 factorial design and I, = 0 2 8 3 for the 3 x 3 with two centre points design. Based on this approximate 13 per cent increase in mean information, we recommend the 3 x 3 with two centre points as the design of choice. Finally, the basic 3 x 3 design with two centre points should be augmented with monotherapies of each of the components and with a placebo. This adds another seven cells to the 3 x 3 design and makes the final design a 4 x 4. Even though the final design is actually 4 x 4, however, we believe that the estimation of the response surface should use only the 3 x 3 data. Since the response surface represents the joint or combined action of the two components, it makes sense to estimate it using only the data obtained from the combination of the two components; that is the 3 x 3 data. The monotherapies and the placebo are included in the trial only for model validation purposes. The final design layout is: Drug B

Drug A

Pbo Low Mid High

Pbo

Low

Mid

High

n

n n

n n 2n

n

n

n

n

n n

n

n n n

3. SAMPLE SIZE FORMULA We can use the information surface associated with the proposed design to calculate the sample size required to estimate the response to within any desired value with any desired confidence over some appropriate region of the design space. Suppose that we select a region of the design within which we desire to estimate the response to within +_ 6 with (1 - a) per cent confidence; that is, z(,,Z,J{var(j3)1 6 7

within the selected region where z(.,~) is a standard normal random variable with exactly a / 2 in one tail. Using the above definition of information, we have

where n is the number of patients per cell. Note that 10n is the total number of patients in the 9 cells of the 3 x 3 design space. Since the entire design requires N = 17n patients, we finally obtain the following expression for the total sample size:

Thus, we can select the information contour that covers any desired region of design space and calculate the sample size required to estimate the response to within f 6 over that desired region with (1 - a) per cent confidence. 4. DISCUSSION

The actual selection of the doses needs careful consideration in the planning of these types of clinical trials. In particular, the lowest dose of each drug should be the lowest dose one expects to


1073

use in a combination and the highest dose should be the highest dose one expects to use in a combination. We envision that preliminary information for selecting doses has already been gleaned from previous trials with these therapies administered in an add-on (or stepped) fashion. Or perhaps the interest in the combination arose from previous studies with similar agents; for example, ACE inhibitors and diuretics in the treatment of hypertension. So, with enough preliminary data, and perhaps some safety or pilot studies, one should have enough information available to select appropriate doses of each drug for this study. An additional consideration is that the range of doses needs to be wide enough to capture an effective combination and yet narrow enough to approximate the true response surface with a quadratic function. Finally, because the doses must be evenly spaced in the coded units, we can choose only two of the three arbitrarily. Thus if we choose the lowest and the highest doses, then we must use as the middle dose their midpoint in coded units. The proposed design offers several advantages over previously used ‘seat of the pants’ designs. First, it has the minimum number of cells and dose levels required to estimate a quadratic surface. Even though the central composite design and our proposed design each has nine cells, the proposed design uses only three dose levels per drug as opposed to the five required by the central composite design. Second, because of the 3 x 3 structure, the design allows for comprehensible comparisons. In particular, we can measure the dose response of each drug and compare them at each level of the other drug. These types of comparisons are impossible with a central composite design. Third, as a result of the shape of the information surface, most information is in the design regions where we might find interpolation necessary. In other words, if we have selected the dose levels appropriately, then interest centres around the middle of the design space which is precisely where the information is at its peak. The disadvantage of our proposed design is that we do not use 41 per cent of the subjects toward the primary objective of the trial; that is, estimating the simultaneous dose response relationship of the two drugs when given in combination. In other words, we estimate the response surface using only 10/17 of all subjects in the trial. This may seem a waste of valuable and costly resources, but we think it simply too risky to run a trial of this magnitude and importance without a placebo cell and the monotherapy dose responses. We note that this is not unlike many placebo controlled trials that compare two active drugs since these trials do not use 33 per cent of the patients for the primary objective of comparing the two active drugs. We recommend exclusion of the monotherapy and the placebo cells from the estimation of the response surface for several compelling reasons. First, since the response surface represents the joint or combined action of the two components, we should estimate it using only data that relate to the joint or combined action of the two drugs; that is, the 3 x 3 data. Second, inclusion of the monotherapy and the placebo cells presents several practical problems in the estimation of the response surface. For example, there may be a considerable negative effect on the information surface. Since the information equation of a 4 x 4 factorial has the same general form as the information equations of the 3 x 3 factorials, the information surface of this design has the same general shape as the 3 x 3 designs. Thus, even with all four doses equally spaced, there would be no improvement in the shape of the surface. But, inclusion of the monotherapies in the estimation will probably result in unequal spacing among the coded values which will cause distortion of the information surface and the design will no longer be ‘nearly rotatable’. Furthermore, if the three doses are evenly spaced on the log scale, then there is the problem of taking the log of zero. From still another point of view, inclusion of the monotherapies in the estimation only serves to spread the information over a larger design space, which results in the shifting of information from areas where it is desired (that is, between the lowest and highest doses one expects to use in the

1074


combination) to areas where it is not desired (that is, lower than the lowest dose one expects to use in the combination). Finally, we note that since the objectives of the trial involve estimation and not hypothesis testing, the sample size formula was obtained from a process that involved the maximum desired width of a confidence interval and not from a power argument, as is typical in most clinical trials.

5. CONCLUDING REMARKS In this paper, we have argued for the use of factorial/response surface designs in clinical trials of combination drugs. We note, however, that we should use them early in the development plan of a drug combination and with those trials aimed at estimating the combined dose response relationship of the two drugs. Specifically, we recommend the use of an augmented 3 x 3 with a double patient allocation in the middle cell. We base this recommendation on a comparison of information surfaces for various competing designs and we chose it because it yields a reasonably symmetric and uniform information surface while maintaining relative simplicity with respect to the number of dose levels and the total number of cells. We recommend further estimation of the response surface using only the data from the 3 x 3 and excluding the monotherapy and placebo data. The main reason for this recommendation is the common sense notion that we should not estimate the response of a drug combination with the use of data from cells that do not contain some combination of the drugs. Finally, to document the efficacy of the combination for regulatory agencies worldwide, we suggest the study of the selected combination in a second trial to demonstrate that the combination is indeed superior to each of its components. REFERENCES

1. Fisher, R. A. The Design of Experiments, 8th edn., Oliver and Boyd, Edinburgh, 1935. 2. Yates, F. ‘Complex experiments’, Supplement to the Journal of the Royal Statistical Society, 11, No. 2, 181-247 (1935). 3. Box, G. E. P., Hunter, W. G. and Hunter, J. S. Statistics for Experimenters, Wiley, New York, 1978. 4. Box, G. E. P. and Wilson, K. B. ‘On the experimental attainment of optimum conditions’, Journal ofthe Royal Statistical Society, Series B, 13, 1-38 (1951). 5. Myers, R. H.Response Surface Methodology, Allyn and Bacon, Boston, 1971. 6. Box, G. E. P. and Draper, N. R. Empirical Model Building and Response Surfaces, Wiley, New York, 1987. 7. Carter, W. H., Wampler, G. L. and Stablein, D. M. Regression Analysis of Survival Data in Cancer Chemotherapy, Marcel Dekker, New York, 1983. 8. Goldberg, M. R. and Offen, W. W. ‘Pinacidil with and without hydrochlorothiazide: dose response relationships from results of a 4 x 3 factorial design study’, Drugs, 36, (suppl. 7), 83-92 (1988). 9. Jamerson, B. D., Cady, W. J., Andrade, C. E., Haverstock, D. and Stewart, W. ‘Multifactorial study to determine the dose response characteristics of Diltiazem and Hydrochlorothiazide’, Abstract PIIF-5, Clinical Pharmacology and Therapeutics, 45, No. 2, 145 (1988). 10. Weir, M. R., Burris, J. F., Oparil, S., Weber, M. A. and Cady, W. J. ‘A multifactorial evaluation of the antihypertensive efficacy of the combination of Diltiazem and Hydrochlorothiazide’, Abstract 1081, American Journal of Hypertension, 2, No. 5, 21A (1989). 11. Box, G. E. P. ‘Choice of response surface design and alphabetic optimality’, Utilitas Mathematica, 21B, 11-55 (1982).

Designing clinical trials for dystonia.

Designing clinical trials for amblyopia.

A Bayesian Adaptive Design for Combination of Three Drugs in Cancer Phase I Clinical Trials.

Designing clinical trials in trauma surgery: overcoming research barriers.

Big Data in Designing Clinical Trials: Opportunities and Challenges.

Considerations and guidance in designing equity-relevant clinical trials.

Designing multi-arm multi-stage clinical trials using a risk-benefit criterion for treatment selection.

Designing clinical protocols for optimal use: measuring attributes of treatment and cancer control trials.

Designing pharmacy practice research trials.

A Multi-state Model for Designing Clinical Trials for Testing Overall Survival Allowing for Crossover after Progression.

Designing therapeutic clinical trials for older and frail adults with cancer: U13 conference recommendations.

A pharmacokinetic method for designing prolonged-release formulations--propoxyphene hydrochloride.

Logistics in designing clinical trials for etanidazole (SR 2508): an RTOG experience.

A Method for Designing Instrument-Free Quantitative Immunoassays.

Hypoglycemia: a review of definitions used in clinical trials evaluating antihyperglycemic drugs for diabetes.

A staged screening of registered drugs highlights remyelinating drug candidates for clinical trials.

Targeted drugs for pulmonary arterial hypertension: a network meta-analysis of 32 randomized clinical trials.

CORR® ORS Richard A. Brand Award: Clinical Trials of a New Treatment Method for Adhesive Capsulitis.

A Bayesian dose-finding design for drug combination clinical trials based on the logistic model.

Consensus conference on a composite endpoint for clinical trials on immunosuppressive drugs in lung transplantation.

A method for lipid droplet isolation from human placenta for further analyses in clinical trials.

Designing drugs that combat kidney damage.

Atherogenic dyslipidemia and combination pharmacotherapy in diabetes: recent clinical trials.

Target dIsease-Guided placEbo-contRolled (TIGER) design: a novel method for clinical trials of acupuncture.