Journal of Biopharmaceutical Statistics

ISSN: 1054-3406 (Print) 1520-5711 (Online) Journal homepage: http://www.tandfonline.com/loi/lbps20

Randomized Phase II Clinical Trials Sin-Ho Jung & Daniel J. Sargent To cite this article: Sin-Ho Jung & Daniel J. Sargent (2014) Randomized Phase II Clinical Trials, Journal of Biopharmaceutical Statistics, 24:4, 802-816, DOI: 10.1080/10543406.2014.901343 To link to this article: https://doi.org/10.1080/10543406.2014.901343

Accepted author version posted online: 03 Apr 2014. Published online: 03 Apr 2014. Submit your article to this journal

Article views: 252

View related articles

View Crossmark data

Citing articles: 2 View citing articles

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lbps20

Journal of Biopharmaceutical Statistics, 24: 802–816, 2014 Copyright © Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2014.901343

RANDOMIZED PHASE II CLINICAL TRIALS Sin-Ho Jung1,2 and Daniel J. Sargent3 1 Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA 2 Biostatistics and Clinical Epidemiology Center, Samsung Medical Center, Sungkyunkwan University, Seoul, Korea 3 Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, Minnesota, USA

Traditionally, Phase II trials have been conducted as single-arm trials to compare the response probabilities between an experimental therapy and a historical control. Historical control data, however, often have a small sample size, are collected from a different patient population, or use a different response assessment method, so that a direct comparison between a historical control and an experimental therapy may be severely biased. Randomized Phase II trials entering patients prospectively to both experimental and control arms have been proposed to avoid any bias in such cases. The small sample sizes for typical Phase II clinical trials imply that the use of exact statistical methods for their design and analysis is appropriate. In this article, we propose two-stage randomized Phase II trials based on Fisher’s exact test, which does not require specification of the response probability of the control arm for testing. Through numerical studies, we observe that the proposed method controls the type I error accurately and maintains a high power. If we specify the response probabilities of the two arms under the alternative hypothesis, we can identify good randomized Phase II trial designs by adopting the Simon’s minimax and optimal design concepts that were developed for single-arm Phase II trials. Key Words: Fisher’s exact test; Minimax design; Optimal design; Two-stage design; Unbalanced allocation.

1. INTRODUCTION The purpose of a Phase II cancer clinical trial is to investigate whether an experimental therapy has promising efficacy and thus is worth further investigation. For Phase II clinical trials of cytotoxic drugs, the most popular primary outcome is overall response of the experimental therapy, meaning the therapy shrinks the tumor(s). As an effort to speed the assessment of new therapies, a Phase II clinical trial usually recruits a small number of patients only to the experimental therapy arm to be compared to a historical control. This implies that the traditional single-arm Phase II trials are appropriate only when reliable and valid data for an existing standard therapy are available for the same patient population. Furthermore, the response assessment method in the historical control data should be identical to the one that will be used for a new study. Received December 3, 2012; Accepted May 4, 2013 Address correspondence to Sin-Ho Jung, PhD, Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, 11070 Hock Plaza, Suite 1102, DUMC Box 2721, Durham, NC 27710, USA; E-mail: [email protected]

802

RANDOMIZED PHASE II CLINICAL TRIALS

803

If no historical control data satisfying these conditions exist or the existing data are too small to represent the whole patient population, we have to consider a randomized Phase II clinical trial with a prospective control to be compared with the experimental therapy under investigation. Cannistra (2009) recommends a randomized Phase II trial if a single-arm design is subject to any of the issues just described. Readers may refer to Gan et al. (2010) about more issues associated with which design to choose between a single-arm Phase II trial and a randomized Phase II trial. Let px and py denote the response probabilities of an experimental arm and a control arm, respectively. In a randomized Phase II trial, we want to test H0: px ≤ py against H1: px > py . The null distribution of the binomial test statistic depends on the common response probability px = py (see Jung, 2008). Consequently, if the true response probabilities are different from the specified ones, the testing based on binomial distributions may not maintain the type I error close to the specified design value. In order to avoid this issue, Jung (2008) proposes to control the type I error rate at px = py = 12 . This results in a strong conservativeness when the true response probability is different from 50%. Asymptotic tests avoid specification of px = py by replacing them with their consistent estimators, but the sample sizes of Phase II trials usually are not large enough for a good large-sample approximation. Fisher’s (1935) exact test has been a popular testing method for comparing two sample binomial proportions with small sample sizes. In a randomized Phase II trial setting, Fisher’s exact test is based on the distribution of the number of responders on one arm conditioning on the total number of responders, which is a sufficient statistic of px = py under H0 . Hence, the rejection value of Fisher’s exact test does not require specification of the common response probabilities px = py under H0 . In this article, we propose twostage randomized phase II trial designs based on Fisher’s exact test. Using some example designs, we show that Fisher’s exact test accurately controls the type I error over the wide range of true response values, and is more powerful than Jung’s binomial test if the true response probabilities are different from 50%. If we can project the true response probabilities accurately at the design stage, we can identify efficient designs by adopting Simon’s (1989) optimal and minimax design concepts that were proposed for single-arm phase II trials. We provide tables of minimax and optimal two-stage designs under various practical design settings. In this article, we limit our focus to randomized Phase II trials for evaluating the efficacy of an experimental therapy compared to a prospective control. Although we demonstrate the proposed method using response as the endpoint, it can be applied to any binomial endpoint, for example, the proportion of patients progression free at a fixed time point (say 6 months). Other types of randomized Phase II trial designs have been proposed by many investigators, including Simon, Wittes and Ellenberg (1985), Sargent and Goldberg (2001), Thall, Simon and Ellenberg (1989), Palmer (1991), and Steinberg and Venzon (2002). Rubinstein et al. (2005) discuss the strengths and weaknesses of some of these methods, and propose a method for randomized Phase II screening designs based on the usual large-sample approximation. As we demonstrate in section 3.2, we observed from numerical studies that the designs based on the large-sample theory may not control the type I error accurately with small sample sizes, motivating the present work. 2. SINGLE-STAGE DESIGN If patient accrual is fast or it takes a lengthy time (say, longer than 6 months) for response assessment, we may consider using a single-stage design. Suppose that n patients

804

JUNG AND SARGENT Table 1 Frequencies (and response probabilities in the parentheses) of a single-stage randomized Phase II trial Response Yes No Total

Arm 1

Arm 2

Total

x (px ) n − x (qx ) n

y (py ) n − y (qy ) n

z 2n − z

are randomized to each arm, and let X and Y denote the number of responders in arms x (experimental) and y (control), respectively. Let qk = 1 − pk for arm k(= x, y). Then the frequencies (and response probabilities in the parentheses) can be summarized as in Table 1. At the design stage, n is prespecified. Fisher’s exact test is based on the conditional distribution of X given the total number of responders Z = X + Y with a probability mass function:    n n θx x z−x    f (x|z, θ) = m+ n n θi i=m− i z−i for m− ≤ x ≤ m+ , where m− = max(0, z − n), m+ = min(z, n), and θ = px qy /(px qy ) denotes the odds ratio. Suppose that we want to limit the type I error rate to be below α ∗ . Given X + Y = z, we reject H0: px = py (i.e., θ = 1) in favor of H1: px > py (i.e., θ > 1) if X − Y ≥ a, where a is the smallest integer satisfying α(z) ≡ P(X − Y ≥ a|z, H0 ) =

m+ 

f (x|z, θ = 1) ≤ α ∗ ,

x=(z+a)/2

and c is the round-up integer of c. Hence, the critical value a depends on the total number of responders z. Under H1: θ = θ1 (> 1), the power conditional on X + Y = z is given by 1 − β(z) ≡ P(X − Y ≥ a|z, H1 ) =

m+ 

f (x|z, θ1 ).

x=(z+a)/2

We propose to choose n so that the marginal power is no smaller than a specified power level 1 − β ∗ , that is, E{1 − β(Z)} =

2n  {1 − β(z)}g(z) ≥ 1 − β ∗ z=0

where g(z) is the probability mass function of Z = X + Y under H1: px > py that is given as g(z) =

  m+    n n n−z+x pxx qyn−x pz−x y qy x z−x

x=m−

RANDOMIZED PHASE II CLINICAL TRIALS

805

for z = 0, 1, . . . , 2n. Note that the marginal type I error rate is controlled below α ∗ since the conditional type I error rate is controlled below α ∗ for any z value. Given a type I error rate and a power (α ∗ , 1 − β ∗ ) and a specific alternative hypothesis H1: (px , py ), we find a sample size n as follows. Algorithm for Single-Stage Design: 1. For n = 1, 2, . . ., a. For z = 0, 1, . . . , 2n, find the smallest a = a(z) such that α(z) = P(X − Y ≥ a|z, θ = 1) ≤ α ∗ and calculate the power conditional on X + Y = z for the chosen a = a(z), 1 − β(z) = P(X − Y ≥ a|z, θ1 ). b. Calculate the marginal power 1 − β = E{1 − β(Z)}. 2. Find the smallest n such that 1 − β ≥ 1 − β ∗ . Given a fixed n, Fisher’s test that is based on the conditional distribution is valid under θ = 1 (i.e., controls the type I error rate exactly), and its power conditional on the total number of responders depends only on the odds ratio θ1 under H1 . However, the marginal power, and hence the sample size n, depends on (p1 , p2 ), so that we need to specify (p1 , p2 ) at the design stage. If (p1 , p2 ) are misspecified, the trial may be over- or underpowered but the type I error in data analysis will always be appropriately controlled. 3. TWO-STAGE DESIGN For ethical and economical reasons, clinical trials are often conducted using multiple stages. Phase II trials usually enter a small number of patients, so that practically the number of stages is two at the most. We consider designs with the same features as popular twostage Phase II trial designs, with an early stopping rule when the experimental therapy has a low probability of achieving additional benefits to the patients. Suppose that nl (l = 1, 2) patients are randomized to each arm during stage l(= 1, 2). Let n1 + n2 = n denote the maximal sample size for each arm, Xl and Yl denote the number of responders during stage l in arms x and y, respectively, X = X1 + X2 and Y = Y1 + Y2 . At the design stage, nl are prespecified. Note that X1 and X2 are independent, and given Xl + Yl = zl , Xl has the conditional probability mass function,    nl nl θ xl xl zl − xl    fl (xl |zl , θ ) = (1) ml+ nl nl i θ i=ml− i zl − i for ml− ≤ xl ≤ ml+ , where ml− = max(0, zl − nl ) and ml+ = min(zl , nl ). We consider a two-stage randomized Phase II trial whose rejection values are chosen conditional on z1 and z2 as follows:

806

JUNG AND SARGENT

1. Stage 1: Randomize n1 patients to each arm; observe x1 and y1 . a. Given z1 (= x1 + y1 ), find a stopping value a1 = a1 (z1 ). b. If x1 − y1 ≥ a1 , proceed to stage 2. c. Otherwise, stop the trial. 2. Stage 2: Randomize n2 patients to each arm; observe x2 and y2 (z2 = x2 + y2 ). a. Given (z1 , z2 ), find a rejection value a = a(z1 , z2 ). b. Accept the experimental arm if x − y ≥ a. Now, the question is how to choose rejection values (a1 , a) conditioning on (z1 , z2 ). 3.1. Choice of a1 and a In this section, we assume that n1 and n2 are given. We consider different options of choosing a1 . For example:

r We may wish to stop the trial if the experimental arm is worse than the control. In this case, we choose a1 = 0. This a1 is constant with respect to z1 .

r We may choose a1 so that the conditional probability of early termination (PET) given z1 is no smaller than a level γ0 (= 0.6 to 0.8) under H0: θ = 1, that is, [(a1 +z1 )/2]−1

PET0 (z1 ) = P(X1 − Y1 < a1 |z1 , H0 ) =



f1 (x1 |z1 , θ = 1) ≥ γ0 ,

x1 =m1−

where [c] denotes the largest integer not exceeding c.

r We may choose a1 so that the conditional probability of early termination given z1 is no larger than a level γ1 (= 0.02 to 0.1) under H1: θ = θ1 , that is, [(a1 +z1 )/2]−1

PET1 (z1 ) = P(X1 − Y1 < a1 |z1 , H1 ) =



f1 (x1 |z1 , θ1 ) ≤ γ1 .

x1 =m1−

Among these options, we propose to use a1 = 0. Most standard optimal two-stage Phase II trials also stop early when the observed response probability from stage 1 is no larger than the specified response probability under H0 ; refer to Simon (1989) and Jung et al. (2004) for single-arm trial cases and Jung (2008) for randomized trial cases. With a1 fixed at 0, we choose the second-stage rejection value a conditioning on (z1 , z2 ). Given type I error rate α ∗ , a is chosen as the smallest integer satisfying α(z1 , z2 ) ≡ P(X1 − Y1 ≥ a1 , X − Y ≥ a|z1 , z2 , θ = 1) ≤ α ∗ . We calculate α(z1 , z2 ) by P(X1 ≥ (a1 + z1 )/2, X1 + X2 ≥ (a + z1 + z2 )/2|z1 , z2 , θ = 1) =

m1+ 

m2+ 

x1 =m1− x2 =m2−

I{x1 ≥ (a1 + z1 )/2, x1 + x2 ≥ (a + z1 + z2 )/2} f1 (x1 |z1 , 1) f2 (x2 |z2 , 1),

RANDOMIZED PHASE II CLINICAL TRIALS

807

where I(·) is the indicator function. Under H1: θ = θ1 , the power conditional on (z1 , z2 ) is obtained by 1 − β(z1 , z2 ) = P(X1 − Y1 ≥ a1 , X − Y ≥ a|z1 , z2 , θ1 ) =

m1+ 

m2+ 

I{x1 ≥ (a1 + z1 )/2, x1 + x2 ≥ (a + z1 + z2 )/2} f1 (x1 |z1 , θ1 ) f2 (x2 |z2 , θ1 ).

x1 =m1− x2 =m2−

Note that, as in the single-stage case, the calculation of type I error rate α(z1 , z2 ) and rejection values (a1 , a) does not require specification of the common response probability px = py under H0 , and that the conditional power 1 − β(z1 , z2 ) requires specification of the odds ratio θ1 under H1 , but not the response probabilities for the two arms, px and py . 3.2. Choice of n1 and n2 In this section we discuss how to choose sample sizes n1 and n2 at the design stage based on various optimality criteria. Given (α ∗ , β ∗ ), we propose to choose n1 and n2 so that the marginal power is maintained above 1 − β ∗ while controlling the conditional type I error rates for any (z1 , z2 ) below α ∗ as described in section 3.1. For stage l(= 1, 2), the marginal distribution of Zl = Xl + Yl has a probability mass function    ml+   nl nl xl nl −xl px qx pzyl −xl qnyl −zl +xl gl (zl ) = xl zl − xl

(2)

xl =ml−

for zl = 0, . . . , 2nl . Under H0: px = py = p0 , this is expressed as l −zl g0l (zl ) = pz0l q2n 0

  ml+   nl nl . xl zl − xl

xl =ml−

Z1 and Z2 are independent. Hence, we choose n1 and n2 so that the marginal power is no smaller than a specified level 1 − β ∗ , that is, 1−β ≡

2n2 2n1  

{1 − β(z1 , z2 )}g1 (z1 )g2 (z2 ) ≥ 1 − β ∗ .

z1 =0 z2 =0

The marginal type I error is calculated by α≡

2n2 2n1  

α(z1 , z2 )g01 (z1 )g02 (z2 ).

z1 =0 z2 =0

Since the conditional type I error rate is controlled below α ∗ for any (z1 , z2 ), the marginal type I error rate is no larger than α ∗ .

808

JUNG AND SARGENT

Although we do not have to specify the response probabilities for testing, we need to do so when choosing (n1 , n2 ) at the design stage. If the specified response probabilities are different from the true ones, then the marginal power may be different from that expected. But in this case, our proposed test is still valid in the sense that it always controls both the conditional and marginal type I error rates below the specified level. Let PET0 ≡ 1 E{PET0 (Z1 )|H0 } = 2n z1 =0 PET0 (z1 )g01 (z1 ) denote the marginal probability of early termination under H0 . Then, among those (n1 , n2 ) satisfying the (α ∗ , 1 − β ∗ )-condition, the Simon-type (1989) minimax and the optimal designs can be chosen as follows:

r Minimax design chooses (n1 , n2 ) with the smallest maximal sample size n(= n1 + n2 ). r Optimal design chooses (n1 , n2 ) with the smallest marginal expected sample size EN under H0 , where EN = n1 × PET0 + n × (1 − PET0 ). Tables 2 to 5 report the sample sizes (n, n1 ) of the minimax and optimal two-stage designs for α ∗ = 0.15 or 0.2, 1 − β ∗ = 0.8 or 0.85, and various combinations of (px , py ) under H1 . For comparison, we also list the sample size n of the single-stage design under each setting. Note that the maximal sample size of the minimax is slightly smaller than or equal to the sample size of the single-stage design. If the experimental therapy is inefficacious, however, the expected sample sizes of minimax and optimal designs are much smaller than the sample size of the single-stage design. We also observe from Tables 2 and 5 that the sample sizes under (α ∗ , 1 − β ∗ ) = (0.15, 0.8) are similar to those under (α ∗ , 1 − β ∗ ) = (0.2, 0.85). One of the popular approaches for randomized Phase II trials is to use the asymptotic method. Given (α ∗ , py , n1 , n2 ), we find c satisfying X−Y ≥ c|px = py ) α = P(X1 − Y1 ≥ 0,  2nˆpqˆ using the normal approximation to binomial distributions, where pˆ = (X + Y)/2n and qˆ = 1 − pˆ . For an approximate critical value c, the exact type I error rate is calculated by using the true binomial distribution. For a specified px ( = py ), the exact power is calculated similarly. From Table 2, the minimax design under (α ∗ , 1 − β ∗ , px , py ) = (0.15, 0.8, 0.5, 0.35) has (n, n1 ) = (86, 66), for which the asymptotic method has α = 0.157 and 1 − β = 0.840. Since the sample size is relatively large in this case, the asymptotic method controls the power close to the nominal α ∗ = 0.15. Now, we consider the minimax design (n, n1 ) = (29, 11) under (α ∗ , 1 − β ∗ , px , py ) = (0.15, 0.8, 0.35, 0.05). In this case, the asymptotic method has α = 0.244, which is far larger than the nominal α ∗ = 0.15 because of the small sample size. 4. NUMERICAL STUDIES Jung (2008) proposed a randomized Phase II design method based on the binomial test, called MaxTest in this paper, by controlling the type I error rate at px = py = 50%. Since the type I error rate of the two-sample binomial test is maximized at px = py = 50%, this test will be conservative if the true response probability under H0 is different from 50%.

RANDOMIZED PHASE II CLINICAL TRIALS

809

Table 2 Single-stage designs, and minimax and optimal two-stage Fisher designs for (α ∗ , 1 − β ∗ ) = (0.15, 0.8) Single-stage design py

px

0.05 0.15 0.20 0.25 0.10 0.25 0.30 0.15 0.30 0.35 0.20 0.35 0.40 0.25 0.40 0.45 0.30 0.45 0.50 0.35 0.50 0.55 0.40 0.55 0.60 0.45 0.60 0.65 0.50 0.65 0.70 0.55 0.70 0.75 0.60 0.75 0.80 0.65 0.80 0.85 0.70 0.85 0.90 0.75 0.90 0.95 0.80 0.95 0.85 0.95

Minimax two-stage design

Optimal two-stage design

θ

n

α

1–β

(n,n1 )

α

1–β

EN

(n, n1 )

α

1–β

EN

3.353 4.750 6.333 3.000 3.857 2.429 3.051 2.154 2.667 2.000 2.455 1.909 2.333 1.857 2.270 1.833 2.250 1.833 2.270 1.857 2.333 1.909 2.455 2.000 2.667 2.154 3.051 2.429 3.857 3.000 6.333 4.750 3.353

79 45 29 56 36 65 41 74 46 83 47 85 53 86 54 87 54 87 54 86 53 85 47 83 46 74 41 65 36 56 29 45 79

0.0827 0.0631 0.0450 0.0884 0.0747 0.1036 0.0879 0.1076 0.0953 0.1149 0.0986 0.1112 0.1121 0.1155 0.1009 0.1228 0.1015 0.1265 0.1043 0.1263 0.1033 0.1237 0.1114 0.1173 0.1208 0.1150 0.1020 0.1088 0.1023 0.1098 0.0929 0.0948 0.1065

0.8005 0.8075 0.8109 0.8033 0.8016 0.8023 0.8025 0.8054 0.8074 0.8022 0.8056 0.8005 0.8015 0.8006 0.8026 0.8032 0.8013 0.8032 0.8026 0.8006 0.8015 0.8005 0.8056 0.8022 0.8074 0.8054 0.8025 0.8023 0.8016 0.8033 0.8109 0.8075 0.8005

(78,40) (44,17) (29,11) (56,25) (36,16) (65,36) (41,19) (74,42) (46,23) (81,37) (47,27) (85,65) (49,26) (86,66) (54,21) (87,59) (54,22) (87,59) (54,21) (86,66) (49,26) (85,65) (47,27) (81,37) (46,23) (74,42) (41,19) (65,36) (36,16) (56,25) (29,11) (44,17) (78,40)

0.0823 0.0620 0.0448 0.0896 0.0783 0.1036 0.0919 0.1097 0.1016 0.1147 0.1007 0.1115 0.1100 0.1153 0.1169 0.1216 0.1150 0.1252 0.1151 0.1260 0.1147 0.1234 0.1240 0.1134 0.1163 0.1214 0.1004 0.1083 0.1068 0.1099 0.0919 0.1036 0.1071

0.8001 0.8033 0.8014 0.8003 0.8009 0.8004 0.8010 0.8001 0.8007 0.8005 0.8005 0.8000 0.8007 0.8000 0.8016 0.8004 0.8017 0.8004 0.8016 0.8000 0.8007 0.8000 0.8005 0.8005 0.8007 0.8001 0.8010 0.8004 0.8009 0.8003 0.8014 0.8033 0.8001

63.00 35.11 23.96 43.46 28.41 52.42 32.01 59.74 36.19 61.35 38.25 75.76 38.88 76.73 39.62 74.05 39.95 74.03 39.53 76.69 38.77 75.70 38.09 61.08 35.87 59.46 31.48 51.98 27.53 42.52 21.76 32.81 61.38

(81,26) (44,17) (29,11) (58,19) (37,12) (69.22) (42,14) (79,26) (49,14) (84,30) (50,17) (95,27) (52,19) (95,32) (55,19) (94,35) (56,18) (94,35) (55,19) (95,32) (52,19) (96,26) (50,17) (84,30) (50,12) (81,23) (43,12) (69,22) (38,9) (59,17) (30,7) (46,11) (83,22)

0.0836 0.0620 0.0448 0.0925 0.0800 0.1060 0.0925 0.1133 0.1061 0.1155 0.1053 0.1210 0.1113 0.1186 0.1168 0.1205 0.1147 0.1234 0.1148 0.1235 0.1126 0.1203 0.1221 0.1193 0.1145 0.1213 0.1060 0.1139 0.1075 0.1110 0.0949 0.1041 0.1098

0.8008 0.8033 0.8014 0.8016 0.8024 0.8009 0.8016 0.8003 0.8019 0.8006 0.8003 0.8004 0.8014 0.8002 0.8012 0.8000 0.8006 0.8000 0.8012 0.8002 0.8014 0.8010 0.8003 0.8006 0.8011 0.8006 0.8024 0.8009 0.8004 0.8016 0.8022 0.8007 0.8011

60.83 35.11 23.96 42.80 28.03 49.48 30.99 56.17 34.81 60.21 36.10 65.02 37.82 66.78 39.43 67.36 39.56 67.32 39.33 66.63 37.62 64.87 35.75 59.83 34.13 55.56 30.12 48.57 26.45 41.30 21.32 32.23 57.67

Here, we compare the performance of our Fisher’s test with that of MaxTest. All the calculations in this section are based on exact distributions, not on simulations. Figure 1 displays the type I error rate and power in the range of 0 < py < 1 −  for single-stage designs with n = 60 per arm,  = px − py = 0.15 or 0.2 under H1 and α = 0.1, 0.15 or 0.2 under H0 : px = py . The solid lines are for Fisher’s test and the dotted lines are for MaxTest; the lower two lines represent type I error rate and the upper two lines represent power. As is well known, Fisher’s test controls the type I error conservatively over the range of py . The conservativeness gets slightly stronger with small py values close to 0. MaxTest controls the type I error accurately around py = 0.5, but becomes more conservative for py values far from 0.5, especially with small py values. For α = 0.1, Fisher’s test and MaxTest have similar power around 0.2 ≤ py ≤ 0.4 except that MaxTest is slightly more powerful for py ≈ 0.4. Otherwise, Fisher’s test is more powerful. The difference in power between the two methods becomes larger with  = 0.15. We observe similar trends overall, but the

810

JUNG AND SARGENT

Table 3 Single-stage designs, and minimax and optimal two-stage Fisher designs for (α ∗ , 1 − β ∗ ) = (0.15, 0.85) Single-stage design py

px

0.05 0.15 0.20 0.25 0.10 0.25 0.30 0.15 0.30 0.35 0.20 0.35 0.40 0.25 0.40 0.45 0.30 0.45 0.50 0.35 0.50 0.55 0.40 0.55 0.60 0.45 0.60 0.65 0.50 0.65 0.70 0.55 0.70 0.75 0.60 0.75 0.80 0.65 0.80 0.85 0.70 0.85 0.90 0.75 0.90 0.95 0.80 0.95 0.85 0.95

Minimax two-stage design

Optimal two-stage design

θ

n

α

1−β

(n, n1 )

α

1−β

EN

(n, n1 )

α

1−β

EN

3.353 4.750 6.333 3.000 3.857 2.429 3.051 2.154 2.667 2.000 2.455 1.909 2.333 1.857 2.270 1.833 2.250 1.833 2.270 1.857 2.333 1.909 2.455 2.000 2.667 2.154 3.051 2.429 3.857 3.000 6.333 4.750 3.353

92 51 35 65 41 78 49 88 52 94 59 104 60 106 61 107 61 107 61 106 60 104 59 94 52 88 49 78 41 65 35 51 92

0.0868 0.0677 0.0531 0.0952 0.0785 0.1058 0.0948 0.1113 0.0990 0.1132 0.1067 0.1210 0.1034 0.1148 0.1089 0.1178 0.1147 0.1214 0.1184 0.1215 0.1176 0.1180 0.1145 0.1205 0.0999 0.1180 0.1158 0.1141 0.0980 0.1082 0.1019 0.0983 0.1065

0.8502 0.8526 0.8581 0.8529 0.8515 0.8521 0.8541 0.8518 0.8506 0.8511 0.8535 0.8504 0.8524 0.8508 0.8551 0.8520 0.8538 0.8520 0.8551 0.8508 0.8524 0.8504 0.8535 0.8511 0.8506 0.8518 0.8541 0.8521 0.8515 0.8529 0.8581 0.8526 0.8502

(92, 48) (51, 24) (34, 11) (65, 37) (41, 21) (78, 48) (49, 23) (88, 43) (52, 29) (94, 68) (59, 30) (100, 55) (60, 39) (106, 79) (61, 37) (107, 74) (61, 38) (107, 74) (61, 37) (106, 79) (60, 39) (100, 55) (59, 30) (94, 68) (52, 29) (88, 43) (49, 23) (78, 48) (41, 21) (65, 37) (34, 11) (51, 24) (92, 48)

0.0870 0.0676 0.0516 0.0950 0.0823 0.1067 0.0976 0.1120 0.0992 0.1133 0.1122 0.1200 0.1062 0.1148 0.1082 0.1170 0.1132 0.1203 0.1163 0.1210 0.1163 0.1159 0.1104 0.1256 0.1071 0.1146 0.1160 0.1190 0.1000 0.1102 0.0997 0.1021 0.1101

0.8500 0.8506 0.8506 0.8502 0.8500 0.8502 0.8506 0.8502 0.8506 0.8502 0.8503 0.8501 0.8501 0.8501 0.8507 0.8500 0.8500 0.8500 0.8507 0.8501 0.8501 0.8501 0.8503 0.8502 0.8506 0.8502 0.8506 0.8502 0.8500 0.8502 0.8506 0.8506 0.8500

74.20 41.27 27.56 53.18 33.09 64.71 38.15 67.92 42.01 82.03 46.22 79.37 50.53 93.40 50.16 91.60 50.57 91.59 50.11 93.36 50.45 79.22 46.00 81.91 41.72 67.52 37.60 64.33 32.34 52.50 24.75 39.45 72.52

(94, 35) (52,18) (34, 11) (68, 24) (42, 16) (82, 29) (51, 17) (93, 32) (56, 18) (102, 37) (62, 19) (107, 39) (65, 22) (114, 39) (65, 24) (115, 45) (66, 23) (115, 45) (65, 24) (114, 39) (65, 22) (107, 39) (62, 19) (102, 37) (56, 18) (93, 32) (52, 15) (84, 26) (43, 14) (69, 22) (35, 7) (53, 15) (98, 27)

0.0881 0.0678 0.0516 0.0963 0.0847 0.1091 0.1004 0.1147 0.1062 0.1203 0.1129 0.1209 0.1138 0.1252 0.1112 0.1205 0.1138 0.1196 0.1156 0.1240 0.1151 0.1234 0.1114 0.1277 0.1147 0.1178 0.1155 0.1201 0.1050 0.1150 0.1016 0.1073 0.1140

0.8502 0.8505 0.8506 0.8501 0.8502 0.8506 0.8527 0.8500 0.8506 0.8503 0.8506 0.8503 0.8510 0.8503 0.8504 0.8500 0.8502 0.8500 0.8504 0.8503 0.8510 0.8503 0.8506 0.8503 0.8506 0.8500 0.8519 0.8503 0.8514 0.8500 0.8505 0.8518 0.8503

71.17 40.61 27.56 50.29 32.14 59.40 37.28 66.49 40.16 72.98 43.71 76.34 46.31 80.04 46.96 83.00 47.07 82.95 46.86 79.88 45.07 76.08 43.28 72.57 39.56 65.68 36.31 58.49 30.87 48.76 24.43 37.47 67.92

difference in power becomes smaller with  = 0.2, especially when combined with a large α(= 0.2). Figure 2 displays the type I error rate and power of two-stage designs with n1 = n2 = 30 per arm. We observe that, compared to MaxTest, Fisher’s test controls the type I error more accurately in most of the range of py values. If α = 0.1, Fisher’s test is more powerful than MaxTest over the range of py < 0.2 or py > 0.6. But with a larger α, such as 0.15 or 0.2, MaxTest is more powerful in the range of py between 0.2 and 0.6. As in the single-stage design case, the difference in power diminishes as  and α increase. While Fisher’s exact test is more powerful than the binomial test in single-stage designs, their performance is comparable in two-stage designs. With a1 value fixed at 0, the two-stage designs based on Fisher’s exact test are not fully optimal. A fully optimal choice of a1 value will depend on z1 , and the two-stage design using the optimal a1 value is believed to outperform the twostage designs based on the binomial test in as the single-stage design case. We will develop

RANDOMIZED PHASE II CLINICAL TRIALS

811

Table 4 Single-stage designs, and minimax and optimal two-stage Fisher designs for (α ∗ , 1 − β ∗ ) = (0.2, 0.8) Single-stage design py

px

0.05 0.15 0.20 0.25 0.10 0.25 0.30 0.15 0.30 0.35 0.20 0.35 0.40 0.25 0.40 0.45 0.30 0.45 0.50 0.35 0.50 0.55 0.40 0.55 0.60 0.45 0.60 0.65 0.50 0.65 0.70 0.55 0.70 0.75 0.60 0.75 0.80 0.65 0.80 0.85 0.70 0.85 0.90 0.75 0.90 0.95 0.80 0.95 0.85 0.95

Minimax two-stage design

Optimal two-stage design

θ

n

α

1−β

(n, n1 )

α

1−β

EN

(n, n1 )

α

1−β

EN

3.353 4.750 6.333 3.000 3.857 2.429 3.051 2.154 2.667 2.000 2.455 1.909 2.333 1.857 2.270 1.833 2.250 1.833 2.270 1.857 2.333 1.909 2.455 2.000 2.667 2.154 3.051 2.429 3.857 3.000 6.333 4.750 3.353

65 38 26 48 30 54 35 63 39 67 41 68 41 69 42 70 42 70 42 69 41 68 41 67 39 63 35 54 30 48 26 38 65

0.1078 0.0799 0.0509 0.1200 0.0991 0.1343 0.1220 0.1512 0.1264 0.1424 0.1295 0.1518 0.1392 0.1631 0.1515 0.1713 0.1581 0.1751 0.1618 0.1746 0.1601 0.1716 0.1589 0.1660 0.1491 0.1522 0.1311 0.1466 0.1386 0.1446 0.1305 0.1392 0.1411

0.8031 0.8040 0.8050 0.8043 0.8012 0.8005 0.8057 0.8003 0.8005 0.8028 0.8101 0.8006 0.8034 0.8012 0.8092 0.8043 0.8079 0.8043 0.8092 0.8012 0.8034 0.8006 0.8101 0.8028 0.8005 0.8003 0.8057 0.8005 0.8012 0.8043 0.8050 0.8040 0.8031

(65, 36) (38, 15) (25, 10) (47, 19) (30, 19) (54, 40) (34, 15) (62, 29) (39, 26) (67, 47) (40, 35) (68, 55) (41, 28) (69, 54) (42, 25) (70, 50) (42, 26) (70, 50) (42, 25) (69, 54) (41, 28) (68, 55) (41, 23) (67, 47) (39, 26) (62, 29) (34, 15) (54, 40) (30, 19) (47, 19) (25, 10) (38, 15) (65, 36)

0.1121 0.0813 0.0481 0.1222 0.1022 0.1349 0.1209 0.1491 0.1342 0.1467 0.1292 0.1517 0.1389 0.1625 0.1479 0.1692 0.1545 0.1727 0.1572 0.1737 0.1581 0.1711 0.1559 0.1638 0.1472 0.1531 0.1425 0.1518 0.1427 0.1427 0.1249 0.1406 0.1410

0.8005 0.8007 0.8005 0.8002 0.8005 0.8003 0.8022 0.8011 0.8005 0.8001 0.8000 0.8001 0.8002 0.8001 0.8004 0.8005 0.8007 0.8005 0.8004 0.8001 0.8002 0.8001 0.8000 0.8001 0.8005 0.8011 0.8022 0.8003 0.8005 0.8002 0.8005 0.8007 0.8005

53.73 30.73 20.98 36.08 25.71 47.88 26.46 47.66 33.40 54.95 37.78 62.04 35.25 62.10 34.50 60.81 34.90 60.80 34.46 62.07 35.19 62.00 37.74 57.84 33.23 47.31 25.94 47.68 25.27 35.09 19.04 28.60 52.42

(68, 24) (38, 15) (25, 10) (47, 19) (31, 13) (60, 18) (35, 13) (65, 23) (40, 14) (73, 24) (45, 12) (75, 30) (45, 16) (76, 32) (46, 16) (77, 32) (45, 18) (77, 32) (46, 16) (76, 32) (45, 16) (77, 28) (45, 12) (73, 24) (40, 14) (65, 23) (36, 11) (60, 18) (33, 7) (48, 17) (26, 6) (40, 9) (70, 20)

0.1152 0.0813 0.0481 0.1222 0.1066 0.1445 0.1221 0.1507 0.1408 0.1582 0.1477 0.1570 0.1445 0.1602 0.1491 0.1656 0.1528 0.1685 0.1549 0.1686 0.1542 0.1651 0.1505 0.1601 0.1430 0.1599 0.1458 0.1590 0.1428 0.1458 0.1267 0.1419 0.1483

0.8004 0.8007 0.8005 0.8002 0.8024 0.8015 0.8020 0.8002 0.8001 0.8003 0.8001 0.8003 0.8003 0.8010 0.8003 0.8010 0.8007 0.8010 0.8003 0.8010 0.8003 0.8022 0.8001 0.8003 0.8001 0.8002 0.8003 0.8015 0.8008 0.8009 0.8007 0.8017 0.8010

52.14 30.73 20.98 36.08 24.43 42.94 26.44 47.09 29.46 51.75 31.59 55.02 32.72 56.29 33.20 56.78 33.32 56.75 33.11 56.19 32.53 55.11 31.17 51.37 28.98 46.58 25.71 42.03 22.99 35.09 18.65 28.16 49.45

an efficient algorithm to identify fully optimal two-stage designs using Fisher’s exact test in our future study. 5. UNBALANCED TWO-STAGE RANDOMIZED TRIALS One may want to accrue more patients to one arm than the other for some reasons, for example, to treat more patients by an experimental therapy than a control. In this case, the test statistic based on the difference in number of responders between two arms that has been considered so far is not appropriate. Let ml and nl denote the sample sizes at stage l(= 1, 2) of arms x and y, respectively (m = m1 + m2 , n = n1 + n2 ). Also, let Xl and Yl denote the number of responders among stage l patients of arms x and y, respectively (X = X1 + X2 , Y = Y1 + Y2 ). If we want to assign γ times larger number of patients to arm x than to arm y, then we have ml = γ × nl and m = γ × n. Note that a choice of γ = 1 corresponds to the balanced two-stage designs considered in the previous section. When

812

JUNG AND SARGENT

Table 5 Single-stage designs, and minimax and optimal two-stage Fisher designs for (α ∗ , 1 − β ∗ ) = (0.2, 0.85) Single-stage design py

px

0.05 0.15 0.20 0.25 0.10 0.25 0.30 0.15 0.30 0.35 0.20 0.35 0.40 0.25 0.40 0.45 0.30 0.45 0.50 0.35 0.50 0.55 0.40 0.55 0.60 0.45 0.60 0.65 0.50 0.65 0.70 0.55 0.70 0.75 0.60 0.75 0.80 0.65 0.80 0.85 0.70 0.85 0.90 0.75 0.90 0.95 0.80 0.95 0.85 0.95

Minimax two-stage design

Optimal two-stage design

θ

n

α

1−β

(n, n1 )

α

1−β

EN

(n, n1 )

α

1−β

EN

3.353 4.750 6.333 3.000 3.857 2.429 3.051 2.154 2.667 2.000 2.455 1.909 2.333 1.857 2.270 1.833 2.250 1.833 2.270 1.857 2.333 1.909 2.455 2.000 2.667 2.154 3.051 2.429 3.857 3.000 6.333 4.750 3.353

81 44 30 56 35 65 43 74 45 80 46 87 53 89 54 89 54 89 54 89 53 87 46 80 45 74 43 65 35 56 30 44 81

0.1148 0.0897 0.0618 0.1292 0.1046 0.1411 0.1294 0.1469 0.1298 0.1575 0.1402 0.1531 0.1468 0.1541 0.1392 0.1600 0.1405 0.1637 0.1437 0.1649 0.1426 0.1610 0.1437 0.1485 0.1664 0.1672 0.1544 0.1480 0.1286 0.1536 0.1408 0.1290 0.1506

0.8535 0.8459 0.8542 0.8511 0.8545 0.8523 0.8577 0.8528 0.8559 0.8515 0.8514 0.8506 0.8501 0.8533 0.8523 0.8508 0.8510 0.8508 0.8523 0.8533 0.8501 0.8506 0.8514 0.8515 0.8559 0.8528 0.8577 0.8523 0.8545 0.8511 0.8542 0.8549 0.8535

(78, 37) (44, 19) (30, 11) (56, 36) (35, 19) (65, 35) (42, 19) (74, 51) (45, 27) (78, 50) (46, 32) (87, 68) (50, 29) (89, 63) (53, 27) (89, 71) (53, 28) (89, 71) (53, 27) (89, 63) (50, 29) (87, 68) (46, 32) (78, 50) (45, 27) (74, 51) (42, 19) (65, 35) (35, 19) (56, 36) (30, 11) (44, 19) (78, 37)

0.1197 0.0924 0.0621 0.1301 0.1097 0.1411 0.1344 0.1497 0.1349 0.1530 0.1398 0.1561 0.1568 0.1537 0.1589 0.1595 0.1569 0.1632 0.1564 0.1627 0.1548 0.1603 0.1616 0.1621 0.1615 0.1679 0.1439 0.1503 0.1362 0.1545 0.1373 0.1429 0.1496

0.8503 0.8504 0.8519 0.8501 0.8521 0.8506 0.8508 0.8502 0.8501 0.8500 0.8501 0.8501 0.8510 0.8502 0.8503 0.8501 0.8501 0.8501 0.8503 0.8502 0.8510 0.8501 0.8501 0.8500 0.8501 0.8502 0.8508 0.8506 0.8521 0.8501 0.8519 0.8504 0.8503

62.00 35.50 24.68 47.58 28.76 52.01 32.60 63.64 37.22 65.29 39.81 78.21 40.70 76.97 41.47 80.61 41.85 80.60 41.41 76.92 40.60 78.15 39.70 65.14 36.99 63.45 32.05 51.56 28.13 47.08 22.36 33.53 60.17

(81, 29) (44, 19) (30, 11) (59, 21) (37, 12) (69, 26) (43, 16) (80, 30) (48, 18) (84, 36) (50, 20) (92, 36) (52, 21) (93, 40) (55, 23) (96, 39) (57, 21) (96, 39) (55, 23) (92, 50) (52, 21) (92, 36) (50, 20) (85, 34) (48, 18) (82, 27) (44, 14) (69, 26) (37, 12) (59, 21) (31, 7) (46, 14) (83, 26)

0.1214 0.0924 0.0621 0.1321 0.1132 0.1453 0.1345 0.1575 0.1443 0.1567 0.1453 0.1670 0.1547 0.1660 0.1599 0.1686 0.1563 0.1713 0.1559 0.1604 0.1559 0.1664 0.1659 0.1692 0.1576 0.1627 0.1473 0.1591 0.1461 0.1538 0.1386 0.1451 0.1516

0.8501 0.8504 0.8519 0.8503 0.8502 0.8508 0.8511 0.8507 0.8505 0.8515 0.8508 0.8503 0.8500 0.8501 0.8506 0.8516 0.8502 0.8516 0.8506 0.8502 0.8500 0.8503 0.8508 0.8502 0.8505 0.8501 0.8505 0.8508 0.8502 0.8503 0.8519 0.8520 0.8506

61.52 35.50 24.68 43.97 28.03 50.85 32.19 58.22 35.50 62.60 37.18 66.87 38.57 68.97 40.96 70.12 41.25 70.08 40.88 72.67 38.40 66.64 36.89 62.01 35.02 57.62 31.35 50.09 26.71 42.70 21.94 33.02 58.94

γ  = 1, it does not make sense to directly compare the numbers of responders between arms at each stage. For the odds ratio, θ = px qy /(qx py ), we want to design a study for H0: θ = 1 versus H0 : θ = θa (> 1), where qk = 1 − pk . We propose following two-stage design:

r Stage 1: Accrue m1 patients to arm x and n1 patients to arm y, and observe X1 and Y1 . For pˆ x ,1 = X1 /m1 , pˆ y ,1 = Y1 /n1 , qˆ k,1 = 1 − pˆ k,1 , and θˆ1 = (ˆpx ,1 qˆ y ,1 )/(ˆqx ,1 pˆ y ,1 ): a. If θˆ1 < 1, reject arm x and stop the trial. b. Otherwise, proceed to the second stage. r Stage 2: Accrue an additional m2 patients to arm x and n2 patients to arm y, and observe X2 and Y2 . For pˆ x = X/m, pˆ y = Y/n, qˆ k = 1 − pˆ k , and θˆ = (ˆpx qˆ y )/(ˆqx pˆ y ): a. Accept arm x for further investigation if θˆ ≥ a. b. Otherwise, reject arm x.

RANDOMIZED PHASE II CLINICAL TRIALS

813

Figure 1 Single-stage designs with n = 60 per arm: type I error rate and power for Fisher’s test (solid lines) and MaxTest (dotted lines).

Given Zl = zl , Xl has probability mass function  fl (xl |zl , θ ) =

ml xl

ml+ i=ml−

 

nl zl − xl 

ml i

 θ xl 

nl zl − i

θi

for ml− ≤ xl ≤ ml+ , where ml− = max(0, zl − nl ) and ml+ = min(zl , ml ). Let θˆ1 = x1 (n1 − y1 )/{y1 (m1 − x1 )} and θˆ = {x(n − y)}/{y(m − x)} denote the estimates of θ after stage 1 and 2, respectively. Note that θˆ1 = θˆ1 (x1 ) is a function of x1 given z1 , and θˆ = θˆ (x1 , x2 ) is a function of (x1 , x2 ) given (z1 , z2 ). Given type I error rate α ∗ and (m1 , m2 , z1 , z2 ), we find a satisfying α(z1 , z2 ) ≤ α ∗ , where α(z1 , z2 ) = P{θˆ1 (X1 ) ≥ 1, θˆ (X1 , X2 ) ≥ a|z1 , z2 , H0 }

814

JUNG AND SARGENT

Figure 2 Two-stage designs with n1 = n2 = 30 per arm: type I error rate and power for Fisher’s test (solid lines) and MaxTest (dotted lines).

=

m1+ 

m2+ 

I{θˆ1 (x1 ) ≥ 1, θˆ (x1 , x2 ) ≥ a} f1 (x1 |z1 , 1) f2 (x2 |z2 , 1).

x1 =m1− x2 =m2−

The conditional power of the two-stage design is calculated by 1 − β(z1 , z2 ) = P{θˆ1 (X1 ) ≥ 1, θˆ (X1 , X2 ) ≥ a|z1 , z2 , Ha } =

m1+ 

m2+ 

I{θˆ1 (x1 ) ≥ 1, θˆ (x1 , x2 ) ≥ a} f1 (x1 |z1 , θa ) f2 (x2 |z2 , θa ).

x1 =m1− x2 =m2−

Noting that Zl is the sum of two independent binomial random variables, Xl and Yl , its probability mass function is given as

RANDOMIZED PHASE II CLINICAL TRIALS

815

   ml+   ml nl xl ml −xl gl (zl ) = px qx pzyl −xl qnyl −zl +xl xl zl − xl xl =ml−

for zl = 0, . . . , ml + nl . Under H0: θ = 1, gl (zl ) is expressed as g0l (zl ) =

  ml+   ml nl l +nl −zl . pzyl qm y xl zl − xl

xl =ml−

Hence, the marginal type I error rate and power of above two-stage design are calculated by α = {α(Z1 , Z2 )} =

m 1 +n1 m 2 +n2 z1 =0

α(z1 , z2 )g01 (z1 )g02 (z2 )

z2 =0

and 1 − β = {1 − β(Z1 , Z2 )} =

m 1 +n1 m 2 +n2 z1 =0

{1 − β(z1 , z2 )}g1 (z1 )g2 (z2 ),

z2 =0

respectively. Since α(z1 , z2 ) ≤ α ∗ for all (z1 , z2 ), we have α ≤ α ∗ . For specified a type I error rate α ∗ and a power 1 − β ∗ , we want to select a twostage design satisfying α ≤ α ∗ and 1 − β ≤ 1 − β ∗ . Under H0 , the probability of early termination and the expected sample size for arm x are calculated as PET0 = P{θˆ1 (X1 ) < 1|z1 , H0 } and EN = m1 × PET0 + m × (1 − PET0 ), respectively. Among the two-stage designs satisfying the (α ∗ , 1 − β ∗ )-restriction, the optimal design is defined as the one with the smallest EN. The minimax design is defined as the one with the smallest m (or m + n) among the two-stage designs satisfying the (α ∗ , 1 − β ∗ )-restriction. 6. DISCUSSIONS We have proposed design and analysis methods for two-stage randomized phase II clinical trials based on Fisher’s exact test. While the binomial test by Jung (2008) requires specification of the response probability of the control arm py or conservatively controls the type I error rate at py = 0.5, Fisher’s exact test does not require specification of py . If px and py under H1 can be accurately specified at the design stage, we can calculate the expected sample size under H0 and the sample sizes (n1 , n2 ) for the minimax and optimal two-stage designs. Even if px and py are misspecified at the design, Fisher’s test accurately controls the type I error rate and maintains a higher power than the binomial test, especially if px and py are different from 50%.

816

JUNG AND SARGENT

Jung’s (2008) designs based on the binomial test are specified by the sample sizes and rejection values (n1 , n2 , a1 , a). For two-stage Fisher’s exact test, however, the rejection value of the second stage a is chosen depending on the total numbers of responders through two stages, (z1 , z2 ), so that the protocol of a randomized Phase II trial design based on Fisher’s exact test may include a list of a values for all possible outcomes of (z1 , z2 ). Even though the sample sizes (n1 , n2 ) are determined at the design stage, the realized sizes when the study is completed may be slightly different from the prespecified ones. This kind of discrepancy in sample sizes is no issue for our method by performing a two-stage Fisher’s exact test conditioning on the realized sample sizes as well as the total number of responders. The computing time of our sample size calculation procedure depends on how large the target sample size is, but it takes only a few seconds for most of the cases reported in Tables 2 to 5. The FORTRAN program to find minimax and optimal designs are available from the first author. FUNDING This research was supported by a grant from the National Cancer Institute (CA142538-01). REFERENCES Cannistra, S. A. (2009). Phase II trials in Journal of Clinical Oncology. Journal of Clinical Oncology 27(19):3073–3076. Fisher, R. A. (1935). The logic of inductive inference (with discussion). Journal of Royal Statistical Society 98:39–82. Gan, H. K., Grothey, A., Pond, G. P., Moore, M. J., Siu, L. L., Sargent, D. J. (2010). Randomized Phase II trials: Inevitable or inadvisable? Journal of Clinical Oncology 28(15):2641–2647. Jung, S. H. (2008). Randomized phase II trials with a prospective control. Statistics in Medicine 27:568–583. Jung, S. H., Lee, T. Y., Kim, K. M., George, S. (2004). Admissible two-stage designs for phase II cancer clinical trials. Statistics in Medicine 23:561–569. Palmer, C. R. (1991). A comparative phase II clinical trials procedure for choosing the best of three treatments. Statistics in Medicine 10:1327–1340. Rubinstein, L. V., Korn, E. L., Freidlin, B., Hunsberger, S., Ivy, S. P., Smith, M. A. (2005). Design issues of randomized phase II trials and a proposal for phase II screening trials. Journal of Clinical Oncology 23(28):7199–7206. Sargent, D. J., Goldberg, R. M. A (2001). flexible design for multiple armed screening trials. Statistics in Medicine 20:1051–1060. Simon, R. (1989). Optimal two-stage designs for phase II clinical trials. Controlled Clinical Trials 10:1–10. Simon, R., Wittes, R. E., Ellenberg, S. S. (1985). Randomized phase II clinical trials. Cancer Treatment Reports 69:1375–1381. Steinberg, S. M., Venzon, D. J. (2002). Early selection in a randomized phase II clinical trial. Statistics in Medicine 21:1711–1726. Thall, P. F., Simon, R., Ellenberg, S. S. (1989). A two-stage design for choosing among several experimental treatments and a control in clinical trials. Biometrics 45:537–547.

Randomized phase II clinical trials.

Traditionally, Phase II trials have been conducted as single-arm trials to compare the response probabilities between an experimental therapy and a hi...
547KB Sizes 0 Downloads 5 Views