Sample size calculations for the development of biosimilar products.

This article was downloaded by: [Ondokuz Mayis Universitesine] On: 08 November 2014, At: 10:27 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20

Sample Size Calculations for the Development of Biosimilar Products a

a

Seung-Ho Kang & Yongjo Kim a

Department of Applied Statistics, Yonsei University, Seoul, Korea Accepted author version posted online: 17 Jul 2014.Published online: 31 Oct 2014.

To cite this article: Seung-Ho Kang & Yongjo Kim (2014) Sample Size Calculations for the Development of Biosimilar Products, Journal of Biopharmaceutical Statistics, 24:6, 1215-1224, DOI: 10.1080/10543406.2014.941984 To link to this article: http://dx.doi.org/10.1080/10543406.2014.941984

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions

Journal of Biopharmaceutical Statistics, 24: 1215–1224, 2014 Copyright © Taylor & Francis Group, LLC ISSN: 1054-3406 print/1520-5711 online DOI: 10.1080/10543406.2014.941984

SAMPLE SIZE CALCULATIONS FOR THE DEVELOPMENT OF BIOSIMILAR PRODUCTS Seung-Ho Kang and Yongjo Kim Downloaded by [Ondokuz Mayis Universitesine] at 10:27 08 November 2014

Department of Applied Statistics, Yonsei University, Seoul, Korea The most widely used design for a Phase III comparative study for demonstrating the biosimilarity between a biosimilar product and a renovator biological product is the equivalence trial, whose aim is to show that the difference between two population means of a primary endpoint is less than a prespecified equivalence margin. A well-known sample size formula for the equivalence trial is given by n1 = kn2

n2 =

zα + zβ /2

2

σ2

(δ − |μT − μR |)2

1+1 k .

Since this formula is obtained based on the approximate power rather than the exact power, we investigate in this article the accuracy of the sample size formula. We conclude that the sample size formula is very conservative. Specifically, we show that the exact powerbased on the sample size calculated from the formula to have power 1 − β is actually 1 − β 2 under some conditions. Therefore, the use of the sample size formula may cause a huge extra cost to biotechnology companies. We propose that the sample size should be calculated based on the exact power precisely and numerically. The R code to calculate the sample size numerically is provided in this article. Key Words: Equivalence trial; Follow-on biologics; Margin; Power; Sample size formula.

1. INTRODUCTION Since many biological products face losing their patents soon, the development of biosimilar products has received much attention from both the biotechnology industry and the regulatory agencies (Chow et al., 2011). In order to obtain approval for a biosimilar product, it is required that one demonstrate similarity of quality, efficacy, and safety to a renovator biological product with comprehensive data. The aims of comparability studies are to demonstrate that the quality of a biosimilar product is highly similar to a renovator biological product and to make sure there is no adverse impact on efficacy and safety. In the development of a biosimilar product, a stepwise approach is usually recommended, beginning with quality studies and followed by nonclinical and clinical studies. In the last step of the stepwise approach, a Phase III comparative study is usually required if there are uncertainties about the biosimilarity of the two products based on quality studies Received May 30, 2013; Accepted May 23, 2014 Address correspondence to Seung-Ho Kang, PhD, Department of Applied Statistics, Yonsei University, 262 Seongsanno SeoDaeMun-Gu, Seoul 120-749, Korea; E-mail: [email protected]

1215

1216

KANG AND KIM

Downloaded by [Ondokuz Mayis Universitesine] at 10:27 08 November 2014

and nonclinical studies. The equivalence trials are frequently conducted as a Phase III comparative study. Sample size calculation plays an important role in all confirmatory clinical trials including the equivalence trials to demonstrate a biosimilarity of a biosimilar product and a renovator biological product. It ensures that there is a sufficient number of subjects enrolled in the trials for providing substantial evidence of safety and efficacy of a biosimilar product and a renovator biological product. However, sample size calculation impacts the cost of the trials, because the larger number of patients results in the higher cost. Therefore, it is important to calculate the minimal sample size that achieves the prespecified power. A well-known sample size formula for the equivalence trials is given by (Chow, 2003, 60) n1 = kn2

n2 =

zα + zβ /2

2

σ2

(δ − |μT − μR |)

2

1+1 k .

(1)

where zα is the upper α quantile of the standard normal distribution (e.g., z0.05 = 1.645), μT and μR represent the population mean of a primary endpoint of a biosimilar product and a renovator biological product, respectively, σ 2 is the population variance of the primary endpoint, k is the allocation ratio, and n1 and n2 are the sample sizes of the first group and the second group, respectively. In this article we investigate the accuracy of the sample size calculation formula based on equation (1). It is pointed out that the sample size calculation based on equation (1) is very conservative, which requires larger sample sizes than the actually needed sample sizes. Specifically, we show that the exact power based on the sample size calculated from equation (1) to have power 1 − β is actually 1 − β 2 under some conditions. In other words, if the sample size is calculated from equation (1) to have 80% power, the exact power with the same sample size is actually 90% in many practical cases. It implies that the use of the sample size formula in equation (1) may cause huge extra cost to biotechnology companies. We propose a new sample size calculation method that evaluates the power function numerically and accurately. Since no closed formula of the new sample size calculation is available, the R code to compute the sample size is provided.

2. THE EQUIVALENCE TRIAL Although several measures have been studied for demonstrating the biosimilarity between a biosimilar product and a renovator biological product (Chow et al., 2010a; Chow et al., 2010b; Chow et al., 2010c; Hsieh et al., 2010; Lei and Olson, 2010), the most widely used measure is the difference between two population means. Therefore, the equivalence trial are frequently employed in order to show that the difference between two population means of a primary endpoint is less than a prespecified equivalence margin. For example, when Epoetin alpha, Epoetin zeta, Somatropin, and Filgrastim were approved by the European Medicines Agency (EMEA), the equivalence trials were conducted (EMEA, 2006, 2007a, 2007b, 2008). The U.S. Food and Drug Administration (FDA) draft guidance also emphasizes the importance of the equivalence trial as follows (U.S. FDA, 2012, 17):

SAMPLE SIZE CALCULATIONS FOR BIOSIMILARS

1217

A study employing a two-sided test in which the null hypothesis is that either (1) the proposed product is inferior to the reference product or (2) the proposed product is superior to the reference product based on a pre-specified equivalence margin is the most straightforward study design for accomplishing this objective.

Similarly, the World Health Organization (WHO) guideline also recommends the use of the equivalence trial as follows (WHO, 2009, 30):


Equivalence trials are strongly recommended for medicinal products with a narrow safety margin, such as insulin, to ensure the SBP (biosimilar product) is not less and not more effective than the RBP (renovator biological product).

In such equivalence trials, patients are randomized into two groups; the patients in the first group receive a biosimilar product and the patients in the second group receive a renovator biological product, respectively. The hypotheses of interest are given by H0 : |μT − μR | ≥ δ

vs.

Ha : |μT − μR | < δ

(2)

where δ (> 0) is a prespecified equivalence margin. 3. THE WELL-KNOWN CLOSED-SAMPLE SIZE CALCULATION FORMULA Let XT,i (i = 1, · · · , n1 ) and XR,i (i = 1, · · · , n2 ) denote the response variables from the biosimilar product in the first group and the renovator biological product in the second group, respectively. It is assumed that XT,i and XR,i follow independently the normal distribution with mean μT and μR , respectively, and a common variance σ 2 . Furthermore, the common variance σ 2 is assumed to be known for the calculation of the sample size. Then hypotheses of equivalence are given by equation (1) in section 2. The hypotheses in equation (1) can be decomposed into two one-sided hypotheses as follows: H01 : μT − μR ≤ −δ

Ha1 : μT − μR > −δ

and H02 : μT − μR ≥ δ

Ha2 : μT − μR < δ.

If both H01 and H02 are rejected at the significance level α, it can be concluded that H0 is rejected at the significance level α, which means that the biosimilar product and the renovator biological product are claimed to be biosimilar. Both H01 and H02 are rejected at a significance level α if ZL < −zα

and

ZU > zα

where ZL =

X¯ T − X¯ R − δ σ 1 n1 + 1 n2

and

ZU =

X¯ T − X¯ R + δ σ 1 n1 + 1 n2

1218

KANG AND KIM

where X¯ T and X¯ R are the sample means of the primary endpoint in the first group and the second group, respectively. Under the alternative hypothesis Ha : |μT − μR | < δ, the power of this test is given by P (ZL < −zα

and

ZU > zα |Ha )

= P (ZL < −zα |Ha ) + P (ZU > zα |Ha ) − P (ZL < −zα

or


= P (ZL < −zα |Ha ) + P (ZU > zα |Ha ) − [1 − P (ZL < −zα

ZU > zα |Ha ) and

(3)

ZU > zα |Ha )]

≥ P (ZL < −zα |Ha ) + P (ZU > zα |Ha ) − 1 ⎞ ⎛ ⎞ ⎛ − μ − μ δ + δ − ) ) (μ (μ T R T R ⎠ ⎝ ⎠ =⎝ − zα + − zα − 1 σ 1 n1 + 1 n2 σ 1 n1 + 1 n2

(4)

⎛

⎞ |μT − μR | δ − ⎠ ≥ 2 ⎝ − zα − 1 σ 1 n1 + 1 n2

(5)

where is the cumulative distribution function of the standard normal distribution. The sample size is often calculated based on equation (5) in order to obtain the closed formula (for example, Chow, 2003, p. 60). In other words, the sample size needed to achieve power 1 − β can be obtained by solving the following equation: ⎞ ⎛ |μT − μR | δ − ⎠ 1 − β = 2 ⎝ − zα − 1. σ 1 n1 + 1 n2 Then, we have zβ /2 =

δ − (μT − μR ) − zα . σ 1 n1 + 1 n2

This leads to the well-known sample size calculation method for the equivalence test whose formula is given by equation (1) in section 1. 4. THE COMPARISON OF THE EXACT AND THE APPROXIMATE POWER The well-known sample size calculation formula for the equivalence test is derived in section 3 and given by equation (1) in section 1. However, the power obtained from equation (5) is the approximate power. Furthermore, the exact power with the sample size obtained from equation (1) might be greater than 1 − β, because two inequalities in equations (4) and (5) are employed in the derivation of the approximate power. Therefore, it is of interest to compare the exact power obtained from equation (3) and the approximate power calculated from equation (5). Both the exact and the approximate powers are computed numerically and presented in Table 1. Table 1 shows that the exact powers are always greater than the approximate powers. In many cases when the approximate powers are about 80%, the exact


1219


Table 1 The exact powers and the approximate powers (α = 0.05) n1 = n2

δ

σ

μT − μR

Exact

Approx.

σ

μT − μR

Exact

Approx.

100

1

1

0.50 0.52 0.54 0.56 0.58 0.60 0.62 0.64 0.66 0.68

0.971 0.959 0.946 0.928 0.907 0.881 0.851 0.816 0.776 0.731

0.941 0.919 0.892 0.857 0.814 0.763 0.702 0.632 0.552 0.463

2

0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18 0.20

0.940 0.938 0.935 0.931 0.925 0.918 0.910 0.900 0.889 0.876

0.931 0.919 0.906 0.892 0.875 0.857 0.837 0.814 0.790 0.763

2

2

1.10 1.12 1.14 1.16 1.18 1.20 1.22 1.24 1.26 1.28

0.937 0.928 0.918 0.907 0.895 0.881 0.867 0.851 0.834 0.816

0.875 0.857 0.837 0.814 0.790 0.763 0.734 0.702 0.668 0.632

4

0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55

0.937 0.932 0.925 0.916 0.905 0.892 0.876 0.859 0.840 0.818

0.913 0.895 0.875 0.852 0.826 0.796 0.763 0.726 0.685 0.641

1

1

0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82

0.974 0.960 0.940 0.912 0.875 0.830 0.774 0.710 0.638 0.561

0.949 0.920 0.880 0.824 0.751 0.660 0.549 0.421 0.277 0.123

2

0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48

0.968 0.960 0.951 0.940 0.927 0.912 0.895 0.875 0.854 0.830

0.936 0.920 0.902 0.880 0.854 0.824 0.790 0.751 0.708 0.660

2

2

1.10 1.15 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55

0.997 0.995 0.990 0.982 0.968 0.945 0.912 0.865 0.803 0.727

0.995 0.990 0.981 0.964 0.936 0.891 0.824 0.730 0.607 0.454

4

0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95

0.982 0.976 0.968 0.958 0.945 0.930 0.912 0.890 0.865 0.836

0.964 0.952 0.936 0.916 0.891 0.861 0.824 0.781 0.730 0.672

100

powers are about 90%. For example, when α = 0.05, n1 = n2 = 100, δ = 1, σ = 1, and μT − μR = 0.58, the exact power is 90.7%, but the approximate power is only 81.4%. The difference of 10% power is quite large so that we cannot ignore it. Figure 1 also shows the differences of the two powers graphically when α = 0.05, n1 = n2 = 100, σ = 1, and δ = 1. As the value of μT − μR increases, the accuracy of the approximate powers drops rapidly. When the value of μT − μR is close to 0.99, the approximate powers drop below zero, which is unacceptable. The differences between the two powers may produce the different sample sizes. The R code is made to compute the sample sizes based on both the exact powers and the

1220

KANG AND KIM power 1.0

exact power approximate power delta = 1 sigma = 1 n1 = n2 = 100


0

–0.9

µT – µR 0.3

0.99

Figure 1 The comparison of the exact and the approximate powers.

approximate powers. Table 2 displays the sample sizes to achieve 80% and 90% powers by using two different powers. The sample sizes based on the approximate powers are always greater than those based on the exact powers. For instance, when δ = 1, σ = 1, α = 0.05, β = 0.2, and μT − μR = 0.60, the sample size based on the approximate power is 108, but the sample size based on the exact power is only 78. The difference of 30 patients in each group may raise a huge extra cost. Another interesting phenomenon is that the sample sizes to achieve 80% powers based on the approximate powers are the same as those to achieve 90% powers based on the exact powers. Such phenomena occur in 30 cases among the 32 cases in Table 2. These phenomena can be explained by the following theorem. Theorem 1. Let n1 = n2 and w = zα −

(μT − μR ) + δ . σ 1 n1 + 1 n2

When w is so small that (w) is negligible, the exact power with the sample size to achieve power 1 − β based on the approximate power is actually 1 − β/2. Proof . The exact power in equation (3) is given by P (ZL < −zα

and

ZU > zα |Ha )

P (ZL < −zα |Ha ) , ⎛ = ⎝−zα −

= (zβ/2 ), = 1 − β 2.

since (w) 0 ⎞

(μT − μR ) − δ ⎠ σ 2 n1 2 since n1 = 2σ 2 zα + zβ /2 / (δ − |μT − μR |)2


1221


Table 2 Sample size calculations based on the exact powers and the approximate powers Delta

σ

μT − μR

Power

Exact

Approx.

Power

Exact

Approx.

1

1

0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85

80%

50 62 78 101 138 198 310 550

69 85 108 140 191 275 429 762

90%

69 85 108 140 191 275 429 762

87 107 136 177 241 347 542 963

2

1

1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75

80%

35 41 50 62 78 101 138 198

48 57 69 85 108 140 191 275

90%

48 57 69 85 108 140 191 275

61 72 87 107 136 177 241 347

1

2

0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80

80%

72 81 102 138 198 310 550 1237

85 108 140 191 275 429 762 1713

90%

92 109 140 191 275 429 762 1713

107 136 177 241 347 542 963 2165

2

3

0.50 0.60 0.70 0.80 0.90 1.00 1.10 1.20

80%

51 58 66 78 92 112 138 174

69 79 92 108 128 155 191 241

90%

69 79 92 108 128 155 191 241

87 100 116 136 162 195 241 305

(w) is negligible for most of cases in Table 2. For example, when δ = 1, σ = 1, α = 0.05, n1 = 78, n2 = 108, and μT − μR = 0.6, (w) = 3.5 × 10−17 .

5. CONCLUSION AND DISCUSSION In this article, we investigate the accuracy of the sample size calculation formula based on the approximate power in Chow (2003, p. 60). The formula turns out to be very conservative. Specifically, in many practical cases the sample sizes to achieve 80% powers based on the approximate powers produce 90% exact powers actually. Therefore, the sample size formula based on the approximate power may make biotechnology industry spend extra money. In this article, we propose that the sample size should be calculated with the exact formula precisely. Although the closed formula to compute the sample size accurately is

1222

KANG AND KIM

not available, it is not hard at all to compute the exact powers numerically. The R code to calculate the sample size accurately is provided in the appendix. In section 3, the exact power is derived when the common population variance σ 2 is assumed to be known for the calculation of the sample size. When the unequal population variances σT2 and σR2 are assumed to be known, Zhang et al. (2013) derived the following formula for the exact power, which can be employed to compute the sample size. This formula represents the probability that the 100 (1 − 2α) % confidence interval for (μT − μR ) lies within (−δ, δ) under the alternative hypothesis.


⎛

P ⎝−δ ≤ X¯ T − X¯ R − zα =

σT2 σ2 + R n1 n2

δ − zα C · CVR − (μT − μR ) C · CVR

and

−

X¯ T − X¯ R + zα

⎞ σT2 σR2 + ≤ δ |Ha ⎠ n1 n2

− [δ − zα C · CVR ] − (μT − μR ) C · CVR

where n1 = an2

σT2 = bσR2

CVR = σ R |μR | C =

1 b |μR | . + n2 an2

Although the population variances (σ 2 or σT2 and σR2 ) are unknown, they are usually assumed to be known in many practical cases and are estimated from earlier or similar studies. If the population variance(s) are assumed to be unknown, they are usually replaced with the sample variance(s), and t distributions are used instead of the standard normal distribution. However, when the required sample sizes are not small, as in the case of clinical trials for biosimilar products, the difference between the sample sizes obtained from these two methods is usually very small, given that the t distribution converges to the standard normal distribution as the degree of freedom increases to infinity. APPENDIX Power and sample size computations are done in R, an open-source statistical software package available at www.r-project.org. In the code that follows, samplesize.cal is a function that computes the smallest sample size for the given power based on either the approximate power or the exact power. samplesize.cal=function(quan,power) { delta=2 diffmu=1.20 sigma=3 q=qnorm(0.95) n1.e=19 exactpower=0 while (exactpower

Sample size calculations for skewed distributions.

Statistical identifiability and sample size calculations for serial seroepidemiology.

Sample Size Calculations for Comparing Groups with Continuous Outcomes.

Sample size calculations for micro-randomized trials in mHealth.

Sample Size and Power Calculations for Additive Interactions.

Sample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys.

Power and sample size calculations. A review and computer program.

Sample size calculations for the design of cluster randomized trials: A summary of methodology.

Sample size calculations for the log rank test: a Gompertz model approach.

Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data.

Power and sample size calculations for clinical trials of myofascial pain of jaw muscles.

Sample Size Calculations for Time-Averaged Difference of Longitudinal Binary Outcomes.

Erratum to: Sample size calculations for cluster randomised controlled trials with a fixed number of clusters.

sample size calculations for assessing correlates of risk in clinical efficacy trials.

Power and Sample Size Calculations for Generalized Estimating Equations via Local Asymptotics.

Sample size and power calculations for detecting changes in malaria transmission using antibody seroconversion rate.

Sample size calculations for stepped wedge trials using design effects are only approximate in some circumstances.

On the regulatory approval pathway of biosimilar products.

Power and sample size calculations for the Wilcoxon-Mann-Whitney test in the presence of death-censored observations.

A 'Global Reference' Comparator for Biosimilar Development.

Use of methods for specifying the target difference in randomised controlled trial sample size calculations: Two surveys of trialists' practice.

On tests of treatment-covariate interactions: An illustration of appropriate power and sample size calculations.

Sample size calculation for the one-sample log-rank test.

Sample size calculation for the one-sample log-rank test.