Journal of Biopharmaceutical Statistics

ISSN: 1054-3406 (Print) 1520-5711 (Online) Journal homepage: http://www.tandfonline.com/loi/lbps20

A new PK Equivalence test for a bridging study Steven J. Novick, Xiang Zhang & Harry Yang To cite this article: Steven J. Novick, Xiang Zhang & Harry Yang (2016): A new PK Equivalence test for a bridging study, Journal of Biopharmaceutical Statistics, DOI: 10.1080/10543406.2016.1148712 To link to this article: http://dx.doi.org/10.1080/10543406.2016.1148712

Accepted author version posted online: 16 Feb 2016.

Submit your article to this journal

Article views: 9

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lbps20 Download by: [University of California, San Diego]

Date: 29 February 2016, At: 10:19

Title: A new PK Equivalence test for a bridging study Short Title: A new PK Equivalence test for a bridging study

Authors: Steven J. Novick*, Xiang Zhang, and Harry Yang

us

cr ip

Steven Novick is Director, Statistical Sciences, MedImmune, Gaithersburg, MD, [email protected] Xiang Zhang is a PhD student, Department of Statistics, North Carolina State University, Raleigh, NC, [email protected]. Harry Yang is Senior Director, Statistical Sciences, MedImmune, Gaithersburg, MD, [email protected]

Abstract

M an

In a bridging study, the plasma drug concentration time curve is generally used to assess bioequivalence between two formulations. Selected pharmacokinetic (PK) parameters including the area under the concentration time curve (AUC), the maximum plasma concentration or peak exposure (C max ), and drug half-life (T 1/2 ) are compared to ensure comparable bioavailability of the two formulations. Comparability in these PK parameters, however, does not necessarily

ed

imply equivalence of the entire concentration time profile. In this article, we propose an alternative metric of equivalence based on the maximum difference between PK profiles of the

pt

two formulations. A test procedure based on Bayesian analysis and accounting for uncertainties in model parameters is developed. Through both theoretical derivation and empirical simulation,

ce

it is shown that the new method provides better control over consumer’s risk.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

t

* corresponding author Steven Novick [email protected]

Short Abstract In a bridging study, the plasma drug concentration time curve is generally used to assess

bioequivalence between two formulations. Selected pharmacokinetic parameters including the area under the concentration time curve, the maximum plasma concentration or peak exposure, and drug half-life are compared to ensure comparable bioavailability of the two formulations. 1

Comparability in these PK parameters, however, does not necessarily imply equivalence of the entire concentration time profile. In this article, we propose an alternative metric of equivalence based on the maximum difference between PK profiles of the two formulations. The new method provides better control over consumer’s risk.

cr ip us

1. Introduction

Pharmacokinetics (PK) studies drug absorption, distribution, metabolism, and excretion (ADME)

M an

after in vivo administration. The drug concentrations in either blood or plasma are measured at different time points and form the drug concentration–time curve or profile. PK parameters such as AUC, C max , and T 1/2, are usually estimated based on the PK profile and used as surrogate

ed

efficacy endpoints in bridging studies to assess the effect of a change in manufacturing process or formulation during drug development or to establish equivalence between a new test

pt

formulation and the original marketed reference formulation. It is a regulatory requirement that a formal bridging study be carried out to evaluate the effects of post-approval changes (FDA,

ce

1995) or to establish comparability between a drug and the reference drug (FDA, 2006; EMEA, 2010). For post-approval scale up or other changes, in vitro dissolution testing may be used as a surrogate measure of bioavailability to avoid the risk and expense of human trials and to

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

t

Keywords: Bayesian posterior probability; Bioequivalence; Concentration time curve; Pharmacokinetics

facilitate the implementation of improvements in processes and products (FDA, 1995; Novick et al., 2015).

2

In a clinical bridging study, patients are randomized to receive either the original reference (R) formulation or the new test (T) formulation. Blood samples, obtained at various time points, are tested to determine serum drug concentrations, giving rise to two sets of ADME profiles. Let f(θ i , t) denote the average concentration of drug agent in a patient population at time t for i = R,

cr ip

t

T. It is conceivable that a direct method intended to establish comparable rate and extent of

T. By contrast, in a traditional PK study, an indirect method is used to assess the similarity

us

between functions f(θ R , t) and f(θ T , t) (FDA, 1992; EMEA, 2010; Chow and Liu, 1999).

M an

Specifically, bioequivalence is equated to similarity in three parameters, namely area under the curve (AUC), the maximum concentration (C max ), and the half-life of the drug (T 1/2 ). For a single-dose PK study, these parameters are defined by 𝜏

(𝑖)

𝐴𝑈𝐶 (𝑖) [𝜏] = ∫0 𝑓(𝜽𝑖 , 𝑡) 𝑑𝑡, 𝐶max [𝜏] = 𝑠𝑢𝑝 𝑡∈(0,𝜏){𝑡: 𝑓(𝜽𝑖 , 𝑡)

1

(𝑖)

𝜃𝑖 , 𝑡 ), and

= � � 𝐶max [𝜏]}, where i = R, T and where τ is a time point such as 8,

ed

(𝑖)

𝑇1/2 [𝜏] =

𝑠𝑢𝑝 𝑡∈(0,𝜏)𝑓(

2

(𝑅)

(𝑇)

pt

12, or 24 hours. It is assumed that 𝜏 > 𝑇1/2 and 𝜏 > 𝑇1/2 and so the τ argument is dropped

ce

from both C max and T 1/2 .

The bioequivalence hypotheses for AUC, C max , and T 1/2 are given by H 0 : | log( ψ(R) / ψ(T)) | ≥ log(1.25) vs. H 1 : | log( ψ(R) / ψ(T)) | < log(1.25), where ψ=AUC[τ],

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

bioavailability between the two formulations is to show the similarity between f(θ i , t) for i = R,

C max , and T 1/2 . To be declared bioequivalent, tests for the three parameters must all result in an

H 1 declaration. We call this the traditional bioequivalence method. While the AUC, C max , and T 1/2 tests have proven useful to compare PK profiles, as noted by Mauger and Chinchilli (2000), analysis based solely on these metrics may give misleading 3

results if they fail to account for important characteristics of the concentration time profiles, and equivalence of the PK metrics does not necessarily imply equivalence in the concentration time profile (Westlake, 1988). In addition, useful information such as correlation among repeated measurements is not utilized in profile comparisons, based on PK parameters (Liao, 2005). In

cr ip

t

fact, it is possible to declare bioequivalence with two curves that exhibit meaningful differences

with parameters K a and K e to respectively denote the first order absorption and elimination rate

us

constants and parameter α=F×Dose/V, where F is the fraction of dose absorbed and V is the volume of the compartment, given by

𝑒𝑥𝑝(−𝐾𝑒 𝑡 ) − 𝑒𝑥𝑝(−𝐾𝑎 𝑡 ) . 𝐾𝑎 − 𝐾𝑒

M an

𝑓(𝜽, 𝑡) = 𝛼𝐾𝑎

(1)

Let 𝜽𝑅 = (𝐾𝑒 = 0.331, 𝐾𝑎 = 0.357, 𝛼 = 300) and 𝜽 𝑇 = (𝐾𝑒 = 0.277, 𝐾𝑎 = 0.301, 𝛼 = 300). 𝐴𝑈𝐶 𝑅 [𝜏] �� 𝐴𝑈𝐶 𝑇 [𝜏]

𝑅 𝑇1/2 𝑇 𝑇1/2

𝑅 𝐶𝑚𝑎𝑥

= log(1.18), �log �

𝑇 𝐶𝑚𝑎𝑥

�� = log(1.00), and

��=log(1.19) and so the two PK profiles are declared bioequivalent. A graph of the two

pt

�log �

ed

With these parameters and τ=20, �log �

curves is shown in the left panel of Figure 1. The right panel of Figure 1 shows the absolute log

ce

ratio of 𝑓(𝜽𝑅 , 𝑡) and 𝑓(𝜽 𝑇 , 𝑡) against time, with points above the dotted line differing by more

than log(1.25). Significant differences between the curves occur for all time points after about 7.2 hours. While the ratio of areas under the curve from 0 to 7.2 hours equals 1.04, the ratio of

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

over a time interval. For example, consider the oral, first-order, single-compartment PK model

areas under the curve from 7.2 to 20 hours equals log(1.53), strongly suggesting a significant

difference in the PK profiles and illustrating the weakness of summary metrics like AUC that can “average” out regions of inequivalence with those that are strongly equivalent. Such issues may arise when testing equivalence of two formulations of a drug of delayed release. The timing for 4

the drug to be released is a key characteristic of the drug. Given the sampling period is long enough, the AUC, C max , and T 1/2 might be similar even though there might be a meaningful difference in the times taken for the two formulations to start being released.

t

Several alternative approaches have been proposed to establish bioequivalence through direct

𝜏

𝐴𝑈𝐶𝑇 −𝐴𝑈𝐶𝑅 𝐴𝑈𝐶𝑅

and 𝑟(𝑡) =

𝑓(𝜃𝑇 ,𝑡)−𝑓(𝜃𝑅 ,𝑡) 𝑓(𝜃𝑅 ,𝑡)

. The index ψ has the property that |𝜓| ≥ |𝜃|.

M an

where 𝜃 =

0

(2)

us

𝜓 = 𝑠𝑖𝑔𝑛(𝜃) � |𝑟(𝑡)|𝑑𝑡

Therefore it is more stringent than the AUC index. However, the index has the potential to be too sensitive to clinically unimportant differences in profiles. As a remedy, an alternative index ψ(S) is proposed by the same authors by smoothing the integrand in (2), using an arbitrary smoothing

ed

kernel S:

𝜏

pt

𝜓(𝑆) = 𝑠𝑖𝑔𝑛(𝜃) � �� 𝑆(𝑢, 𝑡) 𝑟(𝑡 + 𝑢)𝑑𝑢� 𝑑𝑡.

(3)

0

ce

It is shown that when kernel S is a moving average, the resultant index ψ(S MA ) satisfies: |𝜃| ≤ |𝜓(𝑆𝑀𝐴 )| ≤ |𝜓|.

In other words, the index ψ(S MA ) provides a compromise between the traditional AUC-based

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

using the following index for testing profile similarity:

cr ip

comparison between the profiles of two formulations. Mauger and Chinchilli (2000) suggest

index 𝜃 and the profile-based index 𝜓. The drawback with ψ and ψ(S MA ) is that they are also summary metrics and are, as such, less sensitive to detect relatively large and meaningful differences in profiles as the integration in (2) and (3) smooth out the large differences.

5

Furthermore, although the indices are stringent when compared to AUC, it is unclear how they compare to C max or the combination of AUC, C max and T 1/2 , in making bioequivalence inferences. Finally, neither index accounts for uncertainties in model parameters.

𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

)) − log(𝑓(𝜽 𝑇 , 𝑡))| ≥ log(𝛿) )) − log(𝑓(𝜽 𝑇 , 𝑡))| < log(𝛿)

cr ip

H1:

𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

(4)

us

H0:

M an

for some pre-specified time point τ and equivalence limit δ. Chen and Huang (2009) offer the same set of hypotheses with the modification of taking the maximum over the set of time points t ∈ (t 1 , t 2 ) and setting δ=1.25. Chen and Huang (2009) evaluate the hypotheses by using parametric bootstrapping to create an upper 100(1-α)% confidence limit for )) − log(𝑓(𝜽 𝑇 , 𝑡))| and declaring H 1 if the upper confidence limit is less than

ed

𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

log(1.25). We modernize the testing procedure of Chen and Huang (2009) via Bayesian

pt

methods. It is also shown that H 1 of (4) is a more stringent measure than both AUC and C max . It

ce

therefore provides greater protection to consumer’s risk than the traditional PK bioequivalence analysis. Test of equivalence between two formulations is carried out through a Bayesian test procedure, which incorporate uncertainties in model parameters. The Bayesian paradigm of

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

the maximum difference between PK profiles of the two formulations by

t

In this article, we propose an alternative set of hypotheses for evaluating bioequivalence using

comparing two curves was earlier proposed in Novick, Yang, and Peterson (2012) in which two bioassay curves are shown to be equivalent across a range of concentrations after an x-axis shift. Novick, et. al. (2015) utilize a similar strategy for dissolution curve comparison in a chemistry, manufacture, and controls (CMC) setting. Through a simulation study, the method is shown to 6

have good performance when compared to the traditional PK approach. A relaxation of the hypotheses in (4) is also provided, which declares equivalence if the proportion of inconsequential differences exceeds a pre-specified limited. The proposed method is evaluated

cr ip us M an ed pt ce Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

t

through a simulation study.

7

t cr ip us M an ed

pt

ce

2 Theoretical Properties 𝑚𝑎𝑥 |log(𝑓(𝜽𝑅 , 𝑡 )) − log(𝑓(𝜽 𝑇 , 𝑡))| < log(1.25) implies equivalence in Next, it is shown that 𝑡∈(0,𝜏)

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

Figure 1. The PK reference and test curves (left) and absolute log ratios of the two curves (right). Model parameters are 𝜽𝑅 = (𝐾𝑒 = 0.331, 𝐾𝑎 = 0.357, 𝛼 = 300) and 𝜽 𝑇 = (𝐾𝑒 = 0.277, 𝐾𝑎 = 0.301, 𝛼 = 300) using model (1).

both AUC and C max , meaning that H 1 in (4) is, in a sense, a stricter test.

Result 2.1: The alternative hypothesis in (4) implies |log( 𝐴𝑈𝐶 𝑅 [𝜏] ) − log( 𝐴𝑈𝐶 𝑇 [𝜏])| < log(𝛿).

8

Consider the case in which log(𝑓(𝜽𝑅 , 𝑡 )) − log(𝑓(𝜽 𝑇 , 𝑡)) < log(𝛿) for all t ∈ (0, τ). It follows that 𝑓(𝜽𝑅 , 𝑡) < 𝛿𝑓(𝜽 𝑇 , 𝑡) and so log{∫(0,𝜏) 𝑓(𝜽𝑅 , 𝑡)𝑑𝑡 / ∫(0,𝜏) 𝑓(𝜽 𝑇 , 𝑡)𝑑𝑡 } < log(𝛿). Similarly,

by starting with log(𝑓(𝜽𝑅 , 𝑡 )) − log(𝑓(𝜽 𝑇 , 𝑡)) > −log(𝛿) for all t ∈ (0, τ), one obtains

t

log{∫(0,𝜏) 𝑓(𝜽𝑅 , 𝑡)𝑑𝑡 / ∫(0,𝜏) 𝑓(𝜽 𝑇 , 𝑡)𝑑𝑡 } > −log(𝛿). Thus, H 1 in (4) implies the AUC

cr ip

𝑅 𝑇 )| ) − log( 𝐶𝑚𝑎𝑥 Result 2.2: The alternative hypothesis in (4) implies |log( 𝐶𝑚𝑎𝑥 < log(𝛿).

us

𝑅 ) for all t ∈ (0, τ). It Consider log(𝑓(𝜽 𝑇 , 𝑡 )) < log(𝛿) + log(𝑓(𝜽𝑅 , 𝑡 )) ≤ log(𝛿) + 𝑙𝑜𝑔( 𝐶𝑚𝑎𝑥

𝑇 𝑅 follows that log(𝐶𝑚𝑎𝑥 ) < log(𝛿) + log(𝐶𝑚𝑎𝑥 ). Similarly, by starting with log(𝑓(𝜽 𝑇 , 𝑡 )) >

M an

𝑅 𝑇 ) for all t ∈ (0, τ), one obtains log(𝐶𝑚𝑎𝑥 )> − log(𝛿) + log(𝑓(𝜽𝑅 , 𝑡 )) ≥ − log(𝛿) + 𝑙𝑜𝑔( 𝐶𝑚𝑎𝑥 𝑅 − log(𝛿) +log(𝐶𝑚𝑎𝑥 ). Thus, H 1 in (4) implies the C max bioequivalence alternative hypothesis.

Taken together, Results 2.1 and 2.2 indicate that the metric introduced in this section is more

ed

stringent than AUC and C max . It does not, however, generally follow that the alternative hypothesis in (4) implies �log �

𝑅 𝑇1/2 𝑇 𝑇1/2

�� < log(𝛿). Take, for example, the zero-order single-

pt

compartment PK model given by 𝑓(𝜃, 𝑡) = 𝐶0 exp(−𝐾𝑒 𝑡), where C 0 is the initial concentration

ce

and K e is the rate of excretion. If C 0 = 100, 𝐾𝑒𝑅 = 0.80, 𝐾𝑒𝑇 = 1.01, δ=1.25, and τ=1, then 𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

bioequivalence alternative hypothesis.

)) − log(𝑓(𝜽 𝑇 , 𝑡))| = log(1.23), but �log �

𝑅 𝑇1/2 𝑇 𝑇1/2

��= log(1.26). Although H 1 in

(4) does not imply the traditional T 1/2 bioequivalence result, the performance of the new metric and T 1/2 can be evaluated through simulations.

3. A Bayesian test of bioequivalence 9

The proposed metric to measure bioequivalence is the Bayesian posterior probability 𝑚𝑎𝑥 |log(𝑓(𝜽𝑅 , 𝑡 )) − log(𝑓(𝜽 𝑇 , 𝑡))| < log(𝛿) | 𝑑𝑎𝑡𝑎 ) 𝑝(𝛿, 𝜏) = 𝑃𝑟( 𝑡∈(0,𝜏)

(5)

A decision can be made to accept H 1 in (4) if p(δ,τ) > p 0 (say p 0 = 0.95); otherwise H 0 is

cr ip

i.

A joint likelihood is assumed for C | t, (θ R , θ T , Σ), where C and t respectively denote the

us

combined vector of concentrations and times for both reference and test data sets and Σ denotes a vector of other model parameters, such as those needed for variance terms. A prior distribution is placed on (θ R , θ T , Σ).

iii.

Draw a sample from the joint posterior distribution of (θ R , θ T ) | C.

iv.

For every random draw of (θ R , θ T ) in iii, determine if )) − log(𝑓(𝜽 𝑇 , 𝑡))| < log(𝛿).

After repeating steps iii and iv a large number of times, the posterior probability p(δ,τ) is

ed

v.

𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

M an

ii.

estimated by the proportion of times that )) − log(𝑓(𝜽 𝑇 , 𝑡))| < log(𝛿).

ce

pt

𝑚𝑎𝑥 𝑡∈(0,𝜏)|log(𝑓(𝜽𝑅 , 𝑡

Given that subjects are often serially sampled in a PK study, along with the nonlinear nature of f(θ, t), the likelihood function in step i may be given by a nonlinear hierarchical model.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

t

accepted. The Bayesian posterior probability is estimated through the following procedure.

Common software packages, such as JAGS (Plummer, 2003), OpenBUGS (Lunn, Thomas, Best, and Spiegelhalter, 2000) or STAN (Stan, 2014) make it easy to draw samples from the joint posterior distribution of (θ R , θ T ) (step iii). Still, given the level of complexity of the modeling, it may be advantageous to employ parallel processing capabilities as we did in our simulations. In 10

section 4, we show results of realistic simulations to illustrate the power of the test method via

t

p(δ,τ) given in this section, compared with the traditional bioequivalence testing.

cr ip

In this section, the statistic p(δ,τ) from (5), used to test the hypotheses in (4), is characterized and

us

compared with the traditional bioequivalence test for six different scenarios. Data were

generated from the oral, first-order, single-compartment PK model in (1) with α=300 so that 𝑒𝑥𝑝(−𝐾𝑒 𝑡 )−𝑒𝑥𝑝(−𝐾𝑎 𝑡 ) 𝐾𝑎 −𝐾𝑒

. It was assumed that concentrations were log-normally

M an

𝑓(𝜽, 𝑡) = 300𝐾𝑎

distributed so that log10 ( 𝐶𝑖𝑗𝑡 ) = log10 {𝑓(𝜽𝑖𝑗 , 𝑡)} + 𝜀𝑖𝑗𝑡 , where C ijt denotes the concentration of subject j (j=1, 2, …, N) on formulation i (i=R, T) at time point t, log e (𝜽𝑖𝑗 )~𝑁(log 𝑒 (𝜽𝑖 ) , 𝑽),

ed

𝜽𝑖 = �𝐾𝑒,𝑖 , 𝐾𝑎,𝑖 , 𝛼𝑖 = 300�, and the ε ijt are normally distributed errors with constant standard

deviation σ. This system describes a hierarchical nonlinear model. The number of subjects was

pt

set to N=20 for each formulation (i=R, T) with K a and K e values given in Table 1 for the six

ce

0.01 0 0 different scenarios, 𝑽 = � 0 0.01 0 �, and σ=0.022. On the concentration scale, the 0 0 0.01

within-subject residual variability set to %CV (coefficient of variation) =5% and the total %CV

(between + within errors) ranged from 15% to 25%. The concentration of the compound was

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

4. Simulated results

measured for each subject at 0.5, 0.8, 1.5, 2, 3.5, 5, 7, 9, 12, 14, and 16 hours (11 time points). The left panel of Figure 2 shows the reference curve f(θ R , t) with the eleven time points plotted

on the curve. The right panel of Figure 2 shows a typical simulated reference data set, which should illustrate the within-subject and between-subject variability. 11

For each simulated run, a nonlinear hierarchical model was fitted to the data by Bayesian methods in order to draw posterior samples of (θ R , θ T ). A modicum of prior knowledge of the reference distribution was assumed, and since V and σ are common parameters to both reference and test data sets, both reference and test posterior distributions were informed. For the variance

cr ip

t

parameter prior distributions, it was assumed that V-1 ~ Wishart( 20×(true V), degrees of freedom

(𝑖)

(𝑖)

(𝑖)

log 𝑒 �𝜃𝑘 �~ 𝑁( Mean = true log e �𝜃𝑘 �, SD = 0.5) for i=R, T, where 𝜃𝑘 denotes the kth

us

element of the vector 𝜽(𝑖) . We centered the prior distribution for θ(i) at the actual parameter

θ(i) at an estimated value.

M an

value to save some computational time. In practice, we would center the prior distribution for

To draw samples from the posterior distribution, four independent MCMC chains were run with JAGS software, each with ten million posterior samples after a burn-in of 10,000 and thinning by

ed

200. The total of 200,000 final posterior draws typically resulted in effective sample sizes above 10,000 for all modeled parameters, though some Monte Carlo runs occasionally resulting in a

pt

minimum effective sample size as small as 3,000 draws. With forty million posterior draws

ce

(200,000 after thinning), we felt that the sample sizes were adequate to make inferences.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

=20 ), and σ ~ U(0, 1). For the mean-model parameters, the assumed prior distribution was

From each simulated run, the posterior draws of (θ R , θ T ) were used to estimate p(δ,τ) from (5) (𝑖)

(𝑖)

as well as 𝐴𝑈𝐶 (𝑖) [𝜏], 𝐶max , and 𝑇1/2 as functions of (θ R , θ T ) with upper time limit τ=18 hours. The AUC test posterior probability was calculated as 𝑃𝑟( | log(𝐴𝑈𝐶 𝑅 [𝜏]/𝐴𝑈𝐶 𝑇 [𝜏]) |
0.95); so, for testing (4) with p(δ,τ), the equivalence limit δ = 1.35 was used.

us

The value of the equivalence bound δ = 1.35 was selected after very careful consideration and

M an

represents a bound that is close in value to the traditional testing bound of 1.25 while acknowledging the stricter requirements for testing (4) at every time point in (0, τ). A similar decision was made in Novick, et. al. (2015) in which the traditional dissolution test of f 2 > 50 suggests that the maximum difference between two dissolution curve profiles be less than 10%;

ed

yet, in that publication, the equivalence bound was pushed to 15%. It is neither our intent nor our place to propose δ = 1.35 as the officially sanctioned equivalence bound. Rather, we propose

pt

δ = 1.35 as a reasonable bound for this simulated data and expect future debate and discussion to

ce

ensue. Alternatively, one might maintain δ at the value 1.25 but lower the acceptance probability p 0 to 0.9 or even 0.8. For the case θ R =θ T , with p 0 =0.9, the acceptance rate of H 1 is 74% and with p 0 =0.8, the acceptance rate of H 1 increases to 90%.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

rate of declaring H 1 to be too low when θ R =θ T (approximately 45% acceptance rate of H 1 with

For each of the six scenarios, 100 Monte Carlo runs were performed in order to compare the PK profiles of the two curves. Figure 3 shows the reference and test curve means for each of the six scenarios given in Table 1. Scenario 1 represents equality of reference and test curves. Scenarios 2 and 3 represent equivalence with different test curve parameter values, but with the same maximum absolute log ratio between the two curves. Scenarios 4 and 5 are both cases on the H 0 /H 1 border for (4); i.e., the maximum absolute log ratio = log(δ). Scenario 6 is a case on the H 0 /H 1 border for the traditional bioequivalence test and is inequivalent in terms of (4). The 13

Ke

Max Ratio

Reference

0.257

0.294

Test 1

0.257

0.294

Test 2

0.248

0.294

cr ip

Test 3

0.257

0.316

Test 4

0.228

0.294

log 10 (δ) = log 10 (1.35)

Test 5

0.257

0.370

log 10 (δ) = log 10 (1.35)

Test 6

0.198

0.294

log 10 (1.88)

t

Ka

log 10 (1) = 0

log 10 (1.1)

ed

M an

us

log 10 (1.1)

Table 1. The K a and K e parameter values for the reference curve and the six different test

pt

curves. The last column shows the maximum absolute log ratio between the test and reference

ce

curves in the time interval from 0 – 18 hours. For scenarios 1 – 3, the AUC, C max , T 1/2 and traditional (all three) bioequivalence tests all resulted in posterior probabilities of 1.0 to declare H 1 . These scenarios all represent

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

maximum absolute log ratio of the two curves for t∈(0, τ) is shown in the “Max Ratio” column in Table 1. The posterior testing probabilities for each of the 100 Monte Carlo runs and each of six scenarios are shown in Figure 4.

bioequivalence with regards to the alternative hypothesis in (4). For these scenarios, H 1 was

declared with the p(δ,τ) statistic (p(δ,τ) > 0.95) respectively in 97, 65, and 70 cases out of 100. It is clear that the traditional bioequivalence test is more powerful for these cases. Scenarios 4 and 5 represent the H 0 /H 1 border for the hypotheses in (4). For these scenarios, the posterior 14

probability to declare bioequivalence via AUC, C max , T 1/2 and the traditional test remained high. Using the traditional test, the alternative hypothesis (H 1 ) was declared in 81 and 92 cases out of 100, respectively. On the other hand, the p(δ,τ) statistic appears to be uniformly distributed between 0 and 1, a desirable frequentist testing property. This suggests that, under the

cr ip

t

frequentist belief that (θ R , θ T ) have single true values, when using a vague prior, the p(δ,τ) test

examined through simulation. Finally, Scenario 6 represents the H 0 /H 1 border for the AUC test

us

with clear equivalence in the C max and T 1/2 tests. In this case, the C max and T 1/2 posterior

probabilities were generally high and the AUC and traditional test posterior probability was

M an

roughly uniformly distributed, with perhaps a little extra probability mass near 0.5. The values for p(δ,τ) were all less than 0.1 for this case. Scenarios 4, 5, and 6 illustrate the cases for which we see inequivalence in the definition given by (4), yet still readily declare bioequivalence with

ed

the traditional test method.

We also ran a smaller simulation with values of τ > 18 for Scenario 1 (results not shown), but

pt

noted that the ratio of f(θ R ,t)/f(θ T ,t) often dramatically increases when f(θ R ,t) or f(θ T ,t) takes on

ce

values near zero, even when | f(θ R ,t) - f(θ T ,t) | is quite small. This situation resulted in very few declarations of H 1 , even though the curve means were truly equal. In the simulations of this 𝑅 section, by setting τ to a value such that 𝑓(𝜽𝑅 , 𝑡) = 0.1𝐶𝑚𝑎𝑥 , the problem was averted.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

statistic provides good type I error control. With an informative prior, the type I error must be

15

t cr ip us M an ed

pt ce Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

Figure 2. The reference curve mean with points to show the times in the experimental design (left) and a typical simulated reference data sets with N=20 subjects (right).

16

ed

pt

ce

Ac

t

cr ip

us

M an

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

Figure 3: The reference and test curves means for the six simulations in Section 4.

17

t cr ip us M an ed

pt ce Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

Figure 4: Posterior probability to declare equivalence (H 1 ) for the traditional bioequivalence tests and the proposed test of (4) for each of six simulated scenarios. A dashed line was drawn where the posterior probability = 0.95.

18

5. A relaxation of the Bayesian test of bioequivalence From the simulated results in Section 4, the performance of the test statistic p(δ,τ) against that of the traditional bioequivalence test is illustrated for six scenarios. In this section, a relaxation of

cr ip

probability of bioequivalence. Let 𝑄(𝑡) = |log(𝑓(𝜽𝑅 , 𝑡)) − log(𝑓(𝜽 𝑇 , 𝑡))|. The integral τ

∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 denotes the proportion of t ∈ (0, τ) such that |log(𝑓(𝜽𝑅 , 𝑡 )) −

us

log(𝑓(𝜽 𝑇 , 𝑡))| < log(𝛿). This integral may be estimated by the proportion of 𝑄(𝑡𝑘 ) < log(𝛿)

in a large sequence of t k ∈ (0, τ) . The proposed relaxed hypotheses with 0.5 < q ≤ 1 are given

M an

by τ

H 0 : ∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 < 𝑞 vs

τ

H 1 : ∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 ≥ 𝑞

(6)

τ

The alternative hypothesis in (4) is equivalent to ∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 = 1. The Bayesian

ed

posterior probability metric for testing (6) is τ

𝑝(𝛿, 𝜏, 𝑞) = 𝑃𝑟( � 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 ≥ 𝑞 | 𝑑𝑎𝑡𝑎 )

(7)

pt

0

ce

A decision can be made to accept H 1 in (6) if p(δ,τ, q) > p 0 (say p 0 = 0.95); otherwise H 0 is accepted. The Bayesian posterior probability (7) is estimated through the following procedure.

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

t

the hypotheses (4) that does not involve the test limit δ is proposed to boost the posterior

i.

A joint likelihood is assumed for C | t, (θ R , θ T , Σ), where C and t respectively denote the combined vector of concentrations and times for both reference and test data sets and Σ denotes a vector of other model parameters, such as those needed for variance terms.

ii.

A prior distribution is placed on (θ R , θ T , Σ). 19

iii.

Draw a sample from the joint posterior distribution of (θ R , θ T ) | C.

iv.

For every random draw of (θ R , θ T ) in iii, determine if ∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 ≥ 𝑞.

After repeating steps iii and iv a large number of times, the posterior probability p(δ,τ, q) τ

cr ip

is estimated by the proportion of times that ∫0 𝐼{ 𝑄(𝑡) < log(𝛿)} 𝑑𝑡 ≥ 𝑞.

t

v.

τ

and computing p(δ,τ, q) with τ = 18 hours and δ = 1.35. The results of the simulation are shown

us

in Figure 5. From Figure 5, it is fairly plain that as q decreases, the posterior probability to declare bioequivalence (H 1 ) increases. With q = 0.8, the p(δ,τ, q) test statistic becomes

M an

competitive with the traditional bioequivalence test when the curves are close together (scenarios 1-3) while maintaining higher rejection rates to declare bioequivalence when the curves are

ce

pt

ed

farther apart (scenarios 4-6).

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

The simulated scenarios in Section 4 were re-run fifty times each with q = 0.8, 0.9, 0.95, and 1.0

20

t cr ip us M an ed

pt ce

6. Discussion

Bioequivalence is traditionally tested with intersection-union testing of AUC, C max , and T 1/2 test

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

Figure 5: Posterior probability to declare equivalence (H 1 ) for the traditional bioequivalence test and the proposed tests of (6) for each of six simulated scenarios. A dashed line was drawn where the posterior probability = 0.95.

methods. We propose a novel hypothesis for testing bioequivalence by comparing the closeness of the reference and test PK curves across a continuous range of time points. The equivalence test provides better control over consumer’s risk from a compliance point of view. A Bayesian method is developed based on the posterior distribution of model parameters. The method 21

calculates the posterior probability that the log-ratio of two curves are close together for all time points in the time range from 0 to τ. The method requires specification of an equivalence bound δ and an upper time limit τ, a task that should not be taken lightly. A relaxation of the proposed method was also proposed so that one can claim the two curves are close together for 100q% of

t cr ip

As mentioned in Section 4, declaring H 1 in (4) is difficult when one or both of f(θ R , t) and f(θ T ,

us

t) approaches zero even when f(θ R , t) = f(θ T , t). This occurs because non-biologically-significant absolute differences |f(θ R , t) - f(θ T , t)| can translate into large ratios. The problem appeared to be

M an

𝑅 resolved through careful choice of τ such that 𝑓(𝜽𝑅 , 𝜏) = 0.1𝐶𝑚𝑎𝑥 . To provide more protection

against such problems in testing (4), one can determine whether |log( f(θ R , t) / f(θ T , t) )| < δ over a set of time points C τ ⊆ (0, τ) instead of the entire range (0, τ). For example, C τ might be set to 𝑅 all time points in (0, τ) such that 𝑓(𝜽𝑅 , 𝜏) ≥ 0.1𝐶𝑚𝑎𝑥 . A claim of |log( f(θ R , t) / f(θ T , t) )| < δ for

ed

𝑅 𝑇 /𝐶𝑚𝑎𝑥 ) | < log(δ) and implies closeness in the AUC results on the all t ∈ C τ implies | log(𝐶𝑚𝑎𝑥

pt

set C τ .

Finally, we propose one last test of bioequivalence that compares absolute differences of f(θ R , t)

ce

and f(θ T , t) on the original (not log-transformed) scale. For a finite time point τ, consider the set of hypotheses

Ac

Downloaded by [University of California, San Diego] at 10:19 29 February 2016

the time points in the range from 0 to τ.

H0: H1:

𝑚𝑎𝑥 𝑡∈(0,𝜏)|𝑓(𝜽𝑅 , 𝑡

𝑚𝑎𝑥 𝑡∈(0,𝜏)|𝑓(𝜽𝑹 , 𝑡

) − 𝑓(𝜽𝑻 , 𝑡 )| ≥

) − 𝑓(𝜽 𝑇 , 𝑡 )|

A new PK equivalence test for a bridging study.

In a bridging study, the plasma drug concentration-time curve is generally used to assess bioequivalence between the two formulations. Selected pharma...
12MB Sizes 3 Downloads 9 Views