This article was downloaded by: [University of Colorado - Health Science Library] On: 09 April 2015, At: 22:24 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biopharmaceutical Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lbps20

Testing for Parallelism in the Heteroscedastic FourParameter Logistic Model a

b

Kurex Sidik & Jeffrey N. Jonkman a

Bristol-Myers Squibb Company, Princeton, NJ, USA

b

Department of Mathematics & Statistics, Grinnell College, 1116 8th Avenue, Grinnell, IA 50112, USA Accepted author version posted online: 28 Jan 2015.

Click for updates To cite this article: Kurex Sidik & Jeffrey N. Jonkman (2015): Testing for Parallelism in the Heteroscedastic Four-Parameter Logistic Model, Journal of Biopharmaceutical Statistics, DOI: 10.1080/10543406.2014.1003432 To link to this article: http://dx.doi.org/10.1080/10543406.2014.1003432

Disclaimer: This is a version of an unedited manuscript that has been accepted for publication. As a service to authors and researchers we are providing this version of the accepted manuscript (AM). Copyediting, typesetting, and review of the resulting proof will be undertaken on this manuscript before final publication of the Version of Record (VoR). During production and pre-press, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal relate to this version also.

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Testing for Parallelism in the Heteroscedastic Four-Parameter Logistic Model Kurex Sidik

t cr ip

Princeton, NJ, USA

us

Jeffrey N. Jonkman

1116 8th Avenue

M

Grinnell College

an

Department of Mathematics & Statistics

ce pt ed

Grinnell, IA 50112, USA [email protected] Abstract

For bioassay data in drug discovery and development, it is often important to test for parallelism of the mean response curves for two preparations, such as a test sample and a reference sample in

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Bristol-Myers Squibb Company

determining the potency of the test preparation relative to the reference standard. For assessing parallelism under a four-parameter logistic model, tests of the parallelism hypothesis may be conducted based on the equivalence t-test or the traditional F-test. However, bioassay data often have heterogeneous variance across dose levels. Specifically, the variance of the response may

1

be a function of the mean, frequently modeled as a power of the mean. Therefore, in this paper we discuss estimation and tests for parallelism under the power variance function. Two examples are considered to illustrate the estimation and testing approaches described. A simulation study is

cr ip

t

also presented to compare the empirical properties of the tests under the power variance function

variance pattern.

us

Keywords: Dose-response curve; Intersection-union test; Nonlinear model; Parallel assay;

1. Introduction

an

Dilutional similarity; Variance function; Weighted least squares

M

The four-parameter nonlinear logistic model is commonly used to model assay or dose-response data. Although it is simple and convenient to assume constant variance of the responses in fitting

ce pt ed

the model and making inferences such as testing for parallel curves in potency assays, the variance of a bioassay response often changes across levels of the dose in a fashion related to the mean response (e.g., Rodbard and Frazier, 1975; Finney and Phillips, 1977; Rabb, 1981; Peck, et. al., 1984; Davidian, Carroll, and Smith, 1988; O’Connell, Belanger, and Haaland, 1993; Belanger, Davidian, and Giltinan, 1996; Gottschalk and Dunn, 2005). In particular, the variance is taken to be a function of the mean model, typically in the form of a power of the mean model.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

in comparison to the results from ordinary least squares fits, which ignore the non-constant

Therefore, we address the issue of testing for parallelism in determining relative potency under the four-parameter nonlinear logistic model with a power of the mean variance function. This paper may be viewed as extension or follow-up to the paper by Jonkman and Sidik (2009) on the problem of parallelism testing for the same nonlinear model under constant variance.

2

Several parameterizations exist for the four-parameter logistic model (e.g., Finney, 1978; O’Connell, Belanger, and Haaland 1993). We adopt the following parameterization, which we

(1)

cr ip

d −a +ε, 1 + exp b ( c − log X ) 

where Y is the response of interest and X is the dose or concentration of a given preparation. The

us

model parameters a, b, c, and d are interpreted as follows. The parameters a and d are the lower

an

and upper plateau values, respectively. That is, a is the lower asymptote of the curve, which corresponds to the mean response at zero dose for an increasing response curve, or to the mean

M

response at an arbitrarily large dose for a decreasing response curve. Similarly, d is the upper asymptote of the curve. The parameter b is the slope factor, which is essentially the slope of the

ce pt ed

linear part of the model curve. Finally, c is the logarithm of the dose corresponding to a mean response midway between the lower and upper plateaus. Thus, c is important for quantifying the potency of a given preparation under this model. Note that we typically work in the log base 10 scale, so that “log” and “exp” in (1) above are in base 10. In this paper, we will assume that the variance of the response is a function of the mean model: Var (Y | X ) = σ 2 { E ( Y | X )} , 2θ

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Y =a+

t

have generally found useful for fitting bioassay data:

(2)

where θ and σ2 are the parameters of the variance function, which must be estimated using the sample data as are the mean function parameters. Although we assume the power model as a

3

variance function, the methods discussed in this paper should be easily extended to other forms of variance functions. In parallel assays the test preparation and reference standard are often expected to exhibit

cr ip

preparation, and hence are expected to have parallel response profiles. If the response curves for the two preparations are parallel, that is, if the slope parameters as well as the upper and lower

us

plateaus are equivalent for both preparations, then the relative potency of the test sample compared to the reference standard may be defined simply in terms of the parameter c. If the two

an

response curves are not parallel, then the relative potency changes depending on the level of the mean response used to assess the potency (Schofield, 2003; Jonkman and Sidik, 2009). Thus, it

M

is standard practice to statistically test the assumption of parallelism prior to estimating the

1032

and

1034

, 2012; European

ce pt ed

relative potency (see United States Pharmacopeia USP

Pharmacopeia EP Chapter 5.3, 2008). The typical procedure for establishing parallelism in current practice is to test the hypothesis of equal or practically equivalent lower and upper plateaus and slope parameters for the test and reference samples using the nested model F statistic, the equivalence t-test, or some other approaches (e.g., Reeve, 2000; Callahan and Sajjadi, 2003; Hauck et al., 2005; Gottschalk and Dunn, 2005; Jonkman and Sidik, 2009). In

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

dilutional similarity, meaning that the test preparation behaves as a dilution of the standard

particular, the equivalence t-test and the F-test are both commonly used for testing parallelism,

especially the equivalence t-test (see USP

1032

and

1034

, 2012; Jonkman and Sidik, 2009).

The current literature discusses the problem of testing parallelism mostly assuming constant variance of the response. However, the variance of the response in bioassay is usually

4

heterogeneous. We believe that the practice of testing parallelism while ignoring the nonconstant variance pattern can lead to misleading conclusions. Considering that the variance of the response in bioassays is usually proportional to the mean response in the form of the power

cr ip

t

model function (see Rodbard and Frazier, 1975; Finney and Philips, 1977; Davidian, Carroll, and

paper we discuss tests of parallelism in the four-parameter logistic model under the

us

heterogeneous variance model (2).

Thus, the outline of this paper is as follows. In Section 2 we discuss the tests of the parallelism

an

hypothesis. Section 3 gives the details of the estimation method for fitting the model under the variance function. In Section 4 we present two examples that illustrate the procedure under the

M

heteroscedastic model and contrast it with the results obtained by the usual constant variance

ce pt ed

approach. A simulation study is presented in Section 5, and some of the implications of the simulation results are discussed. A brief discussion and summary is presented in Section 6. 2. Testing for parallelism under the heteroscedastic model In assessing the parallelism between a test sample and a reference standard, the heteroscedastic four-parameter logistic regression model is given by

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Smith, 1988; O’Connell, Belanger, and Haaland, 1993, Gottschalk and Dunn, 2005), in this

Yij = f ( xij ; β i ) + ε ij , i = 1, 2;

j = 1… ni ,

(3)

where i = 1 for the reference sample and i = 2 for the test sample. The independent

N(0,σij2 )

random errors, and

f ( xij ; βi )

5

is the mean model,

εij

are assumed to be

f ( xij ; βi ) = ai +

σ ij2 = σ 2{ f ( xij ; βi )}2θ , i = 1, 2; j = 1…ni ,

(4)

us

where θ and σ2 are the variance parameters. Note that we assume the variance parameters are the

an

same for both the reference sample and the test sample, that is θ and σ2 are independent of i for i = 1 , 2. This essentially means that the degree of change in variance over the dose values is

M

similar for the two samples. This assumption should be reasonably in agreement with reality since the test preparation and reference standard are supposed to contain the same effective

ce pt ed

constituents in parallel assays, and hence we would expect that the level of heterogeneity in the variance of the measured response is inherently about the same for the two samples in the same assay.

Under the heteroscedastic four-parameter logistic models (3) and (4), consider testing for parallelism using the two commonly used approaches, namely the F-test and the intersectionunion test (IUT) for equivalence. The F-test tests the null hypothesis

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

assumed equal to a power function of the mean model, i.e.,

is

t

are the parameters of the model. The heterogeneous variance

σij2

cr ip

where

βi = (ai , bi , ci , di )

di − ai , i = 1, 2; j = 1… ni , 1 + exp[bi (ci − log xij )]

H0 : a1 = a2 , b1 = b2 , and d1 = d2,

(5)

6

where the alternative is that at least one of a, b, or d differs between the two samples. The IUT uses t-tests with suitable a priori equivalence limits RL and RU to test the hypotheses (Berger and

cr ip

H0 :

t

a1 a ≤ RL or 1 ≤ RU or a2 a2 b1 b ≤ RL or 1 ≥ RU or b2 b2

versus

ce pt ed

d1 d > RL or 1 < RU . d2 d2

M

b1 b > RL and 1 < RU and b2 b2

an

(6)

a1 a > RL and 1 < RU and a2 a2 H1 :

us

d1 d ≤ RL or 1 ≤ RU d2 d2

Note that different equivalence limits for a, b, and d could be used if desired, as shown in one of the examples in section 4, as long as the limits are pre-specified for each ratio, typically by consulting with the assay scientists, by evaluating historical data from the assay or similar assays, or perhaps by conducting exploratory studies for the express purpose of setting reasonable equivalence limits. We note that setting appropriate equivalence limits for a given

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Hsu, 1996; Jonkman and Sidik, 2009)

assay is nontrivial and may be challenging in practice, as also noted by Hauck et al. (2005). For

discussion on setting the equivalence limits, see the papers by Plikaytis et al. (1994), Hauck et al. (2005), and Callahan and Sajjadi (2003).

7

Tests of parallelism can be carried out based on the weighted least squares fit of the model (3), with weights

wij = 1/ f 2θ ( xij , βi )

βˆi . The weighted least squares estimation under the variance

t

model parameter estimates

formed using an estimate θˆ of the variance parameter θ and

cr ip

likelihood function of the data by incorporating the unknown variance parameter θ (Davidian

us

and Carroll, 1987; Carroll and Ruppert, 1988; Giltinan and Ruppert,1989). We will discuss the parameter estimation method specifically for testing parallelism under the heteroscedastic model

an

in Section 3. Based on the weighted least squares estimation, inference about the parameters such as tests of hypothesis (e.g., Wald F-test and t-tests for individual parameters) may be carried out

M

in a manner similar to the inference methods used with ordinary nonlinear least squares (OLS) estimation (e.g., Gallant, 1987, p124-125; Davidian and Carroll, 1987).

ce pt ed

The F-test of (5) is based on comparing the fit of the heteroscedastic non-parallel or full model (3) to the fit of the heteroscedastic parallel or reduced model

Yij = a +

d −a + ε ij i = 1, 2; j = 1, …ni , 1 + exp[b(ci − log xij )]

(7)

under the null hypothesis (5) (see Reeve, 2000). Note that here

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

function can be obtained using a numerical iterative procedure along with the normal distribution

a = a1 = a2 , b = b1 = b2 , and

d = d1 = d2 . An approximate F statistic for a significant difference between the two models

can be defined as follows,

8

F=

[WSSE ( βɶi | θˆ ) − WSSE ( βˆi | θˆ )] / 3 , WSSE ( βˆi | θˆ )] / ( n − 8)

(8)

βɶi and βˆι are

t

where WSSE(βɶi | θˆ) and WSSE(βˆi | θˆ)] are the weighted residual sums of squares,

cr ip

the heteroscedastic model, the weights would respectively be

ɶij = 1/ f 2θ ( xij , βɶi ) w

and

in fitting the reduced and full models, using an estimate θˆ of the nuisance

us

wˆ ij = 1/ f 2θ ( xij , βˆi )

n = n1 + n2 . Under

an

parameter θ in practice. However, strictly speaking that would not give a valid F-test due to the different mean models used to form the weights under the two models (see Gallant, 1987).

M

Therefore, we suggest to construct the F-test for testing (5) using powers of the mean observed responses to form the weights for both the full and reduced models instead, i.e., ɶij = wˆ ij = 1 / yij2θ , w

i = 1, 2; j = 1, … , ni ,

ce pt ed

ˆ

(9)

y given an estimate θˆ from a suitable estimation method. Note that here ij is the mean of the observed responses at the dose value xij. The hypothesis (5) is typically rejected if F is larger than the critical value from the

F3,n−8

distribution with specified level α (see also Gallant, 1987;

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

the parameter estimates under the reduced and full models, respectively, and

Gottschalk and Dunn, 2005). It should be noted that the observed mean responses are unbiased

estimates of the true means, reflect the actual patterns in the data, and have increasing precision as more replicates are available. The observed means may not give precise estimates of the weights if few replicates are available, although the fitted responses may also produce poor

9

estimates in that case. Also, if the model fit is poor then the observed responses may be a better choice for forming the weights than the fitted responses. In contrast, to assess parallelism by testing (6), the interest is to test whether the mean response

cr ip

lower and upper plateaus and the slope are equivalent for the two samples. The test of this hypothesis can be conducted based on the IUT method discussed by Berger and Hsu (1996) (see

us

also Sasabuchi, 1980); Specifically, the test is carried out by performing six one-sided tests of the parameters of (3), corresponding to the component hypotheses of (6), each at level α. By the

an

IUT method, the test rejects H0 only if all six component tests reject at level α, and this procedure will have at most level α without requiring any multiplicity adjustment. For the details

M

of IUT theory, see Berger (1982), Casella and Berger (1990), and Berger and Hsu (1996). We

ce pt ed

propose to use the usual t-tests of the component hypotheses to implement the IUT of (6), similar to the common practice used in nonlinear regression analysis for making inference under similar heteroscedastic models (e.g., Davidian and Haaland, 1990; O’Connell, et.al., 1993; Belanger et.al., 1996). In particular, the test of practical parallelism formulated in terms of the ratio as in hypothesis (6) can be conducted using the standard t-test by fitting a reparameterized version of the heteroscedastic model (3) and (4) such that the ratios of the lower plateaus, the upper plateaus, and the slopes of the two samples appear as parameters in the model. In summary, the

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

curves for the reference sample and the test sample are practically parallel, i.e. to test whether the

test of the hypothesis (6) is carried out using six one-sided t-tests about the individual ratio

parameters based on the results of the fitted heteroscedastic model. This is the approach we used for the example data in Section 4.

10

Equivalently, the general methodology described by Berger and Hsu (1996) can be easily implemented to test the hypotheses in terms of a difference based on the fitted heteroscedastic model (3) and (4). In particular, the hypothesis (6) can be equivalently defined in terms of the

versus a1 − RL a2 > 0 and a1 − RU a2 < 0 and b1 − RLb2 > 0 and b1 − RU b2 < 0 and

an

d1 − RL d 2 > 0 and d1 − RU d 2 < 0.

us

H1 :

(10)

cr ip

a1 − RL a2 ≤ 0 or a1 − RU a2 ≥ 0 or H0 : b1 − RLb2 ≤ 0 or b1 − RU b2 ≥ 0 or d1 − RL d 2 ≤ 0 or d1 − RU d 2 ≥ 0

M

Note that RL and RU are known constants, and the parameters a2, b2, and d2 are assumed to be positive in this equivalent representation of the hypothesis (6). Note that a2 and d2 are typically

ce pt ed

positive because bioassay responses in general are positive, and b2 may be positive or negative depending on specific assay types or the measurement of interest. Since b1 and b2 will typically have the same sign in a given assay, the ratio b1/b2 in (6) will always be positive, and hence

b1 / b2 = b1 / b2 = b1 / b2

. Thus, as we proposed in our previous paper (Jonkman and Sidik, 2009),

we work with absolute values of the slopes b1 and b2 in testing the parallelism hypothesis (10). The six one-sided t-tests for testing the corresponding components of the hypothesis (10) can be

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

following differences (see Jonkman and Sidik, 2009):

constructed based on fitted results of the heteroscedastic model (3) and (4) similar to the approximate t-tests of the hypothesis under the homoscedastic four-parameter logistic model

discussed by Jonkman and Sidik (2009). For example, to test the first component of the

11

equivalence hypothesis about the lower plateau, i.e.

H01 : a1 − RLa2 ≤ 0 versus H11 : a1 − RLa2 > 0 ,

the t-statistic is

t

Var ( aˆ1 ) + R Var ( aˆ2 )

, (11)

2 L

Var ( aˆ1 )

and

Var ( aˆ 2 )

cr ip

where

aˆ1 − RL aˆ2

are respectively the estimated variances of the parameter estimates

us

aˆ1 and aˆ2 obtained from fitting the heteroscedastic model (3) and (4). Detail of the variance

an

estimation are discussed later, in Section 3. This component hypothesi

H01 is rejected in favor of

M

H11 if T1 > tn−8,α . Similarly, the second approximate t-test in the sequence then tests

ce pt ed

H02 : a1 − RU a2 ≥ 0 versus H12 : a1 − RU a2 < 0, using T2 =

aˆ1 − RU aˆ2

Var ( aˆ1 ) + R Var ( aˆ2 )

It rejects

(12)

2 U

H02 if T2 > tn−8,α . The remaining four tests, of H03 : b1 − RLb2 ≤ 0 , H04 : b1 − RUb2 ≥ 0 ,

H05 : d1 − RLd2 ≤ 0 , and H06 : d1 − RU d2 ≥ 0 , proceed in analogous fashion. The level-α IUT for

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

T1 =

parallelism then rejects H0 in (10) only if all six one-sided approximate t-tests reject

H01,…,H06

at the same α level under the heteroscedastic model (3) and (4) (Berger, 1982; Berger and Hsu, 1996; Jonkman and Sidik, 2009).

12

Although we believe that the ratio formulation of the hypotheses is a natural way to test for practical parallelism, we should mention that alternatively the equivalence hypothesis of parallelism may be defined in terms of absolute differences of the three parameters, namely the

cr ip

t

lower plateau, the upper plateau, and the slope, with predetermined suitable equivalence limits

conducted using the usual t-test for the individual difference parameters under an appropriate

us

reparameterization of the model such that the differences appear as the parameters of the model. For determining a relative potency in parallel assays, the equivalence testing approach may be

1032

and

1034

, 2012). Note that the issues with the F-test

M

homoscedastic model (USP

an

preferred over the F test under the heteroscedastic model similar to the practice under the

described in the literature can also be problematic under heterogeneous variance. In summary,

ce pt ed

the common drawback of the F-test is that it can lead to rejection of parallelism even for very similar response profiles or practically negligible departures from the parallelism when the sample size is large and the variability is small (Callahan and Sajjadi, 2003; Hauck et.al., 2007; Jonkman and Sidik, 2009). The equivalence testing approach may indeed provide an improved inference in the case of practically parallel response curves and in dealing with assays of good precision (Jonkman and Sidik, 2009). However, the F-test, as a likelihood ratio criterion based

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

for the differences. Similarly, the test of the hypothesis in terms of absolute differences may be

test, has certain desirable properties such as invariance to reparametization of the model and the hypotheses, and also is accurate for testing parallelism if the true response curves of the reference and test samples are exactly parallel (Jonkman and Sidik, 2009).

13

Finally, it should be noted that other approaches to testing for parallelism have been proposed and discussed in the literature. In particular, Gottschalk and Dunn (2005) proposed an alternative to the standard F-test based on a weighted extra sum of squares with a chi-squared null

cr ip

t

distribution (see also USP (1032), 2012). For details of the chi-square test for parallelism, see

us

3. Estimation of the variance parameters

Consider the heteroscedastic four-parameter logistic model described in Section 2. We assume

Yij

has normal distribution with

an

the response variable

{

}

βi = ( ai , bi , ci , di )



,

2 for i = 1, 2 are the mean model parameters, and θand σ are the

ce pt ed

where

M

E (Yij ) = f ( xij ; βi ) , Var (Yij ) = σ 2 f ( xij ; βi )

unknown variance function parameters. θ may be viewed as a nuisance parameter. In practice, one may sometimes have a priori knowledge about the value of θ, and hence the parameters could be estimated by weighted least squares using the approximate weights θ = 1 2 and

1 Yij2θ

βi

(for example,

θ = 1 may be commonly assumed values in practice). However, it should be clear

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Gottschalk and Dunn (2005).

that a better approach in general is to estimate θ based on the data so that appropriate weights

can be incorporated in fitting the model (see Giltinan and Ruppert, 1989). Therefore, in testing for parallelism the parameters

βi , θ, and σ 2

should be estimated based on the sample data under

the heteroscedastic four-parameter logistic model.

14

In general, a commonly-used and appealing method of estimating the parameters under a variance function is the generalized least squares (GLS) approach proposed by Carroll and Ruppert (1982, 1988). Other estimation procedures such as the extended least squares approach,

cr ip

t

which estimates jointly the mean model and variance parameters for normally distributed data

al., 1984). However it has been noted that the GLS approach is computationally simple and can

us

be implemented using a standard statistical package, and also the extended least squares method can be less appealing in terms of robustness to the normality assumption (e.g., Carroll and

an

Ruppert, 1982; Beal and Sheiner, 1988; Gilitnan and Ruppert, 1989; and Belanger, Davidan, and Giltinan, 1996). Therefore, in this paper we adopt the GLS approach to estimate the parameters

M

of the heteroscedastic model for testing parallelism. For an overview of other estimation

ce pt ed

procedures, see Raab (1981), Davidian and Carroll (1987), and Beal and Sheiner (1988). The GLS parameter estimation procedure in the context of testing for parallelism under the heteroscedastic model, assuming normality of the response variable follows. Note that the steps involving

Yij

, may be implemented as

βɶi under the reduced model are strictly necessary only for

estimating relative potency after parallelism has been established, but it may be convenient in general to include them when programming the GLS procedure.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

with correct specification of the variance function, have also been proposed (Raab, 1981; Peck et

Step 1: First, obtain the preliminary unweighted or OLS estimates of

full model (and

βɶi( 0) under the reduced model).

15

βi , that is, βˆi( 0)

under the

2 2 Step 2: Estimate θ and σ by minimizing the pseudolikelihood in θ and σ for the full or

( (

)) )

2

(

+ log σ f 2

20

(

 ( 0)  ˆ xij , β i   

))

βɶi ) that minimize the residual sums of squares, respectively, 2

(

n1

(

i =1 j =1

  and 

∑ ∑ wˆ ( y n1

2

i

i =1 j =1

ij

(

− f xij ; βɶi

))

2

 , 

) ( and wɶ = 1 f ( x , βɶ ( ) ) ) . 2θˆ

ij

ij

0

i

ce pt ed

with weights

(

ˆ 0 wˆ ij = 1 f 2θ xij , βˆi( )

))

2

M

∑ ∑ wˆ ij yij − f xij ; βˆi

an

(and

us

βˆ Step 3: Using the estimate θˆ from Step 2, obtain updated weighted least squares estimates i

Step 4: Return to step 2 and iterate Steps 2 and 3 for a few rounds to obtain the final parameter estimates

βˆi , ( βɶi ) , and θˆ .

2 Note that an estimate of σ from the GLS estimation in Step 2 is then adjusted by the degrees of

n ( n − 8)

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

(

 ˆ (0)  yij − f xij , β i 2 PL (θ , σ ) = ∑∑  0 2 20 xij , βˆi( ) i =1 j =1  σ f  n1

2

t

βi = βˆi( 0) ,

cr ip

nonparallel model (3) using

freedom factor

2 , and hence the final estimate of σ is

σˆ 2 =

2

n1

1 ∑∑ n − 8 i =1 j =1

(

( )) . ( x , βˆ )

yij − f xij , βˆi

16

f 2θ

ˆ

ij

i

2

2 Similarly, a GLS estimator σ under the reduced (or parallel) model could also be computed.

The GLS fit for heteroscedastic models can be easily implemented using standard statistical packages such as R and SAS (see Giltinan and Ruppert, 1989; Carroll and Ruppert, 1987).

cr ip

using the IUT described above, may be carried out based on the GLS estimation method by

2

(X

T

WX

)

−1

us

( )

V ar βˆ = σ

βˆ has asymptotic multivariate normal distribution with mean β and covariance (13)

.

an

assuming that

respect to the parameters

wij = 1 / f 2θ ( xij , β i )

(for i = 1, 2 ), and W is a diagonal matrix with the ijth diagonal

. In practice, an estimate of the variance-covariance matrix may be

ce pt ed

element

βi

M

f ( xij ; β i ) with Here, X is the n×8 Jacobian matrix whose ijth row is the partial derivative of

obtained by using the estimate

βˆ along with the variance parameter estimates

θˆ and σˆ 2 (see

Beal and Sheiner, 1988; Giltinan and Ruppert, 1989; Davidian and Haaland, 1990). As noted by Giltinan and Ruppert (1989), estimated standard errors for the parameter estimates

βˆ taken

directly from (13) by treating the weights as fixed and known (despite the use of θˆ to construct

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

β = ( β1 , β 2 ) , such as testing the equivalence hypotheses (10) or (6) Inference about β , here

the weights) can be biased due to uncertainty in estimation of θ, but the bias is not severe enough

to cause problems and is negligible for large samples. In the estimation procedure proposed above, we used the same estimate of θ for both the full model and the reduced model in constructing the F-test for parallelism. It should be noted that

17

the equivalence t-tests for testing (6) are based only on estimation of the non-parallel or full model, and hence estimation of θ under the reduced model is not required for the equivalence test. To construct a valid F-test for testing (5), it is critical to use the same estimated value of θ in

cr ip

t

forming the appropriate weights for the weighted least squares estimation of the reduced and full

be the same (see Gallant, 1987, p. 124–125, p. 333–334, and p. 388–389). Therefore, when

us

constructing the F-statistic (8) we propose to obtain an estimate θˆ based on the full model (3) in the GLS procedure described above, and to use the same estimate of θ to form appropriate

an

weights for fitting both the full and reduced models. Specifically, the weights may be formed y ij2θ

ˆ

at the covariate value

( x , βˆ )

M

using the empirical power variance estimate respective GLS estimated variance functions

f

2θˆ

ij

i

and

(

ˆ f 2θ xij , βɶi

)

xij

in place of the

under the full and

ce pt ed

reduced models, and treating the weights as if they are fixed and known (see also Gottschalk and Dunn, 2005). If estimation of the relative potency using an estimate θ under the reduced (parallel) model is desired after establishing parallelism, it can be easily accomplished by including the relevant extra computations shown in parentheses, as well as including the reduced model pseudolikelihood in step 2, when programming the GLS steps. However, in practice there might be little difference in the estimated value of θ, and hence perhaps in the relative potency,

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

models, and it should be noted that the weights (not just the value of θ) under both models must

between the full and reduced models based on the results of the small simulation presented at the end of this section. Tests of parallelism can be carried out using the methods described in Section 2 based on the

GLS estimation approach outlined above for the heteroscedastic four-parameter logistic model.

18

The quality of the model fits or the model parameter estimates, and hence the inference about parallelism, may depend on accurately estimating the variance parameter θ (see Davidian and Carroll, 1987; Davidian and Haaland, 1990). To evaluate the accuracy of the proposed estimation

t

approach regarding θ, particularly in forming the F-test, we conducted a simulation with around

cr ip

in simulation Case 2 discussed in Section 5. The variance parameter θ was estimated under the

us

full model as described above. The results of the simulation are provided in Table 1. We see that the estimation of θ based on the full model is accurate in terms of the bias, standard error, and

an

mean square error in estimating the true value of θ, even when the data are generated under the reduced model. Therefore, the proposed method of estimating the variance parameter θ using the

M

full or nonparallel model when constructing the approximate F-test for parallelism should be

ce pt ed

reasonable for obtaining an accurate estimate of θ.

When data from multiple parallel-curve assays are available, more precise estimates of the variance parameters θ and perhaps also σ2 of the variance function model (4), and hence more precise estimates of the weights, could be obtained by extending the GLS estimation method under a single assay to the GLS-P method under multiple parallel-curve assays. This method uses the pooled information from all the assays by assuming that the parameters θ and σ2 are the

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

10,000 replicates by generating data under the parallel model (7) with θ = 1.0 and σ = 0.04 , as

same across the multiple assays under the nonlinear mixed-effects model framework. For details, see Davidian and Giltinan (1993a, 1993b).

19

4. Examples: illustration and comparisons We apply the GLS estimation method and the tests of parallelism under the heteroscedastic four-

cr ip

parallelism, in order to compare and comment briefly on the results of the different approaches.

us

4.1. Macrophage cell lysis assay

an

The data in this example are from a macrophage cell lysis assay for estimating the potency of a recombinant protective antigen. The data are taken from the US Pharmacopeia Chapter

111

M

(2008). In this assay, the reference and test samples each have 12 dilution doses with a single luminescence measurement at each dose, so there are

ni =12

observations for each sample with

ce pt ed

total sample size n = 24 . The data are plotted in Figure 1(a), with the GLS fitted heteroscedastic curves overlaid for each sample, assuming the power variance function (2). Although the nonconstant variance pattern is not visually apparent from the plot because there are no replicates at individual doses, the data do seem to have heteroscedastic variance structure based on the plot of residuals from the OLS estimation (the plot is not included). Furthermore, plotting the log of the absolute values of the residuals against the log of the predicted values from the OLS fit of the

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

parameter logistic model to two example data sets. We also use OLS estimation to obtain tests of

model produces a linear trend (the plot is not included), which may indicate that the power model (2) is an appropriate variance function. For graphical methods of diagnosing variance function patterns, see Davidian and Haa-land (1990). Note that in USP Chapter

20

111

a log

transformation of the response is used for analysis due to the specific relationship between the variance and the mean response. To assess parallelism, we used the GLS estimation procedure described in the preceding section

cr ip

using the ratios of the lower plateaus, the upper plateaus, and the slope parameters for the test

d 2 [1 + ti (rd − 1)] − a2 [1 + ti (ra − 1)] , 1 + exp{b2 [1 + ti (rb − 1)](c2 + ti r − log( xij ))}

(14)

an

Yij = a2 [1 + ti (ra − 1)] +

us

and reference samples as the model parameters, i.e.,

M

where ti is a dummy variable with t1 = 1 and t2 = 0 corresponding to the reference and test samples, respectively. Here the three ratio parameters are

ra = a1 / a2 , rb = b1 / b2 , and rd = d1 / d2 .

r = c1 − c2 is the log base 10 relative potency of the test sample relative to the reference

ce pt ed

Note that

standard if the response curves are parallel. The parameter estimates from the GLS and OLS fits are provided in Table 2. The variance parameter estimate for the heteroscedastic model is θˆ = 0 .9 7 2 8 with estimated standard error 0.0843. This lends support to the approach of log-

transforming the response in evaluating parallelism as is done in USP Chapter

111

, because

the estimate θˆ = 0.9728 (and thus 2θˆ ≈ 2 ) indicates that the variance of the response roughly

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

to fit a reparameterized version of the heteroscedastic four-parameter logistic model (3) and (4)

corresponds to a constant coefficient of variation pattern, a special case of the power variance function that would be expected with sample data from a lognormal distribution.

21

To compute the F-statistic (8) for testing (5) under the power variance function, we fitted both wˆ ij = 1 / yij2θ

ˆ

the full and reduced models using the weights

formed from the empirical power

t

variances based on the GLS estimate θˆ = 0.9728 , and obtained F = 2.4397 with p-value 0.1021

cr ip

significance level, and hence the F-test does not provide significant evidence against parallelism. Note that if we take the natural log of the response to stabilize the variance, and conduct the F-

us

test based on the OLS fit of the model, then F = 2.8125 with p = 0.0727 , which renders the

an

same conclusion as the F-test from the heteroscedastic model. If the F-test is carried out using the unweighted or OLS estimation on the data without the log transformation, we obtain

M

F = 2.8221 with p = 0.0721 . Hence we would still conclude that the response curves are parallel at the 5% level, similar to the conclusion reached by the F-test under the heteroscedastic

ce pt ed

model. This might indicate that the F -statistic is not that sensitive to the weights used in fitting a heteroscedastic model, as noted by Gottschalk and Dunn (2005). To test parallelism in terms of practical equivalence, we applied the intersection-union test of (6) using t-tests based on the results of the fitted reparameterized heteroscedastic model (14). We selected

RL = 0.8 and RU = 1.25 for testing the ratios of the lower and upper plateaus and used

RL = 0.75 and RU = 1.33 for the ratio of the slopes as the equivalence limits. Note that we

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

based on the F3,16 distribution. Thus we failed to reject the null hypothesis of (5) at the 5%

slightly widen the equivalence limits for testing the slope, following the general strategy used for the same data in USP Chapter

111

to test parallelism based on the differences of the three

parameters. We should mention that the equivalence limits here were selected basically for

22

convenience without much assay knowledge, because our primary goal was to focus on the statistical methodology for performing equivalence tests under the heteroscedastic fourparameter logistic model. In practice the choice of equivalence limits should be carefully

cr ip

The p-values of the six one-sided t-tests for the IUT using the equivalence limits defined above for the GLS fit of the heteroscedastic model (14) are shown in Table 2, along with the p-values

us

of the six one-sided t-tests for the IUT from the OLS fit of the model using the same equivalence limits. As can be seen from Table 2, the null hypotheses H01,… , H06 are all rejected at the

an

α = 0.05 level based on the heteroscedastic model. Thus, the IUT provides evidence that the

M

ratios of the lower and upper plateaus are between 0.8 and 1.25 and the ratio of the slopes is between 0.75 and 1.33 for this data set, and hence that the response profiles are shown to be

ce pt ed

practically parallel based on the heteroscedastic model. On the other hand, if one ignores the heterogeneous variance pattern and carries out the same equivalence test based on OLS estimation as shown in Table 2, practical parallelism would not be established. The conflicting conclusions reached by the IUT based on the two estimation approaches may be indicative of the misleading inference that may result from ignoring the nonconstant variance pattern in the data. Finally, note that after confirming parallelism based on the IUT with the GLS estimation method described in Section 2, and thus using θˆ = 0. 9728 , we obtained an estimate of 1.0479 for the

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

considered and specified in advance, as noted previously in Section 2.

relative potency with 95% confidence interval (1.0013,1.0967). These estimates differ very little from the point estimate of 1.0512 and the 95% interval estimate (1.0048, 1.0998) obtained from the unweighted fit of the mean model with the log transformed data (see also USP (111)).

23

It may be helpful to illustrate how the IUT is conducted, so we show the calculation for testing

H01 : a1 / a2 ≤ RL versus H11 : a1 / a2 ≤ RL , the first test in the sequence, based on the

T1 =

rˆa − 0.8

0.9676 − 0.80 = 2.6394. 0.0635

us

var( rˆa )

=

Because the one-sided p-value is 0.0089 based on the t-distribution with 16 degrees freedom, we

an

reject H01 at the 5% significance level. The other five components of the IUT can be computed

4.2. Ten dose assay

M

similarly.

ce pt ed

This example data set has been used to illustrate tests for parallelism under the homoscedastic four-parameter logistic model by Jonkman and Sidik (2009). The data were randomly generated using R statistical software (R Development Core Team 2007) under a normal distribution with constant variance. The data consist of three replicates at each of 10 dose levels for a reference standard and a test sample, so we have

ni = 30 observations for each sample. As noted by

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

obtained from (13) by fitting the reparameterized model (14), so

t

rˆa = 0.9676 with estimated standard error 0.0635

cr ip

heteroscedastic fit of the model. From Table 2,

Jonkman and Sidik (2009), the data are representative of a case with homogeneous response variance and high assay precision. The data are plotted in Figure 1(b), with fitted heteroscedastic curves overlaid for each sample. Although the data would typically be assessed for parallelism using OLS estimation because the response has constant variance, we want to illustrate the

24

applicability of the variance function model if one applies the heteroscedastic model routinely without properly diagnosing the variance pattern of the assay data. We fitted the reparameterized version of the heteroscedastic four-parameter logistic model (14)

cr ip

First we note that the estimated variance parameter θˆ = 0 .1 1 8 9 with standard error 0.1463 indicates that θ is not statistically different from zero, which reflects the true distribution of the

us

data, i.e. that the data do indeed have constant variance. Thus we would expect the parameter

an

estimates, and hence the results of the tests, to be about the same using the weighted and unweighted approaches, as is true for the results in Table 3.

M

For testing parallelism using the F-test, the full model has

WSSE(βˆi | θˆ) = 0.7606 for this data

~

ce pt ed

ˆ set, with 52 degrees of freedom, while W SSE ( β i | θ ) = 1.0470 with 55 degrees of freedom for the

reduced model under (5). The difference between the full and reduced models results in

F = 6.5271 , with a p-value of 0.0008. Thus, the F-test rejects parallelism at α = 0.05 based on the weighted least squares fits of the two models using the weights formed from the empirical power variances with θˆ = 0 .1 1 8 9 . If we conduct the F-test based on OLS estimation assuming constant variance, the test statistic is F = 7.0714 with p = 0.0004 . Thus, the F-tests from both

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

using the GLS procedure, and applied the tests of parallelism, with the results shown in Table 3.

the homoscedastic and heteroscedastic fits of the model provide evidence against parallelism of the two response curves.

25

To illustrate the intersection-union test for parallelism, we selected

RL = 0.8

and

RU =1.25

as

the equivalence limits in testing (6) for the ratios of all three parameters. Again the equivalence limits were chosen primarily for convenience by adopting the FDA’s (1992) suggested limits for

cr ip

under the heteroscedastic model. The p-values of the six one-sided component t-tests for the IUT of (6) based on both the heteroscedastic model and the homoscedastic model are shown in Table

us

3, along with the parameter estimates and their standard errors using the two estimation

H01,…,H06 are all rejected at the

an

approaches. As can be seen from Table 3, the null hypotheses

α = 0.05 level, and hence the response profiles may be considered practically parallel. We see

M

that the results of the IUT, as well as the parameter estimates and standard errors, are essentially the same for the GLS and OLS approaches, perhaps due to the non-significant variance

θ (θˆ = 0.1189 ± 0.1463) in the heteroscedastic model. This example may suggest that

ce pt ed

parameter

it is a good practice to routinely incorporate the variance function model when testing for parallelism, even for data that appear to have relatively homogeneous variance, especially considering that assay data often have some degree of heteroscedasticity, as noted by Giltinan and Ruppert (1989).

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

bioequivalence, because our primary focus was on the statistical inference about parallelism

This example also perhaps illustrates that the F-test based on the variance function model shares

the drawbacks of the F-test for parallelism based on OLS estimation, which have been pointed

out by several authors (see Plikaytis et al., 1994; Callahan and Sajjadi, 2003; Gottschalk and Dunn, 2005; Hauck et al., 2005; Jonkman and Sidik, 2009). For very precise assays or for data

26

with a relatively small degree of heteroscedasticity, the F-test often rejects parallelism for response profiles that appear very close to parallel, as illustrated by the example data, so that the

cr ip us an M ce pt ed Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

F test may penalize the experimenter for performing a high precision assay.

27

5. Simulation study To assess and compare the properties of the intersection-union test and the F-test under the

cr ip

variance pattern and uses ordinary least squares estimation, we performed a simulation study. The simulation cases considered are based roughly on the ten-dose bioassay example from

us

section 4.2, similar to Jonkman and Sidik (2009), but assuming the power variance function (4). Our goal was to investigate the empirical behavior of the tests when the curves were nearly

an

parallel, exactly parallel, or nonparallel under different simulation cases. We also wanted to

of the variance parameter θ.

M

assess the effects of the numbers of dose levels and replicates, and the effects of different levels

ce pt ed

5.1. Simulation settings We considered four basic simulation cases, similar to four of the cases considered by Jonkman and Sidik (2009):

1. A case where the curves for the test and reference samples are approximately parallel; that is, a case in which all three ratios are within the equivalence limits. For this, we set the parameters

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

variance function model, as well as the same tests of parallelism if one ignores the heterogeneous

equal to the parameter estimates from the example of section 4.2. Specifically, we set a1 = 2.02, b1 = −1.42, c1 = 2.31, and d1 = 10.12 for the reference preparation, and a2 = 2.04, b2 = −1.35, c2 = 2.59, and d2 = 9.86 for the test sample.

28

2.

A case where the curves are exactly parallel, and the only difference is the potency. For

this, we used values similar to those from the example: a1 = 2.0, b1 = −1.4, c1 = 2.3, and d1 = 10.0 for the reference sample, and a2 = 2.0, b2 = −1.4, c2 = 2.6, and d2 = 10.0 for the test sample.

cr ip

t

A case where the ratios for the plateaus and the slope are all on the boundary for the

equivalence test; that is, a1/a2 = 0.8 = RL, b1/b2 = 1.25 = RU, and d1/d2 = 0.8 = RL. For this case, we set a1 = 1.6, b1 = −1.5, c1 = 2.3, and d1 = 8.0 for the reference sample, and a2 = 2.0, b2 = −1.2,

A case where the ratio of the slopes is outside the equivalence limits for the IUT, but both

an

4.

us

c2 = 2.6, and d2 = 10.0 for the test sample.

plateaus are equal. For this case, we set a1 = 2.0, b1 = −1.5, c1 = 2.3, and d1 = 10.0 for the

M

reference, and a2 = 2.0, b2 = −1.16, c2 = 2.6, and d2 = 10.0 for the test sample. For each of these cases, we varied three other parameters. First, we let the number of different

ce pt ed

doses be either 8, 10, or 12. Using the example data from section 4.2 as a reference point, we set the maximum dose to 6400 and used a dilution factor of 1/2 to establish the actual dose levels. Second, we let the number of replicates at each dose level be either 2 or 3. Finally, for a given dose level and number of replicates, we considered three values for the variance function parameter: θ = 0.75, 1.00, and 1.25. The value σ = 0.04 was chosen considering that σ in bioassay is usually relatively small (see Davidian and Carroll, 1987; Davidian, Carroll, and

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

3.

Smith, 1988). We believe that the three values of the variance parameter θ along with the fixed

value of σ should represent a reasonable range of variance heterogeneity levels and hence a good

overall range of assay precision levels in practice.

29

We set RL = 0.8 and RU = 1.25 for all the simulation cases. As we noted in our previous paper about testing for parallelism (Jonkman and Sidik, 2009), although this might seem to limit the generality of the simulation results, the parameter settings in cases 3 and 4 provide us with a

cr ip

t

reasonable idea of what may be expected when the true values of the parameters are at or near

equivalence limits RL and RU is nontrivial and may be affected by several factors. Our goal was

us

to assess the general performance of the IUT and the F test while controlling the number of simulation cases under the heteroscedastic model, and we felt that trying to vary the equivalence

an

limits in addition to the four parameters of the logistic model, the number of doses, the number of replicates, and the value of θ and σ would lead to an unmanageable number of cases.

M

Furthermore, under the same simulation conditions it should be noted that widening the equivalence limits will increase the rejection rate (thus, the power to declare parallelism in cases

ce pt ed

1 and 2, and the type I error rate in cases 3 and 4), while narrowing the equivalence limits will decrease the rejection rate.

For each combination of parameter values, dose level, number of replicates, and the variance parameter θ, we tried to maintain the number of simulation replicates at around 10000 (exactly 10000 replicates per case was not practical due to occasional non-convergence of the GLS fit). For each simulation replicate, data were generated according to the heteroscedastic model (3)

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

the boundaries of the equivalence limits. Again, we note that the choice of appropriate

and (4). For each data set, we fit the model, specifically the reparameterized version of the model (14) discussed in section 4, using the GLS procedure described in Section 3, and performed the equivalence t-tests and the F-test. Note that in constructing the F-test the weights were formed using the empirical power variance estimates as shown in (9). In addition, for each set of

30

simulation data we conducted the tests of parallelism based on OLS estimation (i.e., ignoring the heterogeneous variance patterns) in order to compare this approach to the results from the heteroscedastic model. Since the equivalence t-test and the F-test are tests of different

cr ip

t

hypotheses, a decision to reject H0 results in different inferences for the two testing approaches.

test, whereas for the F-test the curves are declared to be parallel if the test fails to reject H0 . In

us

order to directly compare the two approaches, we recorded the proportion of times that each test declared parallelism out of the total number of simulation replicates; i.e., the proportion of times

an

that H0 in (6) was rejected for the equivalence t-test, and the proportion of times that H0 in (5) was not rejected for the F-test. This gives us a criterion by which the two tests may be compared

ce pt ed

5.2. Simulation results

M

directly in terms of their conclusions regarding parallelism.

The results of the simulations are shown in Table 4. For each test and each simulation case, the table value is the proportion of times among the approximately 10000 simulation replicates that each test resulted in a declaration of parallelism. The proportion of times that the corresponding test based on OLS estimation declared parallelism is listed in parentheses. For case 1, the response curves are not exactly parallel but they are well within the equivalence

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Specifically, the response curves are declared to be parallel if H0 is rejected for the equivalence

limits, so we argue that they are practically parallel. In this case, when the overall precision was high with a relatively small degree of heteroscedasticity (θ = 0.75), the F-test declared parallelism relatively less often (between 3.3% and 45.0%), with the percentage decreasing as the number of doses and replicates increased, while the IUT declared parallelism a good majority

31

of the time (between 53.4% and 97.1%), with the percentage increasing as the number of doses and replicates increased. On the other hand, when the overall precision was low with a relatively large degree of heteroscedasticity (θ = 1.25), the F-test declared parallelism most of the time

cr ip

t

(between 72.8% and 84.6%), with the percentage decreasing as the number of doses and

30.3%), but the percentage increased as the number of doses and replicates increased. In the case

us

of moderate variance heterogeneity (θ = 1.00), corresponding to neither low nor high overall precision, both the F-test and the IUT declared parallelism about the same proportion of the time

an

overall. However, the proportions for the two tests followed opposite trends: when the number of doses and replicates was small, the F-test declared parallelism most of the time and the IUT

M

seldom declared parallelism, but when he number of doses and replicates was large, the IUT declared parallelism most of the time and the F-test seldom declared parallelism. Note that in

ce pt ed

case 1, the alternative hypotheses are true for both tests. In general, when the number of doses and replicates (i.e., the total sample size) was small and the overall precision was poor with a relatively high degree of heterogeneity, particularly for θ = 1 .25, the F-test declared parallelism more often while the IUT declared parallelism less often. However, as the numbers of doses and replicates per dose (total sample size) increased and the overall precision improved with a relatively small degree of heterogeneity (θ = 0.75), the IUT declared parallelism more often

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

replicates increased, while the IUT declared parallelism relatively rarely (between 0.3% to

while the F-test declared parallelism less often. This is not surprising, as a decrease in overall precision and total sample size represents a loss of power for both tests, while an increase in overall precision and total sample size represents a gain in power for both tests. Because both tests rejected H0 about the same percentage of the time in general, both tests resulted in a

32

“correct” decision about the same percentage of the time given their respective hypotheses. We argue that the IUT might be the preferred approach considering the similarity of the slopes and the plateaus, along with the fact that the IUT rewards more precise assays by declaring

cr ip

t

parallelism more often as the precision and the total sample size increase. On the other hand, if

clearly does not correctly declare parallelism in general, especially as the level of heterogeneity

us

increases. The IUT based on OLS estimation almost never declared parallelism when θ = 1.25, and even for the relatively small level of heterogeneity (θ = 0.75) it declared parallelism up to

an

18% less often than the IUT using the heteroscedastic model in case 1. Thus, the IUT based on OLS estimation should not be used in cases of heterogeneous variance, particularly if the degree

M

of variance heterogeneity in the data is moderate to large. However, from Table 4 we see that for case 1 the F-test based on OLS estimation behaved similarly to the F-test based on weighted

ce pt ed

GLS estimation, suggesting that the F-statistic may be insensitive to the variance structure or to the weights in general (see also Gottschalk and Dunn, 2005). In case 2, the true response curves are exactly parallel, and only the potency differs. In this case, the F-test declared parallelism between 92.3% and 94.2% of the time for the combinations of doses, replicates, and θ considered here. Because the null hypothesis for the F-test is true in this case, the test has an empirical rejection rate somewhat above but reasonably close to the nominal

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

one ignores the heterogeneous variance and conducts the IUT based on OLS estimation, the test

level of α = 0.05 for the simulation settings considered here. When the variance heterogeneity was relatively small (θ = 0.75), the IUT declared parallelism between 65.0% and 99.8% of the time, and the percentage increased with the numbers of doses and replicates per dose. However, once again the power of the IUT to declare parallelism declined as the level of heterogeneity

33

increased, especially for θ = 1.25. The effect of the sample size and the replicates per dose on the results of the IUT is striking, particularly for θ = 1.00 and θ = 1.25. Overall, the results from cases 1 and 2 suggest that both tests may be sensitive to the number of dilution levels and the

cr ip

t

number of replicates (excepting the results of the F-test from case 2), and that the power of the

data. For both cases the power of the IUT to declare parallelism increased substantially with

us

increasing total sample size and replicates per dose, and in case 1 the proportion of simulation replicates in which the F-test declared parallelism (i.e., failed to reject H0 ) decreased

an

substantially with increasing sample size, especially when the level of heterogeneity was small to moderate (i.e., θ = 0.75, 1.00). This again shows that the ability of the F-test based on the

M

heteroscedastic model to correctly declare parallelism may be strongly influenced by the sample

ce pt ed

size and the overall precision of the assay, similar to the F-test based on OLS estimation. In case 3, the ratios of the slopes and the lower and upper plateaus were all set on the boundary of the equivalence limits for the IUT. For this case, the IUT never declared parallelism for any of the simulation settings. Since the null hypothesis is true for the IUT in this case, this represents an empirical type I error rate of about zero. The results confirm that the IUT is extremely conservative. Similarly, the F-test almost never declared parallelism, except when θ = 1.25. In contrast to the IUT, the alternative hypothesis is true for the F-test, so the results represent very

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

IUT to establish parallelism depends strongly on the level of variance heterogeneity in the assay

high power of that test to detect non-parallelism for nearly all the simulation settings. The IUT based on OLS estimation also never declared the parallelism in case 3, and similarly the F-test

based on OLS estimation seldom declared parallelism, except when the number of doses was

34

small and the level of heterogeneity was large. When θ = 1.25, the F-test obtained by incorrectly ignoring the non-constant variance pattern declared parallelism with a maximum rate of 49. 1%. In case 4, the ratio of the slopes was outside the equivalence limits (b1 / b2 = 1.29), but the upper

cr ip

only 2.0%. Since H0 is true for the IUT in this case, the proportions again represent empirical type I error rates. Similarly, in case 4 the F-test did not declare parallelism most of the time,

(θ = 1.25) . The proportions for the F-test

us

except for the highest level of variance heterogeneity

an

were generally close to zero for θ = 0.75 and θ = 1.00 , but were higher when, θ = 1.25 , in which case the F-test declared parallelism substantially more often than the IUT. For the tests

M

conducted by ignoring the heterogeneous variances and using OLS estimation, the IUT again seldom declared parallelism. However, the F-test using OLS declared parallelism much more

ce pt ed

often than the F-test based on the heteroscedastic model, especially when the level of variance heterogeneity was large

(θ = 1.25) , with a proportion as high as 71.2%. The results in cases 3

and 4 indicate that, unlike the cases where the response curves were approximately or exactly parallel, here the F-test based on OLS estimation is not appropriate under heterogeneous variance structures.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

and lower plateaus were equal. The IUT seldom declared parallelism, with a maximum rate of

In summary, the simulation results indicate that the IUT for equivalence works well in certain

cases under the heteroscedastic model, but that the traditional F-test may be preferred in other

cases. This is similar to the conclusion reached about the two tests under the homoscedastic model (see Jonkman and Sidik, 2009). In particular, the IUT appears to be more effective in the

35

following cases: relatively small to moderate variance heterogeneity where the true response curves are approximately parallel (e.g. case 1 with θ = 0.75,1.00 ), and assay data where the level of heterogeneity is high and the true response curves are not parallel (such as case 4 with

cr ip

heterogeneity is not particularly high (i.e. cases 3 and 4 with θ = 0.75,1.00 ), both tests appear highly effective (i.e. neither test declared parallelism very often in the simulations). On the other

us

hand, overall the F-test is superior if the true response curves are exactly parallel. Of course, whether the true response curves are exactly, as opposed to approximately, parallel is unknown

an

in practice. As we noted in our previous paper (Jonkman and Sidik, 2009), the scenario of exact parallelism may be too strict, as it may seldom be true for practical applications. We think it is

M

more likely that the true curves will be only approximately parallel in practical situations, and so

ce pt ed

we prefer to use the IUT for testing parallelism under the heteroscedastic model in practice. The simulation results also confirm that the IUT for testing parallelism using OLS is not appropriate if it ignores the true heterogeneous variance structure, and it should not be used if the response exhibits non-constant variance. The F-test based on OLS estimation, ignoring the heterogeneous variance, resulted in roughly similar inference about parallelism to the F-test based on the heteroscedastic model for many cases, because the F-statistic may be insensitive to the variance patterns. However, the F-test based on OLS estimation can be unreliable under certain cases,

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

θ = 1.25 ). In most of the cases where the true response curves are not parallel and the degree of

particularly when the true response curves are not parallel and the level of variance heterogeneity is severe

(θ = 1.25) .

36

6. Discussion In this paper, we have discussed the problem of testing for parallelism of the response curves for

cr ip

specifically when the heterogeneous variance considered is the typical power variance function model in bioassays (Rodbard and Frazier, 1975; Finney and Phillips, 1977; Raab, 1981;

us

Davidian, Carroll, and Smith, 1988; O’Connell, Belanger, and Haaland, 1993; Gottschalk and Dunn, 2005). This paper may be viewed as an extension of the paper by Jonkman and Sidik

an

(2009) about testing parallelism under constant variance. In particular, we discussed the IUT

M

method for testing practical parallelism using t-tests, and also the traditional F-test for parallelism under the heteroscedastic model, in contrast to the tests of parallelism based on OLS

ce pt ed

estimation by ignoring the heterogeneous variance of the data. A GLS estimation procedure, which can be conveniently implemented using a standard nonlinear regression package in R or SAS (Carroll and Ruppert 1987; Giltinan and Ruppert, 1989), was presented in the context of parallelism testing under the heteroscedastic model. In addition, we have outlined a reasonable approach for constructing the approximate F-statistic for testing parallelism under the variance function model (see also Gallant, 1987). We have argued that the tests of parallelism should not be conducted based on OLS estimation when the variance of the response is heterogeneous rather

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

a test sample and a reference sample in the heteroscedastic four-parameter logistic model,

than constant, and that furthermore it may be helpful to routinely implement the heteroscedastic model when testing for parallelism, considering that some degree of heterogeneity is often present in assay data, as noted by Giltinan and Ruppert (1989).

37

The methods of estimation and testing for parallelism under the power variance function were illustrated using two example data sets. The first example perhaps illustrated a case of relatively moderate heteroscedasticity and approximately parallel responses, and the second was a case of

cr ip

t

parallel curves with constant variance which may indicate that the variance function model is

examples, the equivalence t-test correctly declared parallelism by rejecting the null hypothesis of

us

no practical parallelism. The traditional F-test declared parallelism for the first data set but rejected it for the second, perhaps due to a common issue with the test noted by several authors

an

(e.g., Plikaytis et al., 1994; Reeve, 2000; Hauck et al., 2005; Kpamegan, 2005; Jonkman and Sidik, 2009), specifically that the F test may often reject parallelism for assay data with high

M

precision and large sample sizes even when the response curves appear to be parallel. The results of the examples suggest that the IUT may be a better approach for establishing practical

ce pt ed

parallelism in general.

The empirical properties of the IUT and the F-test for testing parallelism were assessed by a simulation study. The simulation results indicated that under the heteroscedastic model the IUT may provide better inference when the true response curves are practically parallel, but that the F test may provide better inference if the true response curves are exactly parallel, similar to the results obtained under the homoscedastic model (Jonkman and Sidik, 2009). As we have noted

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

flexible and suitable for use even when the true response variance is constant. For both

previously, we think that exact parallelism is unlikely to hold in most practical situations, so the equivalence testing approach may be more appealing in comparison to the F-test. In the simulation cases where the true response curves were clearly non-parallel, both tests failed to support parallelism the vast majority of the time (with the exception of the F test when general

38

precision was low, i.e. when θ was large). Thus we suggest that the IUT should be employed for testing parallelism in practice, and the F test can be used if an inference of exact parallelism is desired. Finally, we note that the IUT based on OLS estimation is not reliable when the variance

cr ip

t

of the response is heterogeneous, and hence it should not be used under non-constant variance.

variance is not constant the F-test based on OLS estimation can lead to incorrect inference about

ACKNOWLEDGMENTS

an

appropriate approach under the heteroscedastic model.

us

parallelism for certain cases, so the approximate F-test based on GLS estimation is a more

previous versions of the paper.

M

The authors are grateful to the Editor and to the three referees for comments that helped improve

ce pt ed

References

Beal, S. L., and Sheiner, L. B. (1988). Heteroscedastic nonlinear regression. Techno-metrics 30: 327–338.

Belanger, B., Davidian, M., and Giltinan, D. M. (1996). The effect of variance function estimation on nonlinear calibration inference in immunoassay data. Biometrics 52: 158–175.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Although the F-test may be less sensitive to the variance patterns or the weights, when the

Berger, R. L. (1982) Multiparameter hypothesis testing and acceptance sampling. Technometrics

24: 295–300.

Berger, R. L., and Hsu, J. C. (1996) Bioequivalence trials, intersection-union tests, and equivalence confidence sets. Statistical Science 11: 283–319.

39

Callahan, J. D., and Sajjadi, N. C. (2003). Testing the null hypothesis for a specified difference – the right way to test for parallelism. BioProcessing Journal, Mar/Apr: 71–77. Casella, G., and Berger, R. L. (1990). Statistical Inference. California: Duxbury Press.

cr ip

generalized least squares in a heteroscedastic linear model. Journal of the American Statistical Association 77: 878–882.

us

Carroll, R. J., and Ruppert, D. Transformation and Weighting in Regression, Chapman and Hall:

an

New York, 1988.

Carroll, R. J., and Ruppert, D. (1987). Diagnostics and robust estimation when transforming the

M

regression model and the response. Technometrics 29: 287–299.

Davidian, M., and Carroll, R. J. (1987). Variance function estimation. Journal of the American

ce pt ed

Statistical Association 82: 1079–1091.

Davidian, M., Carroll, R. J., and Smith, W. (1988). Variance functions and the minimum detectable concentration in assays. Biometrika 75: 549–56. Davidian, M., and Giltinan, D. M. (1993a). Some simple methods for estimating intraindividual variability in nonlinear mixed effects models. Biometrics 49: 59–73.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

Carroll, R. J., and Ruppert, D. (1982). A comparison between maximum likelihood and

Davidian, M., and Giltinan, D. M. (1993b). Analysis of repeated measurement data using the

nonlinear mixed effect model. Chemometrics and Intelligent Laboratory Systems 20: 1–24. Davidian, M., and Haaland, P. D. (1990). Regression and calibration with nonconstant error variance. Chemometrics and Intelligent Laboratory Systems 9: 231–248.

40

European Directorate for the Quality of Medicines (2004). European Pharmacopoeia, Chapter 5.3, Statistical Analysis. EDQM: Strasburg, France, 473–507.

cr ip

Finney, D. J. (1976). Radioligand Assay. Biometrics 32: 721–740.

Finney, D. J. (1978). Statistical Method in Biological Assay, 3rd edition, High Wycombe U. K.:

us

Charles Griffin.

an

Finney, D. J., and Phillips, P. (1977). The form and estimation of a variance function, with particular reference to immunoassay. Applied Statistics 26: 312–320

M

Gallant, A. R. (1987). Nonlinear Statistical Models. New York: Wiley.Giltinan, D. M., and Ruppert, D. (1989). Fitting heteroscedastic regression models to individual pharmacokinetic data

ce pt ed

using standard statistical software. Journal of Pharmacokinetics and Biopharmaceutics 17: 601– 613.

Gottschalk, P. G., and Dunn, J. R. (2005). Measuring parallelism, linearity, and relative potency in bioassay and immunoassay data. Journal of Biopharmaceutical Statistics 15: 437–463. Hauck, W. W., Capen, R. C., Callahan, J. D., De Muth, J. E., Hsu, H., Lansky, D., Sajjadi, N. C.,

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Regulations, vol. 21, chapter 320. U.S. Government Printing Office.

t

FDA (1992). Bioavailability and Bioequivalence requirements. In U.S. Code of Federal

Seaver, S. S., Singer, R. H., and Weisman, D. (2005). Assessing parallelism prior to determining relative potency. PDA Journal of Pharmaceutical Science and Technology 59: 127–137. Hauck, W. W., Singer, R., and Callahan, L. N. (2007). Summary of planned revisions to Design and Analysis of Biological Assays (111). Pharmacopeial Forum 33: 580–581.

41

Jonkman, J. N., and Sidik, K. (2009). Equivalence testing for parallelism in the four-parameter logistic model. Journal of Biopharmaceutical Statistics 19: 818–837. Kpamegan, E. P. (2005). A comparative study of statistical methods to assess dilutional

t cr ip

O’Connell, M. A., Belanger, B. A., and Haaland, P. D. (1993). Calibration and assay development using the four-parameter logistic model. Chemometrics and Intelligent Laboratory

us

Systems 20: 97–114.

an

Peck, C. C., Beal, S. L., Sheiner, L. B., and Nichols, A. I. (1984). Extended least squares nonlinear regression: a possible solution to the “Choice of Weights” problem in analysis of

558.

M

individual pharmacokinetic data. Journal of Pharmacokinetics and Biopharmaceutics 12: 545–

ce pt ed

Plikaytis, B. D., Holder, P. F., Pais, L. B., Maslanka, S. E., Gheesling, L. L., and Car-lone, G. M. (1994). Determination of parallelism and nonparallelism in bioassay dilution curves. Journal of Clinical Microbiology 32: 2441–2447.

R Development Core Team (2007). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Raab, G. M. (1981). Estimation of a variance function, with application to radio-immunoassay.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

similarity. BioPharm International, October issue.

Applied Statistics 30: 32–40.

Reeve, R. (2000). Two statistical methods for estimating relative potency of bioassays. BioPharm 13: 54–60.

42

Rodbard, D., and Frazier, G. R. (1975). Statistical analysis of radioligand assay data. Methods of Enzymology 37: 3–22. Sasabuchi, S. (1980). A test of a multivariate normal mean with composite hypotheses

t cr ip

Schofield, T. L. (2003). Assay Development. Encyclopedia of Biopharmaceutical Statistics, 2nd ed., 1:1: 55–62.

111

.

an

United States Pharmacopeia, Chapter

us

United States Pharmacopeial Convention. (2008). Design and Analysis of Biological Assays. In

United States Pharmacopeial Convention. (2012). Design and Development of Biological

1032

M

Assays. In United States Pharmacopeia, Chapter

.

ce pt ed

United States Pharmacopeial Convention. (2012). Analysis of Biological Assays. In United States Pharmacopeia, Chapter

1034

.

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

determined by linear inequalities. Biometrika 67: 429–439.

43

cr ip us an M ce pt ed Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

Fig. 1. The data and fitted heteroscedastic non-parallel curves for: (a) the macrophage cell lysis assay (b) the ten-dose assay.

44

8

2

0.9741

3

0.9870

2

1.0443

(θˆ )

12



(θˆ )

M

MSE

(θˆ )

0.0860

−0.0130 0.2094

0.0440

0.0443 0.2350

0.0572

an

−0.0259 0.2920

ce pt ed

10

Bias

cr ip

Reps

us

θˆ

Doses

3

1.0280

0.0280 0.1764

0.0319

2

1.0704

0.0704 0.2172

0.0521

3

1.0397

0.0397 0.1582

0.0266

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

Table 1. Estimation of θ (with true θ = 1.00 ) by simulation

45

cr ip

GLS

OLS

GLS P-value

OLS P-value

Parameter

Estimate(SE)

Estimate(SE)

L

L

a2

177.85(7.7872)

158.68(188.54)

ra

0.9676(0.0635)

1.8203(2.4451)

b2

−4.4191(0.1952)

U

an

us

U

0.0002

0.3410

0.5907

0.0306

0.0000

0.0045

0.1395

0.0000

0.0007

0.0000

0.0000

M

0.0089

d2

−3.9054(0.3336)

ce pt ed

rb

0.8527(0.0510)

1.1710(0.1419)

13716.4(401.71)

13953.2(189.80)

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

Table 2. Estimation and test of parallelism for macrophage cell lysis assay

rd

1.0748(0.0452)

1.0411(0.0194)

c2

1.1624(0.0123)

1.1445(0.0116)

46

ce pt ed

Ac

t

cr ip

us

an

M

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

r −0.0261(0.0182) −0.0007(0.0155)

θ 0.9728(0.0843) –

47

cr ip

GLS

OLS

GLS P-value

OLS P-value

Parameter

Estimate(SE)

Estimate(SE)

L

L

a2

2.0198(0.0606)

2.0203(0.0661)

ra

1.0099(0.0514)

1.0106(0.0554)

b2

−1.4145(0.0556)

U

an

us

U

0.0000

0.0002

0.0000

0.0036

0.0000

0.0037

0.0000

0.0000

0.0000

0.0000

0.0000

M

0.0001

d2

rd

c2

−1.4155(0.0560)

ce pt ed

rb

0.9532(0.0549)

0.9537(0.0551)

10.123(0.0892)

10.122(0.0846)

0.9737(0.0112)

0.9737(0.0106)

2.3125(0.0127)

2.3126(0.0125)

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

Table 3. Estimation and test of parallelism for the ten dose assay

48

ce pt ed

Ac

t

cr ip

us

an

M

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

r 0.2818(0.0187) 0.2815(0.0186)

θ 0.1189(0.1463) −

49

Table 4. Simulation Results. The table value is the proportion of times out of the total simulation replicates that the specified test resulted in a declaration of parallelism (proportion based on OLS

cr ip

Parallel

θ

Doses Reps F-test

2

8

3

10

2

Values

Slopes

F-test

IUT

F-test

IUT

F-test

IUT

0.450

0.534

0.923

0.650

0.000

0.000

0.000

0.020

(0.432)

(0.394)

(0.909)

(0.467)

(0.001)

(0.000)

(0.001)

(0.014)

0.261

0.744

0.938

0.876

0.000

0.000

0.000

0.018

(0.263)

(0.617)

(0.914)

(0.734)

(0.000)

(0.000)

(0.000)

(0.019)

0.271

0.806

0.927

0.922

0.000

0.000

0.000

0.015

(0.275)

(0.623)

(0.927)

(0.753)

(0.000)

(0.000)

(0.000)

(0.009)

ce pt ed

0.75 8

IUT

Parallel

an

Approximately

us

1: Case 2: Exactly Case 3: Boundary Case 4: Unequal

M

Case

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

estimation in parentheses).

50

1.00 8

2

3

0.940

0.989

0.000

0.000

0.000

0.013

(0.102)

(0.830)

(0.930)

(0.940)

(0.000)

(0.000)

(0.000)

(0.011)

0.156

0.882

0.930

0.968

0.000

0.000

0.000

0.015

(0.177)

(0.705)

(0.949)

(0.817)

(0.000)

(0.000)

(0.002)

(0.006)

0.033

0.971

0.936

0.998

0.000

0.000

0.000

0.009

(0.041)

(0.886)

(0.946)

(0.966)

0.729

0.113

0.923

0.150

(0.698)

(0.036)

(0.889)

0.616

0.323

(0.607)

10

2

10

12

3

2

us

cr ip

t

0.934

(0.000)

(0.000)

(0.008)

0.000

0.000

0.021

0.008

(0.037)

(0.090)

(0.000)

(0.105)

(0.002)

0.936

0.416

0.000

0.000

0.001

0.014

(0.109)

(0.894)

(0.123)

(0.011)

(0.000)

(0.014)

(0.004)

0.644

0.408

0.929

0.498

0.000

0.000

0.011

0.016

(0.663)

(0.073)

(0.918)

(0.088)

(0.000)

(0.000)

(0.155)

(0.000)

0.491

0.627

0.937

0.777

0.000

0.000

0.000

0.017

(0.517)

(0.253)

(0.920)

(0.306)

(0.000)

(0.000)

(0.033)

(0.002)

0.565

0.538

0.929

0.652

0.000

0.000

0.010

0.020

ce pt ed

8

3

0.094

(0.000)

an

12

2

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

12

3

M

10

51

3

(0.055)

(0.946)

(0.075)

(0.000)

(0.000)

(0.236)

(0.000)

0.384

0.740

0.942

0.872

0.000

0.000

0.000

0.017

(0.451)

(0.262)

(0.945)

(0.335)

(0.000)

(0.000)

(0.055)

(0.000)

0.846

0.003

0.928

0.003

0.020

(0.799)

(0.001)

(0.874)

(0.000)

(0.491)

0.800

0.014

0.940

0.024

(0.769)

(0.000)

(0.887)

0.819

0.035

(0.820)

(0.000)

2

10

12

3

2

12

3

cr ip 0.248

0.000

(0.000)

(0.477)

(0.000)

us

0.000

0.000

0.000

0.069

0.001

(0.001)

(0.296)

(0.000)

(0.299)

(0.000)

0.928

0.047

0.001

0.000

0.200

0.004

(0.912)

(0.000)

(0.032)

(0.000)

(0.609)

(0.000)

ce pt ed

10

3

an

8

2

M

1.25 8

Ac

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

t

12

(0.629)

0.760

0.180

0.938

0.230

0.000

0.000

0.058

0.013

(0.775)

(0.000)

(0.913)

(0.001)

(0.002)

(0.000)

(0.444)

(0.000)

0.782

0.087

0.923

0.112

0.000

0.000

0.191

0.009

(0.837)

(0.000)

(0.940)

(0.000)

(0.000)

(0.000)

(0.712)

(0.000)

0.728

0.303

0.938

0.375

0.000

0.000

0.049

0.017

(0.783)

(0.000)

(0.937)

(0.000)

(0.000)

(0.000)

(0.553)

(0.000)

52

53

ce pt ed

Ac

t

cr ip

us

an

M

Downloaded by [University of Colorado - Health Science Library] at 22:24 09 April 2015

Testing for parallelism in the heteroscedastic four-parameter logistic model.

For bioassay data in drug discovery and development, it is often important to test for parallelism of the mean response curves for two preparations, s...
707KB Sizes 0 Downloads 15 Views