STATISTICS IN MEDICINE, VOL. 9, 549-558 (1990)

OUTLIER DETECTION IN

BIOAVAILABILITY/BIOEQUIVALENCE STUDIES SHEIN-CHUNG CHOW Biostatistics, Bristol-Myers, US.Pharmaceutical Group, Evansville, IN 47721, U.S.A.

AND SIU-KEUNG TSE Department of Applied Mathematics, City Polytechnic of Hong Kong. Hong Kong

SUMMARY This paper concerns techniques for detection of a potential outlier or extreme observation in a bioavailabi1ityJbioequivalencestudy. A bioavailability analysis that includes possible outlying values may affect the decision on bioequivalence. We consider a general crossover model that takes into account period and formulation effects. We derive two test procedures, the likelihood distance and the estimates distance, to detect potential outliers. We show that the two procedures relate to a chi-square distribution with three degrees of freedom. The main purpose of this paper is to exhibit and discuss these two general approaches of outliers detection in the context of a bioavailability/bioequivalencestudy. To illustrate these approaches, we use data from three-way crossover experiment in the pharmaceutical industry that concerned the comparison of the bioavailability of two test formulations and a standard (reference) formulation of a drug. This example demonstrates the influence of an outlying value in the study of bioequivalence.

1. INTRODUCTION

Studies of bioequivalence between formulations of a drug commonly encounter the problem that the data set contains some outlying or extreme observations. These outlying observations may occur either as unexpected observations in the plasma concentration-time curve or as the unusual subject who has extremely high or low bioavailability with respect to the reference formulation. Rodda suggests that unexpected observations in the plasma concentration-time curve will generally have little effect on the comparison of bioavailability and one should include them in the analysis.’ The observed extremely high or low bioavailability in subjects, however, may indicate that the variability of subject response to a formulation is not homogeneous. These outlying observations may have dramatic effects on the bioequivalence test. A usual bioavailability analysis which includes the possible outlying subject may lead to the rejection of bioequivalence when, in fact, the formulations are bi~equivalent.’-~In addition, the inclusion of an outlier may increase the estimate of the variability to the point that the sensitivity of the statistical methodology falls below the necessary level for detection of clinically important differences. In summary, the presence of an outlying subject may negate the conclusion of a bioequivalence study. It is therefore important to examine the outlying observations carefully. 0277-67 15/90/050549- 10$05.00 0 1990 by John Wiley & Sons, Ltd.

Received March 1989 Revised November 1989

5 50

S.-C. CHOW AND S.-K. TSE

The study of the detection of a potential outlier in a set of data has received much attention in past decade^.^-^ Recently, Beckman and Cook gave an extensive review of the literature." In particular, the development of various methods for handling outliers in linear regression problem has in recent years been an active research area. Many techniques have been proposed for the treatment of outlying or influential observations in the linear regression model. Often, however, the data available in a bioavailability study do not fit into the framework of a linear regression model. Therefore, it is essential to develop new techniques that provide a general approach to the identification of outlying observations in a broader class of problems. The idea of likelihood distance proposed by Cook and Weisberg was an attempt in this direction." The likelihood distance (LD) provides a general approach to identify outlying observations without the dependence on the structure of a linear model. In this study we have developed another general approach, termed the estimates distance (ED), to the outliers detection problem. A main purpose of this paper is to exhibit and discuss these two general approaches of outliers detection in the context of a bioavailability study. We note that these ideas apply as well to other outlier detection problems. In the next section, we describe a commonly used crossover model for bioavailability studies. We derive two test procedures, with use of the likelihood distance approach and the estimates distance approach, for the detection of outliers under the hypothesis that all the formulations are bioequivalent. In Section 3, we present an example from the pharmaceutical industry that concerned a bioavailability study of a drug. A brief discussion appears in Section 4.

2. THE MODEL The comparison of n drug formulations in a bioavailability study usually involves a n-period crossover experiment to assess bioequivalence. Consider the following crossover model: X i j r = p + S i + F i + P , + ~ i j l j , l = 1,..., n ; i = 1 , . . . , k

(1)

where p is the overall mean; Fj is the fixed effect of the jth formulation, with Z j Fj = 0; PI is the fixed effect of the Ith period, with C , PI = 0; Si is the random effect of the ith subject; E~~~ is the error term; and X i j l is the response variable on the ith subject in the Ith period under the jth formulation. Note that a factor for the sequence effect is ordinarily present in the model. For brevity, however, we pooled the sequence effect with the subject effect. That is, Si comprises the 'between sequences' and 'subjects within sequence' effects. In model (1) we assume that {Si} and { c i j l } are independently and normally distributed with means 0 and variances 0; and a:, respectively. The response variables of primary interest in a bioavailability/bioequivalence study are the extent of absorption and the rate of absorption. The former is usually measured as the area under the plasma concentration-time curve (AUC), and the latter in terms of C,,, (peak concentration) and t,,, (time to peak concentration). In practice the distribution of the response variable, say AUC, is often skewed. In this case, one usually applies a log transformation on AUC to remove the skewness and then analyses the transformed AUCs with use of model (1). In the following, we describe the two procedures for detection of an outlier. For illustration, we focus on a simplified model (1) with the assumption that there are no period and formulation effects. Our purpose is not to overwhelm the discussion with complicated mathematical details. The ideas extend readily to more general models. Further discussion on the treatment of a more general model, which may have different period effects and/or formulation effects, appears briefly in Section 4.

55 1

OUTLIERS IN BIOAVAILABILITY STUDIES

Under the hypotheses of no period and formulation effects, model (1) reduces to j = 1,. . .,n; i = l , . . .,k.

Xij=p+Si+qj

The parameters of interests are p, 0,' and

1 k m, =k(n-l)i=l

n

0;.

1( X i j - X i ) '

j=l

(2)

Define n k and m z = k- -l l i = , ( X i - X ) 2 .

Let 8 be the parameter vector (O,, 02, 8,)=, 8, = p, 8, = 0,' and 8, = 0,' likelihood function L(0) for (2) is

+ no:.

The log-

(3) The maximum likelihood estimator (MLE) 6 of 0 obtains by maximizing (3) with respect to 8 under the condition 8, b O2. The MLE 8 = (g1, 83)Tof 8 is then

a,,

6,=x J2=m, @,=(k-l)m,/k. than e^, in (4) (that is (k-l)m,

bioequivalence studies.

This paper concerns techniques for detection of a potential outlier or extreme observation in a bioavailability/bioequivalence study. A bioavailabilit...
492KB Sizes 0 Downloads 0 Views