Power and Sample Size Calculations A Review and Computer Program William D. Dupont, PhD, and Walton D. Plummer Jr., BS Department of Preventive Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee

ABSTRACT: Methods of sample size and power calculations are reviewed for the most com-

mon study designs. The sample size and power equations for these designs are shown to be special cases of two generic formulae for sample size and power calculations. A computer program is available that can be used for studies with dichotomous, continuous, or survival response measures. The alternative hypotheses of interest may be specified either in terms of differing response rates, means, or survival times, or in terms of relative risks or odds ratios. Studies with dichotomous or continuous outcomes may involve either a matched or independent study design. The program can determine the sample size needed to detect a specified alternative hypothesis with the required power, the power with which a specific alternative hypothesis can be detected with a given sample size, or the specific alternative hypotheses that can be detected with a given power and sample size. The program can generate help messages on request that fadlitate the use of this software. It writes a log file of all calculated estimates and can produce an output file for plotting power curves. It is written in FORTRAN-77 and is in the public domain. KEY WORDS: Power and sample size calculations, cohort studies, case-control studies, dichotomous

or continuous outcomes INTRODUCTION Sample size a n d p o w e r calculations for clinical trials a n d observational studies are typically p e r f o r m e d either by h a n d [1-5], t h r o u g h the use of p u b l i s h e d g r a p h s or tables [6-10], or t h r o u g h the use of specialized c o m p u t e r p r o g r a m s [11-17 I. Selecting the s a m p l e size for a s t u d y inevitably requires a c o m p r o m i s e balancing the n e e d s for p o w e r , e c o n o m y , a n d timeliness. Investigators m u s t d e t e r m i n e their s t u d y ' s s a m p l e size, p o w e r , a n d detectable alternative h y p o t h e s e s . To do this, it is useful to h a v e a p r o g r a m that, given a n y two of the p r e c e d i n g p a r a m e t e r s , is to be able to calculate the third. The p u r p o s e of this article is to introduce such a p r o g r a m (POWER) a n d to review the p o w e r a n d s a m p l e size calculations that are required for the m o s t c o m m o n s t u d y designs. For each design considered in this article, POWER calculates the s a m p l e size n e e d e d to detect a particular difference in t r e a t m e n t efficacy with a specified p o w e r , the p o w e r with w h i c h a particular difference

Address reprint requests to: William D. Dupont, S-3301 Medical Center North, Department of Preventive Medicine, Vanderbilt University School of Medicine, Nashville, TN 37232-2637. Received May 10, 1989; revised October 11, 1989.

116 0197-2456/1990/$3.50

ControUed Clinical Trials 11:116-128 (1990) © Elsevier Science Publishing Co., Inc. 1990 655 Avenue of Americas, New York, New York I0010

Review of Power and Sample Size Calculations

117

can be detected with a given sample size, and the difference that can be detected with a specified power and sample size. The study designs that can be evaluated by this program are summarized in Table 1. In this table, independent study designs refer to those in which subjects are independently selected at random from some target population. Matched designs are ones in which one or more control subjects are matched to each case patient with respect to certain attributes. Paired designs are matched designs with one control per case. In cohort studies, subjects are followed forward in time until some event occurs [18]. All clinical trials are cohort studies. Case-control studies look for risk factors in samples of case patients with a specific disease and control patients who do not have this disease [2]. A survival outcome variable consists of the time until death or some morbid event occurs, or the total follow-up time for a patient who does not suffer this event. Continuous outcome variables like weight or serum creatinine may take a wide range of values. Dichotomous outcomes take only two values such as success or failure, or the presence or absence of some risk factor. In justifying our study design, the actual power that will be achieved with the selected sample size is more relevant than the power that would have been achieved with other sample sizes that were considered but ultimately rejected. The power associated with the selected sample size can be most effectively demonstrated by plotting the power curve as a function of the true value of the parameter of interest under different alternative hypotheses. The coordinates for such curves can be generated by POWER for input into graphics software packages. POWER is in the public domain and is available from the authors on request for the cost of distribution. It is written in ANSI standard FORTRAN-77 and has been run successfully on both PC computers running under MS-DOS and VAX computers running VMS.

METHODS Generic Power and Sample Size Formulas All the methods discussed in this paper are variations on a familiar theme [3, sect. 5.2]. Suppose that we observe responses on n patients (or groups of patients) that are dependent on some parameter 0. Let f(0) be a known monotonic function of 0 and let S be a statistic derived from the n responses that has a normal distribution with mean X/-dnf(0) and standard deviation ~(0). Let • [z] be the cumulative probability distribution for a standard normal random variable and let z~ -- ~-111 - a] denote the critical value that is exceeded by a standard normal random variable with probability o~. Let 00 and 0a denote the values of 0 under the null and a specific alternative hypothesis, respectively. Let or0 = o'(00), o-a = o-(0o) and let ~ = {f(%) - f(00)}/o'~ denote the difference between f(%) and f(00) expressed in standard deviations of S under the alternative hypothesis. Testing the null hypothesis against a two-sided alternative hypothesis with type I error probability ~ leads to rejection of the null hypothesis when IS - ~

f(00)l > ~0z~2.

(1)

log rank t t X~ ×2 uncorrected X2 Fisher's exact or corrected X2 uncorrected X2

Survival Continuous Continuous Dichotomous Dichotomous Dichotomous Dichotomous

Dichotomous

8

"The terms used in this table are defined in the Introduction.

Test Statistic

Outcome Variable

Independent

Independent Paired Independent Matched Paired Independent Independent

Independent vs. Matched

Cohort

Cohort Either Either Case-control Cohort Case--control Either

Cohort vs. Case-Control

Study Design

S t u d y D e s i g n s T h a t C a n Be E v a l u a t e d b y the P O W E R C o m p u t e r P r o g r a m ~

Method Number 1 2 3 4 5 6 7

Table 1

Meinert [1]

Schoenfeld and Richter [10] Pearson and Harfley [9] Pearson and Hartley [91 Dupont [61 This paper Schlesselman [2] Casagrande et ai. [5]

Reference

o

119

Review of Power and Sample Size Calculations

W h e n 0 = 0a, the probability that S will satisfy Eq. 1 equals the p o w e r associated with this alternative hypothesis, which is l -- ~ = ( 1 ) [ ~ V ~ -

(O'O/O'a)Zed2] q- ( ~ [ - ~ V ~

-

((YO/O'a)Z~d2].

(2)

The first a n d second terms o n the right-hand side of Eq. 2 give the probabilities u n d e r the alternative hypothesis of obtaining S > V~n f(00) + o'0z~,.2and S < X/-~nf(%) - croZ~2, respectively. One or the other of these terms is usually negligible for relevant values of ~ and 13. The smaller of these two terms will be less than 0.001 as long as 2((ro/~r,)z~/2 + z~ >~ 3.1 (see Appendix). Hence, approximating this probability by zero in Eq. 2 yields n = [(crdcr,,)z~,~ + z~]~/~ ~.

(3)

To illustrate the use of Eq. 3, consider a sample of size n r a n d o m l y d r a w n from a normal population with mean i~ and k n o w n variance ~r~. Let ~ d e n o t e the sample mean, S = ~ Y, f(t~) = ~ be the identity function, fro = cro = ~r and ~ = {f(i~) - f(Ix0)}/¢, = ( ~ - ~0)/~r. T h e n S has a normal distribution with mean X~n ~ and standard deviation ~r. Substituting %, ~r,, and ~ into Eq. 3 gives n = (zoo2 + z~)2tr-~/(Ixa - ~0) 2, which is Eq. 5.34 in Ref. 3. The p o w e r and sample size formulas for the study designs considered in this article can n o w be obtained by substituting the appropriate definitions of (r0, cro, and 8 into Eqs. 2 and 3. Equation 3 can also be used to find the specific alternative hypothesis that can be detected with p o w e r 1 - I~ and sample size n. These equations do not yield a closed solution w h e n tr0 ~ ~r~, but can be readily solved by a c o m p u t e r using iterative m e t h o d s [19[. POWER provides a warning message with sample size estimates w h e n e v e r 2(o'o/(ra)z,~/2 + Z~

Power and sample size calculations. A review and computer program.

Methods of sample size and power calculations are reviewed for the most common study designs. The sample size and power equations for these designs ar...
667KB Sizes 0 Downloads 0 Views