Review

CLINICAL TRIALS IN DERMATOLOGY, PART 2: NUMBERS OF PATIENTS REQUIRED ALFRED M. ALLEN, M.D., M.P.H

From the Department of Dermatology Research. Letterman Army Institute o f Research, San Francisco, California

The object of most clinical trials is to determine whether one treatment is better than another. The results are analyzed to determine whether observed differences between groups of patients represent real underlying differences between treatments or are simply artifacts due to limited sample size. The last-mentioned phenomenon is known as sampling error; its effects on the outcome of a clinical trial have been calculated and can be used to plan and evaluate trials. Because the number of patients tested can play a crucial role in deciding whether one treatment is truly superior to another, the choice of sample size is one of the most important aspects of designing clinical trials. Improper choice of sample size often carries with it severe penalties-uncertainties in interpretation of results, poor utilization of time and resources, and the frustration of wasted effort. The basic design of most clinical trials is fairly simple; therefore, an estimate of an appropriate sample size i s easily obtained by consulting published tables. Using these tables, one can decide on the sample sizes required when 2 different treatments (or a placebo and a treatment) are being compared in trials in which the outcome can be mea-

Address for reprints: Alfred M. Allen, M.D., Chief, Health and Environment Activity, Fort Ord, CA 93941. Part one of this four-part series appears in JanuaryFebruary, 1978,17:42. Thetitleofthatarticle should have read: Clinical Trials in Dermatology, Part 1 : Experimental Design.

sured in terms of percentages of patients or lesions that respond to treatment. In this paper, I will show (1) how to determine an appropriate sample size when one can specify the magnitude of the differences between treatments one i s interested in detecting, (2) what magnitude of differences between treatments can be demonstrated when only a limited number of patients is available for study, and (3) how to express the relative magnitude of the difference between treatments in a way that is useful for planning and evaluating trials carried out under a variety of circumstances. But first it may be helpful to introduce a few statistical considerations that are involved in using sample size tables. Preliminary Considerations The medical considerations in a clinical trial are clear-cut: Is one drug better than another? For example, is a newer antibiotic better than tetracycline for treating acne? To arrive at the answer it is usually necessary to conduct clinical trials to see which drug works best. If the patients given a new drug have a higher percentage improvement rate than those given tetracycline, the rebults suggest that the new drug is truly superior to tetracycline. How confident we can be that the new drug is superior necessarily involves the statistical concept of sampling error. An appreciation of sampling error i s fundamental to understanding the role of sample sizes in clinical trials and to the use of sample size tables. Sampling error refers to the fact that random samples from a large population seldom have exactly the same characteristics as one another or as the population from which they

001 1-9059-0400-0194-01 05 @ International Society of Tropical Dermatology, Inc

194

No. 3

CLINICAL TRIAL DESIGN

were drawn. For example, random samples of acne patients on treatment would seldom have exactly the same percentage improvement rates as the whole population of acne patients from which they were drawn. In fact, if the population improvement rate was SO%, one would expect to see improvement rates as high as 69% or as low as 31 % in about one of every 20 sample groups containing 30 patients each. This 19% error (69 - 50 = 19; 50 - 31 = 19) in estimating the percentage of remissions in the population is the sampling error, and is attributable solely to chance. Chance variation can create 2 types of errors in randomized clinical trials: (1) the results may suggest a difference between 2 treatments when there is none; and (2) the results may not indicate a difference between treatments when one actually exists. The first kind of error is known technically as a Type 1, or alpha error, and the second i s called a Type 2, or beta error. Consideration of both kinds of error is necessary to understand the role of sample size in clinical trials. Finding a ”statistically significant difference“ between 2 treatments means that the probability of making a Type 1 error (declaring a real difference to exist when in fact there is none) i s small. Setting the ”p” value for statistical significance at .05 means that the chances of making a Type 1 error are 5% or less. The ability to achieve statistical significance in a trial is directly dependent on sample size: The larger the number of patients, the more likely statistical significance will be reached. In the case of acne trials, for example, an observed difference in improvement rate of 50% versus 65% would be statistically significant if the number of patients in each treatment group was 100, but would not if the number was only 50 (assuming ”p” less than .05). A frequently neglected aspect of drug trials is the meaning of “no significant difference” between treatments. This seems to imply that there are no differences between the drugs; but what it may mean is that the differences are. not large enough to have registered as

Allen

195

“statistically significant.” It does not necessarily mean that there are no clinically important differences between the treatments being compared, especially when sample sizes are small. The ability t o detect-with statistical significance-a real underlying difference between treatments is known as power. The greater the power, the greater the ability to detect differences between treatments based on information from groups of patients. Power is directly related to sample size. For instance, if we wanted a 95% probability of finding a significant difference between 2 treatments when the real underlying differences were represented by 50% and 70%, respectively, we would need 172 patients in each group (assuming “p” less than .05). With only 65 patients in each group, the chances of finding a significant difference would be only 50%. In other words, the power of the trial would drop from 95% to 50% by reducing the sample sizes from 172 to 65. In considering sample sizes for trials, it is necessary to specify three things: (1) the size of the difference between treatments one does not wish to miss: (2) the power one wants to detect a difference of this size or greater: and (3) the significance level desired so as to avoid attributing a difference between treatments if none exists. The specification of these 3 things requires a judgment in each instance. The judgment i s tempered by the nature of the problem at hand and the degree of certainty required by the investigator or his critics. It is conventional to use .05 as a significance level, and to declare any difference that reaches this level “statistically significant.” Some require greater assurance that they will not make a Type 1 error (state that a difference exists when in fact it does not); therefore, they use a second convention, which is to adopt a higher level of significance, usually .01. In the latter instance, the difference is said to be “highly significant.” The selection of a power i s much less conventionalized than is the choice of significance levels. However, one authority

196

INTERNATIONAL JOURNAL OF DERMATOLOGY April 1978

suggests that making a Type 2 error (failing to detect a difference when one exists) i s about 4 times less serious than making a Type 1 error, and recommends a power of 80% when the significance level i s .05 and a power of 95% when the significance level is .01. 'The choice of a power will ultimately depend on how sure the investigator wants to be of detecting specified differences between treatments. A power of 80% to 95% seems reasonable for most clinical work in which it i s deemed important to discover more effective treatments.* Probably the most difficult judgment comes in specifying what difference between treatments one does not want to miss. An important consideration i s that the smaller the difference, the greater the cost in detecting the difference. For example, the number of patients required to detect a difference between drugs with true efficacies of 50% and 55% i s 4 times greater than the number required to detect a difference when true efficacies are 50% and 60%. If the significance level i s set at .01 and the power at 95%, 3630 patients will be required in each treatment group when the trueefficaciesare 50% and 55%,and 91 8 will be required when the true efficacies are 50% and 60% .Obviously, these sample sizes are far beyond those ordinarily used in clinical trials, suggesting that the potential accomplishments of most trials are far more modest than those indicated by the example.

Sample Size:

Numbers Required

The value of a clinical trial bears a direct relationship to the number of patients enrolled: in general, the larger the number of patients, the more reliable is the information provided. To cite an extreme example, a trial which shows that drug A cures 500 of 1000

* For ease of comprehension, we use percentage figures to express power rather than the decimal fractions used in statistical texts. For example, our expression "power of %" would ordinarily read "power =.80." Technically, power is equivalent to the probability of not making a Type 2 (or beta) error. If the probability of making a beta error is 20% (viz. p=.20), then power =1 -p=1-.20=.80.

Vol. 17

patients would provide more convincing evidence that the true efficacy of drug A i s about 50% than would a trial in which drug A cures 1 of 2 patients. Likewise, if drug B cures 0 of 1000 similar patients, the evidence that drug A is more effective than drug B i s more convincing than if drug B cures 0 of 2 patients. Few trials include this many or this few patients; almost invariably the number i s somewhere between the numbers cited. Naturally, this is because very small numbers are unconvincing and very large numbers are unobtainable. These considerations raise the obvious question: What is the minimum number of patients required to produce convincingevidenceabout theefficacyof adrug? There is no stock answer to this question because the minimum numbers of patients required will vary depending upon the circumstances. If the criteria for demonstrating efficacy are strict, the numbers required will be greater than if they are lax. Statistical as well as medical considerations dictate the minimum number of patients required in any trial in which the results will be subjected to statistical analysis and appropriate "p" values obtained. To illustrate, suppose a clinical investigator wishes to determine whether a new drug is significantly better than tetracycline in the treatment of acne. In planning the trial to compare the new drug to tetracycline, the investigator specifies the following conditions: 1. Based on previous experience and a careful reading of the liteqture, the expected improvement rate with tetracycline is about 60%. 2. The new drug would be considered significantly better than tetracycline if the expected improvement rate would be as great as, or greater than, 70%. 3. If the true differences in efficacy are of the magnitude specified, there must be no less than a 95% probabilityof detecting them in the trial. 4. The new drug will be declared significantly better than tetracycline if the sig-

CLINICAL TRIAL DESIGN

No 3

.

197

Allen

Table 1. Sample Sizes Required in Each o f 2 Treatment Croups to Ensure an 80% Probability of Finding a Statistically Significant Difference when “P” i s Less Than .05 * Smaller of 2 percentages expected to respond

Larger of 2 percentages expected to respond 30

40

50

60

80

70

90 ~

10 20 30 40 50 60 70

*

80

-

44 100

25

-

-

5!

20 32 54

__

15 22 33 54

-

-

-

12 16 22 32 51 100

-

9 12 15 20 29 44 80

For sample sizes up to 1 0 0 in a 2-tailed test.

Adapted from reference 2. A dash indicates thdt the required sample size is greater than 100.

nificance level (the “p” value) is less than .01. Under these conditions, the minimum number of patients required in each of the two treatment groups in the trial is 847. By changing the conditions and adopting less stringent criteria for demonstrating efficacy, the investigator can reduce the minimum number of patients required. For example, if the new drug would be considered significantly more effective than tetracycline when the hoped-for improvement rate was at least 80% instead of 70%, the number of patients required would drop to 203. If, in addition, the significance level was changed from .01 to .05, the number required would fall to 153. Finally, if only an 80% instead of a 95% probability of detecting the specified differences is required, the number would be even lower, namely 100. The penalty for reducing the number of patients enrolled in the trial i s that there is a greater chance of not finding a significant difference between the treatments when the new drug i s better than tetracycline. Thus, it is clear that the medical need to know whether one treatment is better than another is inextricably bound up with statistical considerations of sample size. Arriving at an appropriate number of patients to enroll in a trial is easily done with

sample size tables once the necessary conditions are specified. As in the example, the conditions are (1) estimating the expected response rate for the standard or established treatment, (2) stating how much greater the expected response rate from the new drug must be in order to be considered medically significant, (3) specifying the probability desired of detecting a difference between treatments if one actually exists, and (4) deciding on a level of statistical significance. In some instances, there will be no standard or wellestablished treatment comparable to tetracycline for treating acne; hence, the expected placebo response rate may be substituted for the standard drug rate in a placebo-controlled trial. The sample size tables (Tables 1-3) show the number of patients required in each treatment group when the expected responses to treatment are estimated to the closest 10%. Table 1 shows the numbers required when the significance level i s .05 and the probability of finding a significant difference (the power) i s set at 80%; Table 2 applies to the situation wherein the significance level is still .05 but the probability of finding a significant difference is increased to 95%; and Table 3 deals with conditions when the significance level i s raised from .05 to .01 and the probability of findinga significantdifference i s keptat95%.

198

INTERNATIONAL JOURNAL OF DERMATOLOGY

Vol. 1 7

April 1978

Table 2. Sample Sizes Required in Each of 2 Treatment Croups to Ensure a 95% Probability of Finding a Statistically Significant Difference when "P" is Less Than .05 * Smaller of 2 percentages expected to respond

10 20 30 40 50 60

Larger of 2 percentages expected to respond

40

50

60

70

80

90

64

40 75

28 45 81 -

20 30 47 81

-

-

15 21 30 45 75

-

-

11 15 20 28 40 64

-

-

*For sample sizes up to 100 in a 2-tailed test. Adapted from reference 2. A dash indicates that the required sample size i s greater than 100

To illustrate the use of these tables, suppose that the expected response rate with the standard drug i s 50% and that pilot studies on a new drug suggest a response rate of about 90%. Assume further that the investigator will be satisfied with an 80% probability of detecting a significant difference between the standard drug and the new drug when the level of significance is .05. The number of patients required in each treatment group i s shown in Table 1. The required sample size is seen to be 29, which means that there must be at least 29 patients in each treatment group, or a total of 58 (2 x 29 =58) patients in the trial, in order to have an 80% probability of finding a significant difference. An 80% probability of finding a significant difference also means that there is a 20% chance of failure. In other words, with sample sizes of 29, one of every 5 trials, on average, will not show a significant result when the real difference between treatments i s expressed by response rates as different as 50% and 90%. To improve the chances of finding a significant difference between drugs that are as inherently different as this, a power of 95% can be adopted. Under this condition, Table2 shows that the required sample size would be 40. The required number of patients for the trial would therefore be 80 (2 x 40 = 80). With thesignificance level kept at .05, there

is still a 5% chance of finding a significant difference between the 2 drugs if they happen to be equivalent in effectiveness. For example, the efficacy of the new drug might also be 50%, the same as the standard drug. To-reduce the chances of calling similar drugs different while at the same time retaining a power of 95%, a significance level of .01 could be adopted. Table 3 shows that the required sample size would then be 53, and consequently there would have to be 106 patients in the trial (2 x 53 = 106). These examples illustrate a fundamental principle: required sample sizes invariably increase whenever an investigator wishes to increase his chances of finding a significant difference. Required sample sizes also increase whenever the investigator wishes to decrease his chances of inadvertently attributing significance to a difference that is purely due to sampling variation, while at the same time retaining a given chance of finding a difference. Sample sizes larger than 100 are not shown in Tables 1-3 because of the unlikelihood of obtaining more than 200 patients for a trial. If sample sizes larger than 100 are available, or if the conditions are otherwise different from those in Table 1-3, either Table 4 or the more extensive sample-size tables published in statistical texts can be

CLINICAL TRIAL DESIGN

No. 3 Table 3.

199

Allen

Sample Sizes Required in Each o f 2 Treatment Croups to Ensure a 95% Probability of Finding a Statistically Significant Difference when ”P” i s Less Than .01 * Larger of 2 percentages expected to respond

Smaller of 2 percentages expected to respond

10 20 30 40 50 60 -~

.

40

50

60

70

80

90

83

53 99

36 59

-

-

26 39 62

19 27 39

-

-

-

-

-

59 99

14 19 26 36 53 83

-

-

-

~

*For sample sizes u p to 100 in a 2-tailed test. Adapted from reference 2. A dash indicates that the required sample size is greater than 100.

Limited Sample Sizes Itisoftenthecasethatonlya limited number of patients is available for a clinical trial. In this situation the question is: “What can be shown with the available number of patients?” Sample size tables can also be used to answer this kind of question. Suppose the object i s to compare the efficacy of a new antibiotic to that of tetracycline in the treatment of acne. The investigator estimates that about 60% of his patients w i l l show clinically meaningful improvement after 6 weeks of treatment with tetracycline. He expects to be able to enroll about 90 to 100 patients in the trial and thinks the new drug may be superior. How big a difference must there be between the improvement rates before the investigator has a reasonable chance of finding a statistically significant difference between the 2 antibiotics? The investigator stipulates that a statistically significant difference will be considered to exist if ”p” i s less than .05. He also indicates that, for his purposes, a reasonable chance of finding a statistically significant difference can be defined as at least an 80% probability of finding a difference. Table 1 shows the sample sizes required when the significance level is .05 and the probability of finding a significant difference

is 80%. With 100 patients, the largest available sample size would be 50 (100/2 = 50). Table 1 indicates that, with a sample size of 50, the new antibiotic must be expected to produce an improvement rate of nearly 90% before the investigator will have reasonable chance of finding a significant difference between tetracycline and the new drug. Table 1 also shows that exactly 44 patients will be required in each treatment group when the expected improvement rate with the new drug is a full 90%, whereas 100 patients will be required in each group when the expected rate is only 80%. The implications of this are that if the expected improvement rate with the new drug is more than a few percentage points below go%, the investigator will have less than his “reasonable chance” of finding a statistically significant difference when tetracycline is expected to be effective in only 60% of patients. If the criteria for demonstrating a significant difference in efficacy are more stringent than those in Table 1, then even a hundred patients may not suffice for a trial in which the expected improvement rate with the new drug is at least 90%. For example, under the more stringent conditions of Table 2 at least 128 (2 x 64) patients would be required, and under the even more stringent conditions of Table 3 no less than 166 (2 x 83) patients would be needed.

In a two-tailed test. Adapted from Reference 2.

*

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

Smaller of two percentages expected to respond

Table 4.

~~~~~

511

10

178 764

15

100 237 984

~

20 67 125 289 1172

~

25 50 80 146 332 1330

~~

30 39 57 91 163 367 1455

32 44 64 100 178 395 1549

26 35 48 69 107 188 415 1612

22 29 37 51 73 112 195 426 1644

19 24 30 39 53 75 115 199 430 1644

17 20 25 32 41 54 77 116 199 426 1612

15 17 21 26 32 41 55 77 115 195 415 1549

13 15 18 22 26 33 41 54 75 112 188 395 1455

Larger of two percentages expected to respond 35 40 45 50 55 60 65 70

11 13 15 18 22 26 32 41 53 73 107 178 367 1330

75

80 10 12 13 16 18 22 26 32 39 51 69 100 163 332 1172

Sample Sizes Required in each of 2 Treatment Groups to Ensure an 80% Probability of Finding a .Statistically Significant Difference when "P" is Less Than .05.*

9 10 12 13 15 18 21 25 30 37 48 64 91 146 289 984

85

8 9 10 12 13 15 17 20 24 29 35 44 57 80 125 237 764

90

8 9 10 11 13 15 17 19 22 26 32 39 50 67 100 178 51 1

7

95

N

>

P. -

0 0

CLINICAL TRIAL DESIGN

No. 3

These examples illustrate a most sobering and perhaps unfortunate fact, namely, that the numbers of patients commonly enrolled in clinical trails are usually insufficient to provide good insurance against failing to find statisticalIy signif icant differences between treatments that differ enormously in efficacy.

The Relative Difference It is common to want a clinically meaningful way of expressing the magnitude of the difference between one treatment and another. It would be convenient to state that a particular drug i s a certain percentage more effective than another drug used to treat the same condition. For example, if among a large group of acne patients the true response rate to tetracycline i s 60% and to a new drug i s 90%, one might say that the new drug is 30% (90 - 60 = 30) more effective than tetracycline for treating acne. But what if we consider a different population of acne patients in whom the true response rate to tetracycline i s 80%? If we still consider the new drug to be 30% better than tetracycline, we arrive at the impossible situation of having a drug that i s 110 % effective (80 30 = 1 10). To avoid this kind of difficulty, we must modify our approach to calculating a percentage figure that expresses the difference between treatments. A more versatile and clinically meaningful way of expressing the difference between treatments i s called the relative difference? The relative difference i s a value that indicates the percentage of patients who would be expected to improve with the new drug but who would not be expected to benefit from tetracycline. In the example, (60% vs. 9Q%), the relative difference is 75%. This is calculated by subtracting the percentage of patients who respond to tetracycl ine from the percentageof responders to the new drug, and dividing this figure by 100% minus the percentage of tetracycIine responders:

+

.

Allen

201

It means that three-fourths of patients who would not respond to tetracycline would show improvement with the new drug. Table 5 shows the values of the relative differences for various absolute differences. When expected response rates of 20% and 40% are compared (20% absolute difference), the relative difference amounts to 25%. The relative difference i s twice as great, namely 50%, when the improvement rates are 60% and 80%, and the absolute difference is still 20%. Thus, the absolute value of the difference in improvement rates may be less clinically meaningful than the relative difference. To realize how knowledge of the relative difference might be of use in planning a clinical trial, consider a trial in which the object i s to try to confirm the results of a previous trial. In the first study, conducted in the winter, acne improvement rates were 40% for tetracycline and 70% for a new drug. The confirmatory study was planned for the summer months, when acne customarily improves. Hence, the percentage of patients responding to tetracycline was expected to increase from 40% to 60%. Since the relative difference between tetracycline and the new drug might be expected to remain the same (viz., 50%) regardless of seasonal change, the anticipated percentage of patients who would respond to the new drug can be determined from Table 5, and is 80%. The sample size tables reveal the implications of this seasonal change in terms of the numbers of patients required to show the same relative difference between tetracycline and the new drug. Table 1 shows that 54 patients would be needed in each treatment group when the relative difference i s 50%, based on improvement rates of 40% and 70%. However, when the relative difference i s still 50% but is based on improvement rates of 60% and 80%, 100 patients are required in each treatment group. Thus, the higher the improvement rate with the standard treatment, the more patients are required to show a given relative difference between the standard treatment and a new treatment.

202

Vol. 17

INTERNATIONALJOURNAL OF DERMATOLOGY April 1978 Table 5. Percentage Values for the Relative Difference Between 2 Treatments ~~

Smaller of 2 percentages expected to respond

10 20 30 40 50 60 70 80

~

~~

~~

Larger of 2 percentages expected to respond

20

11

30

40

50

60

70 ~

22 13

-

33 25 14

-

44 38 29 17

-

56 50 43 33 20

-

67 63 57 50 40 25

-

80 ~~~

78 75 71 68 60 50

g

90

~-

89 88 86 83 80 75 67 50

-

Underlining of numbers indicates that these relative differences require sample sizes in excess of 100 to satisfy the conditions of Table 1.

Comment Finding a statistically significant difference between one treatment and another provides good assurance that there are true differences in efficacy. But, as the sample size tables indicate, the reverse is unfortunately not true. The lack of a significant difference does not at all indicate that there are no true and important differences between treatments when the sample sizes are small. For example, if 2 drugs produce true cure rates of 40% and 80%, respectively, the chance of finding a significant difference between the two treatments is only 50 : 50 when just 20 patients are in each group (assuming "p" less than .05). Statistical considerations are therefore of great importance in designing clinical trials. The chances of demonstrating a significant difference between one treatment and another are heavily influenced by the number of patients, the size of the relative difference between treatments, and the expected rates of response to treatment. Estimates or even educated guesses as to what these might be will go far to prevent initiation of trials that have little chance of demonstrating clinically meaningfuI differences between treatments. Knowledge of the effect of sample sizes i s valuable in interpreting results reported in the medical literature. Little difficulty arises when it i s reported that one treatment is significantly better than another. But if significant differ-

ences have not been demonstrated, the conclusion that no important differences exist must take into account the possible effects of sample size. If sample sizes are small, say less than 30 in each group, it will be common not to find significant differences between treatments even when the real underlying differences gre very large and of great clinical importance. For example, Tables 1 and 5 show that statistically significant differences will not be found in over 20% of trials with 30 patients per group when the true absolute difference is as high as 40% with a corresponding relative difference as high as 70%. With larger sample sizes, it will s t i l l be common for the results of a trial to lack significance when one treatment is appreciably more effective than another. For instance, Tables 1 and 5 indicate that significant differences will still not'be found in more than 1 of every 5 tFials even when there are 100 patients in each group and the true absolute differences areas great as 20% with a corresponding relative difference as great as 40%. The quandary posed by limited numbers of patients may be resolved by such means as cooperative multicentric trials, comparisons of several separate trials, and cumulative clinical experience. The fallibility of the lastmentioned is widely recognized, but the difficulties with the first 2 are perhaps not so

No. 3

CLINICAL TRIAL DESIGN

widely known. A common error when pooling the results of several independently performed trials is to create a condition known as ”inappropriate combination of evidence from fourfold tables.” A discussion of this error i s beyond the scope of this paper; however, the means to avoid it are explained in several statistical This paper has been concerned with the simplest form of comparative clinical trial and has not covered the application of sample size or relative difference tables to such trial variants as crossover studies or trials in which the response is measured in terms of gradations (e.g., poor, fair, good, excellent; 0 to 3 plus). Since the patient i s used as his own control in crossover studies: the number of patients required is half the number needed for the simpie comparative trial. When the response to treatment i s measured on a graded (ordinal) scale, the sample size tables can still be used if the investigator is willing to temporarily ignore the graded scale and think in terms of yes or no responses (quanta1 scale). For example, if the principal focus of interest i s on whether a treatment produces a good or excellent response as compared to a fair or poor response, it would be logical to think of “good” or “excellent” as a yes response and ”fair” or “poor” as a no response and to use the sample size tables accordingly. This would in no way preclude a more complete analysis of the data once the trial had been performed. The choice of an appropriate sample size i s only one aspect of the design of a good clinical trial. Bias in the allocation of patients to treatment groups may reduce or negate the

.

Allen

203

valueof a trial: a formal system of randomization will help eliminate this problem? However, randomization does not provide assurance that the treatment groups will be balanced in terms of such factors as severity of disease, location of lesions, and the like. Balance can be achieved in a randomized design by blocking and other intricacies of design. If other than a simple comparative or crossover study i s envisaged, much time and effort can be saved by consulting a statistician before the start of a trial. Sample size and relative difference tables are just as applicable to comparing the incidence of side effects of treatment or the performance of diagnostic tests as they are to comparing the beneficial effects of treatment. The role of chance i s equally important i n all. Judicious use of sample size and relative difference tables will help in understanding and perhaps even overcoming the vagaries of chance. Acknowledgment Margaret R. Wrensch, M. S., and Albert M. Kligman, M.D., Ph.D., criticized the manuscript and made numerous suggestions for improvement.

References 1. Cohen, 1.: Statistical Power Analysis for the Behavioral Sciences. New York, Academic Press, 1969, p. 54.

2. Fleiss, I.: Statistical Methods for Rates and Proportions. New York, John Wiley and Sons, 1973, pp. 36, 69, 70, 112-124. 3. Maxwell, A. E.: Analyzing Qualitative Data. London, Methuen, 1961, pp. 73-83. 4. Armitage, P.: Statistical Methods in Medical Research. Oxford, Blackwell, 1971, pp. 369-373.

Clinical trials in dermatology, part 2: numbers of patients required.

Review CLINICAL TRIALS IN DERMATOLOGY, PART 2: NUMBERS OF PATIENTS REQUIRED ALFRED M. ALLEN, M.D., M.P.H From the Department of Dermatology Research...
NAN Sizes 0 Downloads 0 Views