Journal

of Hospital

Infection

(1991)

18, 255-260

LEADING

Infection is my infection is this (simplified)

and failure rates: (failure) rate too high?; new procedure as good? 0. M.

Rosemary

Cottage,

Tarrant

ARTICLE

Monkton,

Lidwell

Blandford

Accepted for publication

Forum,

14 January

Dorset DTll8R

Y

1991

Summary: Some tables and methods are presented for the ready evaluation of data when incidence rates are low and the observed numbers small. The difficulty inherent in confirming the equivalence of a changed regimen is discussed. Keywords:

Statistics;

small numbers;

changes

in procedure.

Introduction

The success of medical techniques and of methods for the control of infection have resulted in a very low incidence of failure for many major procedures. A good example is total joint replacement where postoperative sepsis rates were often as high as 10% or more but where figures below 1% are now regularly reported. However, the consequences of failure after this procedure, among others, are such that there is compelling reason to ensure that these low rates are maintained or reduced still further. With the small number of failures to be expected, from even large series of procedures, the statistical problems of confirming that the outcome is within the limits of established good practice, or even better, that any changes in treatment have had no untoward consequences, are formidable. The methods appropriate for this are well established but are not always accessible or intelligible to clinical personnel nor available in an easily utilizable form. In particular, the more generally used formulae are only strictly applicable when the number of events being considered is large enough for the probability distribution to be assumed to follow the arithmetic normal. In the situations considered here this is by no means always the case. Monitoring

The first area of concern 0195-6701/91/080255+06

is the monitoring

SOS.OO/O

of the performance

of a particular

0 1991 The Hospital

255

Infection

Society

0. M. Lidwell

256

technique, either by reference to an arbitrary standard or, more likely, by comparison with the results reported from established centres of excellence. Either way a target figure, e.g. a percentage incidence, is determined. Applying this to the number of procedures undertaken over a given period leads to an ‘expected’ number of failures which can be compared with the number actually observed. Since we are, by definition, considering incidence rates of no more than a few percent the probability distribution of the number of events which might be expected to arise from the ‘expected’ number closely approximates to the Poisson distribution. Table I gives the limits to the ‘expected’ number above or below which the observed number has no more than a 10% or a 5 % probability of arising by chance. The method of use of the table is most easily shown by an example. A reasonable standard for the incidence of sepsis in the joint Table

I. Limiting values of the Poisson expectationfor 10% and 5% probability that an observed number of events is higher or lower than that to be expected for a ‘standard’ incidence

Observed number of events 0 1

2 3 4 2

7 ; Pi 12 13 14 15

16 :i 19 20

Higher than probable if expectation is less than: (2) (1) 0.1 0.5 1.1

O-05

0.3 0.8

I.7 2.4 3.2 3.9 4.7 5.4 6.2 7.0 7.8 8.6 9.5 10.3 Il.2 12.0 12.8 13.6 14.5

;:;

2.7 3.3 4.0 4.7 5.4 6.2 4:; 8.4 9.3 10.1

IO.9 11.7 12.5 13.3

Fewer than probable if expectation is greater than:

(1)

(2)

24 3.9 5.4 6.8 8.0 9.2 IO.5 Il.8 13.0 14.2 15.4 16.6 17.7 19.0 20.1 21.3 22.5 23.6 24.8 25.9 27.0

3.0 4.8 6.4 7.8 9.2 10.6 II.8 13.2 14.5 15.7 17.0 18.2 19.4 20.6 21.8 23.1 24.3 25.4 26.6 27.9 29.1

There is no more than a 1 in 10 probability that the observed number of events is either higher or lower than the assumed standard, as indicated by the figures in columns (l), and a 1 in 20 probability in respect of those in columns (2). The figures in the table have been obtained by computing the cumulative probabilities of the Poisson distribution up to the (k + 1)th term for integral values of m, the expectation, i.e.: c

R=h h=,$.e--

These have then been plotted on probability paper against corresponding to the 5% and 10% tails read off.

m for k = 0 up to k = 20 and the values

of m

Infection

and failure

rates

257

following operation for total replacement might be taken as 0.5% over two years. Five cases of sepsis have been observed after 650 operations. This can be compared with the expectation of 3.25 (650 x 0.005). From the table it can be seen that 5 cases are not more than the number which might reasonably be anticipated even at the 10% level, since the ‘expected’ number, 3.25, is greater than 2.4, the value of the expectation, greater than which 5 or more would occur on one occasion in 10. Had the observed number been 7, as many as this would have been expected on fewer than one occasion in 10, since 3.25 is less than 3.9, and indeed would have been expected on fewer than one occasion in 20, since 3.25 is just less than 3.3. Similar comparisons can be made, using the right-hand columns to assess whether the observed experience might represent an improvement. It is clear that this is unlikely unless the expected value is higher than in the example above. For example, had the assumed target rate for the procedure been 1.5% the expected number would have been 9.75 (650 x 0.015). Five or fewer cases would then have been expected more rarely than one occasion in 10, since 9.75 is greater than 9.2, but more frequently than once in 20 occasions, since 9.75 is less than 10.6. Had only 4 cases of sepsis been observed then a number as low as this, or lower, would only have been expected less often than once in 20 occasions, since 9.75 is greater than 9.2 (in the right-hand 5 % column). A one in 10 probability of differing from the assumed standard should only be taken as a warning signal or a mild encouragement and even less than one in 20, the conventional 5% level, may not be altogether conclusive.

Procedural Prospective

changes

studies

Assessment of the acceptability, or otherwise, of a procedural change, e.g. a simplified course of antibiotic prophylaxis, in terms of the consequences for the patients, requires comparison of the results of a series of trials under the proposed new regimen with those observed under the current system. Before a prospective study is undertaken it is desirable to know what is the chance of obtaining a useful result, e.g. if the two regimens are indeed equivalent what are the likely limits of the observed difference (or apparent incidence ratio) between them and what are the likely confidence limits of this? Figure 1 shows both these as a function of the number of events (infections or failures) expected on the basis of previous experience in each of the two equal groups; i.e. the expected incidence rate multiplied by the number of trials. It is apparent that there are severe practical limitations on the extent to which the null hypothesis can be expected to be validated. For example, to have a 90% probability that the upper 95% limit (two-tail test)

258

0. M. Lidwell



I

I

I

I

1

I

3

5

IO

20

50

100

Expected number of events

Figure 1. The upper limits of the probability ranges for the observed incidence ratio and its 95% confidence limits, in relation to the expected number of events in each of two equal groups of observations. A. There is a 5% probability that the observed incidence ratio will be equal to or greater than these values, in either direction. B. There is a 10% probability that the upper limit of the 95 % probability range of the observed incidence ratio will be equal to or greater than these values, in either direction. C. There is a 5% probability that the upper limit of the 95% probability range of the observed incidence ratio will be equal to or greater than these values, in either direction. For each of a series of values of n, the number of events expected in each of two equal groups, an 11 x 11 probability matrix was constructed, taking the variance as n, and defining the limits of each cell by n f a fi, where a = 0.25; 0.75; 1.25; 1.75; 2.25; end. Cumulative probability distributions were then compiled from the values in these 121 cells for (1) the apparent incidence ratio and (2) the 95% probability range of this ratio. From these distributions the limits defining the 10% and 5% tails were read off. This was done for values of n of 4, 9, 16, 25 and 100. The probability distributions for the two smallest numbers, 4 and 9, were adjusted to allow for the departure from an arithmetic normal distribution with these small numbers. The 95% probability ranges for a particular cell were calculated using the logit transformation, except when the numbers were very small when values from Table II were employed.

of the observed incidence ratio will not exceed 2, the expected number of events in each group must be at least 12 (curve A). At an expected incidence of 1% this means that the planned number of observations needs to be at least 1200 in each group. With this number of events expected in each group there would, however, be a 10% chance that the outcome would produce a 95% upper limit,for the incidence ratio of four or more (curve B); to reduce this upper limit to 2 would require an expectation of some 50 events in each group. Subsequent assessment When two series of observations have been completed then the confidence limits of the observed incidence ratio, or difference, can be evaluated. A convenient approximation is given by employing the logit transformation.’

Infection Table nl

:

3 4 5

6

II.

The 95% probability

n2

0

1.0 1.5

range for 1

0.03-m 0.2 -cc 0.4 -cc 0.7 -cc -cc -cc

and failure

0.13- 78 0.11-120 0.25-160 0.41-190 0.57-230 0.76-270

the ratio, 2

0.08-13 0.18-16 0.31-21 0.42-27 0.56-33

259

rates

I-, of two small numbers of observed events 3

7.2 0.24- 9.1 0.33-10.5 0.45-11.9

4

5

6

0.14-

0.19-5.2 0.27-5.7 0.38-6.0

0.2h4.2 0.324.4

0.28-3.5

If Pl is the probability of an event occurring in one series of observations and P2 that in another comparable series and the number of events observed are nl and n2, nl >n2, then the table gives the probability range of r = Pl/P2, the number of trials in each group being large in relation to nl or n2. These limits diverge progressively from those calculated by the logit transformation as nl or n2 become smaller. They are about 15% more extreme at n = 6, rising to nearly 100% at n = 2. This applies even if nl is greater than 6. Although the lower limit of Y will then be close to that given by the logit method the upper limit will be about 15% higher than the logit figure for n2 = 6, 20% for n2 = 5, 30% for n2 = 4, 50% for n2 = 3, 100% for n2 = 2 and around five times the logit value for n2 = 1. If n2 = 0 the upper limit is infinite. When nl < n2 then the limits for Pl/P2, the blank upper right hand section of the table, are the reciprocal of those given for the corresponding values of n2 and nl. The data given by Thomas & Gart2 are for binomial distributions but, generally, the values extracted for the compilation of Table II involve only low incidence rates and are, therefore, close to the limiting values. The derivation of the complete 6 X 6 table has involved some interpolation.

If nl and n2 are the numbers of events recorded in the two series, the actual incidence rates being low, and r is the apparent incidence ratio then the variance of In r approximates to l/n1 + l/n2. The confidence limits of Y are then given by In, Y * a Jl/nl + l/n2, where a = 1.64 for a single-tail test and 1.96 for a two-tail test, at the 0.05 level. Both the logit transformation and the application of an arithmetic normal distribution, with a variance of (nl + n2) to the observed numbers give results which depart progressively from the true values as the numbers nl and n2 become smaller. In the case of the logit transformation the deviation exceeds 10% when either are as few as six or less. A set of exact values for the confidence limits, when small numbers are involved, has been given by Thomas & Gart.2 Table II has been compiled from these. Again it is apparent that, with small numbers, the range of uncertainty is very wide, even if the actual observed numbers are identical in the two series. It has, from time to time, been claimed that the null hypothesis is confirmed if the results show no significant difference between the incidence rates in the two series.3 The fallacy in this becomes apparent when we consider that the smaller the study the less likely it is to show a significant difference, even if one exists. Discussion

There is no way out of the dilemma inherent in the variability of randomly occurring events when the incidence is low. This poses particularly difficult questions when it is proposed to change procedures in the interest of simplicity or economy. Very often no practicable trial can establish that the

260

0. M. Lidwell

new regimen is as good as the one that it supersedes within other than very wide, and surely unacceptable, limits; e.g. that the incidence of infection or failure may be more than twice as great. To establish the kind of equivalence that might be regarded as wholly satisfactory, e.g. no more than, say, a 25% increase in the risk, is beyond the bounds of possibility. If it is enough to compare the proposed new regimen with past experience then the range of uncertainty can be reduced. Possibly large numbers of earlier observations will reduce the variance for the comparison with the new, which will be accumulated more rapidly. This advantage will, however, be at the expense of uncertainty as to whether it is the change of immediate concern or some other differences in the current situation which have contributed to the observed result. Strict statistical validation of the equivalence, or even improvement, of a modified procedural regimen in situations of the kind considered here is clearly often unlikely to be obtained. Acceptance of the change can then only be based on confirmation that this has not been followed by any very marked worsening of the results coupled with rational grounds for believing that it should be satisfactory. References 1. Morris JA, Gardner MJ. Calculating confidence intervals for relative risks (odds ratios) and standardised ratios and rates. Br Med 7 1988: 296: 1313-1316. 2. Thomas RG, Gart J. A table of exact confildence hmits for differences and ratios of two proportions and their odds ratios. J Am Stat Assoc 1977; 72: 73-76. 3. Lidwell OM. The problem of showing that a changed treatment is no worse than the old. J Hasp Infect 1991; 17: 307-308.

Infection and failure rates: is my infection (failure) rate too high?; is this (simplified) new procedure as good?

Some tables and methods are presented for the ready evaluation of data when incidence rates are low and the observed numbers small. The difficulty inh...
373KB Sizes 0 Downloads 0 Views