Error Rate, Precision, and Accuracy in Immunohematology A. J. GRINDONAND P. L. ESKA From the Department of Pathology (Laboratory Medicine), Johns Hopkins Hospital. Baltimore, Maryland.

Quality control in the blood bank has traditionally been concentrated in the areas of reagents, equipment, and components. We have found that it can be extended to the measurement of error rate, accuracy, and reproducibility as well. We propose the use of a “correction rate” as a correlate of actual error rate, since numbers of errors in final ABO interpretations are infrequent and difficult to accumulate. We have measured accuracy as the frequency of false positive and false negative results, using weakly active antibodies both in the test situation and in actual practice. Finally, reproducibility can be measured with a series of coded duplicate samples covering the range of immunohematologic reactivity. By using these methods, a laboratory can regularly measure error rate, accuracy, and reproducibility as part of its quality control program.

QUALITY CONTROL in Clinical Chemistry has been based upon concepts of precision and accuracy. In the blood bank, however, quality control has been directed more toward adequacy of products, reagents, and equipment, rather than ongoing assessment of technical proficiency. This tendency may have been related to the difficulty of measurement of accuracy and precision in immunohematologic reactions, or because the rarity of hemolytic transfusion reactions seemed to make such measurement unnecessary. Although not often quantitated, technical errors do occur in blood banks, and at rates much higher than is commonly believed if all errors are included. To establish what constitutes an “acceptable” error rate, the type of error must be considered. For example, it is less important to find a weak antibody in a unit of blood than to mislabel as “Group 0” a unit of blood that is truly Group A, when such a unit might be released in an emergency without crossmatch. Received for publication June 17, 1976; accepted October 7, 1976.

Similarly, acceptable error r a t e for a crossmatch will depend on the circumstances. For instance, missing weak agglutination at room temperature might not have serious consequences since antibodies causing agglutination of this type will generally not cause immediate massive destruction of transfused red blood cells. It would be more serious to miss strong agglutination in the antihuman globulin phase, since antibodies active in this phase may cause a c u t e hemolysis of transfused cells. Even here, however, more errors could be tolerated than with ABO grouping. Most attempts to evaluate error rates have focused on errors in ABO grouping (Table I). Analysis of the military experience during World War I1 showed that actual field errors in ABO grouping were as high as 14.3 per cent,s or 143 per thousand. More recent data based upon proficiency test survey results (presumably well checked for potential clerical error before reporting) has shown net ABO error rates of as high as 90 per t h o ~ s a n d . ~ When one searches beyond reported error rates in ABO interpretation for assessment of precision and accuracy in performance of graded immunohematologic reactions, data is lacking. We present h e r e methods developed to estimate error rates in ABO and Rh testing, and to assess accuracy and reproducibility in graded immunohematologic reactions. Methods and Results Blood Group Errors All testing described here was performed by a staff of registered medical technologists, each of

425 Transfusion Sep1.-Ocl. 1977

Volume 17 Number5

426

GRINDON AND ESKA Table 1 .

Published Net Errors/l ROO Tests

ABO

Reference

Military. 1944 Military, 1944 CAP survey. 1966 CAP survey. 1969 CAP survey, 197 1 BOB licensed banks, 1972 CAP survey, 1973 CDC proficiency.’ 1973 CDC proficiency, 1974 CDC proficiency. 1975

14 5 9 6

8 10 13 4 4 4

Number of Errors/ 1,000 Tests

88 143 90 16 1.7 0.6

7 37 25 12

‘CDC Proficiency rates were recalculated as mean annual error rate for ABO testing, excluding as errors a failure to subgroup.

whom had had at least six month’s experience in our blood bank, which transfuses 15,000 units of whole blood and red blood cells annually. All immunohematologic tests are performed according to the Technical Methods and Procedures of the American Association of Blood Banks? which specifies the recording of results as they are read. I f an error is discovered, the technologist must circle the incorrect recording and label it as an error, rather than write over it. A part of our quality control procedure each morning is the review of the previous day’s results of ABO and Rh determinations by a senior technologist. During a one-month period, the senior technologist reviewing these records listed every error found on the original reports. Only rarely were errors detected for the first time by the senior technologist. Since all results a r e recorded in ink at the time of performing the test, it was frequently found that a technologist would make a “slip of the pen” and would immediately realize that an error in recording had been made. Less frequently, a technologist doing a cell or serum ABO group would obtain an incorrect result or an incorrect interpretation, but such an error would only be detected after comparison with the results of another technologist. The initially incorrect result would be labeled as an error and the correct results entered below it after retesting. The analysis of these results the next day, therefore, usually consisted of a tabulation of errors previously noted and corrected. Table 2.

Types of Corrections Number

Result changed Interpretation changed Both changed Others Totals

23 14 9 6

Per Cent

44 27 17 12

-

-

52

100

Transfusion Sep1.-Oct. 1977

During a one-month period, 52 “corrections” were noted in t h e p e r f o r m a n c e of 1,411 ABO and Rh determinations. Each ABO and Rh determination, however, represented 12 chances for error. The ABO cell group (with the addition of anti-A,B) and the Rh type (with albumin control) represented five chances for an erroneous result to be recorded and one possibility for an erroneous interpretation of the results. Our system involves the repetition of the Rh type and albumin control at the time the serum grouping is being performed, so this procedure offers four chances for an error in results and one additional chance for error in interpretation. Finally, it is possible for an error to be made in comparing the results obtained on the direct and reverse groupings, for a total of 12 chances for error. If each one of the 1,41 I ABO and Rh determinations had a potential for 12 errors, the total number of potential errors was 16,932. The actual number of “corrections” found was 52, or three per thousand. The corrections could be divided into four types (Table 2). The most frequent type of correction was the change of one of the hemagglutination grading results, which occurred 23 times, or 44 per cent of the total. These corrections most likely represented “slip of the pen” errors that were immediately corrected, since the interpretation recorded originally was not changed (Fig. I , Example A). The next most frequent type of correction was the change of interpretation, occurring 14 times, or 27 per cent of the total. In this situation, the hemagglutination grading result was correctly recorded, and while some of the interpretations may have been true errors detected only when cell and serum grouping results were compared, many of these also may represent “slip of the pen” errors (Fig. I , Example B). The nine instances where both result and interpretation were changed, and t h e “others” probably represent the real errors that were not detected uqtil after comparison with the other worker’s interpretation. Since no answer can be erased or obliterated, and “slips of the pen” are apparently frequent, the detected “error” rate may be only half of the detected “correction” rate of three per thousand in performance and interpretation of ABO and Rh determinations. Even if the error rate for each step of this procedure were 3 per 1,000, the chance that two different people performing the same procedure will make an error of the same type is much more rare (9 per I million, or 1 per lOO,OOO), since the chance of simultaneous occurrence of two rare events is the product of the chance of those events.16 Furthermore, if we can assume that the chance of missing an ABO incompatibility in a major crossmatch will be of the same magnitude, the net chance of a

Volume 17 Number 5

427

ERRORS, PRECISION, AND ACCURACY

Example A

FIG. 1. Examples of errors in A recording results and B recording interpretation. Examp1e B

patient’s receiving ABO incompatible blood because of a series of errors occurring in the laboratory becomes infinitesimally small. Accuracy of Reading Hemagglutination Test. Accuracy can be determined as the percentage of false positive or false negative results. To determine accuracy in reading hemagglutination, we prepared dilutions of anti-D which gave weak to 2+ reactions with the addition of antihuman globulin serum (Table 3). Each dilution (along with a number of sera containing no antibody), was divided into two separate samples. Each of the samples was labeled with only a code number or letter. These coded samples were sent out in batches of 20 samples (representing ten different sera) to be tested by a given technologist for the presence of antibody activity. The results are shown in Table 4. Of 501 negative tests, 12 or 2.4 per cent were incorrectly called positive. Of 414 positive tests, 16 or 3.9 per cent were incorrectly called negative. All 16 of the false negative tests were in the group of samples which gave true weak reactions, rather than stronger reactions. Further, of the 16 false negative determinations, one technologist was responsible for four, and another for three. Practice. We compared this test false positive and negative rate with the rates obtained in actual practice. All antibodies detected in the main laboratory are regularly referred to the reference laboratory for further evaluation. As part of this evaluation, the reference laboratory repeats the initial antibody detection test. The number of antibody detection tests called positive in the main laboratory and then found to be negative in the reference laboratory can be considered an index of the number of false positive determinations actually found in practice. Over the course of a I2month period, there were 20,205 negative antibody detection tests. One hundred thirteen were called positive when tested in the routine laboratory, but were found to be negative upon repeat testing in the reference laboratory; this represents

a false positive rate of 0.5 per cent. We also measured t h e false negative r a t e t h a t occurred in actual practice. This was accomplished by having an experienced technologist repeat 48 1 sequentially tested antibody screens which had been performed in the main laboratory. Of 22 true positives, two had been originally called negative, for a frequency of false negative determinations of 9 per cent. Both positive samples missed in the main laboratory had weak activity a t room temperature only. This figure of 9 per cent is relatively inaccurate because of the small number of true positives in the sample, and the possibility of minor temperature variability between the main laboratory and the reference area accounting for the variable detection of two antibodies weakly active at room temperature. Reproducibility of Graded Immunohematologic Reactions We next assessed reproducibility of serologic results. For this evaluation, sera containing antibody activity (typically anti-D), were diluted to give antiglobulin reactions ranging from weak to 4+.2 Sera containing no antibody activity were added to make up 25 per cent of the total. Each dilution was divided into two samples, as for the previous study, and then distributed as coded duplicates. Each set of coded specimens delivered to a technologist contained 20 samples which represented duplicates of ten different serum dilutions. Table 5 shows typical data obtained. The left hand column shows the code numbers for the paired duplicate samples, and across the top are Table 3.

Sera Used for Determination of Accuracy

Number of Samples Dilutions of anti-D

10 20

5 Normal sera

45

Strength of Agglutination

2+

1+ Weak Negative

428

Transfusion

GRINDON A N D ESKA Table 4.

Table 6 . Assessment ofReproducibility

Determination of Accuracy

False positives. Test False positives. Practice

False negative. Test False negative, Practice (481 samples tested)

%

Incorrectly Called Positive

False Positive

50 1

12

2.4

20.205

113

0.6

True Positive

Incorrectly Called Negative

% False Negative

414

16

3.9

22

2

9.1

True Neaative

numbers indicating the individual technologist. A technologist will frequently call duplicates of the same sample I + and 2+. Less frequently, a technologist will call the same sample 2+ and 4+. Table 6 shows the results of testing 288 samples. In 4.9 per cent a sample varied by more than one grade from its duplicates; that is, the sample would be called I + and its duplicate, which had been tested simultaneously, called 3 + . The duplicate variation of one grade was 25.7 per cent; that is, a technologist would obtain 2+ on one sample and 3 + on its duplicate. Since the reactivity of the sample could truly be between 2+ and 3 + and the sample reasonably called either of two different degrees of agglutination, we decided to quantitate reproducibility in terms of how many results varied more than one grade from the mean result obtained by all technologists for a given sample. That is, if a sample were found to range between 2+ and 3+ in reactivity by most technologists, how many times would the sample be called I + or 4+? This variation was found 1.7 per cent of the time. Table 5. Duplicate Samples’

1 6

2 5

Typical ReproducibilityAssessment

Technologist

1

*

2

3

4

5

6

+ t - H + i i + + t + k + t +

-+Ht++t+t+-+Ht+H-+H + t t t i + + + + + t + t i + + H + t * + t i + *

B 11

+

3 4

0 0

+

+

+

+

+

+

+

+

+

t

Sept.-Oct. 1977

+

i

+

+

O

O

+

O

W

o

o

+

o

w

‘At least 10 sera as 20 coded duplicate samples tested in one batch.

Duplicate grade variation > 1 Duplicate grade variation 5 1 Grade variation from mean > 1

4.9% 25.7% 1.7%

Discussion

Blood bankers have instinctively decided what tests a r e most important, and have performed those tests two or sometimes three times. Because t h e determination of ABO group is probably the most important of all pretransfusion tests, it is usually performed by testing both cells and serum, and comparing results. I n larger facilities, these tests may be independently performed by two different workers to provide t h e g r e a t e s t possible accuracy. F u r t h e r m o r e , crossmatching will detect major A B O incompatibilities, providing yet a third independent check of this most important determination. I f an error in A B O grouping is made once in 1,000 determinations, and a net error r a t e of that magnitude is felt to be unacceptable, it would be important to have a second person perform the same test independently and then have the results compared. If the error rate remains constant under these conditions, the net error rate becomes one per million.‘6 Net ABO errors in proficiency test samples in the last ten years have ranged from 0.6 to 90 per 1,000 tests. Lacerte’’ found two units with incorrect A B O labels in 3,322 units of blood sent to the Bureau of Biologics from licensed blood banks. Our results suggest a r a t e of three detected “corrections” per thousand chances for error. With two independent workers doing the same kind of testing, even with a detected error rate of this magnitude, the derived net error r a t e should be in the range of 0.01 per 1,000 “tests” (donor ABO and Rh). With Rh typing, the net results a r e not quite as good. Although C A P Proficiency Testing showed a range of error r a t e of five to 16 per 1,000 samples, the licensed blood banks monitored by the Bureau of Biologics had a net Rh error r a t e in 1972 of 15 per 1,000 units of blood submitted.’” Net errors

Volume 17 Number 5

ERRORS, PRECISION, A N D ACCURACY

in smaller institutions may occur so infrequently that they are difficult to measure reliably. Such institutions could, however, measure correction rates. It is important to note that if the net (detected) error rate for a test is one per 1,000, and the test is performed by two workers independently, the detected correction rate must be about one in 32.16 Net errors should be relatively infrequent with ABO and Rh determinations, where the reactions are usually negative or strongly positive. Published error rates with antibody detection tests, where agglutination can be weakly positive, are understandably higher. Som e proficiency testing surveys have evaluated the ability of laboratories to detect antibodies. Five per cent of all laboratories participating in one proficiency survey in 1969 missed strong antibodies.'j The ability to detect strong antibodies active in the antiglobulin phase had improved by 1973, so that the participants in the same survey program had a false negative rate of 0.5 per cent.I3 Antibodies only weakly active a r e more difficult to detect. An anti-Fy" active only weakly in the antiglobulin phase was distributed in a CDC proficiency survey" to laboratories, and missed by 27 per cent. Weak cold antibodies are more difficult still. In a 1973 CAP survey, a weak.ly reactive anti-PI antibody was not detected by 28 per cent of laboratories. Similarly, an anti-PI distributed by the CDC was missed by 61 per cent of laboratories participating in its su~vey.~ In testing an antibody diluted to give weaker reactions in the antiglobulin phase, we found that 2.4 per cent of our samples were falsely called positive. In practice, only 0.6 per cent were false positive. We attribute this difference to the overreading often seen, even with skilled technologists, in a test situation. Of greater concern, we found that 4 per cent of our samples were falsely negative. In actual practice, however, our false negative rate may be as high as 9 per cent. This 9 per cent is based upon two samples with weak antibodies active only at room temperature

429

that were missed in the crossmatching laboratory. In a test situation where half the samples tested are positive, one would expect the rate of false negatives to be lower than in a clinical situation where positive reactions are infrequent. In our facility, only 5 to 8 per cent of all antibody screens are positive, and this low percentage of positive results may make consistent detection of weak antibodies more difficult. Nevertheless, the fact that false negative determinations were as high as 4 per cent in the test situation is a cause for concern. Review of the technique, particularly when one or a few technologists are responsible, may result in subsequent improvement. The first and most important step, however, is recognition of the extent of the problem. Reproducibility is more difficult to measure than accuracy. Some laboratories have displayed pictures of hemagglutination reactions to improve standardization in grading. Unfortunately, the final grading is based not only on the hemagglutination pattern seen, but also upon the vigor used to resuspend the button of centrifuged red blood cells. Allan and co-workers1 have developed an automatic shaker to provide a uniform resuspension of cell buttons. Although this machine has not been widely adopted, it provides a dr amatic illustration t h a t resuspension is an important variable. In our institution, a technologist's results on coded duplicate samples would differ by more than one grade almost five per cent of the time. Performance of quality control testing of this type will reveal those technologists who are inconsistent in their reading, or.consistently overread or underread compared with the rest of those technologists in the laboratory. While accuracy is generally more important, high precision is often helpful, for instance, in antibody identification. Although not related to immunohematologic accuracy and precision, other errors outside the laboratory may have equally dangerous consequences in transfusion practice. In many university hospitals, crossmatch samples are drawn and transfu-

430

GRINDON A N D ESKA

sions started by interns and residents, who are concerned about many aspects of patient care, and are often sleep-deprived. Patient samples in any hospital may be received that are subsequently discovered to contain blood from a patient other than the one indicated on the label. In one study3, 0.6 per cent of mother-newborn sample pairs had been exchanged and consequently mislabeled. In addition, correctly prepared and labeled units of blood may be transfused to the wrong patient. Errors of this type could be minimized by having a transfusion team draw crossmatch samples and start transfusions. Even with such a team, dangerous errors may still occur. Koepke’ has shown that there is a 4.6 per cent error rate merely in copying an 1 1 digit number. Many other workers have commented on the dangers and the frequency of clerical e r r o r s in t h e clinical laboratory. 1 1 * 1 5 . 1 7 Particularly disturbing are errors in distribution of blood. After all the work in obtaining the sample and preparing the proper unit of blood for the patient has been completed, units may be handed to a messenger for distribution to the wrong patient as often as once per hundred units.I5 It is therefore important to recognize that laboratory quality control is only a part of safe and effective transfusion practice. Traditional immunohematology laboratory quality control has focused on products, reagents and equipment, rather than precision and accuracy in test performance by blood bank personnel. Reproducibility and accuracy can only be improved when measured regularly, and even then the need for improvement may not be recognized unless actively sought.

Transfusion Scpt.-Oct. 1977

2. American Association of Blood Banks: Technical Methods and Procedures of t h e American Association of Blood Banks, 6th ed. Washington, 1974. 3. Brocteur, J., A. Andre, M. Otto-Servais, J. C. Bouilleene, and E. Nicolas: Utilisation d’un serum anti-I dans la serologie d e routine de I’ isoimmunisation foeto-maternelle. Proc. 10th Congr. Eur. SOC.Haematol. Strasborg, 1965 (Cited in reference 12.) 4. Center for Disease Control: lmrnunohematology S u m m a r y Analyses-Proficiency Testing. USDHEW, PHS, Atlanta. Nos. 1 through 4 for 1973, 1974,1975. 5. Coates, J. B.: Blood Program in World War I I . Washington, D.C., 1964. 6. Koepke, J. A,: lmmunohematology proficiency testing, 1966-1969.Am. J. Clin. Pathol. 54508, 1970. 7. -: Clerical errors in surveys. Bull. Coll. Am. Path., June, I97I , p. 193. 8. -, D.V. Cicchetti: Laboratory proficiency in blood banking: The variability of blood bank reagents-I971 (A follow-up study). Transfusion 13:41,1973. and R. J. Eilers, A survey of clinical labo9. -, ratory immunohematology performance. Transfusion 7:316,1967. 10. Lacerte, J. M., V. M. Kane, and P. Cohen: U S . licensed blood banks. Evaluation of whole blood and its components. Transfusion 12:339,1972. I . McSwiney, R. R., and D. A. Woodrow: Types of error within a clinical laboratory. J. Med. Lab. Technol. 26:340,1969. 2. Myhre, B. A.: Quality Control in Blood Banking. New York, John Wiley and Sons, 1974. 3. -, and J. A. Koepke: The College of American Pathologists comprehensive blood bank survey program, 1973. Am. J. Clin. Pathol. 63:995,1975. 14. Richmond, A. M., F. W. Chorpenning, J. W. Moose and G. B. Edmonson: Blood grouping discrepancies. U S . Armed Forces Med. J. 69350, 1955. 15. Schmidt, P. J., and S. V. Kevy, Sources of error in a hospital blood bank. Transfusion 3:198,1963. 16. Stratton, F., and P. H. Renton: Practical Blood Grouping. Oxford, Blackwell Scientific, 1958,p. 148. 17. Taswell, H. F., A. M. Smith, M. A. Sweatt, and K. J. PfaR Quality control in the blood bank-A new approach. Am. J. Clin. Pathol. 62:491,1974.

Acknowledgments The authors would like to acknowledge the expert technical assistenceof Sandra A h , MT(ASCP)SBB.

References I.

Allan, T. E., F. R. Camp, and C. E. Shields: Newly Designed Equipment for t h e Blood Bank. Program, 23rd Annual Meeting, AABB, San Francisco, 1970.

Alfred J. Grindon, M.D., Director, Atlanta Regional Red Cross Blood Program, 1925 Monroe Drive N.E., Atlanta, Georgia 30324. P.L. Eska, MT(ASCP)SBB Chief Technologist, Blood Bank, Johns Hopkins Hospital, 601 N. Broadway, Baltimore, Maryland 21205.

Errer rate, precision, and accuracy in immunohematology.

Error Rate, Precision, and Accuracy in Immunohematology A. J. GRINDONAND P. L. ESKA From the Department of Pathology (Laboratory Medicine), Johns Hopk...
466KB Sizes 0 Downloads 0 Views