Clin Chem Lab Med 2015; aop

Opinion Paper Rainer Haeckel*, Werner Wosniok, Ebrhard Gurr and Burkhard Peil

Permissible limits for uncertainty of measurement in laboratory medicine DOI 10.1515/cclm-2014-0874 Received September 1, 2014; accepted November 13, 2014

Abstract: The international standard ISO 15189 requires that medical laboratories estimate the uncertainty of their quantitative test results obtained from patients’ specimens. The standard does not provide details how and within which limits the measurement uncertainty should be determined. The most common concept for establishing permissible uncertainty limits is to relate them on biological variation defining the rate of false positive results or to base the limits on the state-of-theart. The state-of-the-art is usually derived from data provided by a group of selected medical laboratories. The approach on biological variation should be preferred because of its transparency and scientific base. Hitherto, all recommendations were based on a linear relationship between biological and analytical variation leading to limits which are sometimes too stringent or too permissive for routine testing in laboratory medicine. In contrast, the present proposal is based on a non-linear relationship between biological and analytical variation leading to more realistic limits. The proposed algorithms can be applied to all measurands and consider any quantity to be assured. The suggested approach tries to provide the above mentioned details and is a compromise between the biological variation concept, the GUM uncertainty model and the technical state-of-the-art.

*Corresponding author: Rainer Haeckel, Bremer Zentrum für Laboratoriumsmedizin, Klinikum Bremen Mitte, 28305 Bremen, Germany, Phone: +49 412 273448, E-mail: [email protected] Werner Wosniok: Institut für Statistik, Universität Bremen, Bremen, Germany Ebrhard Gurr: Abteilung Klinische Chemie und Zentrallabor am Klinikum Links der Weser, Bremen, Germany Burkhard Peil: Deutsche Akkreditierungsstelle GmbH, Frankfurt, Germany

Keywords: measurement uncertainty; permissible bias; permissible imprecision. Abbreviations: B, bias; BU, bias combined with uncertainty of bias estimation; GUM, Guide to measurement uncertainty; MU, measurement uncertainty; CVA, analytical coefficient of variation; CVB, biological coefficient of variation; CVC, combined intra- and inter-individual variation; CVE, empirical; CVB (linear scale from normal distribution); CVE∗, CVE derived from the ln-scale from non-symmetrical distribution; CVG, inter-individual variation; CVW,  intra-individual variation; EQAS, external quality assessment schemes; sA, analytical standard deviation; psA, permissible analytical standard deviation; pCVA, permissible analytical CVA; sE, empirical standard deviation (linear scale from Gaussian distribution); sE,ln, empirical standard deviation (logarithmic scale); RI, reference interval; RL, reference limit; RL1, lower reference limit (RL2.5); RL2, upper reference limit (RL97.5); RMSD, root mean square of measurement deviation; TE, total error; uP, pre-examination uncertainty; uB, bias uncertainty; uS, standard uncertainty; uC, combined uncertainty; puC, permissible combined uncertainty; U, expanded uncertainty; pU, permissible expanded uncertainty; pUEQAS, permissible expanded uncertainty for external quality assessment (proficiency testing).

Introduction Variation of laboratory results is unavoidable with present technologies. This variation causes uncertainty which must be estimated and regularly reviewed by laboratories. Information on the uncertainty must be provided to the customers on request [1]. Uncertainty has been defined by the measurement uncertainty approach [2]: standard uncertainty [uS], combined uncertainty [uC = (u12+u22+…+un2)0.5] and expanded uncertainty [U = k·uC; if coverage factor k = 1.96, the level of confidence is 95%]. Whereas the standard uncertainty,

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

2      Haeckel et al.: Permissible uncertainty limits

that means the imprecision determined in the laboratory (usually as part of the internal quality assurance scheme), is easily understood, the combined uncertainty causes controversial discussions. Due to the complexity of modeling the measurement procedure, influences of various input quantities have to be considered as possible sources of uncertainty: e.g., incomplete definition of the measurand, sample heterogeneity, uncertainty of calibrators, matrix differences between calibrators and samples, instability of the sample, the measurand or reagents used, presence of interfering compounds in the sample (lack of specificity), random variability inherent in the measurement process. According to GUM [2] all components of measurement uncertainty (MU) must be expressed as uncertainties. A large combined uncertainty, however, may not be very valuable in the treatment of an individual patient. The GUM concept allows different approaches of establishing uncertainty components thus leading to a high degree of flexibility in measurement uncertainty quantification. Medical laboratories should take responsibility for a minimal combined uncertainty by considering imprecision uS, bias uB and pre-examination errors uP: uC = ( uS 2 + uB 2 + uP 2 ) 0.5

(1)

Pre-examination errors are partly in the responsibility of the clinicians and partly in the responsibility of the laboratory. The laboratory should be involved in preparing standard operating procedures dealing with all critical steps and in developing an educational program which has been proven to improve the quality of the pre-examination phase [3, 4]. Thus, pre-examination errors may be neglected for the daily quality assurance scheme of the laboratory and Equation (1) may be reduced to: uC = ( uS 2 + uB 2 ) 0.5

(2)

For the purpose of quality assurance, the standard uncertainty (imprecision) and bias should be obligatory and regularly assessed. The corresponding permissible limits are discussed in the present communication. The proposed algorithm is applicable to all continuous measurands (‘stetige Messgrößen’) determined in medical laboratories. The German guideline only lists limits for a limited number of measurands, but these cover the majority of work load. For the rest of measurands, it refers to limits published by the manufacturers. However, it is preferable if the limits are derived independent of industry.

Permissible limits for standard uncertainty The imprecision is the ‘numerical expression of precision’ [5]. According to GUM, imprecision is termed standard uncertainty of a single measurement (us): uS = sA

(3)

In the RiliBÄK guideline, the term empirical standard deviation is used [6]. The standard uncertainty is usually determined with control materials (commutable with human samples as much as possible) as standard deviation sA from day to day (1-month period, about 20 days, intra-laboratory reproducibility, replicate measurements under intermediate precision conditions). The intralaboratory reproducibility may also be determined as cumulative control limits determined during a period of 3–6 months as suggested by CLSI [7]. Two general approaches to define permissible imprecision have been reported in the literature based either on the biological variation or on the technical achievability. Benefits and disadvantages of both approaches have been reported elsewhere [8]. The approach on biological variation should be preferred because of its scientific grounds and its transparency. If the biological variation is multiplied by a fixed factor, the permissible limits are too stringent for measurands with relatively small biological variation and are too permissible for measurands with relatively large biological variation [9]. Therefore, several quality classes were proposed with different factors. Fraser [10] was the first to propose three classes (threelevel model). He considered a fraction of the biological variation of 0.25 as optimum, 0.5 as desirable and 0.75 as the minimum quality specification. Haeckel and Wosniok suggested five classes [8]. All classifications are rather artificial and indicate that a non-linear relation exists between biological variation and permissible analytical variation. Measurands with a relatively small biological variation require a higher factor than measurands with a large biological variation [8]. Therefore, an algorithm was developed which allows calculation of permissible coefficient of variation pCVA (permissible analytical coefficient of variation, permissible relative standard uncertainty) for any measurand. Biological variation is defined as intra-individual (CVW), inter-individual (CVG) or as a combination of both (CVC). As surrogate for biological variation, an empirical coefficient of variation (CVE) can be derived of the 95% reference interval (RI) as reported elsewhere [8] and summarized in the Appendix (Equation 18).

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

Haeckel et al.: Permissible uncertainty limits      3

This equation approximately describes the empirical relation between the major part of the presently used permissible imprecision values. Equation (4) describes a curved relation between pCVA and CVE∗. It can be easily adapted to future technical improvements by modifying its parameters. The exponent in Equation (4) may be reduced from 0.5 to, e.g., 0.45 if one wishes more stringent permissible limits. Applying Equation (4), it is assumed that CVA is constant and sA increases linearly over the measurement interval (Figure 1A). Due to detection limits, the linear relation between sA and the measured value xi [12, 13], usually has a positive intercept (Figure 1B). The detection limit is seldom known for matrices occurring in control materials or patients’ samples. Thus, it is assumed that sA at zero quantity approximately corresponds to the standard deviation (sA,RL1) at the lower reference limit (Figure 1C). Then, the connecting straight line between the standard deviation at x = 0 (sA,RL1) and at x = median value of the reference interval in Figure 1C defines the non-constant CVA. The slope a and the intercept b of the connecting straight line are

Standard deviation, coefficient of variation

(4)

B Standard deviation, coefficient of variation

pCVA = (CVE ∗−0.25) 0.5

A

12 10 8 6 4 2 0 0

20

40 60 80 Quantity, arbitrary units

100

120

0

20

40 60 80 Quantity, arbitrary units

100

120

0

20

40 60 80 Quantity, arbitrary units

100

120

40 35 30 25 20 15 10 5 0

C Standard deviation, coefficient of variation

Most quantities in laboratory medicine have a skewed distribution. If the true distribution is not known, the best assumption is a logarithmic distribution [11]. Therefore, CVE should be derived from the logarithmic value of the lower (RL1) and upper reference limit (RL2) as explained in the Appendix [Equation (20)]. This CVE∗ see Appendix, Equation (20)] is inserted in Equation (4) to obtain the permissible CVA:

25 20 15 10 5 0

a = (CVA ·0.01·Med − CVA ·0.01·RL1 )/Med = CVA ·0.01(Med − RL1 )/Med b = sA,RL1

(5) (6)

Med in Equation (5) is the median of the reference range: (RL1⋅RL2)0.5. Slope and intercept can be applied to calculate the standard deviation (sA,xi or psA,xi) and the coefficient of variation (CVA,xi or pCVA,xi) for any quantity xi: psA,xi = a·xi + pCVA ·0.01·RL1

(7)

pCVA,xi = (a·xi + pCVA ·0.01·RL1 )·100/xi = (100·a·xi + pCVA ·RL1 )/xi

(8)

pCVA is calculated by Equation 4. [Correction added after online publication on January 23, 2015: In the ahead of print publication, Equations (7) and (8) mistakenly contain the intercept (b) instead of the slope (a)]

Figure 1 The relation between (permissible) standard deviation (brown rectangles), (permissible) coefficient of variation (blue rhombs) and measured quantity. Left vertical line represents the lower reference limit (RL1) and right vertical line the upper reference limit (RL2). The blue broken line is the standard deviation line for sA at the median of the reference interval which crosses the RL1 line at the assumed theoretical standard deviation at zero quantity. (A) constant CVA, (B) non-constant CVA (intercept of the sA vs. quantity xi line), (C) intercept of the sA vs. xi line = sA at x = RL1. The orange line represents the derived sA,xi vs. quantity line according to Equation (8).

Thus, the present concept leads to higher pCVA values at lower quantity values than the pCVA values obtained for higher quantities. Using Equation (8) for plasma glucose (CVE = 12.72, CVA = 3.53), pCVA is 7.1% at a concentration of 2.2 mmol/L (40 mg/dL), 4.3% at RL1 = 3.9 mmol/L (70 mg/dL), and 2.8% at 7.0  mmol/L (126 mg/dL). The calculatory steps defined by Equation (8) can be performed by an Excel program which is available on the home page of

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

4      Haeckel et al.: Permissible uncertainty limits

the German Society of Clinical Chemistry and Laboratory Medicine [14] or from the authors. Some common examples of pCVA values derived by Equation (8) at the mean of the reference interval are listed in the Table 1. In most cases, the permissible imprecision is smaller than the requirements of the RiliBÄK 2003 ([15], column Q in the Table 1). The latest version of the RiliBÄK 2008 [6] does not provide pCVA values. If several RIs exist for a particular quantity, the RIs with the smallest range may be chosen for the estimation of pCVA (Table 1). In the case that a lower RL (RL2.5) is not known, RL2.5 may be set tentatively at 15% of the upper reference limit RL (RL97.5).

Permissible limits for bias Bias (B) is the ‘numerical expression of trueness’ [5] and is defined as the difference between the target value and the mean value of replicate measurements from control materials. It is a systematic measurement deviation error type called δ in the RiliBÄK guideline [6]. A typical example of a bias is the application of an inaccurate calibrator. Further sources of bias [16] and five main methods of determining bias have been summarized elsewhere [17]. In the present approach, the permissible bias is set to 0.5·psA (see Appendix), because 0.5·sA leads to about the same percent of false positive results than one sA [8]. Fraser [10] suggested 0.56·sA as desirable goal for permissible bias [see Appendix). In the former RiliBÄK [15], the mean bias for all measurands was given as about 1.25·psA. According to the GUM concept [2], the uncertainty of the bias estimation should be considered. This uncertainty is also set to uB = 0.5·psA (see Appendix). Therefore, the permissible bias is expanded to pBu = [(0.5·pCVA ) 2 + (0.5·pCVA ) 2 ] 0.5 = 0.7·pCVA

(9)

Bias should not be included in the uncertainty estimation if B > 0.7·sA and must be eliminated by corrective actions (e.g., by recalibration or even by estimating a corrective factor). However, any trimming of the dataset should be carefully justified [5]. Unspecificities of measurement procedures caused by patients’ samples specific interferences have been called ‘random error’ [17], and are usually not considered as bias in quality assurance schemes. If they are quasi-constant, they may have the characteristics of a bias. A typical example is pseudocreatinines (a mixture of interferents). A fixed factor of 27 μmol/L has been proposed for adults to correct creatinine results obtained by an unspecific Jaffé reaction [8]. Such

unspecificities should be identified and eliminated (e.g., using a factor or by specific reference limits). Target values are preferably determined by a reference method (reference method values) or by the manufacturer with a standardized procedure (method assigned values). If a control material has reference method values, they should be taken as target values, otherwise assigned values must be used.

Permissible limits for combined uncertainty Combined uncertainty (uC) means a combination of several components of uncertainty, as uS, bias, pre-examination uncertainty and others. For the purpose of intra-laboratory quality assurance only standard uncertainty and bias are considered as long as the uncertainty of other components is not easily admissible. Several ways of combination have been applied: sum of both, root of the square values (‘variance model’, ‘uncertainty model’) or the ‘total error model’ TE = B+k·sA [10, 18]. For the present approach, the variance model is chosen because also bias has a variable component and in practice, variable and systematic components may overlap each other and are often difficult to be separated [19, 20]. The permissible combined uncertainty (puC) is puC = (psA 2 + pBu2 ) 0.5

(10)

puC = (psA 2 + 0.7 2 ·psA 2 ) 0.5 = 1.22·psA

(11)

The permissible combined uncertainty can also be expressed in percent of the measured value (pCVC): pCVC = (pCVA 2 p%Bu2 ) 0.5 = 1.22·pCVA

(12)

If a laboratory result is composed of several measured values (e.g., anion gap), the MU is calculated by combining the MUs of all measurand results (e.g., in the anion gap equation). For the creatinine clearance, MUs must be expressed as coefficients of variation because the clearance calculation uses multiplication and division [21].

Permissible limits for expanded uncertainty Considering a 95% probability, the expanded uncertainty calculated according to Equation (11) is: pU = 1.96·puC = 1.96·1.22·psA = 2.39·psA

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

(13)

  26  35  30  180  0.9  10  10  3.4  6  2.2  1.15  4  0.75  95  3.90  3.93  138  49  25  0.75  0.8  10  31.4  110  22  3.9  70  9  125  3.4  14 

Plasma, serum, whole blood   Activated PTT   Albumin   Alcaline phosphatase   Aldosteron   α-Fetoprotein [44]   Aspartate transferase   Alanine transferase   Bilirubine, total   Ca 19-9   Calcium   Calcium, ionized     Carbamazepin CEA   Chloride   Cholesterol [45]   Cholinesterase   Cortisol   Creatinine   Creatinkinase   C-reactive protein   Digoxin   Digitoxin   Erythrocytes   Estradiol, 17-β   Ferritin   Glucose   Glucose   γ-Glutamyltransferase   Hemoglobin   Hemoglobin A1cg   Hemoglobin A1cg  

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM



  31  44  55  485  3.45  22.5  22.5  11.1  23  2.425  1.3  7  2.875  100.5  4.9  7.365  414  73  87.5  2.875  1.4  17.5  36.3  605  67  5.15  92.5  22.5  139  4.05  21 

x  

b i



  s   g/L   U/L   pmol/L   μg/L   U/L   U/L   μmol/L   kU/L   mmol/L   mmol/L   mg/L   μg/L   mmol/L   mmol/L   U/L   nmol/L   μmol/L   U/L   mg/L   mg/L   mg/L   1012/L   pmol/L   μg/L   mmol/L   mg/dL   U/L   g/L   %   mmol/mol  

Unit

   



    > 60 years   Women   Standing     Women   Women                   Women   8 o’clock   Men   Women           Follicle phase   w, 20–50 years  Venous plasma  Venous plasma  Women   Women      

Remark

c

  2.81  3.16  4.68  5.57  6.25  5.19  5.19  5.95  6.25  2.11  2.37  4.55  6.25  1.59  3.16  4.74  5.78  4.00  6.08  6.25  4.55  4.55  2.57  6.92  5.81  3.44  3.45  5.42  2.21  2.80  4.02 

pCVA (xi) 





  6.71  7.56  11.19  13.32  14.93  12.41  12.41  14.21  14.93  5.05  5.66  10.87  14.93  3.81  7.55  11.33  13.82  9.55  14.52  14.93  10.87  10.87  6.13  16.54  13.89  8.23  8.24  12.95  5.27  6.70  9.61 

pU% (xi) 

d



  13.2  14.8  21.9  26.1  29.3  24.3  24.3  27.8  29.3  9.9  11.1  21.3  29.3  7.5  14.8  22.2  27.1  18.7  28.6  29.3  21.3  21.3  10.7  32.4  27.2  16.1  16.2  25.4  10.3  13.1  18.8 

pUEQAS , % 

e

  6.0  7.00  6.00  14.00  10.00  6.0  6.0      3.0  5.0  7.0    2.5  4.0  6.0  11.0  6.0  5.0  8.0  9.0  10.0    13.0  9.0  5.0    6.0  2.0  6.0   

pCVA 

RiliBÄK 2003 

  10.5  12.5  13.0    17.0  11.5  11.5  13.0  14.0  6.0  7.5  12.0  14.0  4.5  7.0    16.0  11.5  11.0  13.5  14.0  15.5    22.0  13.5  11.0    11.5  4.0    10.0 

RMSD  

f

RiliBÄK 2008 

18.0

21.0 6.0

35.0 25.0 15.0

30.0 20.0 20.0 20.0 30.0 30.0

8.0 13.0

24.0 21.0 21.0 22.0 27.0 10.0 15.0 20.0

18.0 20.0 21.0

EQAS

RiliBÄK 2008

Permissible imprecision (pCVA) and combined uncertainty (pU%) for a particular measurand (xi).

e

a

Reference limits. Cursive: 15% of the upper reference limit; bmeasured value or target value of the control material; cm, men; w, women; SU, 24 h urine; dexpanded uncertainty, pU%= 2.39·pCVA; expanded uncertainty for external quality assessment; fRMSD, root mean square of measurement deviation; g% = 0.09148 × IFCC (mmol/mol)+2.152 [33]. The reference limits chosen were taken from the textbooks of Thomas [43], Gressner and Arndt [44] and of common reference limits recommended by the NORDIC group [45]. An Excel program with an extended list of quantities covering most of the measurands of the RiliBÄK guideline [6] and with comparison of the present proposal with the limits of the RiliBÄK is available [14] which calculates permissible uncertainties automatically if the target value xi is inserted. Yellow cells (reference limits and target values) can be changed by the user, green cells contain the automatically calculated results [14].

  36  53  80  790  6  35  35  18.8  40  2.65  1.45  10  5  106  5.90  10.8  690  97  150  5  2  25  41.2  1100  112  6.4  115.0  36  153  4.7  28 

Lower RL  

Upper RL  



Quantity

  a

  a

   

Table 1 Examples of permissible uncertainties.

Haeckel et al.: Permissible uncertainty limits      5

6      Haeckel et al.: Permissible uncertainty limits

pU% = 2.39·pCVA

(14)

The expanded uncertainty also leads to a curved relation with CVE∗ (Figure 2) like pCVA versus CVE∗. The pU% values corresponds to the RMSD values (root mean square of measurement deviation, [20]) of the German Guideline 2008 [6]. In the Table 1, the pU% values proposed are compared with the RMSD values of the German Guideline, because they are the most comprehensive list based on the present state-of-the-art [22]. With several measurands, almost identical permissible limits are estimated by the proposed approach and the RiliBÄK requirements (e.g., aspartate aminotransferase, sodium and prostate-specific antigen in plasma). In some cases, the present limits are smaller than the RMSD values or exceed the RMSD values. Mean proposed pU% for 63 plasma quantities were about 3% less than the corresponding RMSD values of the RiliBÄK 2008. The pU% values were derived for the mean reference interval for comparison with the RMSD values which are equally applied for quantities near the lower and upper reference limits. Positive discrepancies mean that the proposed pU% values (blue rhombs in the Figure 2) are above the RiliBÄK limits (brown rectangles in Figure 2). The greatest positive discrepancy was observed with testosterone (CVE∗ = 32.6), a measurand which often causes problems in ring trials and, anyhow, leads the organizers of external quality assessment schemes (EQAS) to extend the permissible limits. The greatest negative divergences appear with creatinkinase and immunoglobulin A in plasma (of which the derived permissible limits may be acceptable due to a

relative large biological variation indicated by a high CVE∗ value).

Permissible limits for EQAS Recently, we have proposed that permissible limits in external quality assessment schemes (EQAS) may be derived of pCVA, resp. of pU% values [23]. In analogy to this approach, the 90% interval of the permissible limits for EQAS may be: pU EQAS % = 1.64·pU% = 3.92·pCVA

(15)

and the 95% interval may be pU EQAS % = 1.96·pU% = 4.68·pCVA

(16)

The relation between biological variation CVE∗  and the permissible UEQAS% also follows a curved function (Figure 3). The examples listed in the Table 1 are given in percent of the target value (xi in the Table 1). Mean limits of the plasma quantities listed are similar than those required by the new RiliBÄK [6]. Providers of proficiency testing schemes can select RIs to be used in Equation (4) as common RIs published by consensus groups, IFCC supported recommendations and RIs given by manufacturers for particular analytical procedures (assigned target values).

Benefits of the present approach The present proposal is: 1. based on CVE∗ (derived of the reference interval as a surrogate of the biological variation (scientific reasoning);

25

pUEQAS% / EQAS RiliBÄK 2008

pU% / RMSD

20

15

10

5

0 0

20

40 60 Empirical biological variation

80

100

Figure 2 The relation between empirical biological variation CVE∗, the permissible extended combined uncertainty pU% and the permissible RiliBÄK requirements (RMSD values, rectangles). The straight line represents the desirable total error (0.5·CVE∗). The data are taken from the Table and ref. 14.

40 35 30 25 20 15 10 5 0 0

20

40 60 Biological variation, CVE

80

100

Figure 3 The relation between empirical biological variation (CVE∗), the permissible UEQAS% (blue rhombs) and the permissible RiliBÄK requirements for ring trials (brown rectangles). The data are taken from the Table and ref. 14.

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

Haeckel et al.: Permissible uncertainty limits      7

2.

avoids assuming constant biological and analytical variation over the entire measurement interval; 3. considers also technical achievability (feasibility) leading to realistic limits for most quantities under the present technologies; 4. is transparent and can be applied for any quantity, not only for those listed by regulatory bodies (independent of industry); 5. is flexible (can be adapted to future developments); 6. considers skewed distributions; 7. is simple (easy to understand and easy to estimate on an Excel spread sheet).

Limitations 1.

Problems of commutability are not considered by the present concept (as also in all other known concepts). If such problems arise (especially with immunological assays), a relatively high percentage of laboratories may not be able to stay within the permissible limits. Then, EQAS providers often use higher limits [6]. The manufacturer of control materials should inform its customers if he is aware of non-commutabilities. 2. The present concept is developed for measurands for which biological variation can be assumed. However, it can also be applied for some therapeutic drugs. It is not applicable for lithium and toxic substances, e.g., ethanol. In this case, forensic experts may define the limits (action limits) within which a measurement value should vary. If, for example, an ethanol concentration (or lithium concentration) of 0.5‰ (1.2 mmol/L) should not exceed 5%, the permissible CVA = 2.55 (psA = 0.025/1.96 = 0.012755), pU% = 6.09 (pCVA × 2.39), and pUEQAS = 11.9 (pU% × 1.96). The EQAS limit of the RiliBÄK 2008 [6] is 12.0% for both quantities. 3. Action limits (e.g., 4 μg/L for prostate-specific antigen) are not suited for estimating CVE (see also limit 2). 4. In the proposed approach, bias is limited to one control cycle (e.g., 15–20 days). Bias may also occur within a control cycle, e.g., as a short-term trend or as a sudden ‘jump’ deviation (shift). Causes may be lot-to-lot variation of reagents, instability of reagents, antibody differences, and inaccurate standards. These bias types can be easily detected on a Shewarttype control chart. In the former RiliBÄK [15], a continuous increase or decrease of seven subsequent measurement values was considered as a significant indicator of a trend. Bias may also occur during a longer time period (more than at least 2 control cycles)

and may appear as a long-term trend (long-term bias). A long-term trend has been observed over years for glucose measurements [24] with a 5-year increase of 0.55 mmol/L (11%). Long-term instability may lead to serious problems which often are neglected in practice. It can cause erroneous medical decisions if reference limits are not adjusted accordingly [25, 26]. Such trends are not included in the present MU concept. However, these bias effects may be considered if reference limits are periodically estimated or reviewed by an indirect procedure [27, 28]. Long-term drift effects can also be detected, e.g., by the r-ratio test [29] or from EQASs. If suspected, this bias type must be investigated and be eliminated.

Discussion A long tradition starting with Tonks [30], Cotlove and Harris [31], Fraser et al. [10] and a European expert group [32] based their recommendations for permissible quality limits on biological variation. Cotlove’s rule (multiplication of biological CV by 0.5) broadens the reference interval by 11.8% [31]. Cotlove used a combined biological CV (CVC = square root of the squared intra-individual CV+the squared inter-individual CV). Biological CVC closely correlates with CVE∗ [8]. These authors applied one fixed factor to be multiplied with the biological variation which was later replaced by several fixed factors to reconcile the biological variation model with the technical feasibility [10]. The presently proposed approach proposes a continuous and non-linear relationship between biological and analytical variation (Figure 2) corresponding to factors covering a range CVA/CVE∗ from 0.15 to 0.69. The proposed model requires the knowledge of the reference interval which must be reported by the laboratory for each measurand anyhow. In former decades, the scientific community has attributed a lot of efforts to improve the assurance of the analytical quality, but neglected the quality of reference limits (RLs). In the last decade, more focus now seems to be attributed to RLs. Several recommendations have been published by consensus groups for so-called common RLs. Laboratories are requested to put more efforts on considering transferability if external RLs are used [1, 5]. Reference intervals often are methoddependent. An example is hemoglobin A1c. In Table 1, the reference interval of an ion-exchange chromatographic procedure is chosen [33, 34]. The pU% value is not only method-dependent, but also unit-dependent as already pointed out by Braga and Panteghini [33], because the conversion function IFCC-NGSP master equation (Table 1)

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

8      Haeckel et al.: Permissible uncertainty limits

has an intercept (additive term). The permissible total error published by Ricos et al. [35] is 3.0% (which appears too restrictive) and by the RiliBÄK 2008 is 10.0%. With the present approach, pU% is 6.1 (NGSP) and 8.7 (IFCC). The empirical Equation (14) clearly demonstrates that analytical improvements are primarily required for measurands with relatively narrow reference intervals (small biological variation) and less for measurands with larger reference intervals. Klee [36] proposed basing permissible uncertainty goals on the increase of false positive results. Clinical specificity generally is more affected than clinical sensitivity because the test distribution in the disease population usually is broader than the distribution in the control population. Also, there are more control subjects than disease subjects in most populations tested. Klee postulated that the increase of false positive results should not exceed 50%. The majority of measurands listed in Table 1 is below 50%. Only two measurands (arterial pH and plasma sodium) are above 50% with the present technologies (not shown). The GUM approach [2] principally opens two possibilities for handling bias: either tolerating bias by including it in an uncertainty budget (extending permissible uncertainty limits) or not tolerating bias by excluding it from an uncertainty budget. Then, it must be corrected somehow (e.g., by elimination or by establishing a correction algorithm). The choice between both alternatives depends on the magnitude of the bias and its relevance for medical decisions. In the present model, a permissible limit of 0.7·psA was chosen. If a relevant bias to be corrected is found (e.g., by proficiency testing), it is good laboratory practice to review the laboratory operations and to eliminate the cause, and not to apply an arbitrary correction [17]. In many cases, however, bias is not exactly known [18, 21]. Furthermore, the GUM recognizes that the value used for bias correction has an associated uncertainty, which may include the uncertainty of the reference value itself, the uncertainty of the bias determination, and the significance of the bias value should be estimated. All these considerations may overburden many routine laboratories. Therefore, a fixed bias amount is proposed which arbitrarily is set to about 0.7·pCVA. The fixed bias amount is composed of two components, one with a random characteristic and one with a systematic characteristic. If the permissible uC combines pCVA and pB, both components can compensate each other, that means a very low (minimized) bias allows a larger pCVA (and vice versa). Then, the pCVA may become too large for diagnostic purposes. Therefore, it is necessary that pCVA and pB are limited independently, e.g., as recommended in the

former RiliBÄK [15] and in the present proposal. Imprecision and bias have different and complex effects on the diagnostic sensitivity (false positive rates). Therefore, they should be controlled (assessed) separately [8, 37, 38]. Physicians usually accept that laboratory data vary and have unavoidable random errors, but they are not acquainted to consider bias. Analytic bias can have profound effects near clinical decision points [39]. The consideration of bias is only required for comparisons between laboratories (comparability) and if reference limits are taken from external sources. Comparability between laboratories may be sufficiently assured by regular participation in external surveys and by using CE-marked analytical procedures. For the majority of their measurements, most routine laboratories in Europe usually apply standardized analytical procedures which follow the strict requirements of the EC directive 98/78 EU. Total comparability between laboratories probably remains an elusive goal as recently pointed out [8, 40]. The ideal situation is to keep systematic bias as low as possible. Two ways are possible to keep bias negligible [7]: 1) A high degree of comparability by standardization makes surveillance of bias less relevant or even superfluous. 2) Using intra-laboratory RLs to circumvent the numerous problems with transferability. Single laboratories may establish their own reference limits allowing the assumption that they are constant, or remain stable as long as the methodology is not modified [28]. We prefer to use own reference limits instead of correcting measured values due to an established bias. Arzideh et  al. [27, 28] have developed an algorithm to derive intra-laboratory RLs from the large data pools usually stored in laboratory information systems. A program accessible from a Microsoft Excel spread sheet is available from the home page of the German Society of Clinical Chemistry and Laboratory Medicine [14]. Both ways may complement rather than exclude each other. If systematic bias can be neglected, only imprecision must be assured. Then, only random components of permissible expanded uncertainty are to be considered and internal quality assurance could be simplified considerably. The MU approach in the present proposal is restricted to the measurement procedure and does not include pre-examination variability. However, uncertainties of the pre-examination phase and/or the post-examination phase can be easily added if the corresponding data are available and add relevant information to the requesting party. The model presented simplifies the GUM concept to make it practical for all quantities tested with the resources presently affordable for most medical laboratories. The present proposal leads to permissible limits

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

Haeckel et al.: Permissible uncertainty limits      9

similar to those of the RiliBÄK guideline [6] for most quantities. Mean intra-laboratory limits were about 3% lower than the RiliBÄK limits, and mean EQAS limits are nearly identical. The limits of the RiliBÄK 2008 [6] were set to cover 90% of a large survey with about 46–640 laboratories for most measurands [22], and 95% for only a few measurands (e.g., glucose and sodium in plasma). If the limits proposed are higher, they can be justified because the underlying biological variation is relatively high and, therefore does not require more stringent limits. Acknowledgments: Suggestions from Dr W. J. Geilenkeuser, Referenzinstitut für Bioanalytik, are gratefully acknowledged. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. Financial support: None declared. Employment or leadership: None declared. Honorarium: None declared. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

CVE ∗= 100 (exp sE,ln2 − 1) 0.5

(20)

b) if other information than RL2.5 and RL97.5 is given, e.g., RL2 = RL99 and Mln (which is identical to the median on the ln-scale), then RL2.5 and RL97.5 can be obtained by Equations (22) and (23) sE,ln = (lnRL99 − M ln )/2.33

(21)

lnRL2.5 = M ln − sE,ln ·1.96

(22)

lnRL97.5 = M ln +sE,ln ·1.96

(23)

ln RL2.5 and ln RL97.5 [Equations (22) and (23)] are inserted in Equation (19) to obtain the standard deviation on the ln scale which is needed to calculate CVE∗ by Equation (20).

Permissible bias (pB) as fraction of analytical variation according to Fraser [10] pB = 0.25(sA 2 + sB 2 ) 0.5 In this equation sB means the inter-individual variation. Assuming sA = 0.5·sB or sB = 2·sA pB = 0.25(sA 2 + 4·sA 2 ) 0.5 = 0.25(5·sA 2 ) 0.5 = 0.56·sA

Appendix Calculation of the random variation uB of pB estimation [42]

Calculation of CVE in the case of a normal distribution sE = (RL2 − RL1 )/3.92

(17)

The random variation of uB is derived from the estimation of a confidence interval (x) and amounts to:

CVE = sE ·100/M

(18)

uB = t 1− α / 2 ,n−1 ·psA /n0.5

In Equation (18), M is the arithmetic mean: (RL1+RL2)/2

Calculation of CVE∗ in the case of a log-normal distribution a) for a RI covering a 95% interval (RL1 = RL2.5 and RL2 = RL97.5 are known) On the logarithmic scale, sE and median (Med) can be calculated by the following equations: sE,ln = (lnRL2 − lnRL1 )/3.92 Medln = (lnRL1 + lnRL2 )/2

(19)

CVE derived of sE,ln (CVE∗ ) can be calculated by equation (20) according to Aitchison [41].

If the mean value of a control material is determined from n = 15 (or n = 20) measurements, the t-value of the twosided t-distribution (α = 0.05) is 2.14 (2.09) and uB becomes 0.55·psA (or 0.47·psA). If 15–20 measurements are used, an average value of uB = 0.5·psA is appropriate.

References 1. International Standard Medical laboratories – Requirements for quality and competence, ISO 15189-2012(E),1–39. 2. International Organisation for Standardisation. ISO/IEC Guide 98-3:2008 Uncertainty of measurement. Part 3: guide to the expression of uncertainty in measurement (GUM:1995). Genf, ISBN 92-67-10188-9.

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

10      Haeckel et al.: Permissible uncertainty limits 3. Lillo R, Salinas M, Lopez-Carrigos M, Naranjo-Santana Y, Gutierrez M, Marin MD, et al. Reducing preanalytical laboratory sampling errors through educational and technological interventions. Clin Lab 2012;38:911–7. 4. Gurr E, Arzideh F, Brandhorst G. Gröning A, Haeckel R, Hoff T, et al. Exemplary standard operating procedure pre-examination. J Lab Med 2011;35:55–60. 5. Clinical and Laboratory Standards Institute. Expression of measurement uncertainty in laboratory medicine; approved guideline, vol. 32. CLSI document C51-A. Wayne, PA: CLSI, 2012. 6. Richtlinie der Bundesaerztekammer zur Qualitätssicherung laboratoriumsmedizinischer Untersuchungen. Dt Aerzteblatt 2008;105:C301–13. Available from: http://www.aerzteblatt.de/ plus1308. 7. Clinical Laboratory Standard Institute C24A3. Statistical quality control for quantitative measurement procedures: principles and definitions. Wayne, PA: CLSI, 2006. 8. Haeckel R, Wosniok W. A new concept to derive permissible limits for analytical imprecision and bias considering diagnostic requirements and technical state-of-the-art. Clin Chem Lab Med 2011;49:623–35. 9. Oosterhuis WP. Gross overestimation of total allowable error based on biological variation. Clin Chem 2011;57:1334–6. 10. Fraser CG. Biological variation: from principles to practice. Washington DC: AACC Press, 2001:1–151. 11. Haeckel R, Wosniok W. Observed, unknown distributions of clinical chemical quantities should be considered to be lognormal: a proposal. Clin Chem Lab Med 2010;48:1393–6. 12. Haeckel R, Haeckel H. The determination of glucose concentration in 20 microliter capillary blood, liquor and urine by the hexokinase method with the endpoint analyzer 5030 (Eppendorf). Z Klin Chem Klin Biochem 1972;10:453–61. 13. Haeckel R, Mathias D. A two-point method for the determination of urea with the Gemsaec analyzer. Z Klin Chem Klin Biochem 1974;12:515–20. 14. Permissible imprecision (pCVA) and combined uncertainty (pU%) for a particular measurand (xi). Available from: http:www.dgkl.de. Accessed 10 December, 2014. 15. Richtlinie der Bundesärztekammer zur Qualitätssicherung quantitativer laboratoriumsmedizinischer Untersuchungen. Dt Aerzteblatt 2003;100:B2775–8. Available from: www.aerzteblatt.de/plus1308. 16. Mina A, Favaloro EJ, Koutts J. A practical approach to instrument selection, evaluation, basic financial management and implementation in pathology and research. Clin Chem Lab Med 2008;46:1223–9. 17. Krouwer JS. Setting performance goals and evaluating total analytical error for diagnostic assays. Clin Chem 2002;48:919–27. 18. Westgard JO. Update on measurement uncertainty: new CLSI C51A guidance. Available from: www.westgard.com/clsi-c51. htm. Accessed 24 February, 2012. 19. Klee GG. Tolerance limits for short-term analytical bias and analytical imprecision derived from clinical assay specificity. Clin Chem 1993;39:1514–8. 20. Macdonald R. Quality assessment of quantitative analytical results in laboratory medicine by root mean square of measurement deviation. J Lab Med 2000;30:111–7. 21. White GH. Basics of estimating measurement uncertainty. Clin Biochem Rev 2008;29:S53–60.

22. Geilenkeuser WJ. Precision and accuracy in internal quality control of German laboratories – a survey performed by DGKL. J Lab Med 2005;29:11–6. 23. Haeckel R, Wosniok W, Kratochvila J, Carobene A. A pragmatic approach for permissible limits in external assessment schemes with a compromise between biological variation and the state of the art. Clin Chem Lab Med 2012;50:833–9. 24. Froslie KF, Godang K, Bollerslev J, Henriksen T, Roislien J, Veierod MB, et al. Correction of unexpected increasing trend in glucose measurements during 7 years recruitment to a cohort study. Clin Biochem 2011;44:1483–6. 25. Magnusson B, Ellison SL. Treatment of uncorrected measurement bias in uncertainty estimation for chemical measurements. Anal Bioanal Chem 2008;390:201–13. 26. Coucke W, van Blerk M, Libeer JC, van Campenhout C, Albert A. A new statistical method for evaluating long-term analytical performance of laboratories applied to an external quality assessment scheme for flow cytometry. Clin Chem Lab Med 2010;48:645–50. 27. Arzideh F, Wosniok W, Gurr E, Hinsch W, Schumann G, Weinstock N, et al. A plea for intra-laboratory decision limits. Part 2. A bimodal deductive concept for determining decision limits from intra-laboratory data bases demonstrated by catalytic activity concentrations of enzymes. Clin Chem Lab Med 2007;45:1043–57. 28. Arzideh F, Wosniok W, Haeckel R. Reference limits of plasma and serum creatinine concentrations from intra-laboratory data bases of several German and Italian medical centres. Comparison between direct and indirect procedures. Clin Chem Acta 411;2010:215–21. 29. Haeckel R, Schneider B. Detection of drift effects before calculating the standard deviation as a measure of analytical imprecision. J Clin Chem Clin Biochem 1983;21:491–7. 30. Tonks DB. A study of the accuracy and precision of clinical chemistry determinations in 170 Canadian laboratories. Clin Chem 1963;9:217–31. 31. Cotlove E, Harris EK, Williams GZ. Biological and analytic components of variation in long-term studies of serum constituents in normal subjects. Clin Chem 1970;16:1028–32. 32. Stöckl D, Baadenhuijsen H, Fraser CG, Libeer JC, Hylthof Petersen P, Ricos C. Desirable routine analytical goals for quantities assayed in serum. Eur J Clin Chem Clin Biochem 1995;33:157–69. 33. Braga F, Panteghini M. Standardization and analytical goals for glycated hemoglobin measurement. Clin Chem Lab Med 2013;51:1719–26. 34. Niederau CM, Reinauer H. Evaluating a new, fully automated HPLC-ion exchange system (Merck-Hitachi L-9100) for determination of glycated hemoglobin. J Lab Med 1993;17:388–94. 35. Ricos C et al. Available from: www.westgard.com. Biological variation database. The 2014 update. 36. Klee G. A conceptual model for establishing tolerance limits for analytic bias and imprecision based on variations in population test distributions. Clin Chem Acta 1997;260:175–88. 37. Haeckel R, Wosniok W. Benefits of combining bias and imprecision in quality assurance of clinical chemistry procedures. J Lab Med 2007;31:87–9.

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

Haeckel et al.: Permissible uncertainty limits      11

38. Hylthoft Petersen P, Klee P. Influence of analytical bias and imprecision on the number of false positive results using Guideline-Driven Medical Decision Limits. Clin Chim Acta 2014;430:1–8. 39. Klee GG. Establishment of outcome-related analytic performance goals. Clin Chem 2010;56:714–22. 40. Boyd JC. Cautions in the adoption of common reference intervals. Clin Chem 2008;54:238–9. 41. Aitchison J, Brown JA. The lognormal distribution. Cambridge: Cambridge University Press, 1969:1–176.

42. Moore DS, McCabe GP. Introduction to the practice of statistics. New York: W. H. Freeman and Company, 1999:1–825. 43. Thomas L. Clinical laboratory diagnostics. Frankfurt, Germany: TH-Books GmbH, 1998. 44. Gressner AM, Arndt T. Lexikon der Medizinischen Laboratoriumsdiagnostik. Heidelberg: Springer Medizin Verlag, 2007:1–1411. 45. Rustad P, Felding P, Lahti A, Hyltoft Petersen P. Descriptive analytical data and consequences for calculation of common reference intervals in the Nordic reference interval project 2000. Scand J Clin Lab Invest 2004;64:343–70.

Brought to you by | Purdue University Libraries Authenticated Download Date | 5/24/15 4:47 AM

Permissible limits for uncertainty of measurement in laboratory medicine.

The international standard ISO 15189 requires that medical laboratories estimate the uncertainty of their quantitative test results obtained from pati...
847KB Sizes 0 Downloads 4 Views