Clin Chem Lab Med 2015; 53(3): 377–382

Opinion Paper Aldo Clerico*, Andrea Ripoli, Gian Carlo Zucchelli and Mario Plebani

Harmonization protocols for thyroid stimulating hormone (TSH) immunoassays: different approaches based on the consensus mean value DOI 10.1515/cclm-2014-0586 Received June 2, 2014; accepted August 24, 2014; previously published online September 20, 2014

Abstract: The lack of interchangeable laboratory results and consensus in current practices has underpinned greater attention to standardization and harmonization projects. In the area of method standardization and harmonization, there is considerable debate about how best to achieve comparability of measurement for immunoassays, and in particular heterogeneous proteins. The term standardization should be used only when comparable results among measurement procedures are based on calibration traceability to the International System of Units (SI unit) using a reference measurement procedure (RMP). Recently, it has been promoted the harmonization of methods for many immunoassays, and in particular for thyreotropin (TSH), as accepted RMPs are not available. In a recent paper published in this journal, a group of well-recognized authors used a complex statistical approach in order to reduce variability between the results observed with the 14 TSH immunoassay methods tested in their study. Here we provide data demonstrating that data from an external quality assessment (EQA) study allow similar results to those obtained using the reported statistical approach. Keywords: harmonization; immunoassay methods; quality control; quality specification; standardization; thyroid stimulating hormone (TSH). *Corresponding author: Prof. Aldo Clerico, MD, Laboratory of Cardiovascular Endocrinology and Cell Biology, Department of Laboratory Medicine, Fondazione Toscana G. Monasterio, Scuola Superiore Sant’Anna, Via Trieste 41, 56126 Pisa, Italy, E-mail: [email protected] Andrea Ripoli: Scuola Superiore Sant’Anna and Fondazione CNRRegione Toscana G. Monasterio, Pisa, Italy Gian Carlo Zucchelli: CNR Institute of Clinical Physiology and QualiMedLab, Pisa, Italy Mario Plebani: Department of Laboratory Medicine, UniversityHospital, Padua, Italy

Introduction The IFCC Working Group for Standardization of Thyroid Function Tests published three reports on standardization of thyroid function tests in 2010. In particular, the first report relates to thyroid stimulating hormone (TSH) [1], the second to free thyroxine (FT4) and free triiodothyronine (FT3) [2], and the third to total T4 and total T3 [3]. More recently, the same group reported the results of another study concerning the evaluation of TSH immunoassay methods [4]. According to the authors’ statement [4], the aim of the study was to promote harmonization of immunoassay methods for TSH, because, at the present time, a process of standardization is not possible owing to the lack of accepted reference measurement procedures (RPM) for this hormone. Indeed, the term standardization should be used only when comparable results among measurement procedures are based on calibration traceability to SI unit using a RMP [5]. Recently, projects for harmonizing laboratory testing, namely in the field of immunoassays, have received major concern, stressing the need to improve both the quality of sera and statistical methods to be used [6–8]. In addition, a body of evidence has been accumulated to highlight the importance of a global approach (‘the complete picture’) to harmonization in laboratory medicine [9–11].

Harmonization of the TSH immunoassays using the consensus value method Since a standardized protocol was not practicable, Stockl et al. [4] used a complex statistical approach in order to reduce variability between the results observed with the 14 TSH immunoassay methods tested in their study. This statistical approach was based on the factor analysis (FA) model, which is a statistical method used to describe

Brought to you by | Karolinska Institute Authenticated Download Date | 5/25/15 5:07 AM

378      Clerico et al.: Harmonization of TSH methods by Stockl et  al. [4] in respect to other approaches based on consensus values, in particular the external quality assessment (EQA) programs.

Harmonization protocols based on the consensus mean values For their harmonization study, Stockl et al. [4] used a statistical approach based on the ‘consensus values’ estimated by a FA model. The consensus value method (in particular the consensus mean) was generally used in EQA schemes in order to compare different assay methods for thyroid hormones [12] or other biomarkers [14, 15]. According to the IFCC guidelines [16], practical experience has shown that the consensus value usually agrees closely with the ‘true value’ in schemes with a large number of participants using several different methods. For example, in Figure 1, we reported the imprecision profile derived from the data of the most popular TSH immunoassays evaluated in the 2013 cycle of the Italian EQA scheme, called ImmunoCheck study. In this EQA cycle, about 80 laboratories using 10 TSH immunoassay methods measured 18 control samples, allowing 12,600 results (Table 1). In the Figure 1, the mean TSH concentrations (consensus means) of the 18 control samples measured by participant laboratory were 20 y=11.653 x-0.165

18 16 CV, %

variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called ‘factors’. Stockl et  al. [4] applied the FA model to TSH values measured in 94 native human samples measured with 14 different immunoassays. It is important to note that these samples covered the complete spectrum of the working clinical range, i.e., from hyperthyroid to hypothyroid functional hormonal status (concentration range: from 0.0005 mIU/L to 78 mIU/L). Accordingly to this statistical approach, the main objective of the study was to recalibrate a panel of new measurement standard samples (i.e., native sera assigned with targets statistically-derived from a multi-method inter-comparison study) against ‘consensus values’ estimated by a FA model to bring multiple sets of results into closer mutual alignment. In particular, Stockl et al. [4] verified whether recalibration to the estimated target values actually is able to remove the major part of method-specific biases, so that the remaining dispersion nearly entirely could be attributed to within-method effects. Effectively, the most important result of this study was that mathematical recalibration, according to the statistical model, improved the between-method variability (expressed by CV) from 11% to 6% (on average a reduction of 40%). Indeed, the dataset, including all the TSH values derived from the 14 immunoassays, did not fulfil the assumption of a homogeneous sample from an elliptically symmetric distribution, as requested by statistical analysis. Furthermore, several outliers and missing values also occurred. Therefore, the original data required a further step of normalization prior to application of the statistical model. Considering these limitations and methodological difficulties, the reduction obtained in the between-method variability should be considered to be a very good result. As far as the experimental approach is concerned, we should observe that the original project of the IFCC Working Group regarding the standardization of thyroid tests [1–3], at least at present time, seems to be set aside, and replaced by a less ambitious program concerning harmonization. It is important to note that a process of standardization or harmonization for the TSH assay should be theoretically less difficult than that of free thyroid hormones (FT4 and FT3), considering the lower between-method variability of TSH immunoassays compared to that of FT4 and FT3 immunoassays [12, 13]. Moreover, the immunoassay methods for FT3 and FT4 measurement show some adjunctive problems, related to a reliable project of standardization, owing to theoretical limitations and analytical problems related to the estimation of the free hormone fraction [13]. Therefore, the principal aim of this article is to discuss the experimental protocol and the statistical analysis used

14 12 10 8 6

0

2

4 6 8 10 12 TSH measured, mUI/L

14

16

Figure 1 Variability of TSH immunoassay results evaluated in the ImmunoCheck study according to hormone concentrations. About 80 laboratories using 10 different TSH immunoassay methods measured 18 control samples, allowing 12,600 results in the ImmunoCheck EQA study (data including only the results of 2013 cycle). In the figure, the mean TSH concentrations (consensus means) of the 18 control samples measured by participant laboratory are reported in the x-axis, while the total relative variability values (including both intra- and between-method components), expressed as CV, are reported in the y-axis.

Brought to you by | Karolinska Institute Authenticated Download Date | 5/25/15 5:07 AM

Clerico et al.: Harmonization of TSH methods      379 Table 1 TSH values (mU/mL) measured by laboratories participant to the EQA Immunocheck study with 10 different immunoassay methods. Sample   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18

                                   

CM (14496)

  ROCX (4424)



ARC (2526)



ACC (2369)



CENT (1598)



AIA (839)



VID (606)



VIST (487)



IMM2 (339)

0.133 0.138 0.138 0.461 0.467 0.727 0.730 1.740 1.743 2.632 3.316 4.553 4.558 4.645 10.677 15.229 15.296 15.527

  0.157   0.155   0.157   0.546   0.556   0.839   0.839   1.870   1.880   2.759   3.497   4.749   4.720   4.821   10.919   15.504   15.558   15.555

                                   

0.125 0.128 0.127 0.395 0.401 0.620 0.626 1.593 1.571 2.421 2.956 4.386 4.297 4.357 9.940 14.563 14.383 14.763

                                   

0.121 0.128 0.121 0.418 0.423 0.654 0.667 1.536 1.586 2.509 3.087 4.257 4.232 4.300 9.542 14.329 14.350 14.168

                                   

0.133 0.130 0.134 0.423 0.431 0.686 0.691 1.704 1.735 2.516 3.330 4.571 4.590 4.795 11.408 15.993 16.339 16.039

                                   

0.175 0.158 0.163 0.470 0.466 0.734 0.740 1.995 1.932 3.016 3.836 5.432 5.502 5.258 13.190 18.444 18.573 19.250

                                   

0.090 0.092 0.088 0.438 0.451 0.747 0.743 1.793 1.751 2.595 3.372 4.666 4.748 4.783 10.749 15.399 15.590 16.59

                                   

0.120 0.119 0.118 0.397 0.401 0.639 0.650 1.526 1.528 2.075 3.220 4.290 4.342 4.381 10.300 15.065 15.294 15.530

                                   

0.128 0.125 0.132 0.436 0.429 0.697 0.686 1.780 1.798 2.583 3.289 4.528 4.981 4.903 11.027 16.225 16.524 17.063



VIT (319)



LSN (166)

  –   0.169   0.191   0.396   0.402   0.692   0.679   2.328   2.268   3.520   4.000   –   6.419   6.500   15.875   23.936   23.537   –

                                   

0.151 0.143 0.146 0.518 0.567 0.862 0.882 1.895 1.938 2.866 3.377 4.816 4.776 4.818 10.232 16.090 15.934 15.950

The number of results available for each method are reported in the brackets. ACC, Access and DxI platforms Beckman-Coulter Diagnostics; AIA, AIA 600II, 900, 1800 and 2000 platforms, Tosoh Bioscence; ARC, Architect platform, Abbotts Diagnostics; CENT, Advia Centaur, Siemens Healthcare; CM, consensus mean among methods; IMM2, Immulite platform, Siemens Healthcare; LSN, Liaison/XLplatform, DiaSorin; ROCX, Modular, Elecsys and Cobas platforms, Roche Diagnostics; VID, Vidas platform, BioMèrieux Clinical Diagnostics; VIST, Dimension VISTA platform, Siemens Healthcare; VIT, Vitros ECIVIT platform, Ortho Diagnostics.

reported in the x-axis, while the total relative variability values (including both intra- and between-method components), expressed as CV%, were reported in the y-axis. According to the statistical approach used by Stockl et al. [4], it is possible verify that these CV values are linearly related to the logarithmic transformed of consensus means of TSH concentrations, measured by the laboratories participant to the EQA (CV = 12. 172–4.316 logTSH; R = 0.871, p  5–15 mUI/L, and  > 15–20 mUI/L before and after the harmonization approach were reported in Table 2. These data demonstrate that a mean reduction in the between-method CV of about 57% (i.e., from 10.7% before to 4.6% after the recalibration, p  5–15 mUI/L    > 15–20 mUI/L  

18.4 ± 8.3 7.6 ± 1.3 6.1 ± 0.6

     

6.5 ± 1.7 3.8 ± 0.7 3.5 ± 0.5

10 5 15 TSH consensus mean evaluated in the EQA study, mUI/L

Figure 3 Regression between the consensus mean values. In the figure, we reported the linear regression between the TSH concentration values of the consensus mean (x-axis), observed in the EQA study (Table 1), and those recalculated (y-axis) using the same FA approach previously described by Stockl et al. [4]. A very close relationship was found between these TSH concentration values.

Conclusions Generally speaking, the most important limitation of some EQA schemes is that the some samples distributed to the participant laboratories may be ‘artificial’ samples instead of true human serum or plasma samples, and so these samples may be not commutable. Other EQA schemes used only plasma samples collected from healthy subjects or patients for comparative studies on immunoassays for thyroid hormones [12] and natriuretic peptides [14, 19, 20]. It is important to stress that commutability of a sample (both ‘natural’ or ‘artificial’) cannot be foreseen ‘a priori’, but always tested in the laboratory. It is also important to remember that current TSH methods usually claim that are calibrated according to the World Health Organization (WHO) TSH International Reference Preparation 80/558, but all of them do not produce comparable results, probably because this WHO standard is not commutable [20]. In this regard, it is important to note that the most important rule for (and duty of) the reference laboratory, which organizes the EQA scheme, is to test the commutability of all samples distributed in the EQA cycles, at least for the more common methods used by participant laboratories. Moreover, it is important to stress that the laboratories participant to an EQA scheme should be informed on the source (matrix) of samples distributed and commutability of samples by the EQA organizers. Of course, only

Brought to you by | Karolinska Institute Authenticated Download Date | 5/25/15 5:07 AM

382      Clerico et al.: Harmonization of TSH methods the results of well-designed EQA schemes, which use only samples with tested commutability (i.e., in the case of the present study), should be considered to improve harmonization. In conclusion, the take-home message is that samples commutability is an essential pre-requisite of any EQA scheme and only EQA schemes that comply with this requirement should allow clinical laboratories not only to use the data as a benchmark, but even more importantly, for harmonization projects. This, in turn, may improve a patient-centered focus of laboratory services on quality and patient safety. Author contributions: All the authors have accepted responsibility for the entire content of this submitted manuscript and approved submission. Financial support: None declared. Employment or leadership: None declared. Honorarium: None declared. Competing interests: The funding organization(s) played no role in the study design; in the collection, analysis, and interpretation of data; in the writing of the report; or in the decision to submit the report for publication.

References 1. Thienpont LM, Van Uytfanghe K, Beastall G, Faix JD, Leiri T, Miller WG, et al. Report of the IFCC Working Group for Standardization of Thyroid Function Tests (WG-STFT) – part 1: thyroid stimulating hormone. Clin Chem 2010;56:902–11. 2. Thienpont LM, Van Uytfanghe K, Beastall G, Faix JD, Leiri T, Miller WG, et al. Report of the IFCC Working Group for Standardization of Thyroid Function Tests (WG-STFT) – part 2: free thyroxine and free triiodothyronine. Clin Chem 2010;56:912–20. 3. Thienpont LM, Van Uytfanghe K, Beastall G, Faix JD, Leiri T, Miller WG, et al. Report of the IFCC Working Group for Standardization of Thyroid Function Tests (WG-STFT) – part 3: total thyroxine and total triiodothyronine. Clin Chem 2010;56:921–9. 4. Stockl D, Van Uytfanghe K, Van Aelest S, Thienpont L. A statistical basis for harmonization of thyroid stimulating hormone immunoassays using a robust factor analysis model. Clin Chem Lab Med 2014;52:956–72.

5. Gantzer ML, Miller WG. Harmonization of measurement procedures: how do we get it done? Clin Biochem Rev 2012;33:95–100. 6. Van Houcke SK, Thienpont LM. “Good samples make good assays” – the problem of sourcing clinical samples for a standardization project. Clin Chem Lab Med 2013;51:967–72. 7. Van Houcke SK, Van Aelst S, Van Uytfanghe K, Thienpont LM. Harmonization of immunoassays to the all-procedure trimmed mean – proof of concept by use of data from the insulin standardization project. Clin Chem Lab Med 2013;51:e103–5. 8. Van Uytfanghe K, De Grande LA, Thienpont LM. A “Step-Up” approach for harmonization. Clin Chim Acta 2014;432:62–7. 9. Plebani M. Harmonization in laboratory medicine: the complete picture. Clin Chem Lab Med 2013;51:741–51. 10. Plebani M, Panteghini M. Promoting clinical and laboratory interaction by harmonization. Clin Chim Acta 2014;432:15–21. 11. Plebani M, Astion ML, Barth JH, Chen W, de Oliveira Galoro CA, Escuer MI, et al. Harmonization of quality indicators in laboratory medicine. A preliminary consensus. Clin Chem Lab Med 2014;52:951–8. 12. Giovannini S, Zucchelli GC, Iervasi G, Iervasi A, Chiesa MR, ­Mercuri A, et al. Multicentre comparison of free thyroid ­hormones immunoassays: the Immunocheck study. Clin Chem Lab Med 2011;49:1669–76. 13. Iervasi G, Clerico A. Harmonization of free thyroid hormone test: a mission impossible? Clin Chem Lab Med 2011;49:43–8. 14. Prontera C, Zaninotto M, Giovannini S, Zucchelli GC, Pilo A, ­Sciacovelli L, et al. Proficiency testing project for brain natriuretic peptide (BNP) and the N-terminal part of the propeptide of BNP (NT-proBNP) immunoassays: the CardioOrmocheck study. Clin Chem Lab Med 2009;47:762–8. 15. Clerico A, Zaninotto M, Prontera C, Giovannini S, Ndreu R, Franzini M, et al. State of the art of BNP and NT-proBNP immunoassays: the CardioOrmoCheck study. Clin Chim Acta 2012;414:112–9. 16. Hill P, Uldall A, Wildng P. Fundamentals external quality assessment ŽEQA: guidelines for improving analytical quality by establishing and managing EQA schemes; examples from basic clinical chemistry using limited resources. Milano: IFCC, 1996. 17. Fabrigar LR, Wegener DT. Exploratory factor anlysis. New York: Oxford University Press, Inc., 2001:1–176. 18. Joliffe IT. Principal component analysis. New York: Springer, 2013:1–520. 19. Franzini M, Masotti S, Prontera C, Ripoli A, Passino C, Giovannini S, et al. Systematic differences between BNP immunoassays: comparison of methods using standard protocols and quality control materials. Clin Chim Acta 2013;424:287–91. 20. Faix JD, Thienpont LM. Thyroid-stimulating hormone. Why efforts to harmonize testing are critical to patient care. Clin Lab News 2013;39:1–7.

Brought to you by | Karolinska Institute Authenticated Download Date | 5/25/15 5:07 AM

Harmonization protocols for thyroid stimulating hormone (TSH) immunoassays: different approaches based on the consensus mean value.

The lack of interchangeable laboratory results and consensus in current practices has underpinned greater attention to standardization and harmonizati...
601KB Sizes 0 Downloads 8 Views