Inter-Rater Agreement of Sputum Cytology for Lung Cancer Screening in Japan Chiaki Endo, M.D.,1* Ryutaro Nakashima, C.T.,2 Akemi Taguchi, C.T.,3 Kazunobu Yahata, C.T.,4 Ei Kawahara, M.D.,5 Nikako Shimagaki, C.T.,6 Junko Kamio, C.T.,7 Yasuki Saito, M.D.,8 Norihiko Ikeda, M.D.,9 and Masami Sato, M.D.10

Background: To compare lung cancer detection rate by sputum

cytology, we need some assurance that the estimates do not vary widely if different observers evaluate the same specimens. The aim of this study was to determine inter-rater agreement of sputum cytology diagnoses. Methods: Slides of sputum cytology from 150 subjects were selected from a pool of slides held by six of the laboratories that had participated in a population-based lung cancer screening program over the last ten years in Japan. The cytotechnologists in these laboratories had considerable experience with sputum cytology. Each case was re-evaluated six times. Cases that were diagnosed as the same category by all six laboratories were selected as consensus cases to serve as standardized sputum cytology cases. Thirty-seven cytotechnologists with various levels of experience in sputum cytology then re-evaluated these consensus cases. Inter-rater agreement was calculated by kappa statistics including Fleiss’ kappa. Results: All pairs of interlaboratory agreement for the 150 cases showed statistically significant kappa values, most pairs showing 1 Department of Thoracic Surgery, Tohoku University Hospital, Sendai, Japan 2 Department of Cytology, Miyagi Cancer Society, Sendai, Japan 3 Department of Pathology and Cytology, Chiba Foundation for Health Promotion and Disease Prevention, Chiba, Japan 4 Department of Cytology, Osaka Medical Association, Osaka, Japan 5 Department of Clinical Laboratory Science, Kanazawa University, Kanazawa, Japan 6 Department of Cytology, Niigata Health Service Center, Niigata, Japan 7 Department of Cytology, Fukushima Preservative Service Association of Health, Fukushima, Japan 8 Department of Thoracic Surgery, Sendai Medical Center, Sendai, Japan 9 Department of Surgery, Tokyo Medical University, Tokyo, Japan 10 Department of General Thoracic Surgery, Graduate School of Medical and Dental Sciences, Kagoshima University, Kagoshima, Japan *Correspondence to: Chiaki Endo, MD, Department of Thoracic Surgery, Tohoku University Hospital, 4-1 Seiryo-machi, Aoba-ku, Sendai 9808575, Japan. E-mail: [email protected] Received 16 October 2014; Revised 25 November 2014; Accepted 17 December 2014 DOI: 10.1002/dc.23253 Published online 30 January 2015 in Wiley Online Library (wileyonlinelibrary.com).

C 2015 WILEY PERIODICALS, INC. V

substantial agreement. Fleiss’ kappa value across the six laboratories was 0.5. Fourteen cases were identified as the consensus cases, and the agreement among observers with less experience of sputum cytology showed significantly lower than the agreement among those with considerable experience (Fleiss’ kappa value 0.27 vs. 0.45, P < 0.05). Moreover, cytotechnologists with less experience under-diagnosed the slides significantly more often than those with considerable experience. Conclusion: When the observers have considerable experience with sputum cytology, inter-observer agreement is good. Diagn. Cytopathol. 2015;43:545–550. VC 2015 Wiley Periodicals, Inc. Key Words: sputum cytology; lung cancer screening; quality assurance; interrater agreement; kappa statistics; Fleiss’; kappa

A joint committee of the Japan Lung Cancer Society and the Japanese Society of Clinical Cytology has discussed quality assurance for sputum cytology in a populationbased lung cancer screening program that has been running for more than 30 years in Japan.1–3 Screening is offered to residents aged more than 39 years and comprises annual chest X-rays for all screenees plus sputum cytology for screenees with a more than 30 pack-year smoking history. Each prefecture in Japan has managed this program for their residents, designated cytology laboratories in each prefecture assessing the sputum cytology of screenees. Recently, it has been debated whether all such laboratories have similar capabilities in sputum cytology diagnosis because there is considerable inter-prefectural variability in lung cancer detection rate by sputum cytology in this screening program. However, to date, interrater agreement of sputum cytology has not been studied in Japan, except for one small study4 in which five laboratories evaluated sputum cytology slides of only eleven cases without statistical analysis. Thus, we designed this study, in which standardized sputum cytology cases were selected and agreement of Diagnostic Cytopathology, Vol. 43, No 7

545

Diagnostic Cytopathology DOI 10.1002/dc

ENDO ET AL.

cytotechnologists with less experience was compared with that of those with considerable experience.

Materials and Methods Materials Light microscopy sputum cytology slides, stained by the Papanicolaou technique,5 were selected from a pool of slides held by six of the laboratories that had participated in a population-based lung cancer screening program over the last 10 years in Japan. These six laboratories were selected simply because the committee members worked in them. All cytotechnologists in these six laboratories had been assessing sputum cytology slides from the lung cancer screening program for more than ten years, sputum cytology slides of about 6,000 cases/year/laboratory having been evaluated by them. The six laboratories were located in six prefectures in Japan, namely Chiba, Fukushima, Ishikawa, Miyagi, Niigata, and Osaka. In the Japanese lung cancer screening program, sputum cytology slides are classified with five categories: unsatisfactory specimen (A), normal or mild atypical cells (B), moderate atypical cells (C), severe atypical cells (D), and malignant cells (E). Test-positivity is defined as category D or E. The specimens used in the current study had all originally been evaluated as category C, D, or E. This study was reviewed and approved by the institutional ethics committees of all the participating institutes. Because this study utilized a pool of sputum cytology slides from a previously conducted lung cancer screening program, the screenees whose sputum cytology slides were used in this study could not give informed consent. A notice about the content of this study was posted at all participating institutes before commencement of this study. All personal information about the providers of the sputum specimens used in this study was concealed, in accordance with Japanese government guidelines and with the approval of the institutional ethics committees of all the participating institutes.

Methods

by each laboratory. Finally, sputum cytology cases for which all six laboratories had made the same diagnoses were selected as consensus cases to serve as standardized sputum cytology cases. The second part of the study.. Thirty-seven cytotechnologists who attended a cytology meeting held in Miyagi prefecture on March 15, 2014 participated in this study as volunteers. Ten of the 37 had less than 10 years of experience of cytology and were not proficient in sputum cytology. Two of the 37 had less than 10 years of experience of cytology but their specialty was respiratory cytology. Twelve of the 37 had more than 10 years of experience of cytology but were not proficient in sputum cytology. The remaining 13 had more than 10 years of experience of cytology and their specialty was respiratory cytology. These 37 cytotechnologists evaluated the consensus cases selected in the first part of this study.

Statistical Analysis Chi square test was used to investigate distributions of categorical variables, P values of < 0.05 being considered statistically significant. Kappa statistics were used to assess agreement between two raters.6 Linear weighted kappa6 was calculated for the ordered cytology categories. Thus, disagreement between adjacent categories resulted in lower reduction of kappa value than disagreement between non-adjacent categories. Non-weighted kappa was used to assess agreement for dichotomous categories. A kappa value was considered statistically significant if the P value was less than 0.05. Fleiss’ kappa7,8 was also calculated to assess the overall agreement among more than two raters. Using Fleiss’ kappa allows statistical comparison of overall agreement of one group with another group. Kappa values of

Inter-rater agreement of sputum cytology for lung cancer screening in Japan.

To compare lung cancer detection rate by sputum cytology, we need some assurance that the estimates do not vary widely if different observers evaluate...
139KB Sizes 0 Downloads 12 Views