Original Article

Digital Gene Expression Profiling of a Series of Cytologically Indeterminate Thyroid Nodules Riccardo Giannini, PhD1; Liborio Torregrossa, MD2; Stefano Gottardi, PhD3; Lorenzo Fregoli, MD1; Nicla Borrelli, PhD1; Mauro Savino, PhD3; Elisabetta Macerola, PhD1; Paolo Vitti, MD4; Paolo Miccoli, MD1; and Fulvio Basolo, MD1

BACKGROUND: Fine-needle aspiration cytology (FNAC) has been widely accepted as the most crucial step in the preoperative assessment of thyroid nodules. Testing for the expression of specific genes should improve the accuracy of FNAC diagnosis, especially when it is performed in samples with indeterminate cytology. METHODS: In total, 69 consecutive FNACs that had both cytologic and histologic diagnoses were collected, and expression levels of 34 genes were determined in RNA extracted from FNAC cells by using a custom digital mRNA counting assay. A supervised k-nearest neighbor (K-nn) learning approach was used to build a 2-class prediction model based on a subset of 27 benign and 26 malignant FNAC samples. Then, the K-nn models were used to classify the 16 indeterminate FNAC samples. RESULTS: Malignant and benign thyroid nodules had different gene expression profiles. The K-nn approach was able to correctly classify 10 FNAC samples as benign, whereas only 1 sample was grouped in the malignant class. Two malignant FNAC samples were incorrectly classified as benign, and 3 of 16 samples were unclassified. CONCLUSIONS: Although the current data will require further confirmation in a larger number of cases, the preliminary results indicate that testing for specific gene expression appears to be useful for distinguishing between benign and malignant lesions. The results from this study indicate that, in indeterminate FNAC samples, testing for cancer-specific gene expression signatures, together with mutational analyses, could improve diagnostic accuracy for patients with thyroid nodules. Cancer (Cancer Cytopathol) C 2015 American Cancer Society. 2015;123:461-70. V

KEY WORDS: fine-needle aspiration cytology; gene expression profiling; messenger RNA signature; thyroid cancer; thyroid nodule.

INTRODUCTION Thyroid cancer is the most common endocrine malignancy in the United States, accounting for approximately 6% of cancers in women and 2% to 3% of cancers in men. Relative to other cancers, patients who have thyroid cancers generally have a good prognosis: approximately 10% to 15% of patients who develop recurrent disease and approximately 5% of those who develop metastatic disease that does not respond to radioactive iodine ultimately die. However, thyroid cancer diagnoses worldwide have been rapidly increasing over the past few years,1 creating a need for the early diagnosis of thyroid cancer in patients who present with thyroid nodules. Ultrasonography and fine-needle aspiration (FNA) cytology (FNAC) are the primary methods for discriminating between benign and malignant thyroid nodules.2 FNAC is routinely used in the preoperative evaluation of thyroid nodules, but patient management is often complicated by the inability to determine malignancy based on cytology alone. The demonstration of malignancy by FNAC is uncommon; only 3% to 8% of samples are Corresponding author: Fulvio Basolo, MD, Department of Surgical, Medical, Molecular Pathology and Critical Area, University of Pisa, via Roma 57, 56126 Pisa, Italy; Fax: (011) 39 050 992481; [email protected] 1 Department of Surgical, Medical, and Molecular Pathology and Critical Care, University of Pisa, Pisa, Italy; 2Department of Anatomic Pathology, Hospital of Pisa, Pisa, Italy; 3Diatech Pharmacogenetics srl, Jesi, Italy; 4Department of Experimental and Clinical Medicine, University of Pisa, Pisa, Italy

Received: March 3, 2015; Revised: April 24, 2015; Accepted: April 30, 2015 Published online May 29, 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/cncy.21564, wileyonlinelibrary.com

Cancer Cytopathology

August 2015

461

Original Article

decisively malignant, 70% of results demonstrate benign cytologic features, and as many as 30% of FNAC sample results are indeterminate because of the inability to distinguish between malignant and benign cytologic features. In these patients, surgery is usually necessary to obtain a definitive histologic diagnosis. To avoid invasive, cumbersome, and expensive surgery, additional methods are required to improve the sensitivity and specificity of FNAC diagnosis, and such methods would have a significant impact on clinical care.3–5 Recent advances in the molecular genetics of thyroid cancer can be applied to the development of new diagnostic markers for FNAC samples. Papillary thyroid carcinoma (PTC), the most common thyroid malignancy, frequently harbors v-raf murine sarcoma viral oncogene homolog B (BRAF), rearranged in transformation/PTC1 (RET/PTC), or rat sarcoma (RAS) mutations. These mutually exclusive somatic mutations are observed in >70% of papillary carcinomas, some of which are associated with more aggressive tumor behavior. Several studies have detected BRAF, RET/ PTC, or RAS mutations in thyroid FNAC samples and have suggested that the detection of these genes may improve the conclusiveness of FNAC diagnoses.6–9 In 2012, Alexander et al10 demonstrated that, in combination with FNAC, a gene expression classifier (GEC) was useful for identifying patients who had a low risk of cancer among those for whom diagnostic surgery was otherwise recommended. In the current study, we performed gene expression profiling in a series of benign and malignant thyroid FNAC samples with NanoString technology using a customized messenger RNA (mRNA) panel of 34 genes that are differentially expressed in malignant and benign thyroid nodules.11–23 The objective of this study was to identify a molecular signature capable of classifying thyroid tumors as benign or malignant in cytologically indeterminate samples using a k-nearest neighbor (K-nn) machine-learning algorithm.24

MATERIALS AND METHODS Samples

Seventy patients (54 women and 16 men) who presented with either a solitary nodule or a dominant thyroid nodule and were undergoing thyroid surgery were selected from our institution. The nodules ranged in size from 6 to 47 mm (mean 6 standard deviation nodule size, 462

TABLE 1. Demographics of the Patients and Their Cytohistologic Diagnoses Diagnosis Variable Sex, no. of patients Men Women Age: Mean 6 SD, y Size: Mean 6 SD, mm Histology CVPTC FVPTC Adenoma Nodular Goiter Cytology TIR2 TIR3 TIR4/5

Benign

Malignant

9 29 49.9 6 11.7 23.6 6 10.0

7 25 38.5 6 13.7 21.1 6 12.1 27 10

34 5 26 12 0

0 5 27

Abbreviations: CVPTC, classical variant of papillary thyroid carcinoma; FVPTC, follicular variant of papillary thyroid carcinoma; SD, standard deviation; TIR2, benign nodule; TIR3, indeterminate lesion; TIR4/5, suspicion of malignancy/malignant.

20.2 6 10.9 mm). The mean 6 standard deviation age of the patients was 43.6 6 13.3 years (range, 18-68 years). Preoperatively, all nodules had undergone at least 1 FNAC and had been grouped into cytologic classifications according to the Italian Society of Anatomic Pathology and Diagnostic Cytology-Italian Division of the International Academy of Pathology, Italian Consensus Working Group.25 According to this classification, the cytologic diagnosis was benign nodule (TIR2) in 26 patients, indeterminate lesion (TIR3) in 17 patients (12 TIR3A and 5 TIR3B), and suspicion of malignancy (TIR4) or malignant (TIR5) in 27 patients (Table 1). In the first group, the motivation for surgery was the presence of compressive symptoms caused by the large size of the nodule. All patients underwent a total thyroidectomy. Immediately after surgery, each surgical specimen was macroscopically examined, and an ex vivo FNAC was performed on all nodules. Cells obtained from ex vivo FNAC were placed in 1.5-mL tubes marked with an identification number, immediately frozen in liquid nitrogen, and stored at 2808C. The surgical specimens were then formalin-fixed and paraffin-embedded for histologic examination according to the current World Health Organization thyroid tumor classification system. This was an institutional review board-approved study, and informed consent was obtained from all patients. NanoString nCounter Assay

The nCounter custom code set used in this study was designed and synthesized by NanoString Technologies Cancer Cytopathology

August 2015

Gene Expression Profiling of FNACs/Giannini et al

(Seattle, Wash). This code set consisted of 82 reporter and capture probe pairs directed against 39 genes (2 pairs for each target) and included 34 genes that are differentially expressed in malignant and benign thyroid nodules, thyroglobulin (TG) to confirm the follicular origin of the cells obtained from FNAC, and 4 housekeeping genes for reference (b actin [ACTB], b2-microglobulin [B2M], hypoxanthine phosphoribosyltransferase [HPRT], and glyceraldehyde-3-phosphate dehydrogenase [GAPDH]) (Table 2). Total RNA was isolated from the ex vivo FNAC cells using the RNeasy Mini Kit (Qiagen Inc, Valencia, Calif). The concentration of total RNA was assessed using a NanoDrop spectrophotometer (ThermoScientific, Wilmington, Del). The RNA was hybridized using 100 nanograms of total RNA in addition to the capture and reporter probes in each reaction. Hybridization was performed for 16 hours at 658C in a SensoQuest thermal cycler (SensoQuest, Gottingen, Germany). Sample cleanup and counts of digital reports were performed according to the manufacturer’s protocol. Data Normalization and TG Cutoff Calculation

Raw data were collected and exported into an Excel worksheet using NanoString nSolver software (version 1.1; NanoString Technologies). Data were normalized according to NanoString Technologies’ guidelines in 2 steps. First, 6 spike-in positive (POS) controls were used to normalize samples against any systematic differences in sample preparation or hybridization efficiency between individual hybridization experiments. Specifically, the geometric mean of the intensity (POSi) of the 6 positive control probes was calculated for the sample from patient i, and the individual probe intensity for sample i was then scaled by the normalization factor POS/POSi, where POS is the geometric mean of all POSi. Second, the internal negative controls (the housekeeping [HK] genes ACTB, B2M, and GAPDH) were used to further normalize the scaled intensities against any effect of differences in the amount of input RNA. This normalization was performed by determining and subtracting a background threshold value from raw count values for each probe and was set to 1 for all probes at or below the background threshold. If HKi is the geometric mean of the intensity of the 3 housekeeping genes for sample i, then the normalization factor was defined as HK/HKi, where HK is the geometric mean of all HKi. All samples with a scaling factor outside a specific range of values (0.3-3.0 and 0.1-10.0 Cancer Cytopathology

August 2015

for positive controls and housekeeping genes, respectively) were excluded from further analysis. Indeterminate FNAC samples were further filtered on the basis of a TG cutoff value (1688 6 750 normalized counts), defined as the mean of the 5 lowest TG expression values observed in the benign and malignant postsurgically confirmed populations. Statistical Analysis

Once the RNA hybridization data had been correctly preprocessed, the data were subjected to 2-way hierarchical clustering analysis (HCA) using MultiExperiment Viewer 4.9 (The Institute for Genomic Research, Rockville, Md; available at: http://www.tm4.org). HCA was applied independently to the samples (represented in columns) and to the genes (represented in rows). The distance measurement selected for this evaluation was based either on the Pearson correlation (PE) coefficient (1-r) or on the Euclidean metric. Differential expression was determined by applying the t test using STATISTICA software (StatSoft Inc, Tulsa, Okla).The K-nn machine-learning algorithm was applied using MultiExperiment Viewer software to classify the samples. With this method, a sample is classified based on a majority vote of its neighbors; then, the sample is assigned to the most common class among its k nearest neighbors. Here, k is a user-defined positive integer, which was defined as 3 in our study.

RESULTS Cytologic Features of the Sample Nodules

The FNAC result was matched to the thyroidectomy histologic report, which is considered the “gold standard” for determining a thyroid cancer diagnosis. After histologic examination, all samples with a suspicious or malignant FNAC were diagnosed as PTC; among the indeterminate FNAC lesions, 12 of 17 (70.5%) were diagnosed as follicular adenomas, and 5 (29.5%) were diagnosed as PTC (1 of which was a classic variant [CVPTC] and 4 of which were follicular variants [FVPTCs]). All non-neoplastic FNACs were confirmed as goiter or adenomatous nodules. Gene Expression Profiling of Benign and Malignant Thyroid Carcinoma FNAC Samples: The Training Model

Gene expression profiling using NanoString technology was performed on RNA from 70 thyroid FNAC samples 463

Original Article TABLE 2. Sequence-Specific Probes Constructed for the Analysis of 34 Differentially Expressed Genes in Thyroid Tumors Using the nCounter System, Thyroglobulin, and 4 Housekeeping Genes Gene Symbol

mRNA/Var1

BRAF CDH3 CHI3L1

NM_004333 NM_001793 NM_001276

Up-regulated Up-regulated

CITED1

NM_004143

Up-regulated

CKBB COL9A3 CXCR4 DPP4

NM_001823.4 NM_001853.3 NM_001008540.1 NM_001935

Down-regulated Up-regulated Up-regulated

NM_000043.4

Up-regulated

Griffith 200617 Castellone 2004,12 Torregrossa 201222 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Basolo 200011

NM_000639.1

Up-regulated

Basolo 200011

NM_212482 NM_024051 NM_003483.4

Up-regulated Up-regulated Up-regulated Up-regulated Down-regulated Up-regulated

FAS (CD95) FASL (CD95L) FN1 GGCT HMGA2 hTERT IPCEF1 KRT19

mRNA/Var2

mRNA/Var3

NM_001193376 NM_015553 NM_002276

LAMB3 LGALS3 LRP4 MET

NM_000228

MPPED2 NPC2 PDLIM4 PROS1

NM_001584 NM_006432 NM_003687 NM_000313 2.12

Down-regulated Up-regulated Up-regulated Up-regulated

NM_007173 NM_012413 NM_002999 NM_000450.2 NM_000655.4 NM_003005 NM_000295

Up-regulated Up-regulated Up-regulated Up-regulated Up-regulated Up-regulated Up-regulated

PRSS23 QPCT SDC4 SEL-E SEL-L SEL-P SERPINA1

NM_001017402 2.46

Expression in Malignancy

NM_001177388.1 NM_002334 NM_000245

NM_001002236

NM_001002235

Up-regulated Up-regulated Up-regulated Up-regulated

TFF3

NM_003226

Down-regulated

TIMP1

NM_003254

Up-regulated

TPO NM_000547.5 TG NM_003235.4 Housekeeping genes GAPDH HPRT NM_000194 ACTB NM_001101 B2M NM_004048

Down-regulated

Reference(s)

Chung & Kim 2012,15 Vierlinger 201123 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Griffith 2006,17 Huang 2001,18 Vierlinger 201123

Griffith 2006,17 Huang 200118 Vierlinger 201123 Chiappetta 200813 Chou 200114 Vierlinger 201123 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Vierlinger 201123 Griffith 2006,17 Huang 200118 Griffith 2006,17 Huang 200118 Chung & Kim 2012,15 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Vierlinger 201123 Vierlinger 201123 Vierlinger 201123 Chung & Kim 2012,15 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Vierlinger 201123 Griffith 2006,17 Vierlinger 201123 Huang 2001,18 Vierlinger 201123 Bal 200826 Muzza 201027 Bal 200826 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Griffith 2006,17 Huang 2001,18 Vierlinger 201123 Griffith 2006,17 Huang 200118

NM_001256799

Abbreviations: ACTB, actin, b; B2M, b-2-microglobulin; BRAF, v-raf murine sarcoma viral oncogene homolog B; CDH3, cadherin 3, type 1, P-cadherin (placental); CHI3L1, chitinase 3-like 1 (cartilage glycoprotein-39); CITED1, cbp/p300-interacting transactivator, with Glu/Asp-rich carboxy-terminal domain, 1; CKBB, brain-type creatine kinase; COL9A3, collagen type IX, a3; CXCR4, chemokine (C-X-C motif) receptor 4; DPP4, dipeptidyl-peptidase 4; FAS (CD95), fas cellsurface death receptor; FASL (CD95L), fas ligand (tumor necrosis factor superfamily, member 6); FN1, fibronectin 1; GAPDH, glyceraldehyde-3-phosphate dehydrogenase; GGCT, g-glutamylcyclotransferase; HMGA2, high-mobility group AT hook 2; HPRT, hypoxanthine phosphoribosyltransferase; hTERT, human telomerase reverse transcriptase; IPCEF1, interaction protein for cytohesion exchange factors 1; KRT19, keratin 19; LAMB3, laminin b3; LGALS3, lectin, galactoside-binding, soluble 3; LRP4, low-density lipoprotein receptor-related protein 4; MET, met proto-oncogene; MPPED2, metallophosphoesterase domain containing 2; NPC2, Niemann-Pick disease, type C2; mRNA, messenger RNA; NM, National Center for Biotechnology Information Reference Sequence database messenger RNA number; PDLIM4, PDZ and LIM domain 4; PROS1, protein Sa; PRSS23, protease serine 23; QPCT, glutaminyl-peptide cyclotransferase; SDC4, syndecan 4; SEL-E, selectin E; SEL-L, selectin L; SEL-P, selectin P (granule membrane protein 140 kDa, antigen CD62); SERPINA1, serpin peptidase inhibitor, clade A (a1 antiproteinase, antitrypsin), member 1; TFF3, trefoil factor 3 (intestinal); TG, thyroglobulin; TIMP1, tissue inhibitor of metalloproteinase metallopeptidase inhibitor 1; TPO, thyroid peroxidase; Var, variant.

464

Cancer Cytopathology

August 2015

Gene Expression Profiling of FNACs/Giannini et al

representing postsurgically confirmed cases, including, 26 benign samples, 27 malignant samples, and 17 indeterminate samples. The raw data normalization was performed in 2 steps. The first normalization was based on the positive-spike controls, and the second was based on the housekeeping gene expression data. After data normalization, 3 samples (1 benign and 2 malignant) were discarded on the basis of NanoString POS and HK scaling ranges. Indeterminate FNAC samples were then filtered using a TG expression level cutoff (1688 6 750 normalized counts), defined as the mean of the 5 lowest TG expression values observed in known benign or malignant

samples. Consistently, nearly all of the indeterminate samples (16 of 17) expressed TG levels above the cutoff value, confirming the follicular origin of the cells obtained from FNAC. Consequently, the 1 sample with a TG expression level under the cutoff value was excluded. After data processing and filtering, 66 of our initial 70 thyroid FNAC samples (94.3%; corresponding to 25 benign samples, 25 malignant samples, and 16 indeterminate samples) were selected for investigation of the correlation between gene expression profiles and the true benign or malignant status of indeterminate FNAC samples. In an attempt to model the gene expression profiles of benign and malignant FNAC samples, cluster analyses of expressed genes and sample analyses were performed using the Euclidean distance between samples. Figure 1A summarizes the results obtained from the expression analysis of the 34-gene panel (Table 2) for the 25 benign lesions and 25 malignant lesions. Cluster analysis Figure 1. Unsupervised hierarchical clustering of malignant (M) and benign (B) fine-needle aspiration cytology samples are illustrated according to gene expression levels as measured by the nCounter System (NanoString Technologies, Seattle, Wash). Each column represents a single sample, and each row represents a single gene. Red indicates a high level of expression relative to the mean expression, and green indicates a low level of expression relative to the mean expression (A) for each tested gene and (B) only for those genes that exhibited a statistically significant difference in expression (P

Digital gene expression profiling of a series of cytologically indeterminate thyroid nodules.

Fine-needle aspiration cytology (FNAC) has been widely accepted as the most crucial step in the preoperative assessment of thyroid nodules. Testing fo...
539KB Sizes 0 Downloads 8 Views