Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

Contents lists available at ScienceDirect

Journal of Pharmaceutical and Biomedical Analysis journal homepage: www.elsevier.com/locate/jpba

Liquid chromatography–mass spectrometry based serum peptidomic approach for renal clear cell carcinoma diagnosis Zhenzhen Huang a,1 , Shudi Zhang a,1 , Wei Hang a,b,∗ , Yuedong Chen c , Jiaxin Zheng c , Wei Li c , Jinchun Xing c , Jie Zhang d , Eryi Zhu a , Xiaomei Yan a a Department of Chemistry, MOE Key Lab of Spectrochemical Analysis & Instrumentation, College of Chemistry and Chemical Engineering, Xiamen University, China b State Key Laboratory of Marine Environmental Science, Xiamen University, China c Department of Urology, Xiamen First Hospital, China d Key Lab of Urban Environment and Health, Institute of Urban Environment, Chinese Academy of Sciences, China

a r t i c l e

i n f o

Article history: Received 9 May 2014 Received in revised form 30 June 2014 Accepted 24 July 2014 Available online 2 August 2014 Keywords: Serum peptidomics Renal clear cell carcinoma Liquid chromatography–mass spectrometry Peptide biomarkers Unsupervised hierarchical cluster analysis

a b s t r a c t Serum peptidomic approach was applied to investigate the peptidomic signature and discover the clinical biomarkers and biomarker patterns for RCC patients. The holistic orthogonal partial least-squaresdiscriminant analysis (OPLS-DA) based on qualified profile data successfully classified RCC patients from healthy controls, showing 100% sensitivity and specificity. Following critical criteria, several peptides presenting significant differences in serum level were picked out. The unsupervised hierarchical cluster analysis on those peptides was performed, showing 100% sensitivity and 93.3% specificity for RCC diagnosis regarding the present samples. Besides, receiver–operating characteristic (ROC) analysis was applied on single peptide biomarkers, with four peptides showing excellent predictive power. Among them, IYQLNSKLV and AGISMRSGDSPQD are reported for the first time for cancer detection. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Kidney cancer (KC) is typically asymptomatic and frequently fatal [1]. Among all the subtypes of KC, renal clear cell carcinoma (RCC) is the most prevalent one, accounting for nearly 70% of all the KC surgical cases [2]. For a successful treatment, early detection is essential [3]. Computed tomography, magnetic resonance imaging, and positron emission tomography are commonly used imaging diagnostic techniques [4]. Even with the combined use of the above techniques, it is still difficult to identify tumors at early stage [5]. Therefore, RCC specific biomarkers would be of great clinical value. Previous RCC biomarker discovering was focused on genomic, proteomic, and metabonomic scope [5,6]. Currently, there

Abbreviations: AS, advanced stage; AUC, area under the ROC curve; BPC, base peak chromatogram; ES, early stage; FA, formic acid; HMW, high molecular weight; KC, kidney cancer; LMW, low molecular weight; OPLSDA, holistic orthogonal partial least-squares-discriminant analysis; QC, quality control; RCC, renal clear cell carcinoma; ROC, receiver–operating characteristic analysis; TNM, Tumor Nodes Metastasis Staging System. ∗ Corresponding author at: Department of Chemistry, Xiamen University, 422 Simingnan Road, 361005, China. Tel.: +86 592 2184618; fax: +86 592 2185610. E-mail address: [email protected] (W. Hang). 1 These authors contributed equally to this work. http://dx.doi.org/10.1016/j.jpba.2014.07.028 0731-7085/© 2014 Elsevier B.V. All rights reserved.

are no biomarkers which are approved or gain a wide consensus from the medical community for a reliable screening for RCC [7]. Peptidome are referred as naturally occurring low molecular weight (LMW) peptides and proteolytic fragments with molecular weight less than 10 kDa [8]. The corresponding term “peptidomics” was defined as systematic, holistic, qualitative and quantitative analysis of these endogenous peptides and small proteins at a defined time point and location [9,10]. Unlike proteomics, it has been advocated to study peptides in their native forms, without vitro digestion using specific enzymes [11,12]. From the molecular weight point of view, the peptidomics studies can fill the blank between proteomics and metabonomics [13]. Disease microenvironment associated LMW peptides are readily shed into extracellular interstitiums and facile to cross the endothelial cell barrier of the vasculature, which finally run into the circulation [14]. Considering that blood constantly flow through kidney, it might be expected that the onset of KC may be determined by characterizing the altered abundance of LMW peptides in serum. In this study, serum peptidomic analysis on the endogenous LMW mixtures was carried out using a reversed phase liquid chromatography–mass spectrometry (RPLC–MS) platform, followed by supervised holistic-variate orthogonal partial leastsquares-discriminant analysis (OPLS-DA) to discriminate 30 RCC

176

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

patients and 30 healthy controls. Unsupervised hierarchical cluster analysis of chosen markers was also performed as a multi-variate approach. The clinical utility of single markers for RCC diagnosis was evaluated using receiver–operating curve (ROC) analysis. The purpose of this study is to discover and evaluate the capability of RCC diagnosis based on the peptidomics features, pinpointing serum peptidomic biomarkers with holistic-variate and multivariate approaches.

proteins such as albumin and immunoglobulin, 2.4 mL pure ACN was added afterwards. After vortexing, the mixture was set aside at 4 ◦ C for another 30 min and then centrifuged at 3000 × g for 10 min at 4 ◦ C. Serum supernatant was transferred to speed-vac (Scan Speed Maxi Vac, Labogene, Denmark) to make it dried at 4 ◦ C. The dried LMW mixture was redissolved in 600 ␮L 0.01 M phosphate buffer solution (PBS) and subsequently analyzed by LC–MS. An in-house quality control (QC) was prepared by pooling and mixing the same volume of each sample.

2. Materials and methods 2.4. LC–MS/MS analysis 2.1. Materials 20% Tricine gel was obtained from BioRad (Hercules, CA); Coomassie Brilliant Blue-G250 and unstained low range protein ladder were obtained from Thermo fisher (Rockford, USA); HPLC-grade acetonitrile (ACN) was obtained from Tedia (Fairfield, OH, USA). HPLC-grade formic acid (FA) was purchased from Sigma/Fluka (Switzerland). 2.2. Serum collection and storage 30 RCC patients and 30 healthy controls from Xiamen First Hospital were enrolled in this study. All the patients were diagnosed with histopathology examination, and none had received chemotherapy or radiation before blood collection. Blood samples were collected in the morning before breakfast following standard clinical protocol, and written informed consents were obtained from each subject [15]. Briefly, venous blood was collected into 8.5 mL BD Vacutainer SST “tiger-top” tubes (BD Biosciences catalog number 367988) with protease inhibitor added, clotted at room temperature for 1 h in vertical position in the rack, and centrifuged at 2000 × g for 10 min at room temperature. The obtained sera were immediately stored at −80 ◦ C until analysis. Study protocols and procedures in our text were approved by the local ethics committee and analysis was carried out in agreement with the Declaration of Helsinki. Detailed information of the clinicopathogical characteristics of the tumor patients and healthy controls is provided in Table 1. Tumor stages were established according to the 2002 Tumor Nodes Metastasis (TNM) Staging System [16]. For simplicity in our study, tumors confined to the kidney (T1-2, without metastases) were considered early stage (ES) while tumors beyond the kidney (T3-4, with metastases) were considered advanced stage (AS) [6]. 2.3. Sample treatment for LC–MS 150 ␮L human serum was diluted using 600 ␮L 20% ACN (V/V), then incubated at 4 ◦ C for 30 min to completely disrupt the protein–peptides interactions. In order to precipitate abundant Table 1 Demographic and clinical chemistry characteristics data. Characteristics

RCC patients

Healthy controls

No. of subjects Age (mean, range) Male Female ES AS BMI Hematuria Medications Smoking habit Non smokers Ex smokers Smokers Race

30 53, 35–71 17 13 7 23 22.3, 18–27 0 0

30 51, 33–74 16 14 – – 23.2, 19–29 0 0

15 3 12 Chinese

16 0 14 Chinese

All chromatographic separations were performed using an Ultimate 3000 HPLC system (Dionex, USA). A 2.1 mm × 150 mm Kinetex C18 2.6 ␮m analysis column (Phenomenex, USA) along with a guard column (AQ0-8503, Phenomenex, USA) was used for RP separation at 40 ◦ C. The mobile phase was a mixture of (A) H2 O with 0.1% FA and (B) ACN with 0.1% FA, with a programmed gradient as follows: initial 10% B maintained for 3 min, then increased to 50% in 22 min, increased to 95% in 5 min, held at 98% for 5 min, decreased to 10% in 0.1 min, and finally maintained at 10% for 5 min. The injection volume was 15 ␮L. The analysis column was eluted at a flow rate of 200 ␮L min−1 . A high resolution electrospray ionization (ESI) mass spectrometer (MicrOTOF QII, Bruker Daltonics, USA) was operated in positive ion mode over the mass scan range of 200–2000 to collect peptidomic profiles data. The acquisition rate was set at 1 spectrum per second. In the source, the electrospray needle was grounded, the interface capillary voltage was set at −4500 V, the end plate offset potential was set at −500 V, the nebulizer gas pressure was set at 0.7 bar, and the dry gas flow rate was set at 6 L min−1 at a temperature of 200 ◦ C. The MS data acquisition and the subsequent MS/MS of selected peptides were performed in a data-dependent manner. Each full MS scan was followed by three MS/MS experiments using collision induced dissociation (CID), with smart exclusion of background noise and previously fragmented precursors in a period of 60 s. Argon was employed as collision gas and collision energy was adjustably swept from 20 to 70 eV. All samples were analyzed randomly to avoid result artifacts. A blank and a QC sample were injected every 7 samples to monitor the stability of the system. 2.5. Data processing and analysis The raw data acquired from LC–MS were pretreated by DataAnalysis 4.0 software (Bruker Daltonics, Billerica, USA) to find compounds with molecular features. Ions were finally grouped into “compounds” by their molecular features. The “compounds” were evaluated by ProfileAnalysis 1.1 software (Bruker Daltonics, Billerica, USA), which performed peak alignment, background noise subtraction, and data reduction in an automated and unbiased way. Only peaks with signal-to-noise ratio (S/N) greater than 5 were utilized in further analysis. All LC–MS detected peaks were identified nalcomparing both the MS spectra and the retention time. The main parameters were set as follows: retention time range 6–35 min, mass range 200–1000, mass window 0.5 Dalton, and retention window of 1 min. To correct the MS response shift during the long analysis duration and the different enrichment factors of each serum sample, the data of each sample was normalized to total intensity before the multivariate data analysis. After correction, the data were exported to SIMCA-P v12.0 software (Umetrics AB, Sweden) for principal component analysis (PCA) to test the reproducibility of separations. Holistic orthogonal partial least-squares-discriminant analysis (OPLS-DA) is an extension of PCA that makes use of class information to maximize the separation among classes. It was applied to these complex spectral data to aid in the characterization of profile changes related

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

to RCC [17–19]. Each OPLS-DA model was evaluated by both the internal permutation test and external validation test. Prior to PCA and OPLS-DA analysis, the profile data was mean-centered and unit variance scaled. Unpaired Student’s t-tests, with a Bonferonni correction for multiple comparisons, were employed to ensure that the peptide biomarkers extracted with holistic OPLS-DA analysis were significantly differentially expressed between the RCC patients and the controls. P value threshold of 0.001 was used to define the significance. Furthermore, to understand the potential relationships among the biomarkers and evaluate the predictive power of such multi-variate approach, hierarchical cluster analysis was performed using Cluster software (Stanford University, UK). Univariate receiver–operating characteristic curve (ROC) analysis was performed using SPSS Statistics 18 (SPSS Inc., USA) to evaluate the predictive power of those peptide biomarkers in discriminating the RCC patients from the controls. 2.6. Peptide identification Candidate peptide identification was conducted using the MASCOT (version2.1.0; Matrix Science) to search against SwissProt 56.8 human database. Searching parameters used were as follows: no enzyme; no fixed modification; peptide mass tolerance was set at 0.05 amu; fragment ion mass tolerance was set at 0.6 amu; peptide charges were +1, +2, +3; instrument type was ESIQUADTOF. The variable modifications were: Acetyl (K), Acetyl (N-terminal), Amidated (C-terminal), and Oxidation (M). In this study, Mascot ion with significant score (P < 0.05) and a rank of 1 was accepted as a positive identification after manual verification of the presence of matched b- and/or y-ion sequences in fragmentation masses [20,21]. After the sequences were identified, they were BLASTed against UniprotKB protein knowledgebase to find out the proteins where they originated from. 3. Results and discussion For cancer peptidomic analysis, the stability of the analytical method is very important to obtain valid data. Thus, the concept of QC samples was adopted here to monitor and evaluate the stability of the analysis [22]. PCA, an unbiased statistical method, was performed on all samples which included 8 QC samples. Since the two coordinates of the PCA plot indicate the chemical information differences of the samples, the trend for sample groups to cluster differently indicates that the groups are inherently different. Additionally, in PCA plot, the tighter the same QC samples cluster, the more reproducibility the analytical method exhibits. As shown in Fig. 1(A), all QC samples are clustered together, which exhibits the stability and reproducibility of the measurement. Though complete separation between RCC and healthy controls is not observed, the result of PCA indicates the inherent peptidomic changes of RCC patients compared with the controls. In order to find peptide RCC biomarkers, holistic OPLS-DA model was then generated which only contained the clinical samples of the study, with the result shown in Fig. 1(B). Distinct clustering between the RCC patients and controls is achieved. The RCC samples show a tighter pattern than healthy controls, and no samples are outside of the T2 ellipse. The characteristics of the holistic OPLS-DA model indicate that cancer cells possess a unique peptidomic phenotype [14,23]. Model parameters for the variation R2 Y (cum) and the predictive capability Q2 (cum) are significantly high [R2 Y (cum) = 0.981; Q2 (cum) = 0.846]. In case of model overfitting, permutation tests with 100 iterations were performed. These permutation tests compared the fitting performance of the original model with that of randomly permuted models [24].

177

The validation plot strongly indicates that the model is valid, as shown in Fig. 1(C). To further evaluate the predictive ability of the model, an independent test set consisting of 12 individuals (6 RCC patients and 6 healthy controls) was used. None of the individuals had been used in the generation of the supervised model, which therefore allowed for the estimation of true predictive accuracy. As depicted in Fig. 1(D), the OPLS-DA model correctly predicts all RCC patients (including 2 ES cases) and healthy controls, showing 100% sensitivity and 100% specificity for the samples involved in our study. This result shows great potential of using the holistic OPLS-DA model for RCC screening. Peptides were carefully screened before being approved as potential RCC biomarkers. Firstly, S-plot of the first component that can explain the most variables of the dataset was derived. By the combination of covariance (X-axis) and correlation loading profiles (Y-axis), the S-plot could thus be helpful for filtering interesting variables (peptides) and meanwhile lowering false positives [25]. As shown in Fig. 2(A), 63 squared variables which have both higher p and p(corr) values are the most relevant peptides for the separation between RCC and healthy controls [26]. Variables were also selected based on a criterion of VIP value greater than 1 [27]. Then, variables without the support of the necessary confidence intervals were rejected [28]. As shown in the VIP column (Fig. 2(B)), some variables highlighted with red stars could not be considered significant because of their negative confidence intervals. Fifteen unqualified variables were eliminated in this step. Unpaired Student’s t-tests were performed as the final testing procedure, and variables without significant differences (p value < 0.001) between RCC patients and the controls were eliminated. Overall, 19 variables passed the stringent threshold, and are considered meaningful biomarkers in discriminating the RCC patients from controls. To identify the 19 differentially expressed peptides between the RCC and healthy controls, two dimension candidate list (Rt, m/z) was generated manually and applied to data dependent MS/MS experiments. The QTOF mass spectrometer was set to the automatic precursor-presetting mode (only peptides in the candidate list will be fragmented using CID) during the LC runs for sequence identification. Raw MS/MS spectrum was processed with noise filtering (S/N = 5), deconvoluted using DataAnalysis 4.0 (Bruker Daltonics, USA), and then exported into MASCOT for sequence searching. Compared with identification of proteins by tryptic digests, identification of endogenous peptides by sequence database searching is more challenging because no enzyme cleavage is specified. As to the 19 RCC candidate biomarkers, we performed a MASCOT search against SwissProt database in advance, with parameter settings described in the experimental section. In succession, we manually verified the maximum probability of the MASCOT result (peptide rank of 1) with the confirmation criterion of matching of at least 3 successive b ions (N-terminal) or y ions (C-terminal) between the theoretical and experimental spectrum. Fig. 3 shows an interpretation of the type of fragments observed for the peptide DSGEGDFLAEGGGVR which was identified as a doubly-charged ion with mass accuracy of 1.1 ppm. The existence of successive series of b ions and y ions validates the identification, providing increased confidence of the Mascot results. Among the 19 candidate biomarkers, 2 were not identified due to their low abundance as well as their relative large size. Cancer is a complex disease involving systemic deregulation of cell proliferation, apoptosis, and cell cycle. A “biomarker pattern” containing a group of biomarkers might be effective for the discrimination of cancer [29]. Inspired by this viewpoint and also to understand the potential relationships among them, hierarchical cluster analysis was performed as a “multi-variate approach” using Cluster software (Stanford University, UK). 19 candidate peptides of 60 peptidome profiles (all the samples) were analyzed

178

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

Fig. 1. (A) PCA score plots based on all samples including 8 QCs (•, RCC patients; , Healthy controls; , QC samples). (B) OPLS-DA model with 24 RCC patients and 24 healthy controls. (C) Permutation validation of the OPLS-DA model. (D) OPLS-DA prediction of 6 independent RCC patients and 6 independent healthy controls (, RCC patient prediction set; , Healthy controls prediction set).

by average-linkage hierarchical clustering, and the result of two separated clusters is observed in the heat map, as shown in Fig. 4. The upper cluster (Y-axis) contains 30 of the 30 patients, showing 100% sensitivity, whereas the lower cluster contains 29 of the 30 healthy volunteers. As highlighted by a green star, one healthy volunteer flees from the lower cluster and joins in the upper cluster. The heat map indicates that data reduction by 97% (19 from 690

variables) does not adversely affect the separation of the groups. The closely related peptides are clustered, and two major clusters (on the top) are observed. Cluster I includes Var.02, 01, 03, 10, 06, 12 and 15. Cluster II consists of Var.05, 09, 07, 13, 04, 08, 11, 18, 14, 17, 16, and 19. The correlation coefficient among Var.16, 19 and 17 is significantly high (r = 0.87) while correlation coefficient between Var.16 and 19 is up to 0.98. It must be noted that the early stage

Fig. 2. (A) S-plot derived from the first orthogonal component. (B) VIP column plot with jack-knifed confidence intervals.

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

179

Fig. 3. MS/MS spectra for peptide DSGEGDFLAEGGGVR. The b and y ion masses are shown, as well as the matched amino acid sequence (top right) from MS/MS spectra.

Fig. 4. Unsupervised hierarchical clustering of the identified peptides from OPLS-DA model. The misclassified individual is highlighted using a green star. The males are labeled by blue squares, and the early stage patients are labeled by brown triangles. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

patients, labeled by blue squares, are separated in different subgroups. The reason may be that heat map is a supervised method, it synthetically takes the 19 peptides into consideration when performing hierarchical clustering; moreover, the peptides which can indicate the occurrence of cancer may be expressed differently within each patient due to individual differences. Hence the early stage samples could not be distinguished from the advanced

stage samples. Additionally, male (labeled by brown triangles) and female patients are divided into different subgroups. It indicates that the 19 candidate peptides are highly correlated with the occurrence of RCC cancer, but have no close relationship with genders. In conclusion, unsupervised multi-variate clustering which utilized 19 peptides signatures offers potential application for predicting the presence of RCC with high confidence (sensitivity = 100%, specificity = 93.3%). Using all the individuals in the training set and predictive set, we constructed the univariate ROC analysis for each peptide. Area under the ROC curve (AUC) was calculated and served as a measure of prediction performance of peptide biomarkers. It combines the sensitivity and specificity of a given marker for disease diagnosis [30]. An AUC of 0.5 indicates that the peptide does not predict better than chance. The discrimination of a diagnostic biomarker is considered good if AUC is 0.9–1, moderate if AUC is 0.7–0.9, and poor if AUC is 0.5–0.7. Table 2 shows the result of ROC analysis, and the variables are ranked according to their AUC. The proteins which the peptide sequences belong to are also listed above. 4 peptides (unidentified Var.03 included) are considered as the most possible RCC diagnostic biomarkers because their AUCs are larger than 0.9. The line MS/MS spectra of IYQLNSKLV and AGISMRSGDSPQD are shown in Fig. S1 and Fig. S2 in Supplementary Information. For Cluster I (Fig. 4), Var.01 has the highest AUC of 0.997, which is followed by Var.02 with AUC = 0.996. Var.03 has an AUC of 0.972; unfortunately, the identity of the peptide is still unknown due to its large molecular weight. For Cluster II, Var.04 achieves the highest AUC of 0.964. The high sensitivities and specificities achieved by those serum peptide biomarkers suggest that they could be used in clinical applications for diagnosing RCC. Future studies will include much larger sample sets to verify the results. The signal intensities and variations of the 3 identified peptides (Var.01, 02, and 04) are presented in Fig. 5. Several studies were carried out which made trials, based on the technique of SELDI (Surface Enhanced Laser Desorption/Ionization) or others, to identify peptides and short proteins (less than 10 kDa) in serum as markers of cancers, including breast cancer [31–35], colorectal cancer [36], gastric cancer [37,38], hepatocellular carcinoma [39], ovarian cancer [40], pancreatic cancer [41], prostate cancer [42], RCC cancer [43,44], thyroid cancer [45] and so on. Won et al. identified five biomarkers with masses of 3900, 4107, 4153, 5352 and 5987 Da which could correctly distinguish RCC samples from healthy and nonRCC samples [43]. Xu et al. discovered 35 biomarkers which were differently expressed between normal

180

Table 2 Potential serum peptide biomarkers for RCC. No.

Mass (Da)

P valuea

VIP

Trend

Chargeb

AUC

Ratio (K/C)c

Sequence

1076.62 1377.58

9.32E−11 7.08E−10

10.4 8.1

↓ ↓

2 2

0.997 0.996

– 0.08

Var.03 Var.04 Var.05 Var.06 Var.07 Var.08

5333.41 3164.83 2377.20 985.55 1464.65 1018.00

8.97E−11 8.59E−05 2.99E−09 2.78E−07 3.76E−04 4.05E−06

14.0 6.2 5.4 4.6 8.0 3.2

↓ ↑ ↑ ↓ ↑ ↑

6 3 3 3 2 2

0.972 0.964 0.891 0.882 0.858 0.853

0.29 14.89 12.23 0.12 5.69 1.43

Var.09 Var.10 Var.11

732.44 1771.80 3016.74

2.69E−05 4.80E−05 1.76E−06

4.1 4.7 3.3

↑ ↑ ↓

4 2 3

0.853 0.850 0.848

2.35 – 2.98

Var.12

1236.69

1.86E−05

8.0



2

0.832

5.69

RALAFR SLAELGGHLDQQVEEF QPVLVGLFLSMYLITVLGNLLIILAVSC + Amidated (C-term) + Oxidation[M]* APKPHAFVGSVK

Var.13 Var.14

4150.17 1036.00

7.59E−06 1.29E−04

6.7 4.9

↓ ↑

5 2

0.829 0.799

0.11 1.17

Unidentified IILILAILR + Amidated (C-term)*

Var.15 Var.16 Var.17 Var.18 Var.19

1555.76 1262.59 1205.58 922.65 1349.62

4.82E−04 6.16E−04 8.86E−04 3.24E−04 9.82E−04

4.6 5.2 17.2 5.4 3.8

↑ ↑ ↑ ↑ ↑

3 2 2 2 2

0.748 0.734 0.730 0.717 0.706

1.12 1.26 1.08 13.90 22.62

* a b c d e

The peptides which were identified tentatively. Indexes evaluating the significant differences between the healthy and the cancer groups. Change trend compared with controls. (↑): up-regulated. (↓): down-regulated. Ratio was calculated from the arithmetic mean values of each group. The proteins which are uncharacterized or couldn’t be found in the database of UniprotKB. Mascot score that is used to distinguish correct peptide identifications from incorrect ones.

IYQLNSKLV AGISMRSGDSPQD + Acetyl (N-term) + Oxidation [M]* Unidentified QGLLPVLESFKVSFLSALEEYTKKLNTQ DDPDAPLQPVTPLQLFEGRRN TRHTFGRI + Amidated (C-term)* DSGEGDFLAEGGGVR LPILKIIPI + Amidated(C-term)

DFWRKMYLREP + Oxidation[M] GEGDFLAEGGGVR EGDFLAEGGGVR LKPIIKVL SGEGDFLAEGGGVR

Protein name

P valuee

Cubilin –d

1.40E−02 3.90E−02

– Apolipoprotein A-I Complement C4 – Fibrinogen alpha chain Dynein, axonemal, heavy chain 10 – Apolipoprotein A-IV Olfactory receptor

4.50E−02 3.20E−02 3.10E−02 1.70E−02 1.20E−03 3.50E−034.00E−03

Putative Polycomb group Protein ASXL1 – Major facilitator superfamily domain-containing protein 8 Zinc finger protein 233 Fibrinogen alpha chain Fibrinogen alpha chain – Fibrinogen alpha chain

1.10E−03

3.80E−02 4.00E−02

2.80E−02 4.60E−03 1.90E−03

6.30E−03 2.20E−03 1.70E−03 3.30E−02 1.90E−02

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

Var.01 Var.02

*

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

181

Fig. 5. Values and variations of serum peptides with AUC > 0.9. A, B, and C corresponding Var.1, 2, and 4 in Table 2 respectively. Boxes are drawn from the 25th to 75th percentiles in the intensity distribution. The median, or 50th percentile, is drawn as a horizontal line inside the box.

controls and patients with small RCC tumors, and the diagnostic decision tree generated from them efficiently diagnosed early RCC samples with a sensitivity of 81.8% and specificity of 100% [44]. Villanueva et al. found 17 biomarkers of thyroid cancer among which 4 peptides are present in our study (Var.07, 16, 17, and 19), revealing that these peptides may be viewed as common biomarkers of carcinoma [45]. Comparing the peptides identified in our study with those identified in the serum KC peptidomic study done by Gianazza et al. which compared benign tumor and malignant tumor of kidney, only 4 of the peptides (Var.05, 07, 16, and 19) identified in our study were present in their dataset [46]. The large difference between the datasets might be attributed to the different controls used and the much stricter biomarker selection criteria in our study. Of the most predictive peptides (AUC > 0.9) in Table 2, Var.01 with the sequence of IYQLNSKLV and Var.02 with the sequence of AGISMRSGDSPQD + Acetyl(N-term) + Oxidation[M] are reported for the first time for cancer detection. The biological significance of the candidate peptides and proteins would be of great interest. Cubilin (Var.01), which facilitates endocytosis of HDL [47], is down-regulated in RCC patients. It may indicate that less HDL is taken in into cells to form HDL cholesterol, and the low concentration of HDL cholesterol is known to be closely associated with increasing incidence of cancer [48]. As a result, more HDL may be left in blood without being absorbed, as is the case here that Apolipoprotein A-I (Var.04) and Apolipoprotein A-IV (Var.10), two main proteins of HDL, are up-regulated in RCC patients. The protein of Complement C4 (Var.05) enhances the solubilization of immune aggregates and the clearance of them [49,50], and an up-regulated level of it indicates the existence of disorder in body. Putative polycomb group protein ASXL1 (Var.12) serves as a corepressor for peroxisome proliferator-activated receptor gamma (PPARG) and hinders its adipocyte differentiation-inducing activity [51,52], which is highly associated with cell proliferation and tumor formation. Four peptides of potential significance exhibit laddering (Var.07, DSGEGDFLAEGGGVR; Var.19, SGEGDFLAEGGGVR; Var.16, GEGDFLAEGGGVR; Var.17, EGDFLAEGGGVR). These laddering peptides may be generated by the enzymatical cleavage of Fibrinogen alpha chain by some enzymes within human body. An elevated level of those peptides usually indicates an abnormal clotting process in human body, which may be caused by disseminated intravascular coagulation, cellulites, ovarian cancer, and systemic lupus erythematosus [53]. The results of our peptidomic study imply that diagnosis of diseases by measuring the laddering peptides may be complicated, and their expression levels may also be different in different diseases. For KC, early diagnosis has the potential to markedly improve patient survival [54]. Given the fact that most physiological and pathological processes possess a serum peptidomic “signature,” the use of peptidomics is ideally suited for development of a serum diagnostic assay for RCC. In the discovery phase of our study, we sorted through hundreds of peptides to identify several that are most predictive of RCC. Despite the limited samples for this pilot

study, we found a clear and significant separation between RCC patients and healthy controls in both supervised holistic OPLS-DA model and unsupervised hierarchical cluster analysis. Furthermore, ROC analysis determined the most promising candidates for early diagnosis of RCC. It should be noted that all the patients enrolled in the present study are Chinese, which can have an impact on the generalizability of the study. Currently, the model allows detecting mainly advanced stage cancer. Further investigations should be undertaken to develop fully validated analytical procedures to confirm that these peptides accurately reflect the difference. 4. Conclusions Genomic and proteomic tools have been used for years in the studies attempting to discover clinical markers for a variety of cancers. However, they are still not readily amenable to discover diagnostic biomarker because large genes and proteins are not generally secreted into biofluids [14]. Peptidomics, a relatively new omics technology, is more suited for such an analysis because small peptides are frequently secreted into serum. Thus, peptidomics research may be more clinically translatable than genomic and proteomic endeavors. However, the peptidomic approach has its own shortcomings. There are very few possible specific internal cleavages to be used for significant identification with databases, as a consequence they’re identified by MS/MS characteristic fragment pattern [10]. On the other hand, many peptides can hardly be identified due to their low internal concentration and the low quality of some fragmentation spectra [55]. In our study, we applied a LC–MS based peptidomic approach to investigate the serum peptidomic signature and discovered the clinical biomarkers for RCC. With all qualified profile data, a holistic OPLS-DA model was constructed with 100% predictive accuracy. The unsupervised hierarchical cluster analysis was performed as a “multi-variate” approach, showing 100% sensitivity and 93.3% specificity with the samples involved in our study. There are 3 identified peptides with AUC > 0.9 discovered in this study, which might be the most promising peptide biomarkers for RCC detection. Among them, IYQLNSKLV and AGISMRSGDSPQD + Acetyl(N-term) + Oxidation[M] are reported for the first time for cancer detection. The elegance of the method reported here is that, rather than focusing on single peptides, we capitalized on the entire human peptidome to construct the OPLS-DA model and searched for RCC specific patterns of expression (hierarchical cluster). In future, a greater number of patients and controls will be verified and validated in our study before the ultimate development of a clinically applicable peptidomic test for RCC diagnosis. Acknowledgements We gratefully acknowledge the financial support from the Research Fund for the Doctoral Program of Higher Education of China (20120121110011), the Program for Changjiang Scholars and

182

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183

Innovative Research Team in University (IRT13036), and the Medical Center Construction Foundation of Xiamen. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.jpba.2014.07.028. References [1] W.M. Linehan, M.M. Walther, B. Zbar, The genetic basis of cancer of the kidney, J. Urol. 170 (2003) 2163–2172. [2] S. Störkel, J.N. Eble, K. Adlakha, M. Amin, M.L. Blute, D.G. Bostwick, M. Darson, B. Delahunt, K. Iczkowski, Classification of renal cell carcinoma, Cancer 80 (1997) 987–989. [3] Y. Won, H.-J. Song, T.W. Kang, J.-J. Kim, B.-D. Han, S.-W. Lee, Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons, Proteomics 3 (2003) 2310–2316. [4] K.W.M. Siu, L.V. DeSouza, A. Scorilas, A.D. Romaschin, R.J. Honey, R. Stewart, K. Pace, Y. Youssef, T.-f.F. Chow, G.M. Yousef, Differential protein expressions in renal cell carcinoma: new biomarker discovery by mass spectrometry, J. Proteome Res. 8 (2009) 3797–3807. [5] L. Lin, Z. Huang, Y. Gao, Y. Chen, W. Hang, J. Xing, X. Yan, LC–MS-based serum metabolic profiling for genitourinary cancer classification and cancer typespecific biomarker discovery, Proteomics 12 (2012) 2238–2246. [6] L. Lin, Z. Huang, Y. Gao, X. Yan, J. Xing, W. Hang, LC–MS based serum metabonomic analysis for renal cell carcinoma diagnosis, staging, and biomarker discovery, J. Proteome Res. 10 (2011) 1396–1405. [7] C. Wingren, A. Sandström, R. Segersvärd, A. Carlsson, R. Andersson, M. Löhr, C.A.K. Borrebaeck, Identification of serum biomarker signatures associated with pancreatic cancer, Cancer Res. 72 (2012) 2481–2490. [8] R.S. Tirumalai, K.C. Chan, D.A. Prieto, H.J. Issaq, T.P. Conrads, T.D. Veenstra, Characterization of the low molecular weight human serum proteome, Mol. Cell. Proteomics 2 (2003) 1096–1103. [9] L. Hu, M. Ye, H. Zou, Recent advances in mass spectrometry-based peptidome analysis, Expert Rev. Proteomics 6 (2009) 433–447. [10] M. Schrader, P. Schulz-Knappe, Peptidomics technologies for human body fluids, Trends Biotechnol. 19 (2001) S55–S60. [11] L.D. Fricker, J. Lim, H. Pan, F.-Y. Che, Peptidomics identification and quantification of endogenous peptides in neuroendocrine tissues, Mass Spectrom. Rev. 25 (2006) 327–344. [12] K. Sasaki, Y. Satomi, T. Takao, N. Minamino, Snapshot peptidomics of the regulated secretory pathway, Mol. Cell. Proteomics 8 (2009) 1638–1647. [13] Y. Gao, L. Lin, Z. Huang, Y. Chen, W. Hang, Peptidome workflow of serum and urine samples for biomarker discovery, Anal. Methods 3 (2011) 773–779. [14] C. Fredolini, F. Meani, A. Luchini, W. Zhou, P. Russo, M. Ross, A. Patanarut, D. Tamburro, G. Gambara, D. Ornstein, F. Odicino, M. Ragnoli, A. Ravaggi, F. Novelli, D. Collura, L. D’Urso, G. Muto, C. Belluco, S. Pecorelli, L. Liotta, E. Petricoin, Investigation of the ovarian and prostate cancer peptidome for candidate early detection markers using a novel nanoparticle biomarker capture technology, AAPS J. 12 (2010) 504–518. [15] J. Villanueva, J. Philip, C.A. Chaparro, Y. Li, R. Toledo-Crow, L. DeNoyer, M. Fleisher, R.J. Robbins, P. Tempst, Correcting common errors in identifying cancer-specific serum peptide signatures, J. Proteome Res. 4 (2005) 1060–1072. [16] F.L. Greene, D.L. Page, I.D. Fleming, A. Fritz, C.M. Balch, D.G. Haller, M.A. Morrow, AJCC Cancer Staging Manual, sixth ed., Springer, New York, 2002. [17] S.J. Bruce, P. Jonsson, H. Antti, O. Cloarec, J. Trygg, S.L. Marklund, T. Moritz, Evaluation of a protocol for metabolic profiling studies on human blood plasma by combined ultra-performance liquid chromatography/mass spectrometry: from extraction to data analysis, Anal. Biochem. 372 (2008) 237–249. [18] M. Bylesjö, M. Rantalainen, O. Cloarec, J.K. Nicholson, E. Holmes, J. Trygg, OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification, J. Chemom. 20 (2006) 341–351. [19] M. Bylesjö, D. Eriksson, M. Kusano, T. Moritz, J. Trygg, Data integration in plant biology: the O2PLS method for combined modeling of transcript and metabolite data, Plant J. 52 (2007) 1181–1191. [20] L.P. Aristoteli, M.P. Molloy, M.S. Baker, Evaluation of endogenous plasma peptide extraction methods for mass spectrometric biomarker discovery, J. Proteome Res. 6 (2007) 571–581. [21] P. Nanni, F. Levander, G. Roda, A. Caponi, P. James, A. Roda, A label-free nano-liquid chromatography–mass spectrometry approach for quantitative serum peptidomics in Crohn’s disease patients, J. Chromatogr. B 877 (2009) 3127–3136. [22] H.G. Gika, G.A. Theodoridis, J.E. Wingate, I.D. Wilson, Within-day reproducibility of an HPLC-MS-Based method for metabonomic analysis: application to human urine, J. Proteome Res. 6 (2007) 3291–3303. [23] J. Villanueva, D.R. Shaffer, J. Philip, C.A. Chaparro, H. Erdjument-Bromage, A.B. Olshen, M. Fleisher, H. Lilja, E. Brogi, J. Boyd, M. Sanchez-Carbayo, E.C. Holland, C. Cordon-Cardo, H.I. Scher, P. Tempst, Differential exoprotease activities confer tumor-specific serum peptidome patterns, J. Clin. Invest. 116 (2006) 271–284. [24] S. Wiklund, D. Nilsson, L. Eriksson, M. Sjöström, S. Wold, K. Faber, A randomization test for PLS component selection, J. Chemom. 21 (2007) 427–439.

[25] S. Wiklund, E. Johansson, L. Sjostrom, E.J. Mellerowicz, U. Edlund, J.P. Shockcor, J. Gottfries, T. Moritz, J. Trygg, Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models, Anal. Chem. 80 (2008) 115–122. [26] Z. Huang, L. Lin, Y. Gao, Y. Chen, X. Yan, J. Xing, W. Hang, Bladder cancer determination via two urinary metabolites: a biomarker pattern approach, Mol. Cell. Proteomics 10 (2011), M111.007922. [27] A.B. Umetrics, User’s Guide to SIMCA-P, SIMCA-P+, version 12.0, UMetrics, Umeå, Sweden, 2005. [28] P. Yin, D. Wan, C. Zhao, J. Chen, X. Zhao, W. Wang, X. Lu, S. Yang, J. Gu, G. Xu, A metabonomic study of hepatitis B-induced liver cirrhosis and hepatocellular carcinoma by using RP-LC and HILIC coupled with mass spectrometry, Mol. Biosyst. 5 (2009) 868–876. [29] E.F. Petricoin III, A.M. Ardekani, B.A. Hitt, P.J. Levine, V.A. Fusaro, S.M. Steinberg, G.B. Mills, C. Simone, D.A. Fishman, E.C. Kohn, Use of proteomic patterns in serum to identify ovarian cancer, Lancet 359 (2002) 572–577. [30] T. Poynard, P. Halfon, L. Castera, M. Munteanu, F. Imbert-Bismut, V. Ratziu, Y. Benhamou, M. Bourliere, V. de Ledinghen, FibroPaca Group, Standardization of ROC curve areas for diagnostic evaluation of liver fibrosis markers based on prevalences of fibrosis stages, Clin. Chem. 53 (2007) 1615–1622. [31] S. Becker, L.H. Cazares, P. Watson, H. Lynch, O.J. Semmes, R.R. Drake, C. Laronga, Surfaced-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) differentiation of serum protein profiles of BRCA-1 and sporadic breast cancer, Ann. Surg. Oncol. 11 (2004) 907–914. [32] J. Li, Z. Zhang, J. Rosenzweig, Y.Y. Wang, D.W. Chan, Proteomics and bioinformatics approaches for identification of serum biomarkers to detect breast cancer, Clin. Chem. 48 (2002) 1296–1304. [33] C. Mathelin, A. Cromer, C. Wendling, C. Tomasetto, M.-C. Rio, Serum biomarkers for detection of breast cancers: a prospective study, Breast Cancer Res. Treat. 96 (2006) 83–90. [34] A.W. van Winden, M.-C.W. Gast, J.H. Beijnen, E.J. Rutgers, D.E. Grobbee, P.H. Peeters, C.H. van Gils, Validation of previously identified serum biomarkers for breast cancer with SELDI-TOF-MS: a case–control study, BMC Med. Genet. 2 (2009) 4. ´ T. Liu, R. Zhao, B.O. Petritis, M.A. Gritsenko, D.G. Camp, R.J. [35] Y. Shen, N. Tolic, Moore, S.O. Purvine, F.J. Esteva, Blood peptidome-degradome profile of breast cancer, PLoS ONE 5 (2010) e13133. [36] Y.-d. Chen, S. Zheng, J.-k. Yu, X. Hu, Artificial neural networks analysis of surfaceenhanced laser desorption/ionization mass spectra of serum protein pattern distinguishes colorectal cancer from healthy population, Clin. Cancer Res. 10 (2004) 8380–8385. [37] Y. Liang, M. Fang, J. Li, C.B. Liu, J.A. Rudd, H. Kung, D.T. Yew, Serum proteomic patterns for gastric lesions as revealed by SELDI mass spectrometry, Exp. Mol. Pathol. 81 (2006) 176–180. [38] H.-B. Lu, J.-H. Zhou, Y.-Y. Ma, H.-L. Lu, Y.-L. Tang, Q.-Y. Zhang, C.-H. Zhao, Five serum proteins identified using SELDI-TOF-MS as potential biomarkers of gastric cancer, Jpn. J. Clin. Oncol. 40 (2010) 336–342. [39] J.-F. Cui, Y.-K. Liu, H.-J. Zhou, X.-N. Kang, C. Huang, Y.-F. He, Z.-Y. Tang, T. Uemura, Screening serum hepatocellular carcinoma-associated proteins by SELDI-based protein spectrum analysis, World J. Gastroenterol. 14 (2008) 1257. [40] M.F. Lopez, A. Mikulskis, S. Kuzdzal, E. Golenko, E.F. Petricoin, L.A. Liotta, W.F. Patton, G.R. Whiteley, K. Rosenblatt, P. Gurnani, A novel, high-throughput workflow for discovery and identification of serum carrier protein-bound peptide biomarker candidates in ovarian cancer samples, Clin. Chem. 53 (2007) 1067–1074. [41] G.M. Fiedler, A.B. Leichtle, J. Kase, S. Baumann, U. Ceglarek, K. Felix, T. Conrad, H. Witzigmann, A. Weimann, C. Schütte, Serum peptidome profiling revealed platelet factor 4 as a potential discriminating peptide associated with pancreatic cancer, Clin. Cancer Res. 15 (2009) 3812–3819. [42] B.-L. Adam, Y. Qu, J.W. Davis, M.D. Ward, M.A. Clements, L.H. Cazares, O.J. Semmes, P.F. Schellhammer, Y. Yasui, Z. Feng, Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men, Cancer Res. 62 (2002) 3609–3614. [43] Y. Won, H.J. Song, T.W. Kang, J.J. Kim, B.D. Han, S.w. Lee, Pattern analysis of serum proteome distinguishes renal cell carcinoma from other urologic diseases and healthy persons, Proteomics 3 (2003) 2310–2316. [44] G. Xu, C.Q. Xiang, Y. Lu, X.N. Kang, P. Liao, Q. Ding, Y.F. Zhang, Application of SELDI-TOF-MS to identify serum biomarkers for renal cell carcinoma, Cancer Lett. 282 (2009) 205–213. [45] J. Villanueva, A.J. Martorella, K. Lawlor, J. Philip, M. Fleisher, R.J. Robbins, P. Tempst, Serum peptidome patterns that distinguish metastatic thyroid carcinoma from cancer-free controls are unbiased by gender and age, Mol. Cell. Proteomics 5 (2006) 1840–1852. [46] E. Gianazza, C. Chinello, V. Mainini, M. Cazzaniga, V. Squeo, G. Albo, S. Signorini, S.S. Di Pierro, S. Ferrero, S. Nicolardi, Y.E.M. van der Burgt, A.M. Deelder, F. Magni, Alterations of the serum peptidome in renal cell carcinoma discriminating benign and malignant kidney tumors, J. Proteomics 76 (2012) 125–140. [47] R. Kozyraki, J. Fyfe, M. Kristiansen, C. Gerdes, C. Jacobsen, S. Cui, E.I. Christensen, M. Aminoff, A. de la Chapelle, R. Krahe, The intrinsic factor – vitamin B12 receptor, cubilin, is a high-affinity apolipoprotein AI receptor facilitating endocytosis of high-density lipoprotein, Nat. Med. 5 (1999) 656–661. [48] A.-S. Furberg, M.B. Veierød, T. Wilsgaard, L. Bernstein, I. Thune, Serum highdensity lipoprotein cholesterol, metabolic profile, and breast cancer risk, J. Natl. Cancer Inst. 96 (2004) 1152–1160.

Z. Huang et al. / Journal of Pharmaceutical and Biomedical Analysis 100 (2014) 175–183 [49] M.C. Carroll, D.M. Fathallah, L. Bergamaschini, E.M. Alicot, D.E. Isenman, Substitution of a single amino acid (aspartic acid for histidine) converts the functional activity of human complement C4B to C4A, Proc. Natl. Acad. Sci. U. S. A. 87 (1990) 6868–6872. [50] A.W. Dodds, X.-D. Ren, A.C. Willis, S.A. Law, The reaction mechanism of the internal thioester in the human complement component C4, Nature 379 (1996) 177–179. [51] Y.-S. Cho, E.-J. Kim, U.-H. Park, H.-S. Sin, S.-J. Um, Additional sex comb-like 1 (ASXL1), in cooperation with SRC-1, acts as a ligand-dependent coactivator for retinoic acid receptor, J. Biol. Chem. 281 (2006) 17588–17598.

183

[52] J.C. Scheuermann, A.G. de Ayala Alonso, K. Oktaba, N. Ly-Hartig, R.K. McGinty, S. Fraterman, M. Wilm, T.W. Muir, J. Müller, Histone H2A deubiquitinase activity of the Polycomb repressive complex PR-DUB, Nature 465 (2010) 243–247. [53] X. Zheng, H. Baker, W.S. Hancock, Analysis of the low molecular weight serum peptidome using ultrafiltration and a hybrid ion trap-Fourier transform mass spectrometer, J. Chromatogr. A 1120 (2006) 173–184. [54] R.H. Weiss, P.Y. Lin, Kidney cancer: identification of novel targets for therapy, Kidney Int. 69 (2006) 224–232. [55] G. Baggerman, P. Verleyen, E. Clynen, J. Huybrechts, A. De Loof, L. Schoofs, Peptidomics, J. Chromatogr. B 803 (2004) 3–16.

Liquid chromatography-mass spectrometry based serum peptidomic approach for renal clear cell carcinoma diagnosis.

Serum peptidomic approach was applied to investigate the peptidomic signature and discover the clinical biomarkers and biomarker patterns for RCC pati...
2MB Sizes 1 Downloads 7 Views