1

Electrophoresis 2013, 00, 1–5

Julie Klein1 Theofilos Papadopoulos2 Harald Mischak1,3 William Mullen3 1 Mosaiques

Diagnostics, Hannover, Germany Research Foundation, Academy of Athens, Greece 3 BHF Glasgow Cardiovascular Research Centre, University of Glasgow, UK 2 Biomedical

Received July 16, 2013 Revised August 21, 2013 Accepted August 23, 2013

Short Communication

Comparison of CE-MS/MS and LC-MS/MS sequencing demonstrates significant complementarity in natural peptide identification in human urine Clinical proteomics has led to the identification of biomarkers specifically associated with a clinical condition that can serve for diagnostic or prognostic purposes. Learning more about the origin of these protein fragments would lead to a better insight in the pathology, and this requires improved identification of the peptide sequences. The aim of this study is to assess the complementarity of LC-MS/MS and CE-MS/MS as techniques in peptide sequence identification of the urinary low-molecular weight proteome. A male standard human urine sample was analyzed using LC- and CE-MS/MS (n = 10 per technique), identifying 905 unique peptide sequences with high confidence, 50% of those were identified only with LC, 20% only with CE and 30% with both techniques. Higher LC coverage might be due in part to the higher amount of sample that can be loaded onto an LC column. Peptides uniquely identified in CE are generally small and highly charged, likely unable to bind to the LC column In conclusion, we showed that LC-MS/MS and CE-MS/MS are highly complementary in identifying peptide sequences. The combination of both technologies results in significantly increased sequence coverage. Keywords: Biomarkers / Clinical proteomics / MS / Peptide sequencing DOI 10.1002/elps.201300327



Additional supporting information may be found in the online version of this article at the publisher’s web-site

Clinical proteomics has made significant progress in the last decade, leading to the identification of peptide biomarkers that can be used for diagnostic or prognostic purposes [1–4]. It is becoming evident that learning more about the origin of these protein fragments (i.e. local or systemic production) and how they were generated (i.e. proteases) will lead to a better understanding of the pathophysiological mechanisms associated with the disease [5–7]. To achieve this goal, a systematic identification of the peptides is mandatory. The discovery of peptide biomarkers has mostly been performed using CE-MS, while LC-MS/MS has been the main technology employed in the identification of these peptide sequences [8, 9]. Here, we compare the performance of CE-MS/MS and LC-MS/MS in

Correspondence: Dr. William Mullen, Biomarker Research, BHF Glasgow Cardiovascular Research Centre, Lab B 3.33a/Room 2.21, Joseph Black Building, University of Glasgow, Glasgow G12 8QQ, United Kingdom E-mail: [email protected] Fax: +44-141-330-3360

Abbreviation: HCD, higher-energy collision dissociation  C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

the analysis of a complex sample, and investigate if a combination of LC-MS/MS and CE-MS/MS can increase sequence coverage. If the two technologies are complementary then they may, when applied in combination, significantly increase the rate of identification of peptide biomarker sequences. A male standard human urine sample was analyzed ten times with LC-MS/MS and ten times with CE-MS/MS. This sample has already been described extensively in [10]. The collection in all cases followed the procedure that was used in several recent studies [11–13]. Collected samples were frozen immediately at −20⬚C. Upon completion of collection, all frozen samples were thawed on ice, sonicated, combined divided into aliquots, and frozen again at −80⬚C. Urine samples were prepared as described by Z¨urbig et al. [14]. After extraction all samples were lyophilized and stored at 4⬚C until further use. Samples for CE-MS/MS analysis were reconstituted in 10 ␮L of HPLC-grade water, while LC-MS/MS samples were reconstituted in 50 ␮L of HPLC-grade water. For LC analysis a 5 ␮L sample (1:10) was loaded for separation in 0.1% formic acid and acetonitrile (98:2) onto a Dionex Ultimate 3000 RSLS nano flow system (Dionex, Camberly, UK) at a flowrate of 5 ␮L/min. Elution was performed www.electrophoresis-journal.com

2

J. Klein et al.

Electrophoresis 2013, 00, 1–5

Figure 1. Comparison of the average number of spectra acquired per unit time (A) and per run (B), average number of high confidence sequence identified per unit time (C) and per run (D) in LC- (white bar) and in CE-MS/MS (full bar) (*p < 0.05).

on an Acclaim PepMap C18 nano column with a linear gradient of A 0.1% formic acid and acetonitrile (98:2) against 0.1% formic acid and acetonitrile (20:80) starting at 5–50%A over 100 min. For CE, a 230 nL-aliquot (1:50) was injected under constant flow and pressure conditions at a pH of 2.2 to ensure that all peptides are positively charged. Both, CE and LC, were directly interfaced with an Orbitrap Velos FTMS (Thermo Finnigan, Bremen, Germany), using data-dependent higherenergy collision dissociation (HCD) MS/MS sequencing of a maximum of the top 20 ions. Data files were analyzed using Proteome Discoverer 1.2 (activation type: HCD; min– max precursor mass: 790–6000; precursor mass tolerance: 10 ppm; fragment mass tolerance: 0.05 Da; S/N threshold: 1) and were searched against the Uniprot human nonredundant database without enzyme specificity. No fixed modifications were selected, oxidation of methionine and proline and deamidation of aspartic acid and glutamine were selected as variable modifications. The peptide data were extracted using high confidence peptides, defined by an Xcorr ≥ 1.9, a delta mass between experimental and theoretical mass ± 5 ppm, absence of cysteine in the sequence as without reduction and alkylation it forms disulphide bonds, absence of oxidized proline in protein precursors other than collagens or elastin, and top one peptide rank filters. The object of this investigation was to compare how complementary the two technologies are and ascertain their practical contributions to peptide sequence identification. The sample cycle time for both technologies is similar, in this case 80 min for LC and 60 min for CE. However, the actual time available for analysis of peptides was 60 min for LC and 19 min for CE. The average number of spectra acquired per unit time was slightly higher in CE than LC

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

(126.3/min ± 2.4 versus 94.7/min ± 4.3, respectively, p < 0.0001) (Fig. 1A), while the total number of spectra per run was more than two times lower in CE than LC (2388 ± 137 versus 6014 ± 1089, respectively, p < 0.05; Fig. 1B). To assess the quality of spectra acquired in each analysis the number of high confidence sequences returned was studied. The average number of high confidence sequences returned was 9.4/min (±1.4) and 14.9/min (±1.7) (p < 0.05) for LC and CE, respectively (Fig. 1C). However, the elevated rate of high confidence sequences identification for CE must be balanced against the lower duration of the actual analysis time (19 min versus 60 min). When calculating the average number of unique sequences per run, the difference between CE and LC became nonsignificant (162 ± 15 versus 211 ± 35, respectively, p = 0.49; Fig. 1D). By further combining LC and CE sequences obtained in each of the ten analyses, we could identify a total number of 905 peptide sequences, with LC generating significantly more unique high confidence sequences than CE (727 versus 463, respectively; Fig. 2A and Supporting Information Table). However, when analyzing the reproducibility of the sequence identification, we found that CE demonstrated a higher reproducibility, as almost 20% of the CE sequences were reproducibly detected in 70–100% of the ten technical replicates whereas only 7% of the LC sequences were detected with similar reproducibility (Fig. 2B and Supporting Information Table). LC and CE technologies showed clear complimentary selectivity as only 30% (285) of the peptide sequences were identified by both techniques (Fig. 2A and Supporting Information Table). There may be several reasons for the low overlap between LC and CE identification. A significant limitation of

www.electrophoresis-journal.com

Electrophoresis 2013, 00, 1–5

General

3

Figure 2. (A) Venn diagram depicting the number of unique peptides found in LC-, CE-MS/MS and in both methods of analysis. (B) Cumulative percentage of consistently identified peaks in found in CE and in LC analysis. (C) Comparison of the average normalizedppm area of the unique LC-MS/MS sequences and the sequences found in both LC- and CE-MS/MS (*p < 0.05). (D) Distribution of the normalized-ppm area of the sequences identified in LC-MS/MS depending on their identification also in CE-MS/MS.

LC compared to CE is the loss of peptides due to either elution in the flow through (especially small and highly charged peptides) or strong interaction with the LC columns (large and hydrophobic peptides). As we only scored the presence of high-confidence identification, we subsequently investigated if the peptides were really absent in the LC runs (i.e. due to LC separation bias) or if they could be found but with spectra of lower quality. Of the 178 unique CE sequences, 78 (44%) were truly unique and 100 (56%) were detectable in LC runs but with lower confidence, not passing the rigid threshold applied (see above). On the other hand, of the 442 unique LC sequences, only 94 (21%) could also be found in CE with lower quality spectra, and 348, almost 80% appeared to be unique. Three main hypotheses could explain this very high number of LC sequences that cannot be found in CE. In LC the amount of sample loaded onto the column was substantially higher than in CE, enabling identification of lower abundant peptides. We used the standard dilutions factors employed in our laboratory for CE-MS (1:50), this is the maximum amount that can be loaded without compromising resolution. For LC-MS/MS analysis the dilution factor (1:10) used ensures neither the column nor the detector are overloaded. If CE-MS/MS analysis is to add to the overall sequence identification data it must be compared with the optimum concentration used for LCMS/MS analysis used in our laboratory. However, this may be one explanation for the low number of overlapping peptides. The second hypothesis is that the ionisation efficiency of a number of hydrophobic peptides may be greatly enhanced in the higher organic content mobile phase in the LC gradient system. The third hypothesis is that a major negative factor of LC is the presence of carryover. In LC, the column can never been fully restored and will always present a substan C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

tial amount of “contaminating” peptides from the preceding sample runs; in this case, peptides not actually present in the sample may be detected, effectively originating from previously run samples. On the other hand, the CE capillary can be perfectly cleaned and reconditioned with NaOH after each run. We tested the hypothesis that the unique LC sequences could not be found in CE due to their lower abundance. To do so, we compared the average normalized-ppm area of the 442 unique LC sequences to the 285 sequences that could be found in both LC and CE. The peptides identified in LC and not in CE had an average area about ten times lower than the peptides that could be found by both techniques (94 ± 12 versus 1119 ± 193, respectively, p < 0.0001) (Fig. 2C). To confirm this result, we analyzed the area distribution of the 727 LC peptides, and we showed that the lower the area, the lower the probability of the sequence to be found in CE (Fig. 2D). Therefore, although we cannot rule out that out of the 442 high confidence unique LC peptides some of them are contaminants and are not really a part of standard urine low-molecular weight proteome, the high number of peptides that could uniquely be identified by LC are likely due in part to the higher volume and concentration of sample that were used for analysis. We investigated what were the main protein precursors represented in standard urine low-molecular weight proteome and found that the large majority of the peptides were fragments of collagens (Table 1 and Supporting Information Table). Interestingly, when looking for the top 20 proteins associated with the urinary peptides in an extensive, open-source, database of omics data associated with kidney pathophysiology (http://www.kupkb.org/) [15] and in the Human Protein Atlas (http://www.proteinatlas.org/), 14 of the precursor proteins had already been identified as also www.electrophoresis-journal.com

4

J. Klein et al.

Electrophoresis 2013, 00, 1–5

Table 1. Representation of the top 20 proteins in the standard urinary low-molecular weight proteome

Symbol

COL1A1 COL3A1 COL1A2 UMOD S100A9 COL2A1 SEMG1 FGA COL16A1 COL5A1 COL5A2 S100A8 COL11A1 COL4A5 SPP1 ALB COL11A2 COL4A2 COL5A3 GSN IGKC

Name

Collagen alpha-1(I) chain (P02452) Collagen alpha-1(III) chain (P02461) Collagen alpha-2(I) chain (P08123) Uromodulin (P07911-2) Protein S100-A9 (P06702) Collagen alpha-1(II) chain (P02458-1) Semenogelin-1 (P04279) Fibrinogen alpha chain (P02671) Collagen alpha-1(XVI) chain (A6NDR9) Collagen alpha-1(V) chain (P20908) Collagen alpha-2(V) chain (P05997) Protein S100-A8 (P05109) Collagen alpha-1(XI) chain (P12107-3) Collagen alpha-5(IV) chain (P29400) Osteopontin (P10451-4) Serum albumin (P02768-2) Collagen alpha-2(XI) chain (P13942-5) Collagen alpha-2(IV) chain (P08572) Collagen alpha-3(V) chain (P25940) Gelsolin (Q5T0H9) Ig kappa chain C region (P01834)

#Sequences CE

LC

Both

41 11 8 1 19 4 7 2 2 2 3 9 2 2

90 30 18 13 3 11 4 3 7 8 6

152 53 13 16 6 4 6 8 1

1 4 3 6

6 4 8 6 3 3 7 7

1 1 2

1 1

Detect. in urine

Expr. in kidney

+ – + + + – + + + – + – – – + + – + + + +

+ + + + + – – + – – – ± + + + + ± + – + –

For each protein, the number of identified peptides with either CE- or LC-MS/MS or both are displayed. Using the KUPKB and Human Protein Atlas in combination, previous detection of the protein in healthy bladder urine and expression in the kidney is shown. +: detected/expressed; –: not detected/expressed; ±: uncertain.

being present in the healthy human urinary proteome, and twelve have been shown to be expressed in the kidney (Table 1). The remaining nine proteins might thus originate from an extra-renal source (i.e. systemic production, or urinary system such as bladder or ureter). Finding a number of proteins of systemic origin confirms the use of urine as a source of biomarkers for a range of extra renal diseases. Two such proteins S100 A8 and A9 (Table 1) were better detected in CE than in LC. These peptides are predominantly polar and highly charged with multiple histidine and lysine amino acids that, as mentioned above, may elute in the void volume and go undetected. This may also explain why protein S100 A8 has not been previously reported in urine. In conclusion, in this study we demonstrated that CEMS/MS and LC-MS/MS are highly complementary approaches, and employing both techniques results in substantially increased sequence coverage. LC-MS/MS gives a higher number of identified peptide sequences per analytical run. This may to a large degree be a result of the substantially increased time available for analysis, as well as to the ability to load significantly more sample on an LC column, in comparison to CE. As in the CE analysis all peptides migrate through the capillary based on their mass and charge state, they must all elute from the capillary, hence should be detectable. Therefore not detecting them must be due to technical issues, whether this be sensitivity or the sampling speed. Increasing the sample concentration in CE may improve se-

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

quencing efficiency but has the potential to negatively impact on the separation quality. However, the reason for the missing peptides may not just be due to the higher sample loading capacity in LC. CE produces much narrower peak widths, in the order of 4–6 s, whereas LC is 40–60 s. Unlike LC, CE cannot be run at a lower flow rate to increase the time peaks are available for MS/MS analysis. The Orbitrap mass spectrometer when running in HCD and top 20 MS/MS mode, as was the case here, has a cycle time of around 6 s when running the full top 20 MS/MS scans. This could result in a number of peaks being missed in the CE-MS/MS analysis. Decreasing the number of MS/MS scans down to top 10 may reduce this problem but may also result in low abundance peptides being missed. Further optimisation of the time required to obtain high-resolution MS/MS spectra may produce improved results. On the other hand, in the LC analysis we cannot fully rule out the possibility of a number of sequences in the LC results arise from carryover from previous miscellaneous samples, which may increase the number of unique peptides reported and explain the lower reproducibility of peptide identifications compared to CE. Despite the aforementioned limitations CE-MS/MS led to the identification of a further 20% of the total number of sequences identified, a significant proportion when trying to extensively map any proteome. Increasing the sequence coverage will help to identify peptide biomarkers of disease and unravel the mechanistic pathways associated with the pathophysiology of diseases.

www.electrophoresis-journal.com

Electrophoresis 2013, 00, 1–5

General

5

H.M. acknowledges support from the FP7-PEOPLE-2009IAPP program Protoclin (GA 251368) and from HEALTH2011-278249, EU-MASCARA.

[7] Siwy, J., Zoja, C., Klein, J., Benigni, A., Mullen, W., Mayer, B., Mischak, H., Jankowski, J., Stevens, R., Vlahou, A., Kossida, S., Perco, P., Bahlmann, F. H., PLoS One 2012, 7, e51334.

The authors have declared the following potential conflict of interest: J.K. is an employee and HM is the founder and co-owner of Mosaiques Diagnostics, who developed the CE-MS technology for clinical application. The other authors do not have competing interest.

[8] Chalmers, M. J., Mackay, C. L., Hendrickson, C. L., Wittke, S., Walden, M., Mischak, H., Fliser, D., Just, I., Marshall, A. G., Anal. Chem. 2005, 77, 7163–7171.

References [1] Mischak, H., Schanstra, J. P., Proteomics Clin. Appl. 2011, 5, 9–23. [2] Roscioni, S. S., de Zeeuw, D., Hellemons, M. E., Mischak, H., Zurbig, P., Bakker, S. J., Gansevoort, R. T., Reinhard, H., Persson, F., Lajer, M., Rossing, P., Lambers Heerspink, H. J., Diabetologia 2013, 56, 259–267. [3] Metzger, J., Negm, A. A., Plentz, R. R., Weismuller, T. J., Wedemeyer, J., Karlsen, T. H., Dakna, M., Mullen, W., Mischak, H., Manns, M. P., Lankisch, T. O., Gut 2013, 62, 122–130. [4] Theodorescu, D., Schiffer, E., Bauer, H. W., Douwes, F., Eichhorn, F., Polley, R., Schmidt, T., Schofer, W., Zurbig, P., Good, D. M., Coon, J. J., Mischak, H., Proteomics Clin. Appl. 2008, 2, 556–570.

[9] Zurbig, P., Renfrow, M. B., Schiffer, E., Novak, J., Walden, M., Wittke, S., Just, I., Pelzing, M., Neususs, C., Theodorescu, D., Root, K. E., Ross, M. M., Mischak, H., Electrophoresis 2006, 27, 2111–2125. [10] Mischak, H., Kolch, W., Aivaliotis, M., Bouyssie, D., Court, M., Dihazi, H., Dihazi, G. H., Franke, J., Garin, J., Gonzalez de Peredo, A., Iphofer, A., Jansch, L., Lacroix, C., Makridakis, M., Masselon, C., Metzger, J., Monsarrat, B., Mrug, M., Norling, M., Novak, J., Pich, A., Pitt, A., Bongcam-Rudloff, E., Siwy, J., Suzuki, H., Thongboonkerd, V., Wang, L. S., Zoidakis, J., Zurbig, P., Schanstra, J. P., Vlahou, A., Proteomics Clin. Appl. 2010, 4, 464–478. [11] Haubitz, M., Good, D. M., Woywodt, A., Haller, H., Rupprecht, H., Theodorescu, D., Dakna, M., Coon, J. J., Mischak, H., Mol. Cell Proteomics 2009, 8, 2296–2307. [12] Jantos-Siwy, J., Schiffer, E., Brand, K., Schumann, G., Rossing, K., Delles, C., Mischak, H., Metzger, J., J. Proteome. Res. 2009, 8, 268–281. [13] Kistler, A. D., Mischak, H., Poster, D., Dakna, M., Wuthrich, R. P., Serra, A. L., Kidney Int. 2009, 76, 89–96.

[5] Klein, J., Eales, J., Zurbig, P., Vlahou, A., Mischak, H., Stevens, R., Proteomics 2013, 13, 1077–1082.

[14] Zurbig, P., Schiffer, E., Mischak, H., Methods Mol. Biol. 2009, 564, 105–121.

[6] Metzger, J., Chatzikyrkou, C., Broecker, V., Schiffer, E., Jaensch, L., Iphoefer, A., Mengel, M., Mullen, W., Mischak, H., Haller, H., Gwinner, W., Proteomics Clin. Appl. 2011, 5, 322–333.

[15] Klein, J., Jupp, S., Moulos, P., Fernandez, M., BuffinMeyer, B., Casemayou, A., Chaaya, R., Charonis, A., Bascands, J. L., Stevens, R., Schanstra, J. P., Faseb. J. 2012, 26, 2145–2153.

 C 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.electrophoresis-journal.com

MS sequencing demonstrates significant complementarity in natural peptide identification in human urine.

Clinical proteomics has led to the identification of biomarkers specifically associated with a clinical condition that can serve for diagnostic or pro...
517KB Sizes 0 Downloads 0 Views