JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

Available online at www.sciencedirect.com

ScienceDirect www.elsevier.com/locate/jprot

Statistical characterization of HCD fragmentation patterns of tryptic peptides on an LTQ Orbitrap Velos mass spectrometer Chen Shaoa,1 , Yang Zhangb,1 , Wei Sunc,⁎ a

National Key Laboratory of Medical Molecular Biology, Department of Physiology and Pathophysiology, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, 5 Dong Dan San Tiao, Beijing 100005, China b Department of Neurosurgery/China National Clinical Research Center for Neurological Diseases, Beijing Tiantan Hospital, Capital Medical University, 6 Tian Tan Xi Li, Beijing 100050, China c Core Facility of Instrument, Institute of Basic Medical Sciences Chinese Academy of Medical Sciences, School of Basic Medicine, Peking Union Medical College, 5 Dong Dan San Tiao, Beijing 100005, China

AR TIC LE I N FO

ABS TR ACT

Article history:

High-energy collisional dissociation (HCD) is an efficient peptide fragmentation method that is

Received 22 November 2013

widely used in Orbitrap mass spectrometers. A greater understanding of HCD fragmentation

Accepted 16 June 2014

patterns would benefit the development of proteomic data analysis algorithms. In this study, b and y ion fragmentation patterns and residue-specific cleavage effects in HCD mode were statistically characterized on a LTQ Orbitrap Velos mass spectrometer. We compared HCD and CID spectra

Keywords:

collected in an Orbitrap for the same doubly and triply charged tryptic peptides. Our analytical

Proteomics

results revealed novel statistical features of HCD spectra. The intensity of y ions reached a

HCD

maximum in the 60–70% and 40–50% relative mass bins of HCD spectra from doubly and triply

Fragmentation pattern

charged peptides, respectively. The HCD mode showed a slight preference for generating y ions

CID

with lower charges than did CID mode. Singly charged fragment ions dominated the five fragment ions with the highest intensity in HCD spectra. Hydrophobic residues for b ions were the primary differences in cleavage selectivity between the two modes, while residues for y ions showed a similar cleavage preference. These results will assist with the development of database search engines and the design of proper transitions for targeted proteomic analysis. Biological significance Orbitrap mass spectrometry is becoming the popular instrument for proteomic analysis, and HCD mode is becoming the main analysis mode for its high resolution and high sensitivity. This study characterizes the features of HCD spectrum by comparing with CID mode on LTQ Velos Orbitrap. The patterns of b and y ions in HCD and CID modes were systematically compared for the first time, including the charge state, ion frequency and intensity and cleavage selectivity. These results will help develop database search engines and design proper transitions for targeted proteomic analysis in the future. © 2014 Elsevier B.V. All rights reserved.

⁎ Corresponding author. E-mail addresses: [email protected] (C. Shao), [email protected] (Y. Zhang), [email protected] (W. Sun). 1 These authors contributed equally to this work.

http://dx.doi.org/10.1016/j.jprot.2014.06.012 1874-3919/© 2014 Elsevier B.V. All rights reserved.

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

1. Introduction Tandem mass spectrometry (MS/MS) is widely used for high-throughput analysis of complex proteomes. Peptides are fragmented in mass spectrometry, and MS/MS spectra are then analyzed with database searching or de novo sequencing algorithms to identify peptide sequences. Detailed understanding of the fragmentation process would assist in the development of peptide identification algorithms and also the design of transitions for multiple reaction monitoring (MRM) experiments. Collision-induced dissociation (CID) in ion trap instruments is a low energy resonant-excitation process that activates/ deactivates ions continuously at rates of 1–100 s−1 [1]. CID has been used successfully in recent decades; however, it has an intrinsic drawback in that fragment ions less than one-third of the precursor ion m/z are lost in the MS/MS spectra, a phenomenon called the one-third effect. High energy collisional dissociation (HCD) was first implemented in the octopole collision cell of the LTQ Orbitrap XL in 2007 [2]. It is now widely used in state-of-the-art instruments such as the Q-Exactive and LTQ Orbitrap Fusion. HCD is a beam-type CID similar to the quadrupole CID performed in triple quadrupole and Q-ToF instruments. HCD has higher energy and shorter activation time (~0.1 ms) than linear ion trap CID (~30 ms) [1]. HCD overcomes the one-third effect [2] and thus generates abundant informative ions in the low-mass region, such as the characteristic a2 ion, the b2 ion pair, y1 and y2 ions, immonium ions and reporter ions for isobaric tags (e.g., iTRAQ and TMT) [3,4], facilitating the studies of post-translational modifications [2,5–8], protein quantification and de novo sequencing [9]. Currently, the mobile proton model has been the most widely accepted theory for the interpretation of the gas phase fragmentation process [10,11]. This model assumes that protons are initially localized to the most basic sites of protonated peptides. After ion activation, the ionizing proton can be transferred to the amide carbonyl oxygens along the peptide backbone to cause fragmentation of peptide amide bonds. By considering the peptide ion charge state in relation to the number of basic amino acid residues, protons on peptide ions can be categorized as ‘mobile’, ‘partially-mobile’ and ‘non-mobile’ [12]. The mobile proton model is a general framework that describes the fundamental chemistry of gas phase peptide fragmentation. Different fragmentation methods show different features of this process. In a global comparison, Graaf et al. [13] measured the similarity of MS/MS spectra acquired from quadrupole CID, ion trap CID and HCD by calculating cross-correlation (Xcorr) scores. Their observations demonstrated that spectra acquired by quadrupole fragmentation were much more similar to spectra acquired by Orbitrap HCD than to ion-trap CID spectra. Detailed rules for some fragmentation methods have been discovered with statistical analysis of relatively large datasets of MS/MS spectra. Among all of the commonly used fragmentation methods, ion trap CID has been the best studied. Sequence-dependent or residue-dependent fragmentation behaviors that include, but are not limited to, basic residues have been found to be common in the CID process [10,12,14,15]. For example, enhanced cleavage N-terminal to proline (P) and C-terminal to aspartic acid were considered to be the primary

27

cleavage effects in the charge-directed and charge-remote pathways, respectively [12]. Enhanced cleavage C-terminal to histidine (H) was also found to be significant for the formation of b ions [14]. Zubarev and colleagues [16,17] also compared CID cleavage selectivity to ECD (electron capture dissociation) on an LTQ-FT instrument. They found that the yn-2 (n refers to peptide length) fragment had the highest intensity of all y ions. A similar study of quadrupole CID was performed on a Q-TOF [18] instrument and revealed common cleavage selectivity among various collision-activated fragmentation methods as well as several unique features for each method. As for HCD, it provides fragment ions in the full mass range. Therefore, a typical a2–b2 ion pair as well as other small informatic fragment ions can be detected in HCD spectra. y ions were observed to dominate HCD spectra, whereas b ions contributed little because they were less stable due to higher collision energy. For peptide PTM analysis, HCD provides characteristic immonium ions containing modified residues [2,5,19]. Although both HCD and quadrupole CID are beam-type CID, patterns of these two fragmentation methods are expected to be slightly different due to the difference in collision energy. Statistical investigation of large datasets from high-resolution MS/MS spectra is needed to reveal more information on HCD fragmentation patterns. Recently Michalski et al. [19] systematically characterized ion type composition in HCD spectra. They found that HCD spectra were more complex than ion trap CID spectra. Regular a, b and y ions accounted for only 53% of total fragment ion intensity in HCD spectra, compared to 72% in ion trap CID spectra. The low mass regions in HCD spectra contained abundant internal fragments, side chain fragments and immonium ions. Additionally, b ions accounted for only about 30% amino-acid coverage in HCD spectra, whereas the information content of b ions in ion trap CID spectra was increased greatly by prominent high-mass b ions. In this study, we investigated b- and y-ion fragmentation patterns from HCD by comparing FT-FT-HCD to FT-FT-CID MS/MS spectra on an LTQ Orbitrap Velos. Features such as b and y ion charge state, identification rate and intensity distribution were compared. We also characterized statistically the influence of residues adjacent to the fragmentation sites for the formation of b and y ions in the two modes. Currently, peptide identification algorithms make only minimal use of fragment ion intensity information. The characterization of detailed fragmentation rules, especially residue-dependent fragmentation effects, would undoubtedly help to incorporate more spectral information into these algorithms and thus improve the confidence and efficiency of proteomic data analysis.

2. Materials and methods 2.1. Apparatus The LTQ Velos Orbitrap mass spectrometer used in this study was from Thermo Fisher (Bremen, Germany). The ACQUITY UPLC system was from Waters (Milford, MA), and the Advance Captive Spray for Thermo Fisher and the C18 reverse phase capillary was from Michrom Bioresources (Auburn, CA).

28

JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

2.2. Reagents Deionized water from a Milli-Q RG ultrapure water system (Millipore, Bedford, MA) was used in all experiments. HPLC grade ACN and formic acid, ammonium bicarbonate, iodoacetamide, DTT, sequencing grade modified trypsin, and PMSF were from Sigma-Aldrich (St. Louis, MI).

2.3. Protein sample preparation Human urine extracts were obtained from five healthy men by mixing first-void urine samples and precipitating proteins with −20 °C acetone. Briefly, five first-void urinary samples were centrifuged at 5000 g for 30 min and the precipitate removed. Supernatants were mixed with the same volume, and the mixed samples were precipitated with 50% −20 °C acetone for 2 h following centrifugation at 12,000 g for 30 min. Pellets were re-suspended in lysis buffer (7 M urea, 2 M thiourea, 50 mM Tris, 50 mM DTE, 1 mM PMSF, 1 mM RNAse, and 1 mM DNAse). The human liver carcinoma cell line Huh7.5.1 was grown in DMEM supplemented with dialyzed bovine serum, 2 mM L-glutamine, 100 U/mL penicillin, and 100 mg/mL streptomycin. Cells were harvested in 50 mL tubes by spinning for 5 min at 400 g, washed five times with phosphate-buffered saline, and mixed with lysis buffer in a homogenizer on ice. Urine and cell protein was quantified using the Bradford method.

2.4. Protein digestion Protein samples were digested using the filter-aided sample preparation method as previously described [20]. Briefly, protein samples were reduced with 20 mM DTT at 95 °C for 3–5 min and washed once with 8 M urea on a 10 kDa filter at 14,000 g for 40 min. Mixtures were alkylated with 55 mM iodoacetamide for 30 min in the dark and washed twice with 8 M urea. Protein samples were washed with 50 mM ammonium bicarbonate once and digested with trypsin (1 μg/50 μg protein) overnight at 37 °C. Peptide mixtures were desalted on a Waters Oasis C18 solid phase extraction column. All samples were lyophilized for mass spectrometry analysis.

2.5. LC–MS/MS Lyophilized samples were re-dissolved in 0.1% formic acid (buffer A) and analyzed on a reverse phase C18 capillary LC column from Michrom Bioresources (100 μm × 150 mm, 3 μm, 0.5 μL/min). The eluted gradient was 5–30% buffer B (0.1%

formic acid, 99.9% ACN) for 100 min. The three LTQ Orbitrap Velos modes (FT-LTQ-CID, FT-FT-CID, and FT-FT-HCD) were used for analysis. Instrument settings for the three modes are in Table 1. Other settings included internal mass calibration (445.120025 ion as lock mass with a target lock mass abundance of 0%), charge state screening (excluding precursors with unknown charge state or + 1 charge state), and dynamic exclusion (exclusion size list 500 and exclusion duration 90 s).

2.6. Data download A large collection of HCD data produced by Michalski et al. [19] were used to validate HCD fragmentation patterns observed in our dataset. Raw files were downloaded at Tranche (www. proteomecommons.org) using the hash code: pI2oaLaSi7gPxUWNbesdXCgR17sWvMY6qVkHL þ MtWA0Q 5sqn= UxZVSjk3KpFTfrmDYpf3y=Iv6WfaAi6‐HaILdZL0YocAAAAAAAAT7Q ¼¼

2.7. Data processing MS/MS spectra were used to search the International Protein Index human database (version 3.07) from the European Bioinformatics Institute website (www.ebi.ac.uk/IPI/) using Mascot software version 2.3.02 (Matrix Science, UK). Trypsin was chosen for cleavage specificity with two as the maximum number of allowed missed cleavages. Carbamidomethylation was a fixed modification. The searches were performed using a peptide tolerance of 5 ppm and a product ion tolerance of 0.5 Da for the FT-LTQ-CID data and 0.02 Da for the FT-FT-CID and FT-FT-HCD data, respectively. Target and decoy database searches were processed using the postprocessor Percolator [21] with data filtered at p < 0.05. The false discovery rate for all results was about 1% at the spectral level. The three datasets from Michalski et al. [19] were processed using the same parameters as the FT-FT-HCD mode. The raw files were searched against the human, yeast and Escherichia coli SwissPort database from UniProt website (www.uniprot.org).

2.8. Spectrum preprocessing Spectra of doubly and triply charged peptides identified in all replicates in FT-LTQ-CID, FT-FT-CID and FT-FT-HCD modes were extracted from cell and urine samples. Since there were no technical replicates in Michalski's dataset, doubly or triply charged peptides with at least two spectra were selected for further processing.

Table 1 – LTQ Orbitrap Velos parameter settings for the three modes.

MS resolution MS/MS analyzer MS/MS activation time (ms) MS/MS resolution AGC for MS/MS MS/MS maximal injection time (ms) Data dependent scan number Normalized collision energy MS/MS trigger threshold

FT-LTQ-CID

FT-FT-CID

FT-FT-HCD

60,000 LTQ 10 Normal 5000 100 20 35 500

30,000 Orbi 10 7500 50,000 500 10 35 500

30,000 Orbi 0.1 7500 50,000 500 10 40 5000

29

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

Spectra were preprocessed by the following steps: (1) de-isotope peaks were identified as described by Chi et al. [9]. The charge states of precursor ions were determined according to the isotope peak distributions. (2) Peak intensities were normalized so that the most abundant peak had an intensity of 10,000. (3) b and y ions were assigned within a mass tolerance of 0.02 Da for spectra from the FT-FT-CID and FT-FT-HCD modes and 0.5 Da for those from the FT-LTQ-CID mode. Charge states of fragment ions could be +1 and +2 for doubly charged peptides, and +1 to +3 for triply charged peptides.

2.9. Creation of peptide consensus spectrum Spectra for the same peptide ions were merged to create a consensus spectrum (consisting of only b and y ions) similar to the method of Lam et al. [22]. First, the similarity between two spectra for each peptide was measured by the cosine of angle as described by Tabb et al. [14]. The equation is X

Ia Ib

 qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi X 2X 2 Ia Ib

ð1Þ

where Ia and Ib are the intensity of the same fragment ion in spectra a and b, respectively. As shown in Supplemental Fig. 1, nearly 90% of the spectrum pairs had a similarity score of greater than 0.95. Therefore 0.95 was used as a threshold to determine whether two spectra had a similar pattern. Second, for each peptide ion in each mode, dissimilar spectra were eliminated from the spectrum pool. This step aimed to remove doubtful PSMs and ensure that the consensus spectrum represented the typical fragmentation pattern of the peptide ion. Spectra for the same peptide ions were clustered based on the following rules: (1) each spectrum had at least one similar spectrum (with a similarity score of no less than 0.95) in its cluster; (2) any two spectra from different clusters were dissimilar (with a similarity score below 0.95). We retained the largest cluster for further processing. Spectra in other clusters were removed. Finally, for each peptide ion in each mode, intensities of fragment ions in the spectra from the largest cluster were merged. Peak intensities for each fragment ion were defined as the weighted average of corresponding peak intensities. Weight was defined according to the spectrum Mascot score, which is an index of overall spectral quality. Ions identified in less than half of the spectra were removed to avoid random matches. All scripts for spectra processing and statistical analysis were in R language (version 2.11.1).

3. Results and discussion 3.1. Overall spectrum similarities between the FT-LTQ-CID, FT-FT-CID and FT-FT-HCD modes Cell lysates and urine extracts were analyzed in the FT-LTQ-CID, FT-FT-CID and FT-FT-HCD modes, and each sample was analyzed twice in each mode. The total number of MS/MS scans, PSMs and unique peptides identified in the two runs at 1% FDR are presented in Supplemental Table 1 (detailed data in Supplemental Tables 2 and 3). To avoid the comparison bias from different peptide sequences, only peptides identified in both runs from all of the three modes were selected for comparison. To further eliminate randomness in fragment ion generation and detection, we created a consensus MS/MS spectrum library for each mode by merging fragment ion intensities in spectra from the same peptide ions. Peaks representing b and y ion series, which were the most abundant ion series in both CID and HCD spectra, were extracted. To make the peak intensities of different spectra comparable, the intensity of the most abundant peak in each spectrum was normalized to 10,000. To evaluate spectrum similarities between the three modes, the distribution of similarity scores between any two spectral pairs for the same peptide ion in the same mode was calculated as a reference (Supplemental Fig. 1). The median similarity scores were 0.98, 0.99 and 0.99 in the FT-LTQ-CID FT-FT-CID and FT-FT-HCD modes, respectively, indicating that all of the three modes achieved very good reproducibility in the generation of b and y ion series. Table 2 shows the median similarity score between the three modes of the total/b/y ion pattern. Spectra in the FT-LTQ-CID and FT-FT-CID modes showed very similar patterns, suggesting that the mass analyzer used for the MS/MS scan might not be an influencing factor for the spectral pattern. The total ion similarity between these two modes and the FT-FT-HCD mode ranged from 0.65 to 0.71, which indicated a considerable difference between the two fragmentation methods. Moreover, the similarity score for b ions between any two modes was significantly lower than for y ions, implying that greater variation might have occurred in the formation of b ions.

3.2. Comparison of CID and HCD fragmentation patterns Consensus spectra for the same peptides identified in the FT-FT-CID and FT-FT-HCD modes were compared to reveal the differences between CID and HCD fragmentation patterns. These two modes employed the same instrument settings

Table 2 – Median similarity scores of the total, b and y ions between the three modes. FT-LTQ-CID vs FT-FT-CID

Total ions b ions y ions

FT-LTQ-CID vs FT-FT-HCD

FT-FT-CID vs FT-FT-HCD

Charge + 2

Charge + 3

Charge +2

Charge + 3

Charge + 2

Charge + 3

0.98 0.93 0.99

0.97 0.89 0.98

0.66 0.18 0.78

0.65 0.40 0.76

0.69 0.19 0.80

0.71 0.51 0.79

30

JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

[19] were used. In their study, E. coli, yeast and hela cell proteomes were analyzed on an LTQ Orbitrap Velos in FT-FT-HCD mode with the same collision energy as for our HCD experiments. 49,905, 62,849 and 69,736 PSMs and 16,551, 26,491 and 39,479 peptides were identified in the three proteomic datasets at 1% FDR (Supplemental Tables 4–6). Consensus spectra were created for peptides with at least two spectra in each dataset. A total of 19,010 doubly charged peptides and 8958 triply charged peptides were included in that dataset.

100%

90%

90%

80%

80%

70%

70%

60% 50% 40%

+2

30%

+1

Percentage

100%

60% 50%

+3

40%

+2

30% 20%

10%

10%

0%

0%

100%

90%

90%

80%

80%

70%

70%

60% 50% 40%

+2

30%

+1

Percentage

100%

60% 50%

+3

40%

+2

30%

20%

20%

10%

10%

0%

0%

100%

90%

90%

80%

80%

70%

70%

50% 40%

+2

30%

+1

60% 50%

+3

40%

+2

30%

20%

20%

10%

10%

0%

0%

+1

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

60%

Percentage

100%

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

+1

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

Percentage Percentage

+1

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

20%

b:0-10% b:10-20% b:20-30% b:30-40% b:40-50% b:50-60% b:60-70% b:70-80% b:80-90% b:90-100% y:0-10% y:10-20% y:20-30% y:30-40% y:40-50% y:50-60% y:60-70% y:70-80% y:80-90% y:90-100%

Percentage

and different fragmentation methods, which guaranteed that the comparison would be fair. And, importantly, the high-resolution Orbitrap analyzer for the MS/MS readout helped to unambiguously annotate fragment ions and eliminate the influence of random noise. A total of 3789 and 1718 consensus spectra for doubly and triply charged peptides, respectively, were included in the comparison. To validate HCD spectral patterns acquired from our dataset, raw mass spectrometric files from the study of Michalski et al.

Fig. 1 – Fragment ion charge state versus relative mass. The relative mass range for fragment ions was divided into 10 bins. The percentages of fragment ions with different charge states were calculated for the relative mass bins. Left 10 bars in each panel, b ions; right 10 bars, y ions.

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

In the following part of this study, our CID and HCD spectrum library and Michalski's HCD spectrum library are called the “CID”, “HCD” and “HCD2” datasets, respectively.

3.2.1. Charge state of fragment ions Fig. 1 displays the charge state versus relative mass distributions of b and y ions in the three datasets. Relative mass was calculated as fragment ion mass divided by precursor mass. The percentage of multiply charged fragment ions increased gradually with increasing relative mass in both modes. For doubly charged peptides, singly charged fragment ions dominated both CID and HCD spectra. In high relative mass regions, the proportion of singly charged y ions in the HCD spectra was significantly greater than in CID spectra. For triply charged peptides, most fragment ions in the high relative mass regions in both CID and HCD datasets had multiple charges. However, singly charged y ions dominated in all relative mass regions in the HCD2 dataset. There might be two possible explanations for the inconsistency of y ion charge

31

state distribution in the two HCD datasets. One is the differences between instrument configurations and conditions between the two mass spectrometers. The other is the bias of peptide sequences in the two datasets. Nevertheless, HCD spectra in both datasets showed preference for singly charged y ions compared to CID spectra.

3.2.2. Identification rate of fragment ions Fig. 2 compares the average fragment ion identification rate of spectra in the CID and HCD datasets. Compared to CID mode, HCD mode tended to identify fragment ions with lower relative mass. The HCD mode performed better in Mann– Whitney tests than the CID mode in relative mass regions below 30% for b ions and 50% for y ions in spectra for doubly charged peptides and below 50% for b ions and 70% for y ions in spectra for triply charged peptides (Supplemental Fig. 2). A large performance difference between the two modes was observed in regions with very low relative mass for all sub-figures.

Fig. 2 – Fragment ion identification rate versus relative mass. Identification rate was number of b or y ions observed in a MS/MS spectrum divided by theoretical number of b or y ions within the relative mass region. Fragment ions with different charge states were counted only once. A. Average identification rates among doubly charged peptides. B. Average identification rates among triply charged peptides.

32

JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

Table 3 – Median intensities of singly charged b1–b4 ions in HCD spectra.

In addition, the overall identification rate for CID spectra for both b and y ions decreased significantly when spectra from triply charged peptides were compared to spectra from doubly charged peptides, while the identification rate of b ion with moderate relative masses (30–50%) increased in HCD mode.

Charge + 2

b1 b2 b3 b4

3.2.3. Intensity distribution of fragment ions

HCD

HCD2

HCD

HCD2

0.00 2852.65 1151.50 290.71

0.00 2247.05 598.70 78.75

0.00 1694.36 631.53 367.91

0.00 1173.91 443.91 310.86

Intensity

Intensity Intensity

peptide but was slightly biased to the N-terminal side. In spectra from triply charged peptides, the intensity of y ions in HCD spectra reached a maximum in the 40–50% relative mass bin. However, a dip in y-ion intensity in the 60–70% relative mass bin of the CID spectra was found. Intensity distributions of fragment ions were investigated previously using ion trap [14] and Q-TOF [18] instruments. Both instruments showed a maximum intensity of y ions at approximately 60% of the precursor mass of a doubly charged peptide. Among CID spectra generated by a LTQ-FT mass spectrometer, Zubarev et al. reported that yn-2 (n refers to peptide length) fragments had the highest intensity of all y ions [17]. The same phenomenon was also observed in our CID dataset. yn-2 ions had the highest abundance in 19.4% and

Intensity

Intensity distributions for fragment ions in the CID and HCD datasets are presented in Fig. 3. Spectra in the HCD2 dataset had very similar distributions to spectra in the HCD dataset (Supplemental Fig. 3). For b ions, peak intensities in the 10–20% relative mass bin were distinctly higher than intensities in other relative mass regions in HCD spectra for both doubly and triply charged peptides. After examining the intensity distribution of all possible b ions, we found that singly charged b2 ions contributed the most to the peak intensity in this bin (Table 3). For y ions, the intensity distributions in both CID and HCD modes showed similar trends in the spectra from doubly charged peptides. In regions with a relative mass above 20%, y-ion intensity increased gradually with relative mass and reached a maximum in the 60–70% relative mass bin, indicating that fragmentation preferentially occurred in the center of the

Charge + 3

Fig. 3 – Distributions of fragment ion intensity in CID and HCD spectra. Bar, median peak intensity among ions for a relative mass bin; line above, 75th percentile of peak intensities; line below, 25th percentile. Peak intensities for the same fragment ion but with different charge states were summed. The intensity of missing ions was assigned a value of zero. For CID spectra: A. The intensity distribution for doubly charged peptides. B. The intensity distribution for triply charged peptides. For HCD spectra: C. The intensity distribution for doubly charged peptides. D. The intensity distribution for triply charged peptides.

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

25.8% spectra for doubly and triply charged peptides, respectively. This position-specific cleavage preference appears to be specific to the CID mode, as no clear position preference was found in either of the two HCD datasets. The maximum intensity of b ions was observed in the 15–20% relative mass bin for Q-TOF and the 45% bin for ion trap. The intensity distribution of the HCD spectra observed in our study was similar to Q-TOF spectra. There may be a difference in the b-ion intensity distribution between the HCD and CID modes because b ions with large relative masses in HCD processes may undergo secondary fragmentation to produce smaller b ions, a ions or smaller species because of higher collision energy [23], while CID fragmentation was limited by the low-mass cutoff effect and did not provide enough energy for secondary peptide fragmentation to form these ions [19]. This explanation is supported by the fact that Michalski et al. observed a dramatic increase in the number of immonium ions and internal fragments in HCD spectra compared to CID spectra.

3.2.4. The five highest intensity fragment ions Fragment ions with high intensity are valuable for targeted proteomic analysis. We calculated the composition of the five fragment ions with the highest intensity in CID and HCD spectra (Table 4). For the same precursor ion, the median of 3 of the top 5 fragment ions was the same between CID and HCD spectra. Singly charged b2 ions appeared to be the top 5 fragment ions in 59.5% and 33.7% of doubly and triply charged peptides, respectively, in the HCD dataset, and 59.2% and 26.4% of doubly and triply charged peptides, respectively, in the HCD2 dataset. These results indicate a considerable level of difference between the CID and HCD spectra. Since spectra from quadrupole instruments are more similar to HCD spectra than to CID spectra, product ions from HCD fragmentation are more suitable for use in designing MRM transitions than are those from CID fragmentation. The top 5 fragment ions in HCD mode were one b ion and four y ions at the median. Most were singly charged, therefore singly charged y ions are recommended for designing MRM transitions when experimental data are unavailable.

3.3. Residue-specific fragmentation behaviors in CID and HCD modes After tryptic digestion, a majority of lysines (K) and arginines (R) are located at the C-terminus, and P is unlikely to be located at the N-terminus. As is shown in Fig. 3, the fragmentation efficiency for both CID and HCD modes varied in different relative mass Table 4 – Composition of the five fragment ions with the highest intensity in CID and HCD spectra. Data are presented as median values.

Number of co-identified ions Ion type (b/y)-CID Ion type (b/y)-HCD Ion type (b/y)-HCD2 Charge (+1/+ 2/+ 3)-CID Charge (+1/+ 2/+ 3)-HCD Charge (+1/+ 2/+ 3)-HCD2

Charge + 2

Charge + 3

3

3

1/4 1/4 1/4 5/0 5/0 5/0

1/4 1/4 1/4 2/3/0 4/1/0 5/0/0

33

regions. The distributions of the 20 amino acid residues at the different positions in the peptide were examined. In internal regions of peptides, none of the 20 residues were found to have a significant position bias in all of the three datasets (data not shown). Therefore, residues at positions 3 to n-2 were used to investigate the residue-specific effect. Note that the internal miscleavage sites, K and R most frequently occurred as KP and RP in peptide sequences due to the selectivity of tryptic digestion. Since proline (P) has a strong cleavage preference on its N-terminal side (as discussed below) that has a strong influence in the cleavage selectivity of adjacent residues, cleavage selectivity of K and R were not investigated in this study. According to the mobile proton model, peptides with and without mobile protons may show significantly different fragmentation patterns. In brief, N-terminal side cleavage of P is the primary cleavage selectivity for peptides with mobile protons, whereas non-mobile peptides without mobile protons are poorly fragmented and have primary cleavage selectivity C-terminal to Asp [11,12]. Therefore, fragmentation patterns of mobile and non-mobile peptides need to be studied separately. Since only a very small proportion of peptides in the datasets could be categorized as non-mobile peptides, we simply removed this category of peptides from this analysis. Therefore, a total of 3756 doubly and 1717 triply charged peptides in our dataset and 18,833 doubly and 8976 triply charged peptides in the HCD2 dataset remained in this analysis. To facilitate comparison with previous studies, the residue-specific effect was measured as the “N-bias” index using the following formula as described by Tabb et al. [14]. ðIN ‐IC Þ=ðIN þ IC Þ

ð2Þ

where IN and IC represented the intensity of N- and C-terminal side cleavage products, respectively, of the residue. Positive N-bias indicated preference for N-terminal side cleavage on a residue, while negative N-bias indicated preference for C-terminal side cleavage (also called C-bias). N-bias values are not available for residues when none of the N- and C-terminal fragment products are identified in a spectrum. To ensure that the comparison was based on exactly the same set of amino acid residues, only residues with available N-bias values in both CID and HCD spectra for the same peptides were used. As is shown in Table 3, b2 is typically the strongest b ion in HCD spectra. Therefore, any amino acid at position 2/3 of a peptide is most likely to have negative/positive N-bias for b ions. Amino acids at position 4 might also tend to have positive N-bias since b3 are generally more intensive than b4. To eliminate positional bias, amino acid residues at positions 5 to n-2 were used for N-bias analysis in this study. Fig. 4 represents the median N-bias value for each residue in the CID and HCD datasets. N-bias trends for HCD spectra in Fig. 4 were further confirmed with the HCD2 dataset (Supplemental Fig. 4), in which the same N-bias trends were observed for all of the residues. Pairwise fragmentation maps illustrating cleavage intensities at all amide bonds with the 18 × 18 residue combinations in the CID and HCD datasets are presented in Fig. 5 for doubly charged peptides and in Supplemental Fig. 5 for triply charged peptides, while maps for the HCD2 dataset are presented in Supplemental Fig. 6. In all datasets, figures for doubly and

34

JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

Fig. 4 – Median N-bias value for each residue in CID and HCD datasets. N-bias = (IN − IC) / (IN + IC), where IN and IC represented the intensity of N- and C-terminal fragment peaks, respectively, of the residue. Intensities of the same fragment ion with different charge states were summed. The intensity of missing ions was assigned the value of zero. Residues were excluded if none of the N- and C-terminal cleavage products were identified. A. b ions for doubly charged peptides. B. y ions for doubly charged peptides. C. b ions for triply charged peptides. D. y ions for triply charged peptides.

triply charged peptides showed similar patterns, indicating that residue-specific fragmentation behavior might not correlate with precursor charge state. Enhanced cleavage at the N-terminal side of P was the most significant residue-specific effect in both CID and HCD modes. These results were consistent with previous observations on instruments with gas phase fragmentation, such as ion trap [12,14,15,24], MALDI-TOF-TOF [25] and Q-TOF [18] and LTQ-FT [16]. P is the only amino acid with a side chain that contributes a ring to the peptide backbone. This unusual structure prevents cleavage at the C-terminal side and enhances formation of N-terminal cleavage products for both ion types. Kapp et al. [12] considered this phenomenon to be a primary effect of the charge-directed fragmentation pathway. The N-terminal side cleavage selectivity of P was found to be weaker for b ions than for y ions in HCD spectra, especially for doubly charged peptides, while this difference was not observed in CID spectra (Fig. 5). This might have been partially due to the low b-ion identification rate in HCD mode.

Another detected cleavage selectivity was C-bias for H for b ions in both modes, which has been documented in previous studies [12,14,15,18]. Typically, b ions are formed with protonated oxazolones. However, b ions terminating at H form a more stable 5–5 ring bicyclic structure, resulting in higher intensity [10,26]. Since H can serve as a charge-bearing residue in the fragmentation process, it is interesting to study the relationship between the cleavage site of H and the fragment ion charge state. To study this relationship, spectra for tryptic peptides with no basic residues except for one H in each internal sequence in the three different datasets were analyzed. A Z-bias index was created to measure the preference for fragment-ion charge state. The formula for Z-bias is (I1 − I2) / (I1 + I2), where I1 and I2 represent the intensity of singly and doubly charged fragment ions, respectively. As is shown in Table 5, a strong positive z-bias (i.e., a preference for singly charged fragment ions) was observed for both b and y ions in both modes, with the exception that no significant z-bias was observed for b ions from C-terminal side cleavages and a strong negative z-bias

35

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

A. b ions, CID

B. y ions, CID

C. b ions, HCD

D. y ions, HCD

Fig. 5 – Pairwise intensity maps for bond cleavage at each X–Z residue combination for doubly charged peptides in CID and HCD datasets. Vertical, single letter codes of amino acids indicating residues N-terminal to the cleavage sites (residue X); horizontal, residues C-terminal to the cleavage sites (residue Z). Peak intensities of b and y ions were normalized to the summed peak intensities of b and y ions in each spectrum, respectively. Median peak intensity for cleavage at each residue combination is illustrated in maps according to the color legend. Residue combinations that appeared fewer than 30 times in the dataset were shown as a blank.

was observed for y ions from N-terminal side cleavages in CID spectra of triply charged peptides. The statistical significance of the Z-bias values was verified by Mann–Whitney tests. This supports the observation in Fig. 1 that multiply charged fragment ions are more frequently identified in CID spectra of triply charged peptides. As has been discussed by Tabb et al. [17], protons in multiply charged precursor ions may repel each other and disperse to both b and y ions. Influenced by this repulsive effect, cleavage of a doubly charged peptide is more likely to yield a singly charged b/y ion pair than a doubly charged y ion. However, it is still hard to explain the case of HCD spectra of triply charged peptides, in which N-terminal side cleavage of H tended to retain only one proton on y ions that contains two basic residues. More studies are needed to understand this special process.

Tabb et al. found that the three residues with the highest N-bias values for both b and y ions were P, glycine (G) and serine (S) [14]. We observed the same result in CID dataset. In Table 5 – Median z-bias value of H cleavage products, bn, bc, yn and yc. The first letter of labels in the first column refers to the type of fragment ion, while the second one refers to the side of the cleavage site. Charge + 2

bn bc yn yc

Charge + 3

CID

HCD

HCD2

CID

HCD

HCD2

1 1 1 1

1 1 0.48 1

1 1 1 1

1 0.18 −0.74 1

1 1 0.86 1

1 1 1 1

36

JO U R N A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

both modes, cleavage at the C-terminal side of the three residues was suppressed. The same cleavage preferences for G and S in CID mode were observed by Huang et al. [15], especially for spectra from partially mobile peptides. The only significant difference in G and S cleavage between the CID and HCD modes was that the N-bias for G was not observed for b ions in both HCD datasets. G has only a single hydrogen atom for its side chain, with a much smaller van der Waals radius than other amino acids. Huang et al. suggested that the high flexibility in Ramachandran phi and psi angles of G might contribute to the strong cleavage preference seen in CID mode [27]. The mechanism for these behavior differences between the two fragmentation methods needs to be further investigated. The most significant difference between the two modes was cleavage of hydrophobic residues. For b ions, HCD spectra showed strong N-bias for residues isoleucine (I), leucine (L), valine (V), phenylalanine (F), tyrosine (Y) and tryptophan (W), while CID spectra showed C-bias or no significant bias for these residues. Huang et al. reported that enhanced cleavage at the N-terminal side of I, L and V appeared in ion trap spectra from partially mobile peptides, whereas enhanced cleavage C-terminal to these residues appeared for fully mobile peptides [15]. Barton et al. found that, for spectra generated from a Q-TOF instrument, enhanced cleavage appeared both at N- and C-terminal to L and only N-terminal to I and V [18]. Both of these observations were based on doubly charged peptides. In our observation, HCD spectra showed a strong N-bias for I, L and V for b ions from both partially and fully mobile peptides. These observations suggested that, for these residues, the cleavage behavior in HCD mode might be more similar to quadrupole fragmentation than ion trap fragmentation. For y ions, a significant C-bias was observed for I, L and V in both modes, while cleavage selectivity for F, Y and W were very weak. C-bias of I, L and V for y ions is commonly observed in spectra generated from the ion trap [14,15], LTQ-FT [16] and Q-TOF [18] instruments. In conclusion, enhanced cleavage at N-terminal of P and S for both b and y ions, and G for y ions, as well as enhanced cleavage at the C-terminal side of H for b ions and I, V and L for y ions were common in both CID and HCD modes. The main differences in the residue-specific effects of the two modes were in the formation of b ions. An N-bias for G was observed in CID spectra, but not in HCD spectra. For the hydrophobic residues I, L, V, F, Y and W, C-bias or no significant bias was observed for CID spectra, and an N-bias was observed for HCD spectra.

own. Our analytical results confirmed some known patterns for HCD spectra and provided detailed descriptions of them. 1) HCD spectrum tended to generate smaller fragment ions. This phenomenon was observed not just in regions affected by the one-third effect. In spectra from triply charged peptides, both b and y ions with moderate relative mass had higher identification rates in HCD mode than in CID mode. 2) CID and HCD spectra differed significantly in b ion patterns, especially for doubly charged peptides. HCD spectra were characterized by a high intensity peak of singly charged b2 ions. This ion had a large probability to be one of the five fragment ions with the highest intensity in HCD spectra. More importantly, novel statistical features of HCD spectrum were discovered. 1) The HCD mode typically generated y ions with moderate relative mass. The intensity of y ions reached a maximum in the 60–70% and 40–50% relative mass bins of HCD spectra from doubly and triply charged peptides, respectively. 2) HCD mode showed a slight preference for generating y ions with lower charges than CID mode. Singly charged fragment ions dominated the five fragment ions with the highest intensity in HCD spectra (five for doubly charged peptides and four for triply charged peptides at the median). 3) The main difference in cleavage selectivity between the two modes was on hydrophobic residues (G, I, L, V, F, Y and W) for b ions, while residues showed similar cleavage preference for y ions. HCD is now widely used in newly released instruments such as the Q-Exactive and LTQ Orbitrap Fusion. This study provides valuable statistical data for researchers to interpret HCD data. Proper adoption of b/y ion patterns, such as charge state and intensity distributions as well as residue-specific fragmentation selectivity, would improve the scoring of PSMs in database searching algorithms and the deduction of most possible sequences in de novo sequencing software. Researchers who need to manually validate MS/MS spectra with specific mutations or modifications could also consider results in this study for guidance. In addition, quadrupole CID spectra have similar patterns to HCD spectrum. Therefore this study would also guide the selection of MRM transitions when no MS/MS spectra of candidate peptides are available. According to our observations, singly charged y ions with moderate relative mass are most likely to be detected in HCD spectrum, while all b ions except b2 ion are not likely to be detected. Furthermore, the most possible cleavage bonds for a given peptide sequence can be predicted by the N-bias values and the pairwise bond cleavage maps.

Acknowledgments 4. Conclusion This study performed quantitative and statistical characterization of patterns of HCD spectra compared with CID spectral patterns. The comparison was based on high-resolution MS/ MS spectra analyzed in an Orbitrap analyzer to help with unambiguous annotation of fragment ions. Fragmentation patterns were validated on HCD spectra created by Michalsiki et al. This library was approximately five times larger than our

We would like to thank Dr. Hao Chi from Institute of Computing Technology, Chinese Academy of Sciences for kindly offering the MS/MS peak deisotope algorithm, and A. Michalski for kindly offering the raw HCD data. This work was supported by the National Natural Science Foundation of China (No. 31200614 and No. 30970650), the Beijing Natural Science Foundation (No. 5132028) and the Foundation for the Author of National Excellent Doctoral Dissertation of P.R. China (No. 2007B64).

JO U RN A L OF P ROTE O M ICS 1 09 ( 20 1 4 ) 2 6 – 37

Transparency document [13]

The transparency document associated with this article can be found in the online version. [14]

Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx.doi.org/10.1016/j.jprot.2014.06.012.

[15]

REFERENCES

[16]

[1] Xia Y, Liang X, McLuckey SA. Ion trap versus low-energy beam-type collision-induced dissociation of protonated ubiquitin ions. Anal Chem 2006;78(4):1218–27. [2] Olsen JV, Macek B, Lange O, Makarov A, Horning S, Mann M. Higher-energy C-trap dissociation for peptide modification analysis. Nat Methods 2007;4(9):709–12. [3] Bantscheff M, Boesche M, Eberhard D, Matthieson T, Sweetman G, Kuster B. Robust and sensitive iTRAQ quantification on an LTQ Orbitrap mass spectrometer. Mol Cell Proteomics 2008;7(9):1702–13. [4] Pichler P, Kocher T, Holzmann J, Mohring T, Ammerer G, Mechtler K. Improved precision of iTRAQ and TMT quantification by an axial extraction field in an Orbitrap HCD cell. Anal Chem 2011;83(4):1469–74. [5] Nagaraj N, D'Souza RC, Cox J, Olsen JV, Mann M. Feasibility of large-scale phosphoproteomics with higher energy collisional dissociation fragmentation. J Proteome Res 2010;9(12):6786–94. [6] Nagaraj N, D'Souza RC, Cox J, Olsen JV, Mann M. Correction to Feasibility of Large-Scale Phosphoproteomics with Higher Energy Collisional Dissociation Fragmentation. J Proteome Res 2012;11(6):3506–8. [7] Jedrychowski MP, Huttlin EL, Haas W, Sowa ME, Rad R, Gygi SP. Evaluation of HCD- and CID-type fragmentation within their respective detection platforms for murine phosphoproteomics. Mol Cell Proteomics 2011;10(12): M111 009910. [8] Frese CK, Altelaar AF, Hennrich ML, Nolting D, Zeller M, Griep-Raming J, et al. Improved peptide identification by targeted fragmentation using CID. HCD and ETD on an LTQ-Orbitrap Velos J Proteome Res 2011;10(5):2377–88. [9] Chi H, Sun RX, Yang B, Song CQ, Wang LH, Liu C, et al. pNovo: de novo peptide sequencing and identification using HCD spectra. J Proteome Res 2010;9(5):2713–24. [10] Wysocki VH, Tsaprailis G, Smith LL, Breci LA. Mobile and localized protons: a framework for understanding peptide dissociation. J Mass Spectrom 2000;35(12):1399–406. [11] Boyd R, Somogyi A. The mobile proton hypothesis in fragmentation of protonated peptides: a perspective. J Am Soc Mass Spectrom 2010;21(8):1275–8. [12] Kapp EA, Schutz F, Reid GE, Eddes JS, Moritz RL, O'Hair RA, et al. Mining a tandem mass spectrometry database to

[17]

[18]

[19]

[20]

[21]

[22]

[23] [24]

[25]

[26]

[27]

37

determine the trends and global factors influencing peptide fragmentation. Anal Chem 2003;75(22):6251–64. de Graaf EL, Altelaar AF, van Breukelen B, Mohammed S, Heck AJ. Improving SRM assay development: a global comparison between triple quadrupole, ion trap, and higher energy CID peptide fragmentation spectra. J Proteome Res 2011;10(9):4334–41. Tabb DL, Smith LL, Breci LA, Wysocki VH, Lin D, Yates JR. 3rd.Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic peptides. Anal Chem 2003;75(5):1155–63. Huang Y, Triscari JM, Tseng GC, Pasa-Tolic L, Lipton MS, Smith RD, et al. Statistical characterization of the charge state and residue dependence of low-energy CID peptide dissociation patterns. Anal Chem 2005;77(18):5800–13. Zubarev RA, Zubarev AR, Savitski MM. Electron capture/ transfer versus collisionally activated/induced dissociations: solo or duet? J Am Soc Mass Spectrom 2008;19(6):753–61. Savitski MM, Kjeldsen F, Nielsen ML, Zubarev RA. Complementary sequence preferences of electron-capture dissociation and vibrational excitation in fragmentation of polypeptide polycations. Angew Chem Int Ed Engl 2006;45(32):5301–3. Barton SJ, Richardson S, Perkins DN, Bellahn I, Bryant TN, Whittaker JC. Using statistical models to identify factors that have a role in defining the abundance of ions produced by tandem MS. Anal Chem 2007;79(15):5601–7. Michalski A, Neuhauser N, Cox J, Mann M. A systematic investigation into the nature of tryptic HCD spectra. J Proteome Res 2012;11(11):5479–91. Wisniewski JR, Zougman A, Nagaraj N, Mann M. Universal sample preparation method for proteome analysis. Nat Methods 2009;6(5):359–62. Kall L, Canterbury JD, Weston J, Noble WS. MacCoss MJ Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods 2007;41(1):923–5. Schwartz JC, Senko MW, Syka JE. A two-dimensional quadrupole ion trap mass spectrometer. J Am Soc Mass Spectrom 2002;13(6):659–69. Paizs B, Suhai S. Fragmentation pathways of protonated peptides. Mass Spectrom Rev 2005;24(4):508–48. Breci LA, Tabb DL, Yates 3rd JR, Wysocki VH. Cleavage N-terminal to proline: analysis of a database of peptide tandem mass spectra. Anal Chem 2003;75(9):1963–71. Khatun J, Ramkissoon K, Giddings MC. Fragmentation characteristics of collision-induced dissociation in MALDI TOF/TOF mass spectrometry. Anal Chem 2007;79(8):3032–40. Farrugia JM, Taverner T, O'Hair RAJ. Side-chain involvement in the fragmentation reactions of the protonated methyl esters of histidine and its peptides. Int J Mass Spectrom 2001;209(2–3):99–112. Huang Y, Triscari JM, Pasa-Tolic L, Anderson GA, Lipton MS, Smith RD, et al. Dissociation behavior of doubly-charged tryptic peptides: correlation of gas-phase cleavage abundance with Ramachandran plots. J Am Chem Soc 2004;126(10):3034–5.

Statistical characterization of HCD fragmentation patterns of tryptic peptides on an LTQ Orbitrap Velos mass spectrometer.

High-energy collisional dissociation (HCD) is an efficient peptide fragmentation method that is widely used in Orbitrap mass spectrometers. A greater ...
2MB Sizes 1 Downloads 2 Views