www.proteomics-journal.com

Page 1

Proteomics

Recent advances and challenges in plant phosphoproteomics

Cecilia Silva-Sanchez1, Haiying Li2, Sixue Chen1, 2, 3*. 1

Proteomics and Mass Spectrometry, Interdisciplinary Center for

Biotechnology Research, University of Florida, Gainesville, FL 32610, USA. 2

College of Life Sciences, Heilongjiang University, Harbin 150080,

China. 3

Department of Biology, UF Genetics Institute, Plant Molecular and

Cellular Biology Program, University of Florida, Gainesville, FL 32610, USA.

Running title: Plant phosphoproteomics

* Corresponding author:

Sixue Chen, Ph.D. Department of Biology, University of Florida, Gainesville, FL 32610, USA Tel: +1 (352) 273-8330 E-mail: [email protected] Received: 25-Aug-2014; Revised: 29-Sep-2014; Accepted: 24-Nov-2014 This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1002/pmic.201400410. This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 2

Proteomics

Abstract

Plants are sessile organisms that need to respond to environmental changes quickly and efficiently. They can accomplish this by triggering specialized signaling pathways often mediated by protein phosphorylation and dephosphorylation. Phosphorylation is a fast response that can switch on or off a myriad of biological pathways and processes. Proteomics and mass spectrometry (MS) are the main tools employed in the study of protein phosphorylation. Advances in the technologies allow simultaneous identification and quantification of thousands of phosphopeptides and proteins that are essential to understanding the sophisticated biological systems and regulations. In this review we summarize the advances in phosphopeptide enrichment and quantitation, MS for phosphorylation site mapping and new data acquisition methods, databases and informatics, interpretation of biological insights and crosstalk with other PTMs, as well as future directions and challenges in the field of phosphoproteomics. Key words: Plant, phosphoproteomics, phosphopeptide enrichment, MS, PTM network I. Introduction All living organisms have a strict regulatory system to control biological processes in the cell through molecular interactions of thousands of biomolecules. Genes encode the basic biological functions of proteins, which directly impact the phenotype/output of cells. In general, a single gene can produce many protein species due to multiple levels of regulation such as alternative splicing and posttranslational modification (PTM) that increase the proteome diversity [1, 2]. It is known that newly synthesized proteins may not have biological functions until they undergo PTMs to become functional proteins [3]. All the events that lead to a change in the properties of a protein by proteolytic cleavage, addition of modifying groups, or change to one or more amino acids are defined as PTMs [4]. Currently, there are more than 300 known PTM reactions and the numbers are still increasing [1, 5]. PTMs play a key role in cellular biological functions by altering the folding of given proteins, changing enzymatic activities and/or substrate specificities, regulating protein interaction with other molecules such as proteins, nucleic acids, lipids and cofactors, directing This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 3

Proteomics

subcellular localization, or signaling for degradation [3]. PTMs are often important molecular switches that can lead to dramatic changes in the regulation of biological pathways and processes even when there are no changes in the total protein or transcript levels [2]. Protein phosphorylation represents 53.5% of all the PTMs based on the published experimental data [2]. Phosphorylation is a ubiquitous PTM present during the entire lifespan of all cells. Approximately one third of all the proteins in eukaryotic cells are phosphorylated at any given time [6]. It is a reversible and often transient modification that regulates essential molecular events in the cell cycle, DNA transcription, and energy metabolism, and important biological processes including seed germination, stomatal movement, innate immune response and defense, and stress tolerance [7-9]. Protein phosphorylation and dephosphorylation are dynamic processes catalyzed by groups of enzymes called kinases and phosphatases, respectively. It is estimated that protein kinases constitute 1.5 to 2.5% of the entire human genome [7]. The reference plant Arabidopsis thaliana has about 1003 different kinases and 200 phosphatases, representing almost 4% of the total proteins and highlighting the importance of protein phosphorylation and dephosphorylation in plants [3, 9]. Phosphorylation occurs mostly on serine (Ser), threonine (Thr), and tyrosine (Tyr) residues in eukaryotes, although histidine (His), aspartate (Asp), glutamate (Glu), Lysine (Lys), arginine (Arg) and cysteine (Cys) have been reported to be phosphorylated in prokaryotes [10]. In general, phosphorylation on Ser and Thr is more frequent than on Tyr. A large-scale phosphoproteomics study in Arabidopsis estimated a proportion of 85.0%, 10.7% and 4.4% of pSer, pThr and pTyr, respectively, surprisingly close to the proportions calculated from the human genome [9]. Phosphoproteomics is a fast growing field that aims to accomplish a comprehensive

analysis

of

protein

phosphorylation

through

detecting

phosphoproteins and their phosphorylated amino acid residues in a qualitative and quantitative fashion. The goal is to better understand the regulatory roles of protein phosphorylation and dephosphorylation in molecular networks and ultimately the global impact of this PTM on cellular processes and organismal output. In recent years, several technological advances have been made in different aspects for the study of protein phosphorylation, such as sample preparation and improved methods for phosphoprotein/peptide enrichment, mass spectrometry (MS) improvement for This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 4

Proteomics

better fragmentation and increased sensitivity to achieve high coverage detection and

quantification

of

phosphoprotein/peptides,

along

with

advances

in

phosphoproteome databases and bioinformatics that allow accurate phosphoprotein prediction and annotation (Figure 1). The goal of this review is to discuss the advances at different forefronts, challenges and future directions in the field of plant phosphoproteomics.

2. Phosphoprotein and phosphopeptide separation and fractionation Due to the dynamic nature of phosphorylation/dephosphorylation processes in the cells, phosphoproteomic studies face several challenges such as low stoichiometry of the phosphorylated species in a proteome, as well as heterogeneity of the phosphorylated species of a given protein. These usually implicate the need of enrichment steps prior to MS analysis. Another challenge is that the limited dynamic range of the techniques employed for phosphorylation analysis results in the biased identification of the abundant phosphorylation sites but misses the identification of the low abundance peptides. In this section we review several analytical techniques employed in the analysis of phosphoproteins and phosphopeptides (Figure 1).

2.1 Sample preparation Several factors have to be taken into account for sample preparation in plant phosphoproteomics. For instance, the cell wall constitutes a natural barrier for efficient protein extraction. In addition, plants are rich in metabolites such as polyphenols, lipids, polysaccharides and many secondary metabolites. They could interfere with subsequent analysis and need to be removed [1, 8]. Furthermore, the presence

of

highly

abundant

proteins

such

as

ribulose

bisphosphate

carboxylase/oxygenase (Rubisco) in leaves and storage proteins in seeds can account for 30-60% and 10-40% of the total protein in leaves and seeds, respectively. They can mask the detection of phosphoproteins in much lower abundance [8, 11]. Moreover, the presence of proteases, phosphatases and kinases can introduce artifacts during sample preparation and complicate phosphoproteomic analysis. An ideal protein extraction protocol needs to ensure a reliable sampling of all the proteins present in planta. There are many protocols for efficient protein extraction from plant materials [1, 8, 11]. In general, they work well for phosphoproteomic studies after adding inhibitors of proteases, kinases and/or This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 5

Proteomics

phosphatase inhibitors. Urea-base denaturing buffers were found to prevent protein degradation and non-specific phosphorylation or dephosphorylation during sample preparation [1, 8-10, 12]. All the steps need to be done in ways that are compatible with downstream separation and/or enrichment methods.

2.2 Gel base global phosphoprotein detection Two dimensional gel electrophoresis (2-DE) is a well-established technique that allows the fractionation of thousands of proteins in complex mixtures by isoelectric focusing and molecular weight [1, 13]. It provides a visual map of all the proteins, allowing direct observation of potential changes in protein abundance and PTM that are not predictable from genomic information. With the development of fluorescent dyes that selectively detect phosphoproteins, 2-DE can be employed as a highresolution technique that takes the advantages of the pI and MW changes associated with PTMs. The use of Pro-Q diamond phosphoprotein stain (Pro-Q DPS), which specifically binds to the phosphate moieties of phosphoproteins regardless of the phosphoamino acids, allows direct detection of phosphoproteins. In addition, once the phosphoproteome maps are generated, the resolved and differentially stained proteins can be excised for protein identification. A time course study of phosphorylation events in the seed filling process of Brassica napus revealed over 300 phosphoprotein spots. Of the spots, 253 have quantitative information over the five time points. After LC-MS/MS of 103 spots, 70 nonredundant phosphoproteins were identified, but only 16 phosphoproteins were verified to have at least one phosphopeptide based on the MS/MS spectra [14]. When a subsequent total protein staining map is generated, an accurate quantitation of the phosphoprotein changes (by normalizing against total protein changes) was achieved [15]. In another study, maize leaf phosphoproteome changes in response to mechanical wounding were analyzed using 2-DE in combination with differential staining and protein identification by MS [16]. After rigorous time course experiments, it was possible to select 125 spots that were consistently detected for MS analysis. Only 21 proteins were reliably identified and compared with the literature to show that most of them were previously described as phosphoproteins. Apart from all the technical difficulties associated with 2-DE [1, 13, 15], a big disadvantage is that the amount of protein collected in a spot is often inadequate for downstream enrichment methods (see next section), and thus difficult to confirm the This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 6

Proteomics

phosphorylation state of the protein and assign the specific sites of phosphorylation [14, 17, 18]. 2-DE can also be combined with sensitive antibody detection for specific phosphoamino acids, generating a map that can be associated to a specific biological pathway [19]. By applying this approach, Ghelis et al. [20] identified 19 pTyr proteins involved in abscisic acid dependent processes.

2.3 Phosphopeptide enrichment and fractionation As mentioned before, phosphorylated proteins/peptides are often of high heterogeneity and low stoichiometry in a biological sample. In addition, MS has technical issues such as ion suppression and acquisition of high abundance species. It is often imperative to conduct enrichment of phosphorylated peptides or proteins prior to MS analysis. The enrichment methods include, immunoaffinity enrichment, immobilized

metal

affinity

chromatography

(IMAC),

metal

oxide

affinity

chromatography (MOAC), Phos-Tag chromatography, prefractionation by ion exchange

chromatography

chromatography

(HILIC)

(SCX and

and

SAX),

electrostatic

hydrophilic

repulsion

interaction

hydrophilic

liquid

interaction

chromatography (ERLIC), polymer based metal ion affinity capture (PolyMAC), hydroxyapatite

chromatography,

enrichment

by

chemical

modification,

and

phosphopeptide precipitation [21]. There are extensive reviews on these methods [1, 6, 10, 21]. Here we highlight the most popular methods and their advantages and disadvantages in phosphoproteomics. 2.3.1 Immunoaffinity enrichment Traditionally, specific antibodies were used to enrich pTyr containing peptides due to their underrepresentation in biological samples. The method has been shown to be reliable for the study of pTyr proteins [9, 20, 22]. The development of antibodies that recognize pSer [23] and pThr [24] and their surrounding residues represents an improvement in the immunoaffinity enrichment methods, but the specificity varies according to the neighboring residues of the pSer/pThr. It appears necessary to develop different antibodies to ensure the optimal enrichment of different pSer/pThr proteins [19]. Immunophosphoprotein enrichment can be conducted right after protein extraction and the proteins can be resolved by 2-DE. The phosphoprotein maps will provide important information about the experimental pIs and MWs of the proteins, This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 7

Proteomics

and the gels can be further stained, probed with antibodies, or used for protein identification [9]. For complex samples, several rounds of different antibody based enrichments are often required in order to achieve high resolution. Recently, some commercial kits have been developed to characterize specific pathways regulated by phosphorylation. Combining a series of antibodies designed to enrich Ser/Thr kinases, Akt/P13K, and Tyr kinases, core proteins that are involved in many different regulator pathways such as Akt signaling, MAP kinase signaling, cell cycle regulation etc. were successfully detected [19]. Many of the commercial kits have been designed for human or mouse cell lines and have shown good efficiency and specificity [25, 26]. To the best of our knowledge, there are no kits with specific antibodies for phosphoproteins in plant pathways. This is obviously an area to be developed in the future. 2.3.2 Immobilized metal ion affinity based enrichment methods The combination of metal oxide affinity phosphopeptide/phosphoprotein enrichment and LC-MS/MS has become a popular approach and has led to the identification of thousands of modified peptides/proteins. The immobilized metal ion affinity methods include IMAC, MOAC and Phos-Tag. They take advantage of the negatively charged phosphate groups on the phosphoamino acids that can interact with positively charged metal ions and compounds [21, 27]. For IMAC, metal ions such as Ni2+, Fe3+, Ga3+, Zr4+ and Ti4+ are chelated with silica or agarose through nitriloacetic acid (NTA) or iminodiacetic acid (IDA). When employed in phosphopeptide enrichment, IDA was found to be the most efficient matrix when chelated with Ga 3+ [10, 21]. MOAC uses metal oxides like titanium (Ti) or zirconium (Zr) as anchoring molecules that trap phosphopeptides through the formation of multi-dentate bonds [21, 28]. Although both metal and metal oxide affinity methods can enrich mono- and multiphosphorylated peptides, MOAC binds strongly multi-phosphorylated peptides, making elution difficult, as reflected in the high tendency of identifying monophosphorylated peptides [28]. A disadvantage of the enrichment techniques is the nonspecific binding of acidic peptides. Trypsin digestion of phosphoproteins may generate peptides containing more than one acidic residues and hence very acidic peptides. In contrast, Glu-C cleaves after glutamic acid or aspartic acid, generating peptides with only one acidic amino acid residue, thus can reduce nonspecific binding [10, 21]. This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 8

Proteomics

The pH and organic acids used in the binding buffer have a huge effect on the selectivity of the enrichment. As a rule of thumb, a highest sensitivity can be achieved at pH > 3, and the maximal selectivity occurs at pH < 1-1.5. Hence the binding buffer has to be optimized to compromise between sensitivity and selectivity. Meanwhile, it was observed that TFA was superior to hydrochloric acid, formic acid and acetic acid [10, 21]. Regardless of the technical difficulties, IMAC technology has been applied successfully in phosphoproteomic studies. A recent study of Selaginella

moellendorffii

applied

IMAC

along

with

LC-MS/MS

for

the

characterization of 1593 unique phosphopeptides, of which 1104 contained nonredundant phosphorylation sites associated with 716 phosphoproteins [29]. Phos-tag has

a

similar

principle

as

IMAC,

but

it

uses

1,3-bis[bis(pyridine-2-

ylmethyl)amino]propan-2-olato dizinc(II) complex as a selective phosphate binding tag in aqueous solution at neutral pH. The neutral pH allows the complete deprotonation of phosphoproteins/phosphopeptides to improve the sensitivity. In the case of phosphoproteins, elution in nearly physiological conditions offers versatility for assaying protein activities and functions [21, 27, 30]. A variation of Phos-tag complex copolymerized in polyacrylamide gels allows differential mobilization of phosphoproteins versus non-phosphorylated counterparts. It has become an important tool for confirmation of the phosphorylation state of targeted proteins. For example, this technique helped reveal that phosphorylation of ubiquitin ligase ATL31 controls plant nutrient response by targeting 14-3-3 proteins for degradation [31]. Several new enrichment methods combine the properties of the above enrichment methods in a complementary fashion to improve the coverage of phosphoproteomes [32-34]. Sequential elution of IMAC (SIMAC) uses IMAC to enrich monophosphopeptides first by elution in acidic conditions (1%TFA). A subsequent elution with ammonia water yields multiply phosphorylated peptides. The flow-through/wash of IMAC and the acidic elution of monophosphopeptides were enriched again with TiO2 media for monophosphopeptides. Following this strategy, the number of phosphorylated peptides was found to be three times more than those obtained using optimized TiO2 enrichment [32]. Recently, a large scale analysis of the phosphoproteomes in seed maturation of Arabidopsis, rapeseed and soybean employed a combined strategy of IMAC and TiO2 enrichment, a total of 2,001 phosphopeptides with 1,026 unambiguous phosphorylation sites of 956 different

This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 9

Proteomics

proteins were observed, among which 652 phosphoproteins had not been previously reported [35]. In an attempt to overcome the low ionization efficiency of phosphopeptides under ESI, an enrichment procedure on a MALDI plate combines the strength of IMAC/MOAC and takes the advantage matrix combinations to enhance the ionization of phosphopeptides in positive mode [10]. Superparamagnetic materials are nonporous adsorbents that have a high surface/volume ratio with a high binding capacity, and are easily manipulated by an external magnetic field. They have been incorporated in several IMAC and MOAC methods for good selectivity and high binding capacity for up to 60 µg phosphopeptides/g particles [36]. 2.3.3 Ion exchange chromatography A widely used technique in protein and peptide separation/fractionation is ion exchange chromatography. In strong cation exchange (SCX), peptides are loaded in acidic conditions (e.g., pH 2.7) and are eluted using increasing concentration of salts. Under this condition, tryptic peptides containing Lys and Arg at the C-terminus often carry a charge of +2, phosphopetides that contain negative charges give a less positive net charge, and thus tend to elute early during chromatography [37]. A twodimensional SCX with acidic and ultra-acidic pHs was carried out to enrich phosphopeptides with basic residues. The first SCX chromatography was done at a pH separating peptides from phosphopeptides, and then discrete fractions were separated on SCX at pH 1.5, where the phosphate groups of the phosphopeptides with basic residues are protonated, phosphopeptides were identified [38]. In strong anion exchange (SAX), phosphopeptides are strongly retained and can be separated according to their number of phosphate groups [39]. The results were comparable to the IMAC enrichment in spite of the presence of non-specific binding of nonphosphorylated peptides. Usually an IMAC or MOAC step after ion exchange chromatography needs to be performed. Although ion exchange chromatography can be performed on-line with RP LC-MS for automation and high throughput [1], the off-line separation is often carried out for better results.

This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 10

Proteomics

2.3.4 Hydrophilic interaction chromatography In hydrophilic interaction chromatography (HILIC), separation is achieved by a combination of polar partitioning of the analytes in between a water-rich stationary phase and an organic modifier-rich mobile phase. Phosphopeptides with highly polar phosphate groups are strongly retained on the HILIC stationary phase resulting in a separation from the non-phosphorylated species. A recent phosphoproteomic study was conducted to determine wheat defense responses to a fungus Septoria tritici. Phosphopeptides were enriched with TiO2, and then fractionated using HILIC. A total of 2305 proteins and 968 phosphopeptides were identified [40]. ERLIC is a variation of HILIC using electrostatic repulsion as an additional chromatography stationary phase that adjusts the selectivity of HILIC. Using ion exchange and HILIC properties, the selectivity of ERLIC can be adjusted by changing the organic contents and/or the pH of the solvents. Gan and coworkers [41] compared IMAC-SCX approach versus ERLIC for the enrichment of phosphopeptides. Using ERLIC, the number of phosphopeptides was tripled compared to the IMAC-SCX. Only 12 % of the identified phosphopeptides

overlapped

between

the

two

experiments,

showing

the

complementary nature of the different enrichment techniques.

3. Quantitative phosphoproteomics Global studies of phosphoproteomes aim to identify and quantify all possible phosphoproteins present in a given sample at a given time. Due to the dynamic nature of the phosphorylation events, it has become necessary to measure the quantitative changes of phosphorylation in the cells in order to fully understand the regulatory mechanisms mediated by protein kinases and phosphatases. In Table 1, we present an overview of quantitative phosphoproteomics research in plants.

This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 11

Proteomics

3.1 Stable isotope labeling Stable isotope labeling with amino acids in cell cultures (SILAC) is used for in vivo incorporation of “light” and "heavy"

13

15

C- or

N-labeled amino acids into proteins

through cellular metabolism. Because there is one version of “light” and two versions of “heavy” amino acids, the technology allows combination of up to three samples into one experiment and reduces experimental error in the quantitation due to isotope incorporation at the early stage of sample preparation [79]. The technology requires maximal incorporation of the isotope labeled amino acids into the proteome, and the cells used for study must be auxotrophic for the amino acids used for labeling. Plants are autotrophic organisms, representing a challenge for the application of SILAC. Interestingly, SILAC was applied in Arabidopsis cell cultures to study the regulation of glutathione S-transferase expression in response to abiotic stress induced by salicylic acid, showing a maximum 80% of [ 13C6] arginine incorporation [80]. Recently, the usage of a medium-heavy (Lys-4) amino acid versus a heavy (Lys-8) amino acid in Arabidopsis cell cultures showed that quantification can be done using Lys-4/Lys-8 to overcome the influence of partial incorporation [81]. In addition to cell cultures, SILAC Arabidopsis has shown utility when growing plants in media containing

14

N- and

15

N-enriched ammonium nitrate

and potassium nitrate. For example, phosphopeptide quantitation was used to characterize proteins in Arabidopsis whose degree of phosphorylation is rapidly altered by osmotic stress treatment. The results indicate that mitogen-activated protein kinases and proteins involved in 5' messenger RNA decapping and phosphatidylinositol 3,5-bisphosphate synthesis are involved in the osmotic stress response [72]. Recently, hydroponic isotope labelling of entire plants (HILEP) [82] was successfully used to study the phosphorylation events in early auxin signaling in lateral root induction [69]. In addition to solid and liquid culture, stable isotope labeling in planta (SILIP) was developed to efficiently label soil-grown plants with 14

N/15N isotopes [83]. More examples of metabolic labeling can be found in Table 1. Isobaric tagging reagents such as isobaric tag for relative and absolute

quantification (iTRAQ) or tandem mass tags (TMT) are versatile because they can be used to label any protein sample and allow multiplexing up to 8 and 10 samples, respectively. Each iTRAQ or TMT reagent contains a reactive group (an Nhydroxysuccinimide ester that reacts with peptidyl amines to form a covalent bond) This article is protected by copyright. All rights reserved.

www.proteomics-journal.com

Page 12

Proteomics

and an isobaric tag. After labeling, the reagents do not affect the overall behavior of the peptides during sample processing and minimize experimental variations. Quantitation is achieved in MS/MS where the tags are released, and the differences among the tag intensities reflect the original quantities of peptides and proteins in different samples [1, 9]. In phosphoproteomics research, these technologies have been extensively used [40, 57-61, 70, 74, 76] (Table 1). However, isobaric tag labeled peptides showed reduced identification efficiency as much as 50% in multistage activation (MSA) MS/MS due to the high charge state induced by the tag in electrospray ionization. This may account for the discrepancies observed in the number of identified phosphopeptides between SILAC or label free studies and the iTRAQ and TMT work [84]. It should be noted that most of the quantitative phosphoproteomics

studies

obtained

quantitative

information

based

on

phosphopeptides only. Normalization with total protein changes corrected up to one quarter

of

the

phosphoprotein

changes,

showing

a

dependency

of

the

phosphorylation events on the total protein changes. In addition, assessment of the correct

stoichiometry

using

the

phosphorylated

and

non-phosphorylated

peptides/proteins impacts directly the interpretation and understanding of the roles of the protein phosphorylation events in the cells [54, 85, 86]. As an alternative to the commercial iTRAQ and TMT, a chemical derivatization method based on stable isotope dimethyl labeling has been developed, in which all the peptides are chemically labeled at their - and -amino groups leading to a mass difference of 4 Da [87]. This approach has been applied successfully to plant phosphoproteomics, e.g., to investigate the phosphoproteome dynamics in response to changes of water status in maize leaves [56] and to study protein phosphorylation influenced by photosynthetic activity in Arabidopsis leaves [71]. Recently, a N,N-dimethylleucine (DiLeu) 4-plex isobaric tandem mass (MS2) tagging was reported to have high quantitation efficiency, resulting in comparable iTRAQ like data with high protein coverage (up to 43%) and quantitation accuracy (

Recent advances and challenges in plant phosphoproteomics.

Plants are sessile organisms that need to respond to environmental changes quickly and efficiently. They can accomplish this by triggering specialized...
676KB Sizes 2 Downloads 11 Views