Mol Genet Genomics (2014) 289:1217–1223 DOI 10.1007/s00438-014-0881-x

ORIGINAL PAPER

Investigating co‑evolution of functionally associated phosphosites in human Zhi Liu · Guangyong Zheng · Xiao Dong · Zhen Wang · Beili Ying · Yang Zhong · Yixue Li 

Received: 27 December 2013 / Accepted: 19 June 2014 / Published online: 9 July 2014 © Springer-Verlag Berlin Heidelberg 2014

Abstract  Phosphorylation is essential for protein function and signal transduction in eukaryotic cells. With the rapid development of mass spectrometry technology, a large number of phosphosites are identified. However, high-throughput methods of functional characterization for phosphosites are still scarce. In this study, we inspected if the co-evolution property can be used as an indicator to explore function of phosphosites through investigating coevolutionary relationship between functionally associated phosphosites in human. In practice, the evolution attributes of phosphosites were represented with phylogenetic profiles, and then co-evolutionary correlations of functionally associated phosphosites were detected on three levels: (1) phosphosites within one protein; (2) phosphosites in different proteins participating in the same signal transduction

Communicated by S. Hohmann. Z. Liu and G. Zheng contributed equally to this work. Electronic supplementary material  The online version of this article (doi:10.1007/s00438-014-0881-x) contains supplementary material, which is available to authorized users. Z. Liu · X. Dong · Z. Wang · Y. Li (*)  Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd., Shanghai 200031, People’s Republic of China e-mail: [email protected] Z. Liu  University of Chinese Academy of Sciences, 19 Yuquan Rd., Beijing 100049, People’s Republic of China

pathways, and (3) general phosphosites. Results of the detection show that co-evolution is a general property of functionally associated phosphosites. This finding suggests to some degree that it is feasible to use the co-evolution property in exploring the function of phosphosites and investigating the functional association between them. Keywords  Co-evolution · Functional association · Phosphorylation site · Phylogenetic profile · Posttranslational modification

Introduction It is reported that phosphorylation is involved in most of the cellular events, and defects of phosphorylation have been connected to numerous developmental disorders and human diseases (Wang et al. 2014). Characterization of phosphorylation can help understand their critical function in cellular events and provide potential drug targets for disease treatment (Lopez and Cho 2012). With tens of thousands of sites identified by mass spectrometry (MS) G. Zheng · B. Ying · Y. Li  Shanghai Centre for Bioinformation Technology, 2078 Keyuan Rd., Shanghai 201203, People’s Republic of China B. Ying · Y. Zhong  School of Life Sciences, Fudan University, 220 Handan Rd., Shanghai 200433, People’s Republic of China

G. Zheng  CAS‑MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yueyang Rd., Shanghai 200031, People’s Republic of China

13

1218

based phosphoproteomes, researchers are faced with the challenge of distinguishing key sites which are highly biologically important from the large-scale data and performing functional annotation for these sites. In previous studies, evolutionary conservations have been used to predict individual functional phosphosites (Niu et al. 2012) based on the concepts that phosphosites of known function are dramatically more conserved than those with no characterized function (Landry et al. 2009). However, the functional associations among the phosphosites have not been investigated before, as phosphorylation is often involved in signal transduction cascades and relationship between phosphosites is complex and hard to define explicitly. In the last two decades, the co-evolution property was utilized for surveying functional association in the biological community. In 1996, Fryxell et al. reported that phylogenetic trees of insulin and insulin receptors were more similar than what could be expected in divergence across species under the standard molecular clock hypothesis (Fryxell 1996). Pellegrini and his colleagues carried out the first application of functional association prediction with phylogenetic profile information on genomic level in 1999, on E. coli (Matteo Pellegrini et al. 1999). They computed phylogenetic profiles for 4,290 E. coli proteins by aligning each protein sequence with the proteins from 16 other fully sequenced genomes. Results of their work illustrated that proteins with matching or similar profiles strongly tended to be functionally linked. More recently, many efforts have been made to investigate the co-evolution of various biological molecules. For instance, methods based on coevolution strategy were utilized for exploring functional association and predicting interaction in ligand and receptor systems (Chern-Sing Goh et al. 2000), transcription factors and their DNA target (Zheng et al. 2012). Moreover, co-evolution study at amino acid level has successfully predicted the interacting surfaces (protein interfaces) of protein complexes as well as the interacting partners of a protein (Schug et al. 2009; Tress et al. 2005; Yeang and Haussler 2007). Since the co-evolution property has been applied to explore function of various molecules successfully, we hypothesized that the co-evolution property may be used to explore the function of phosphosites. In this study, we first represented the evolutionary attributes of phosphosites with phylogenetic profiles. At the same time, functionally associated phosphosites was collected manually from public databases and literatures. Then, we tested the correlation of phylogenetic profile for functionally associated phosphosites on three levels: (1) phosphosites within same proteins; (2) phosphosites in different proteins participating in the same signal transduction pathways, and (3) general phosphosites. Test results showed that functionally associated phosphosites are of high correlation in phylogenetic profile.

13

Mol Genet Genomics (2014) 289:1217–1223

Materials and methods Phosphosites collection and pre‑processing Human phosphoproteins and phosphosites were retrieved from the SysPTM database version 2.0 (Li et al. 2014), and restricted to the proteins recorded in the NCBI RefSeq repository (Pruitt et al. 2012). Then these phosphoproteins were mapped to homology groups retrieved from the NCBI HomoloGene database (build 66) (Sayers et al. 2012). Initially, we considered all the 21 species recorded in the HomoloGene database which covered a broad evolutionary time scale. While examining each homology group, we found that around 60 % human phosphoproteins did not have orthologs in non-vertebrate species. Since we were to construct phylogenetic profile on phosphosite level, the impact of lacking orthologs should be minimized as much as possible. Therefore, we chose seven reference species whose numbers of homology proteins in the HomoloGene database are comparable to that of human (Table S1). These species included P. troglodytes, M. musculus, R. norvegicus, C. lupus, B. taurus, G. gallus, and D. rerio which were also adopted in the previous studies concerning function and evolution of phosphorylation (Wang et al. 2011). Then, we kept only one homology sequence for a species in a given homology group. If there existed several homology sequences for a certain species in a group, we kept the most likely orthologous sequence according to the results of the bi-direction best hit test carried out by the BLAST software (Altschul et al. 1990). In addition, groups having less than two homology sequences were discarded, resulting 9,037 homology groups with non-redundant phosphoproteins. Phosphosites phylogenetic profile and profile correlation The ClustalW program (Chenna et al. 2003) was first used to align the sequences of each homology group, and then human phosphosites and the corresponding sites in reference species were extracted from alignment results (Fig. 1). Next, each human phosphosite was depicted with a phylogenetic profile, which was a binary vector with eight elements representing evolutionary status of the phosphosite in eight species. An element was set to one when the site of a certain species was identical to human phosphosite; otherwise, 0 (Fig. 1). The profile correlation of two phosphosites was defined as:  n − ni=1 xi ⊕ yi Cor = (1) n where x and y referred to the two vectors in comparison, i was the ith element of vectors, and n was the element

Mol Genet Genomics (2014) 289:1217–1223

1219

Fig. 1  Diagram of phosphosite phylogenetic profile generation and profile correlation calculation. Firstly, multiple sequences alignment is carried out for each homology group. Then phylogenetic profile of each human phosphosite is generated from alignment results. The correlation of a profile pair is defined as the value of identical entries divided by the total entries between the two profiles

number of a vector. Character ⊕ was a logical operation, its value was set to be 1 if and only if its two input elements were different (0 ⊕ 0 = 0,1 ⊕ 0 = 1,0 ⊕ 1 = 1,1 ⊕ 1 = 0). Annotation of phosphosites in KEGG pathway First, we downloaded human signal transduction pathways involving phosphorylation events from the KEGG database (Release 65.0) (Kanehisa et al. 2004). Then, we collected information of phosphosite participating in these pathways from published papers manually (Table S2). In practice, pathways having less than two phosphosites evidenced by published papers were excluded from our analysis. Datasets of functionally associated sites In practice, the functionally associated sites (FAS) were defined in three scenarios: 1) FAS within a protein (F1 dataset, Table S3). In the annotated pathways mentioned above, if a protein was phosphorylated on multiple sites, which were annotated with the same or related function in a certain pathway, these sites were regarded as FAS; 2) FAS within signal transduction pathways (F2 dataset, Table S4). When phosphorylation of site a in protein A could directly activate or inhibit phosphorylation of site b in protein B, site a and b were considered as FAS; 3) general FAS: phosphosite pairs within interacted proteins and catalyzed by kinases within the same family. First, experimentally verified interacted proteins and kinase-substrate dataset were collected from the STRING database (Franceschini et al. 2013) and

PhosphoSitePlus database (Hornbeck et al. 2012) respectively. Then, we extracted all the phosphosite pairs within interacted proteins, and kept the pairs which both sites were catalyzed by kinases within the same family, resulting in a dataset of around 900 phosphosite pairs (F3 dataset, Table S5).

Results High phylogenetic profile correlation of functionally associated sites within a protein It is well known that many intracellular pathways involve protein phosphorylation/dephosphorylation reactions (Krebs and Krebs 1999). Thus it was important to investigate the co-evolution of phosphosites in the context of signal pathways. Given the context of a certain pathway, multi-site phosphorylation of a protein can determine signal extension and duration in the pathway (Cohen 2000b). For example, in the MAPK pathway, phosphorylation can occur in protein MAPKAP-K2 at position Thr222, Ser272 and Thr334. Only when two out of the three sites are phosphorylated, the pathway is activated (Cohen 2000a), these sites are defined as functionally associated ones. In F1 dataset, 41 proteins were annotated with more than one phosphosites in a certain pathway (Table S3). And we noticed that, for most multi-phosphosite proteins, phylogenetic profiles of functionally associated sites in a protein are correlated. To test whether this correlation was significant in

13

Mol Genet Genomics (2014) 289:1217–1223 1.2

1220

0.8 0.6 0.4

STAT

MAPKAPK2

Lats Mapkapk

Smad5

Smad2

Smad1

IKKb p100

IKKa

Btk

beta−catenin

Multiple phosphorylated protein

Ikba

p27

FOXO3

MEK1 p21

PKCa

SGK1

JNK

c−JUN

ERK5

MEK5

HSP27

ATF−2

MEF2C

MLK3

MKK6 p38

Raf Myc

ERK2

ERK1

S6

RSK2

ATG1

IRS1

4EBP1

AKT

mTOR

TSC2

0.0

0.2

Profile correlation

1.0

Observed Average

Fig. 2  Phylogenetic profile correlation of sites within multi-phosphorylated proteins. Correlation values comparison between functionally associated phosphosite-pairs and all phosphosite-pairs within a certain protein. The red bar presents correlation values of functionally

associated site-pairs within multiple phosphorylated proteins, and the blue bar presents the average correlation values of site-pairs within the donor proteins. (p value 

Investigating co-evolution of functionally associated phosphosites in human.

Phosphorylation is essential for protein function and signal transduction in eukaryotic cells. With the rapid development of mass spectrometry technol...
410KB Sizes 1 Downloads 3 Views