Mol Genet Genomics DOI 10.1007/s00438-015-1002-1
ORIGINAL PAPER
Genome‑wide analysis and expression of the calcium‑dependent protein kinase gene family in cucumber Xuewen Xu · Min Liu · Lu Lu · Min He · Wenqin Qu · Qiang Xu · Xiaohua Qi · Xuehao Chen
Received: 23 July 2014 / Accepted: 23 January 2015 © Springer-Verlag Berlin Heidelberg 2015
Abstract Calcium-dependent protein kinases (CDPKs) are multi-functional proteins that combine calcium-binding and signaling capabilities within a single gene product. Current studies have shown that the CDPKs regulate numerous growth and developmental processes and biotic and abiotic stress responses. Nonetheless, knowledge concerning the specific expression patterns and evolutionary history of the CDPK family in cucumber (Cucumis sativus L.) remains very limited. We, therefore, investigated the phylogenetic relationships and expression profiles of the 19 CDPK genes identified in the cucumber genome sequence, resolving them into four subfamilies based on a phylogenetic tree and gene structures. Tissue-specific expression profiles suggest that cucumber CDPK genes are involved in cucumber tissue development. An expression analysis based on qRT-PCR indicated that cucumber CDPK genes are extensively involved in abscisic acid, salt, cold, drought, heat, and waterlogging responses, possibly by different mechanisms. The fates of two paralogs after divergence were also investigated, suggesting subfunctionalization and neofunctionalization during evolution. These observations lay an important foundation for functional and
Communicated by L. Xiong. X. Xu and M. Liu have contributed equally to this work. Electronic supplementary material The online version of this article (doi:10.1007/s00438-015-1002-1) contains supplementary material, which is available to authorized users. X. Xu · M. Liu · L. Lu · M. He · W. Qu · Q. Xu · X. Qi · X. Chen (*) School of Horticulture and Plant Protection, Yangzhou University, Yangzhou, Jiangsu 225009, China e-mail:
[email protected] evolutionary analyses of the CDPK gene family in cucurbitaceae species. Keywords Cucumber · Calcium-dependent protein kinase (CDPK) · Gene family · Phylogenetic analysis · Gene expression Abbreviations CsCDPK Cucumis sativus calcium-dependent protein kinase BLAST The basic local alignment search tool SMART Simple modular architecture research tool HMM Hidden markov model NJ The neighbor-joining MEGA Molecular evolutionary genetics analysis GSDS Gene structure display server pI Isoelectric point M.W. Molecular weight qRT-PCR Quantitative real-time PCR
Introduction Ca2+ is a ubiquitous second messenger that plays an important role in diverse physiological processes and environmental stimuli (Bush 1995; Reddy et al. 2011). The intracellular modulations of Ca2+ are transduced by four kinds of calcium sensors, calmodulins, calmodulin-like proteins, calcineurin B-like proteins, and calcium-dependent protein kinases (CDPKs) (Reddy et al. 2011; Trewavas et al. 2002; Sanders et al. 2002). Among these sensors, CDPKs constitute a large calcium-sensing subfamily, which has been identified throughout the plant kingdom and in some protozoans, but is absent in animals (Ludwig et al. 2004; Harper and Harmon 2005; Ishino et al. 2006).
13
CDPKs, also named CPKs, are Ser/Thr protein kinases that typically possess a variable N-terminal domain and several functional domains, including a protein kinase catalytic domain, an autoinhibitory junction domain, and a calmodulin-like domain in their C-terminal variable region (Hrabak et al. 2003; Cheng et al. 2004). The N-terminal domain is highly variable and often bares palmitoylation or myristoylation sites, which are associated with subcellular localization, and inhibit the kinase activity of the catalytic domain in the absence of Ca2+ (Ishino et al. 2006; Harper et al. 1994). The specific functions of CDPKs are determined by these variable domains (Hrabak et al. 2003). The protein kinase domain is the catalytic domain containing an ATP binding site (Harmon et al. 1994). The calmodulinlike domain usually harbors one to four EF-hand motifs for Ca2+ binding capacity (Cheng et al. 2004), and CDPKs are activated by the binding of Ca2+ to their calmodulin-like domain (Snedden and Fromm 2001). Several studies have demonstrated the presence of 31, 34, 35, 20, 30, and 25 CDPK genes in the genomes of Oryza, Arabidopsis, maize, wheat, poplar, and canola, respectively (Ray et al. 2007; Hrabak et al. 2003; Ma et al. 2013; Li et al. 2008; Zuo et al. 2013; Zhang et al. 2014). There are also a large number of reports on the expression of CDPKs in horticultural plants, such as grapevine, chili pepper, and peanut (Chung et al. 2004; Li et al. 2008; Chen et al. 2013; Dubrovinaa et al. 2013). Analyses of CDPK genes expression levels have shown both positive and negative responses to abiotic and biotic stresses. Wheat CPK1, 2, 3, 4, 7, 10, 12, 15, and 19 respond to powdery mildew tolerance (Li et al. 2008), grapevine VaCPK9, 21, and 26 were induced under salt stress (Dubrovinaa et al. 2013), and poplar PtCDPK18, 19, 21, 22, 25, and 26 were also induced under mechanical wounding (Zuo et al. 2013). The over-expression of OsCDPK7 or OsCDPK13 enhanced the tolerance to cold, salt, and drought in transgenic Oryza (Komatsu et al. 2007). The over-expression of ZmCPK4 enhanced abscisic acid (ABA) sensitivity in seed germination, seedling growth, drought stress tolerance, and stomatal movement in transgenic Arabidopsis (Jiang et al. 2013). There is also some evidence demonstrating that some CDPKs act as negative regulators. For example, AtCDPK21 (loss of seedling function) mutants are more susceptible to hyperosmotic stress and show increased stress responses to marker gene expression and metabolite accumulation (Franz et al. 2011). In addition, some CDPKs respond to abiotic stress by modulating ABA signaling and reducing reactive oxygen species accumulation (Das and Pandey 2010; Asano et al. 2012). Despite extensive studies of CDPKs in many other species, little is known about this gene family in cucumber (Cucumis sativus L.). Cucumber is an agriculturally and economically important vegetable crop worldwide, ranking 4th in quantity after tomato, cabbage, and onion
13
Mol Genet Genomics
(FAO STAT 2012, http://faostat3.fao.org). Cucumber was the first vegetable crop to be sequenced; therefore, the genome provides an important reference for the breeding of the cucurbitaceae family and in molecular and biological studies (Wan et al. 2013). The availability of the cucumber complete genome sequence has enabled us to perform a genome-wide analysis to functionally characterize the genes belonging to a multigene family (Huang et al. 2009). Because of the importance of CDPK genes in abiotic stress responses, we initiated a project to isolate CDPKs from cucumber. In the present study, the genome-wide identification of CsCDPKs was performed by database searches, and they were classified according to a phylogenetic analysis and classified according to phylogenetic analysis. Furthermore, CsCDPK expression levels in different tissues at distinct developmental stages, as well as in response to various abiotic stresses, were also investigated. Our results provide a perspective on the evolutionary history and general biological roles of the cucumber CDPK family.
Materials and methods Identification of cucumber CDPK genes The cucumber protein, cDNA and genomic DNA databases were obtained from ICUGI (http://www.icugi.org, version 1). In total, 34 Arabidopsis CDPK proteins were accessed from NCBI (http://www.ncbi.nlm.nih.gov/) and used as query sequences in BLASTP searches against the predicted cucumber proteins. In addition, the hidden markov model (Eddy 1998) program was used to identify CDPKs in cucumber. The HMMER2.1.1 software package (http:// www.hmmer.wust.edu) was also used to make gene predictions in the kinase and EF-hand domains from Arabidopsis CDPK proteins. To further verify the reliability of these candidate sequences, motif scanning of the CDPKs was performed using two online tools, ScanProsite (http:// prosite.expasy.org/scanprosite/) and the SMART software (http://smart.embl-heidelberg.de/smart/), to confirm each candidate CsCDPK protein as a member of the CsCDPK family. Genomic distribution and phylogenetic analysis Genes were mapped on chromosomes by identifying their detailed chromosomal position provided in the Cucumber Genome Database. The distribution of CsCDPK family members was drawn manually. Full-length protein sequences of 34 Arabidopsis (Harper and Harmon 2005), and 19 cucumber CDPK proteins were aligned using ClustalW 1.8.1 (Thompson et al. 1994), and the alignments
Mol Genet Genomics
were then adjusted manually before a phylogenetic tree was constructed by the neighbor-joining method using the MEGA6 program (Tamura et al. 2007). Gene structure and conserved motif predictions, promoter region analyses, and the calculation of Ks (synonymous substitution rate) and Ka (non‑synonymous substitution rate) values The DNA and cDNA sequences corresponding to each predicted gene from the cucumber genome database were downloaded, and then the intron distribution pattern and splicing sites were analyzed using the web-based bioinformatics tool GSDS (http://gsds.cbi.pku.edu.cn/) (Guo et al. 2007). To search for conserved motifs within the cucumber CDPK proteins, the online MEME tool (http://meme.nbcr.net/meme/ cgi-bin/meme.cgi) (Machanick and Bailey 2011) was used to find similar sequences shared by these members. The 1.5-kb DNA sequences upstream of the start codon (ATG) corresponding to each CsCDPK gene from the cucumber genome database were downloaded, and the putative cis-elements were then analyzed using another online tool PLACEcare (Lescot et al. 2002). Ks and Ka values were calculated using the DnaSP v5.0 software (DNA polymorphism analysis) (Librado and Rozas 2009). Finally, the Ka/Ks ratio was analyzed to assess the selection pressure for each gene pair.
RNA isolation and qRT‑PCR Total RNA was isolated using RNAiso Plus (Takara, China). Dried RNA samples were dissolved in DEPC-water to a 1,000 µg/mL concentration using a Biophotometer Plus (Expander, Germany). RNA was reverse-transcribed using a Takara PrimeScript® RT reagent kit with gDNA eraser according to the manufacturer’s specifications. RT-PCR was performed using a RealMasterMix (SYBR Green) kit (TIANGEN, China) according to the manufacturer’s specifications. SYBR Green PCR cycling was performed on an iQ™ 5 multicolor real-time PCR detection system (Bio-RAD, USA) using 20-µL samples. The PCR primers were designed using Primer Premier 5.0 (Premier Biosoft International, USA), to avoid the conserved region. Detailed primer sequences are shown in Online resource 1. Three replicates of the stress and control treatments were used for real-time PCR. The analysis of relative mRNA expression data was performed using the ΔCt method. Each expression profile was independently verified in three replicate experiments performed under identical conditions. The heat map representation was performed using centring and normalized ΔCt values, with Cluster 3.0 software and Java Treeview to visualize the dendrogram.
Results Plant materials and stress treatments The field site was at the experimental farm of the department of horticulture, Yangzhou University. Cucumber seeds of the variety ‘Zaoer-N’ were grown in 25-cm diameter pots containing peat, vermiculite, and perlite (3:1:1, v/v) in a greenhouse kept at a constant day temperature of 28 °C and a night temperature of 18 °C, with a relative humidity of 70–85 % for the entire experimental period. The roots, stems, true leaves, tendrils, male flowers, and fruits of mature plants were sampled separately from the field using forceps and placed into liquid nitrogen for the tissuespecific expression analysis. The seedlings at the three-leaf stage were subjected to different abiotic stresses. ABA stress was imposed by spraying 100 mM ABA on the foliage of selected uniform seedlings. For salt and drought stresses, seedlings were removed from the soil and subjected to 200 mM NaCl or 10 % polyethylene glycol (PEG) 6,000, respectively. Cucumber plants were moved to growth chambers and incubated at 4 and 42 °C for cold and heat stresses, respectively. For the waterlogging stress, seedlings were waterlogged to the base of the first true leaves (2 cm up from base of the hypocotyls). Aboveground tissues were harvested at 0, 3, 6, 12, and 24 h after their respective treatments, frozen quickly in liquid nitrogen, and stored at −80 °C for further analysis.
Identification and genome distribution of CsCDPKs in cucumber To identify CDPK family members in cucumber, bioinformatics methods were used to gather extensive information on this family. Here, a genome-wide analysis of the CsCDPK family was performed with the assistance of information from the Cucurbit Genomics Database (http://www.icugi.org). The annotation search function of the cucumber genome and the online ScanProsite tool were used to identify probable CsCDPKs candidates. The research identified 20 candidate genes of which one gene, Csa025386, was eliminated because of its similarity to another gene, Csa014084. The remaining 19 genes were designated as CsCDPK1 to CsCDPK19, according to the proposed CDPK gene nomenclature (Hrabak et al. 1996). Among these identified genes, the names of CsCDPK1 (Csa017282), CsCDPK2 (Csa002575), CsCDPK3 (Csa019908), CsCDPK4 (Csa014084), and CsCDPK5 (Csa016123) were maintained as previously reported (Rajesh and Jayabaskaran 2002; Kumar et al. 2004). The CsCDPKs identified in our study ranged in molecular weight from 56.3 to 75.8 kDa, which were comparable with CDPK genes from other plant species. The CDPKs identified in this study each contained four EF-hand motifs,
13
Mol Genet Genomics
Table 1 Summary information on the calcium-dependent protein kinase (CDPK) gene family in cucumber (Cucumis sativus L.) Gene name
Gene identifier
Gene locus
EF hands
pI
M.W. (KDa)
Myristoylaton motif
Palmitoylation prediction
N-terminal acylation
CsCDPK1 CsCDPK2 CsCDPK3 CsCDPK4 CsCDPK5 CsCDPK6 CsCDPK7 CsCDPK8 CsCDPK9
Csa017282 Csa002575 Csa019908 Csa014084 Csa016123 Csa000481 Csa001056 Csa002018 Csa002499
Chr2:13,413,251..13,416,483 Chr3:26,379,049..26,382,276 Scaffold000143:136,312..147,152 Chr7:6,028,294..6,033,396 Chr4:5,476,283..5,480,430 Chr6:23,496,059..23,500,408 Chr5:23,965,632..23,968,627 Chr3:31,372,914..31,377,011 Chr3:26,446,394..26,451,015
4 4 4 4 4 4 4 4 4
5.07 4.99 5.37 5.59 5.65 8.99 5.38 5.78 5.35
56.3 74.3 56.3 63.3 59.2 64.6 63.9 60.2 65.2
N N N Y Y Y Y Y Y
Y Y Y Y Y Y Y Y Y
N N N N N N N N N
CsCDPK10 CsCDPK11 CsCDPK12 CsCDPK13 CsCDPK14 CsCDPK15 CsCDPK16 CsCDPK17 CsCDPK18
Csa002986 Csa003756 Csa004928 Csa006484 Csa007804 Csa008536 Csa011913 Csa017005 Csa018149
Chr4:579,654..584,797 Chr1:4,988,834..4,993,347 Chr3:23,864,680..23,867,789 Chr1:20,814,389..20,817,934 Chr6:18,200,679..18,203,381 Chr6:5,813,020..5,822,106 Chr6:18,461,292..18,468,490 Chr3:3,559,906..3,563,799 Chr2:13,070,036..13,073,942
4 4 4 4 4 4 3 4 4
5.34 6.05 5.86 6.28 5.67 8.88 6.98 6.34 9.11
60.0 62.1 59.4 59.7 59.8 75.8 59.5 58.6 69.4
Y Y Y Y Y Y Y N Y
Y Y Y Y Y Y Y Y Y
N N N N N N N N N
CsCDPK19
Csa021911
Scaffold000257:15,571..18,857
4
6.46
60.4
Y
Y
N
except for CsCDPK16, which had three EF-hand motifs (Table 1). To understand the genomic distribution of the identified CsCDPKs, the DNA sequence of each CsCDPK was used to search the cucumber genome database using BLASTN (nucleotide against nucleotide). In total, 17 genes could be mapped on chromosomes 1–7, while two genes, CsCDPK3 and CsCDPK19, did not mapp to any position on the cucumber genome (Fig. 1). CsCDPK distributions were uneven among the seven cucumber chromosomes, but CsCDPKs were found on all chromosomes. The largest numbers of CsCDPKs were found on chromosome 3 (five genes), followed by chromosomes 5 and 6 (three genes each). Phylogenetic analysis of the cucumber CDPK gene family To detect the evolutionary relationships of CDPK genes, an unrooted neighbor-joining tree was generated from alignments of the full-length sequences of cucumber and Arabidopsis CDPK genes (Harper and Harmon 2005). As shown in the Fig. 2, the 19 CsCDPKs were assigned to four major subfamilies according to the tree topology (Groups A, B, C, and D), which matched the classification of the A. thaliana CDPKs, and could be further divided into 10 sub-subfamilies (A1, A2, A3, B1, B2, B3, C1, C2, C3, and D). Thirteen (13) Arabidopsis and six CsCDPKs (CsCDPK1, 2, 3, 4, 7, and 9) belonged to group A. Ten (10)
13
Fig. 1 Genomic distribution of cucumber (Cucumis sativus L.) calcium-dependent protein kinase (CDPK) genes on chromosomes Chromosomal positions of the CsCDPK genes are indicated by CsCDPK number (assigned in Table 1). The scale is in megabases (Mb). The black bars in the middle of the seven chromosomes represent the rough position of the centromeres. Two genes could not be localized on any chromosome
Arabidopsis and four (4) CsCDPKs (CsCDPK5, 8, 14, and 17) belonged to group B. Eight (8) Arabidopsis and six (6) CsCDPKs (CsCDPK10, 11, 12, 13, 16, and 19) belonged to group C. Three (3) Arabidopsis and three (3) CsCDPKs (CsCDPK6, 15 and 18) belonged to group D. The results
Mol Genet Genomics
Structural divergence among the CsCDPKs Because the intron/exon organizations and intron types and numbers can indicate evolutionary history within some gene families (Boudet et al. 2001; Kudla et al. 2010), the gene structures of all 19 CsCDPK members were examined to obtain further insight. Interestingly, all of the members in subfamily A contained six introns, while those in subfamily B contained seven introns. However, sub-subfamilies C1 and C3 contained seven introns, C2 contained six introns, and subfamily D contained 10 to12 introns (Fig. 3). In addition, 10 conserved motifs within the cucumber CDPK genes were identified using online MEME tools (Fig. 4), which can help to predict their function (Yang et al. 2006). Motif 10 can be found in subfamily A and B, whereas subfamily C contained motif 8. CsCDPK1 and CsCDPK3 appear to possess degenerated N-terminal domains and did not contain motif 9, which may contribute to their functional specificity. The N-terminal domain often contains myristoylation or palmitoylation acceptor sites that bind CDPKs to membranes (Hrabak et al. 2003). The variable PEST motifs within the N-terminal domain help to determine the half-life of the enzyme (Rechsteiner and Rogers 1996) (Online resource 2). The conserved gene structures within each group support their close evolutionary relationship. Identification of putative cis‑elements in the promoters of CsCDPKs
Fig. 2 Phylogenetic relationships of cucumber (Cucumis sativus L.) and Arabidopsis calcium-dependent protein kinase (CDPK) genes. A neighbor-joining tree was created by the MEGA6 program with 1,000 bootstrap sampling using the full protein sequences of 19 cucumber and 34 Arabidopsis CDPK proteins. The four groups were labeled as a–d (color figure online)
of the phylogenetic analysis of the predicted CsCDPK protein sequences revealed that there was not an equal representation of cucumber and Arabidopsis proteins in the four subgroups.
To understand the transcriptional control of the CsCDPKs, sequences of the 1.5-kbp upstream regions from the translation initiation site (ATG) were extracted for the 19 genes and subjected to analysis on the PLACEcare server. The results suggested that various cis-acting elements related to plant growth, development, and stresses are present in different CsCDPK members. The Skn-1 motif that is responsible for endosperm-specific expression was present in all of the CsCDPK promoters except for those of CsCDPK4, 15 and 19. The ARE motif that is responsible for anaerobic induction was present in 17 of the CsCDPK promoters, suggesting important roles for these genes in anaerobic stress responses. The abscisic acid-responsive element (ABRE), heat-shock element (HSE), and gibberellin-responsive element (GARE) are present in more than 10 of the CsCDPKs promoters. The fungal-responsive element Box-W1 was present in nine CsCDPKs, whereas the wound-responsive element WUN-motif was present in four CsCDPKs (Table 2). These results suggest that CDPK proteins link Ca2+ signaling and developmental and stress responses. Surprisingly, among the 19 CsCDPKs, putative ciselements involved in light responsiveness, ACEs and
13
Fig. 3 Intron–exon structures of cucumber (Cucumis sativus L.) calcium-dependent protein kinase (CDPK) family genes All 19 genes’ intron-exon structures are illustrated in the diagram. Exons and
Mol Genet Genomics
introns are indicated by purple wedges and hat shapes, respectively (http://gsds.cbi.pku.edu.cn/) (color figure online)
Fig. 4 Organization of the different motifs in 19 cucumber (Cucumis sativus L.) calcium-dependent protein kinase (CDPK) family genes. The conserved motifs were detected using the MEME online tools (http://meme.sdsc.edu/meme/intro.html) (color figure online)
AE-boxes, were observed in all of the CsCDPKs promoter regions. A possible reason was the strong crosstalk between light signals and downstream light-regulated physiological responses, such as chloroplast development and hypocotyl elongation. Expression profiles of cucumber CDPK genes in different tissues and under different abiotic conditions We investigated the link between evolutionary and functional divergence by determining the expression profiles of the 19 CsCDPKs in the roots, stems, leaves, male flowers, fruits, and tendrils (Fig. 5). The predominant expression
13
patterns of the 19 CsCDPKs can be seen in Online resource 3. Some CsCDPKs, such as CsCDPK4, 8, and 19 were expressed at nearly the same levels in all tissues, suggesting that they perform constitutive housekeeping functions. Three CsCDPKs (12, 14, and 18) were strongly expressed in the roots and were assigned to subfamilies C2, B1, and D, respectively (Fig. 2). Interestingly, other CsCDPKs were highly expressed in many organs and to an obviously lower extent in roots, such as CsCDPK1 and 3. The root-specific CDPKs can be targets for the breeding of roots flourishing and waterlogging-tolerant cultivars because of the undeveloped and shallow roots system of cucumber (Xu et al. 2014).
Mol Genet Genomics Table 2 cis-Acting regulatory elements present in the promoters of calcium-dependent protein kinase (CDPK) genes in cucumber (Cucumis sativus L.) Gene name Motifs related to growth and development
Motifs related to stress response
CsCDPK1
CAT-box, GCN4_motif, O2-site, Skn-1_motif, circadian ACE, ARE, Box-W1, GARE-motif, HSE, MBS, TCA-element
CsCDPK2
MBSI, MSA-like, Skn-1_motif, as-2-box, circadian
ACE, ARE, ERE, GARE-motif, LTR, P-box, TCA-element
CsCDPK3
CCGTCC-box, GCN4_motif, Skn-1_motif
ABRE, ACE, ARE, HSE, TC-rich repeats
CsCDPK4
as-2-box, circadian
CsCDPK5
O2-site, Skn-1_motif, circadian
ACE, ARE, GARE-motif, GC-motif, LTR, TATC-box, TC-rich repeats, TCA-element ACE, ABRE, ARE, CGTCA-motif, HSE, TGA-element, TGACG-motif
CsCDPK6
O2-site, Skn-1_motif, circadian
CsCDPK7 CsCDPK8 CsCDPK9 CsCDPK10 CsCDPK11 CsCDPK12 CsCDPK13 CsCDPK14 CsCDPK15 CsCDPK16 CsCDPK17
ABRE, AE-box, Box-W1, CGTCA-motif, HSE, P-box, TC-rich repeats, TCA-element, TGA-element, TGACG-motif GCN4_motif, O2-site, Skn-1_motif, circadian ACE, ARE, AT-rich element, CGTCA-motif, ERE, GARE-motif, GCmotif, HSE, MBS, TC-rich repeats, TCA-element, TGACG-motif Skn-1_motif, circadian, as-2-box ACE, ARE, Box-W1,ERE, GARE-motif, HSE, LTR, P-box, TATC-box, TC-rich repeats, TCA-element, TGA-element, WUN-motif GCN4_motif, O2-site, Skn-1_motif, as-2-box, circadian ABRE, ACE, ARE, Box-W1, CGTCA-motif, MBS, TCA-element, TGACG-motif GCN4_motif, MBSI, O2-site, Skn-1_motif ABRE, AE-box, CGTCA-motif, GARE-motif, HSE, MBS, TATC-box, TC-rich repeats, TCA-element, TGACG-motif Skn-1_motif ACE, ARE, GARE-motif, HSE, P-box, TATC-box, TC-rich repeats, TGA-element, WUN-motif MSA-like, Skn-1_motif, ACE, ARE, Box-W1, CGTCA-motif, MBS, TC-rich repeats, TGAelement, TCA-element, TGACG-motif MBSII, Skn-1_motif, circadian ACE, ARE, Box-W1, ERE, GARE-motif, LTR, TC-rich repeats, TCAelement GCN4_motif, O2-site, Skn-1_motif, circadian ARE, AE-box, CGTCA-motif, ERE, GARE-motif, HSE, LTR, MBS, P-box, TC-rich repeats, TCA-element, TGACG-motif MBSII, circadian ABRE, ARE, AE-box, GCN4_motif, HSE, MBS, Skn-1_motif, TC-rich repeats CAT-box, GCN4_motif, O2-site, Skn-1_motif, circadian ABRE, ACE, ARE, Box-W1, GARE-motif, MBS, TATC-box, TC-rich repeats, TCA-element HD-Zip1, HD-Zip2, Skn-1_motif, as-2-box, circadian ACE, ARE, Box-W1, GARE-motif, HSE, TC-rich repeats, WUN-motif
CsCDPK18 MBSI, Skn-1_motif, as1
ABRE, ACE, ARE, CGTCA-motif, ERE, GARE-motif, HSE, LTR, TCrich repeats, TCA-element, TGACG-motif, WUN-motif
CsCDPK19
ACE, ARE, ERE, HSE, TC-rich repeats, TCA-element
Plants are frequently challenged by abiotic stresses, such as drought stress and heat. CDPK proteins may be widely involved in signaling and responding to abiotic/ biotic stimuli (Kudla et al. 2010), but limited information was available on CDPK involvement in stress responses in cucumber. In this research, the relative expressions levels of 19 CsCDPKs were recorded at the three-leaf stages under salt (100 mM NaCl), low temperature (4 °C), heat (42 °C), waterlogging, drought (100 mM PEG), and ABA (100 mM) treatments (Fig. 6). The results indicated that 16 (84.2 %) genes responded to at least one treatment, which included seven (36.8 %) genes responding to salt treatment, two (10.5 %) genes to cold, seven (36.8 %) genes to heat, nine (47.4 %) genes to waterlogging, six (31.6 %) genes to drought, and nine (47.4 %) genes to ABA (Online resource 4). This suggests that these CsCDPK genes were involved in responses to high salinity, low temperature, heat,
waterlogging, drought, and ABA signaling. Interestingly, some up-regulated genes typically induced by one abiotic stress were down-regulated during another stress, such as the expressions levels of CsCDPK2 under salt and waterlogging stresses. Although numerous CDPK proteins respond to various abiotic stresses, the expression patterns of CDPK genes under waterlogging stress were newly revealed in our studies. The results showed that nine CsCDPKs were differentially up- or down-regulated under the waterlogging treatment (Fig. 6), suggesting that several CsCDPKs might be potentially involved in waterlogging stress. One plausible explanation is that the CDPKs are downstream messengers in signaling pathways triggered by auxins and nitric oxide to promote adventitious root formation (Lanteri et al. 2006), which is an important adaptive mechanism allowing these new roots to functionally replace the original waterlogged root system (Sairam et al. 2008).
13
Mol Genet Genomics
Discussion CDPK gene family in cucumber
Fig. 5 Heat map showing the expression of cucumber (Cucumis sativus L.) calcium-dependent protein kinase (CDPK) in different tissues. Quantitative RT-PCR was used to assess CsCDPK transcript levels in total RNA samples extracted from seedling, roots (R), stem (S), leaf (L), male flowers (M), fruit (F), and tendrils (T), and the relative expression was log2 transformed. The cucumber β-actin gene (GenBank AB010922) was used as an internal control. Each value denotes the mean relative level of expression of three replicates. Genes highly or weakly expressed in the tissues are colored red and green, respectively. The heat map was generated using cluster 3.0 software (color figure online)
To survey the co-expression relationship between CsCDPKs, Pearson’s correlation coefficients (PCC value) were calculated based on the qRT-PCR data from six samples (Online resource 5) and showed that significant correlations existed between three pairs of CsCDPKs at the 0.01 level (2-tailed), (PCC value ≥0.917). Among them, CsCDPK1 and 3 (Group A) and CsCDPK11 and 13 (Group C) belonged to the same group, indicating that they probably contribute to some redundant function or are involved in the same regulatory network in a biological process. Furthermore, the PCC value among pairs of CsCDPKs based on qRT-PCR under ABA, cold, drought, heat, salt, and waterlogging stress conditions (Online resource 6) showed that 50 gene pairs showed significant correlations at the 0.01 level (2-tailed), including 43 (86 %) pairs that showed significant positive correlations (PCC value ≥0.959) and 7 (14 %) pairs that showed significant negative correlations (PCC value ≤−0.959). The fact that 43 positive and 7 negative pairs correlated suggests that some CsCDPKs may also be involved in common pathways when responding to these abiotic stresses.
13
The CDPK gene family has a long evolutionary history, which can be traced back to the oldest land plants, such as pteridophyte (Selaginella moellendorffii) and bryophyte (Physcomitrella patens), which contain at least 35 CDPKs in their genomes (Hamel et al. 2014). Long thought to be plant-specific, CDPK homologs have also been found in ciliates and apicomplexan parasites (Zhang et al. 2001; Billker et al. 2009). The emergence of the first CDPK gene occurred before the basal split between green plants and alveolate protists (Zhang et al. 2001). The genome of cucumber encodes 19 CsCDPKs compared to 34 AtCDPKs in Arabidopsis (Harper and Harmon 2005), 30 PtCDPKs in Poplar (Ma et al. 2013), and 31OsCDPK in Oryza (Harmon et al. 1994). The lower numbers of CDPK genes in cucumber may be due to the absence of whole genome duplication (WGD) events after the gamma WGD, as shown in Arabidopsis, or the WGD after the split between eudicots and monocots, as shown in rice (Jiao et al. 2011). Another plausible reason could be the low number of segmental duplications and tandem gene duplications observed in the cucumber genome (Asano et al. 2012). However, Arabidopsis, poplar, and Oryza have undergone duplication events recently, which might have led to the increased number of the CDPK family members in their genome (Remington et al. 2004). Conversely, it can be hypothesized that 19 CsCDPKs are sufficient for cucumber to mediate Ca2+ signals and, therefore, some CsCDPKs might have been lost during the evolutionary process because of functional redundancy. Structural divergence among CsCDPKs CDPK proteins in groups A, B, and C tend to be slightly acidic, with pIs ranging between five and six. However, group D did not follow this trend and have a basic pI of eight or greater (Table 1). A basic pI might correlate with a specific subcellular localization or function of Group D CDPKs, which accordingly correspond to the most divergent homologs within this protein kinase family (Hamel et al. 2014). Functional domains of CsCDPKs with less than four EFhands have been reported in other plants, but not in cucumber. In our studies, most CsCDPKs have four EF-hands, except for CsCDPK16 with three EF-hands. The EF-hand motifs differed in their Ca2+-binding affinities and subsequently in the contribution to Ca2+-regulated kinase activities, among which the N-terminal EF1 and EF2 motifs with lower Ca2+-binding affinities played important roles in
Mol Genet Genomics
Fig. 6 Heatmap showing the expression patterns of cucumber (Cucumis sativus L.) calcium-dependent protein kinase (CDPK) under abscisic acid (100 mM), cold (4 °C), drought (10 % PEG 6,000), heat (42 °C), salt (200 mM NaCl), and waterlogging (2 cm up from the base of hypocotyls) stresses. Quantitative RT-PCR was used to assess CsCDPK transcript abundance in total RNA samples extracted from three-true-leaf seedlings under the six abiotic stresses,
and the relative expression was log2 transformed. The cucumber β-actin gene (GenBank AB010922) was used as an internal control. Each value denotes the mean relative level of the expression of three replicates. Genes highly or weakly expressed in the tissues are colored red and green, respectively. The heat map was generated using cluster 3.0 software (color figure online)
activating the kinases (Ma et al. 2013). It would be of interest to explore the differences in the biological functions of CsCDPK16 (three EF-hands) compared with the other CsCDPKs (four EF-hands). CDPKs, such as AtCDPK25, OsCDPK30, and PtCDPK15, may also possess degenerated calmodulin domains (Harper and Harmon 2005; Ma et al. 2013), even though none of them require Ca2+ activation (Boudsocq and Sheen 2013). Interestingly, unlike other identified CDPK protein families, 16 of the N-termini of the CsCDPK proteins contained myristoylation motifs (except for CsCDPK1, 2 and 3), which may promote protein— membrane and protein—protein interactions (Johnson et al. 1994) (Online resource 2). Studies in wheat and Arabidopsis revealed that proteins possessing myristoylation motifs tend to localize in the plasma membrane (Mehlmer et al. 2010). However, proteins lacking a myristoylation site were also localized to the plasma membrane because three canola CDPKs (BnaCDPK5, 11, and 13) displayed fluorescence at the plasma membrane and nuclei (Zhang et al. 2014). Hence, further studies are needed to understand the subcellular localization of cucumber CDPKs. Exon/intron numbers in CsCDPKs were highly conserved within subfamilies A and B. In subfamily A, two introns within the EF-hand were found between codons in CsCDPK1, 2, 3, and 7, whereas no introns within the
EF-hand were found between codons in CsCDPK9. In subfamily B, three introns within the EF-hand were found between codons in CsCDPK5, 8, and 17, whereas no introns within the EF-hand were found between codons in CsCDPK14. Interestingly, both CsCDPK9 and CsCDPK14 have two introns in the N-terminus domain. Likewise, CsCDPK11 and CsCDPK16 in subfamily C have phase-0 introns in the N-terminus domain (Fig. 3). These intron insertions can be used as molecular markers to identify different members in the same families and subfamilies. Furthermore, in studying the patterns of CDPK genes in rice, poplar, and Arabidopsis, we found that although there are some differences in the splicing phase and interrupted amino acids, the positions of the intron inserts are conserved. This suggests that the insertional positions of the introns in the conserved domains are important for the evolution and the divergence of function in the gene family. The fate of CsCDPK after divergence: subfunctionalization and neofunctionalization Gene divergence is an important process in the evolution of novel functions (Hughes, 1994). To investigate the functional diversification of the CsCDPKs after divergence, we focused on two paralogs, CsCDPK1 and CsCDPK3
13
(subfamily A3), and CsCDPK8 and CsCDPK17 (subfamily B3). The Ka/Ks substitution rate ratio is an indicator of the selection history on genes or gene regions. Generally, Ka/Ks values 1 indicate an accelerated evolution with positive selection, and Ka/Ks = 1 suggest neutral selection (Yang et al. 2006). In this study, the Ka/Ks values for the two paralogs were small (