Transcriptomic and proteomic analyses of embryogenic tissues in Picea balfouriana treated with 6-benzylaminopurine

Qingfen Li, Shougong Zhang and Junhui Wang*

State Key Laboratory of Forest Genetics and Tree Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, China

*Corresponding author, e-mail: [email protected]

The cytokinin 6-benzylaminopurine (6-BAP) influences the embryogenic capacity of the tissues of Picea balfouriana during long subculture (after three months). Tissues that proliferate in 3.6 μM and 5 μM 6-BAP exhibit the highest and lowest embryogenic capacity, respectively, generating 113±6 and 23±3 mature embryos per 100 mg of tissue. In this study, a comparative transcriptomic and proteomic approach was applied to characterize the genes and proteins that are differentially expressed among tissues under the influence of different levels of 6-BAP. A total of 51 375 unigenes and 2617 proteins were obtained after quality filtering. There were 2770 transcripts for proteins found among these unigenes. Gene Ontology (GO) analysis of the differentially expressed unigenes and proteins showed that they were involved in cell and binding activity and were enriched in ribosome and glutathione metabolism pathways. Ribosomal proteins, glutathione S-transferase proteins, germin-like proteins and calmodulin-independent protein kinases were up-expressed in the embryogenic tissues with the highest embryogenic ability (treated with 3.6 μM 6-BAP), which was validated via qRT-PCR analysis, and these proteins might serve as molecular markers of embryogenic ability. Data are available via SRA and ProteomeXchange with identifier SRP042246 and PXD001022, respectively.

This article has been accepted for publication and undergone full peer review but has not been through the copyediting, typesetting, pagination and proofreading process, which may lead to differences between this version and the Version of Record. Please cite this article as doi: 10.1111/ppl.12276 This article is protected by copyright. All rights reserved

Abbreviations – 6-BAP, 6-benzylaminopurine; GO, Gene Ontology; GST, glutathione S-transferase; iTRAQ, isobaric tags for relative and absolute quantification; KEGG, Kyoto Encyclopedia of Genes and Genomes; qRT-PCR, quantitative real-time polymerase chain reaction; SE, somatic embryogenesis.

Introduction Somatic embryogenesis (SE) is the process whereby somatic cells differentiate into embryos and ultimately into plants through a series of characteristic morphological stages, particularly the later stages, which resemble the zygotic stages of development (Zimmerman 1993, Schmidt et al. 1997). During SE somatic cells undergo developmental restructuring towards the embryogenic pathway and forms the basis of cellular totipotency in higher plants (Nolan et al. 2003, Imin et al. 2004). The procedure for the production of SE-derived seedlings has been established through the efforts of many different laboratories, but numerous restrictions still exist in its application, such as lower initiation and germination rates (Stasolla and Yeung 2003, Mazri 2014). One of the primary problems is that many of the embryogenic tissues of conifers lose their embryogenic ability during long subculture, which was the main reason for cryopreservation (Kong and von Aderkas 2011, Latutrie and Aronen 2012). This disadvantage limits conifer somatic embryogenesis to be used in breeding programs (Liao and Juan 2014). As the molecular basis of SE is not well understood, even in model plant species such as rice and Arabidopsis, molecular studies on SE in important crop and woody plants must be developed for biotechnological applications. Factors influencing the somatic embryogenesis of Arabidopsis thaliana have been well studied by Gaj (2001, 2004) and Gaj et al. (2006). However, the underlying physiological and molecular mechanisms were not well clarified due to their complexity. For example, among the LEAFY COTYLEDON (LEC) genes, LEC2 has been shown to be sufficient to induce embryogenic competence in Arabidopsis (Stone et al. 2001), and the mutation of the gene can strongly impair the embryogenic response (Gaj et al.2005). Although LEC2 has been found to interact with the abscisic acid (ABA) and gibberellins (GA) (Braybrook et al. 2008, Gaj et al. 2005, Ledwoń and Gaj 2009), the molecular mechanisms underlying the interactions between the LEC genes and these hormones have not yet been elucidated. Numerous genes have been found to be involved in SE This article is protected by copyright. All rights reserved

(Parket et al. 2011, Rose and Nolan 2006, Zeng et al. 2007). However, their precise functions and the extent of the similarities in the actions of these genes in the two processes remain to be determined. Transcriptomic and proteomic techniques have been successfully applied to catalogue genes and proteins that are expressed during somatic embryogenesis in economically important conifers, such as Picea abies, Picea glauca and Pinus taeda (Cairney et al. 2006, Che et al. 2006, Imin et al. 2005, Joosen et al. 2007, Legrand et al. 2007, Lippert et al. 2005, Sharma et al. 2008, Singla et al. 2007, Stasolla et al. 2003, 2004, Su et al. 2007,Thibaud-Nissen et al. 2003, van Zyl et al. 2003, Zeng et al. 2006). The levels of 2 , 4-dichlorophenoxyacetic acid (2,4-D) and 6-BAP are decreased in the proliferation stage of somatic embryogenesis in most conifers compared with the initiation stage. In our laboratory, the whole SE system had been established in Picea likiangensis (Franch) Pritz var. balfouriana (Rehd. et Wils.) Hillier ex Slavin (Picea balfouriana), which is one of the most important forestation species in China. Embryogenic tissues could been initiated from both of immature and mature zygotic embryos, and the germination rate could up to about 50%. However, many cell lines of Picea balfouriana lose their embryogenic ability if the levels of both plant growth regulators or 2,4-D alone are decreased during long subculture. In a previous study by our group, when these cell lines were cultured in media with different levels of 6-BAP, their embryogenic capacity was changed, with some lines losing their embryogenic capacity, while that of others was maintained. Cytokinins are mostly known for their role in the control of cell division in plants, and the associated signaling pathway has been studied recently (Argueso et al. 2010, Gupta and Rashotte 2012). However, the mechanisms involved in cytokinin signaling in plant somatic embryogenesis are still unclear. Numerous genes and proteins have been identified as being specifically expressed during somatic embryogenesis (Mordhorst and Toonen 1997, Chugh and Khurana 2002, Fehér et al. 2003). These genes include hormone-responsive genes such as auxin-inducible genes (Walker and Key 1982), late embryo abundant genes (LEA) (Agnieszka 2009), calcium/calmodulin-binding proteins (Reddy et al. 2002), calmodulin-like protein kinases (Davletova et al. 2001), SERK genes (Schmidt et al. 1997, Nolan et al. 2003, Thomas et al. 2004), homeobox genes(Hiwatashi and Fukuda 2000), chitinases (De Jong et al. 1992), arabinogalactans (Katja et al. 2000, Baldwin et al. 2001), WUSCHEL (Zuo et al. 2002), BABY BOOM (Boutilier et al. 2002) and LEC genes (Reidt 2000, Stone et al. 2001), to name a

This article is protected by copyright. All rights reserved

few. Little is currently known about the processes underlying the induction and maintenance of the genes involved in somatic embryogenesis. Therefore, we hypothesized that 6-BAP may affect the embryogenic capacity by influencing the expression of genes and proteins, such as those listed above. The main purposes of the present work are to identify a robust set of transcripts and proteins expressed by 6-BAP as well as to define genes and proteins that are induced in somatic embryogenesis. Thus, the study presented here describes the use of RNA-Sequence (RNA-Seq) and isobaric tags as relative and absolute quantitation (iTRAQ) methods for obtaining a better understanding of the mechanisms underlying the influence of 6-BAP on the embryogenic ability of Picea balfouriana.

Materials and methods Plant material An embryogenic line was generated from immature embryos of genotype 4 of Picea balfouriana in half-strength LM basal medium (Litvay et al. 1981) solidified with 0.2% (w/v) gellan gum (Gelrite, Sigma). The basal medium was supplemented with 1000 mg·L–1 casein hydrolysate, 1% (w/v) sucrose, 500 mg·L–1 glutamine, 10 μM 2,4-D and 5 μM 6-BAP. The pH of the medium was adjusted to 5.8, followed by autoclaving at 121°C for 20 min. Then, sterilized L-glutamine was added to the partially cooled (50–60°C) autoclaved medium, which was subsequently sterilized via filtration for decomposition at 121°C. The cultures were maintained in darkness at 24 ± 1°C and sub-cultured every 2 weeks. Finally, the embryogenic tissues were transferred to media with the same composition as the initiation medium, but containing three different levels of 6-BAP (2.5 μM, 3.6 μM, 5 μM) for three months. The samples (Fig. 1A, B, C) were collected after subcultured 7 d in fresh proliferation media. For each treatment, 1 g of fresh tissue were frozen with liquid nitrogen for IAA and ZR extraction (three replicates for each treatments), 300 mg of fresh tissue were frozen with liquid nitrogen for RNA isolation (no biological replicates for RNA-Seq), and 2 g of fresh tissue was frozen with liquid nitrogen for protein isolation. Two biological replicates and three technological replicates for each biological replicate were performed in the iTRAQ analyses. Early embryo differentiation from tissues was stimulated by transferring the tissues to half-strength LM medium lacking plant growth regulators for 1 week. The promotion of late embryo development and maturation was achieved by transferring cultures to half-strength LM medium This article is protected by copyright. All rights reserved

supplemented with 61 μM ABA and 0.4% active charcoal, 6% sucrose, 500 mg·L–1 glutamine, 1 g·L–1 casein hydrolysate and 4% gelrite, followed by culturing in the dark at 24 ± 1°C. Ten replicates were performed for each treatment, and the number of somatic embryos (at least 3 mm long) that exhibited 3–5 cotyledons and could germinate generated per 100 mg of embryogenic tissue was counted after eight weeks (Fig. 1 D,E,F). Thirty somatic embryos from each of the three treatments were transferred to one quarter-strength LM medium containing 2% sucrose, 0.5% active charcoal, 500 mg·L–1 casein hydrolysate, 500 mg·L–1 glutamine and 3% gelrite and incubated at 24 ± 1°C in the light (30 µE m–2 s– 1

, 16 h photoperiod) for germination. Ten replicates were performed for each treatment. After six

weeks, the germinated somatic embryos showing elongated roots and hypocotyls were counted. To compare the influence of different levels 6-BAP on the maturation of tissues and the germination of the generated somatic embryos, mature somatic embryos and the recorded germination rates were subjected to analysis of variance in SPSS20. The level of significance was P < 0.05.

IAA and ZR extraction, purification, and quantification The methods for extraction, purification and determination of IAA and ZR in three types of tissues were performed as described were modified from those described by Yang et al. (2001). Samples of 200–300 mg were extracted in cold 10 ml 80% (v/v) methanol containing 1 mM butylated hydroxytoluene as an antioxidant. The extract was incubated overnight at 4°C and centrifuged at 10 000 g for 20 min at the same temperature. The supernatants were passed through Chromosep C18 columns (C18 Sep-Park Cartridge; Waters Corp., Millford, MA, USA), prewashed with 10 ml of 100% methanol, 5 ml of 100% aether and 5 ml of 100% methanol, respectively. The hormone fractions were dried under N2 , and dissolved in 2 ml of phosphate-buffered saline (PBS, 0.01 mol·L–1, pH 7.4) containing 0.1% (v/v) Tween 20 and 0.1% (w/v) gelatin for analysis by enzyme-linked immunosorbent assay (ELISA). The absorbance was recorded at 490 nm. Calculations of the enzyme-immunoassay data were performed as described by Weiler et al. (1981). The results were analyzed for variance using SPSS20, the level of significance was P < 0.05.

Protein preparation and digestion Tissues were suspended in 500 μL of lysis buffer (8 M urea, 4% CHAPS, 40 mM Tris-HCl), with 1 mM PMSF and 2 mM ethylenediaminetetraacetic acid (EDTA) (final concentration). After 5 min of This article is protected by copyright. All rights reserved

vigorous vortexing, dithiothreitol (DTT) was added to a final concentration of 10 mM. Following mixing, the samples were homogenized via sonication cracking at a low temperature for 15 min and then centrifuged for 20 min at 25 000 g, after which the supernatant was homogenized well with 10 mM DTT for 1 h at 56°C, followed by 55 mM IAM at 45°C in a darkroom. Then, the supernatant was mixed well with ice-cold acetone (1:4, v/v), allowed to rest for 2 h at –20°C and centrifuged for 20 min at 25 000 g. For digestion, the protein pellets from the previous step were resuspended in digestion buffer (200 μL of 0.5 M triethylammonium bicarbonate, TEAB) and broken up via sonication cracking at a low temperature for 15 min. Finally, equal aliquots (500 μL) from each lysate were centrifuged for 20 min, and the supernatant was stored for use at –80°C.

iTRAQ labeling The iTRAQ labeling of peptide samples derived from control and ethanol-treated conditions was performed using the iTRAQ Reagent 8-plex kit (Applied Biosystems, Foster City, CA) according to the manufacturer’s protocol. For each treatment, four samples (two biological replicates and two technical replicates) were subjected to iTRAQ labeling. Two biological replicates from the 2.5 μM treatment were labeled with reagents 115 and 121, while iTRAQ tags 117 and 119 were used to label two biological replicates from the 3.6 μM treatment, and two biological replicates from the 5 μM treatment were labeled with reagents 116 and 118. The peptides labeled with the respective isobaric tags were reconstituted in 4 mL of Buffer A (25 mM NaH2PO4, 25% ACN, pH 2.7). The iTRAQ-labeled peptides were fractionated using an Ultremex SCX column (4.6 x 200 mm) with the LC-20AB system (Shimadzu, Japan) at a flow rate of 1.0 mL min–1. The 22-min HPLC gradient consisted of 100% buffer A for 10 min; 5–35% buffer B (25 mM NaH2PO4, 1 MKCL in 25% ACN, pH 2.7) for 11 min; and finally, 35–80% buffer B for 1 min. The resultant chromatograms were recorded at 214 nm. The twenty-one collected fractions were lyophilized and desalted with Strata X cartridges and concentrated to dryness using a vacuum centrifuge. The samples were stored at –80°C.

LC-MS/MS proteomic analysis Mass spectroscopy analysis was performed using an AB SCIEX TripleTOF™ 5600 mass spectrometer (AB SCIEX, Framingham, MA, USA), coupled with an online microflow HPLC system (Shimadzu, JAPAN), as described elsewhere (Unwin et al. 2010). The peptides were separated using a nanobored This article is protected by copyright. All rights reserved

C18 column with a picofrit nanospray tip (75 Km ID x 15 cm, 5 μM particles) (New Objectives, Woburn, MA). Separation was performed at a constant flow rate of 20 μL min–1, using a splitter to achieve an effective flow rate of 0.2 μL min–1. Mass spectrometry data were acquired in positive ion mode, with a selected mass range of 300–2000 m/z. Peptides showing charge states of +2 to +4 were selected for MS/MS analysis. The three most abundantly charged peptides above a count threshold were selected for MS/MS and dynamically excluded for 30 s at a ±30 mDa mass tolerance. Smart information-dependent acquisition (IDA) was activated with an automatic collision energy and automatic MS/MS accumulation. The fragment intensity multiplier was set to 20, and the maximum accumulation time was 2 s. The peak areas of the iTRAQ reporter ions reflect the relative abundance of the proteins in the samples. For peptide identification, a Triple TOF 5600 mass spectrometer was used in this study, which shows a high mass accuracy (less than 2 ppm). Other applied identification parameters included the following: fragment mass tolerance: ± 0.1 Da; mass values: monoisotopic; variable modifications: Gln->pyro-Glu (N-term Q), Oxidation (M), iTRAQ8plex (Y); peptide mass tolerance: 0.05 Da; max missed cleavages: 1; fixed modifications: Carbamidomethyl (C), iTRAQ-plex (N-term), iTRAQ-plex (K); other parameters: default. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium (http://proteomecentral.proteomexchange.org) via the PRIDE partner repository (Vizcaíno et al. 2014) with the dataset identifier PXD001022.

Proteomic data analysis The MS data were processed using Proteome Discoverer software (Version 1.2.0.208) (Thermo Scientific) to generate a peak list. The default parameters of the Proteome Discoverer software (Version 1.2.0.208) were used. In detail, these parameters were as follows. For precursor selection, we used an MS1 precursor. For spectrum property filtering, we used the following values: Lower RT Limit, 0; Upper RT Limit, 0; Lowest Charge State, 0; Highest Charge State, 0; Min. Precursor Mass, 0; Max. Precursor Mass, 300 Da; Total Intensity Threshold, 10 000 Da; Minimum Peak Count, 1. For scan event filters: Mass Analyzer, FTMS; Ms Order, MS2; Activation Type, HCD; Scan Type, Full; Ionization Source, Nanospray; Polarity Mode, +. For peak filters: S/N Threshold (FT-only), 0. For replacement of unrecognized properties, the following parameter setting were applied: Unrecognized Charge, Automatic; Unrecognized Mass Analysis, FTMS; Unrecognized MS Order, MS2; Unrecognized Activation, HCD; Unrecognized Polarity Re, +. data acquisition was performed with This article is protected by copyright. All rights reserved

Analyst QS 2.0 software (Applied Biosystems/MDS SCIEX). Protein identification and quantification were performed using Mascot 2.3.02 (Matrix Science, London, United Kingdom) (Stancu et al.2011). For iTRAQ quantification, the peptide for quantification was automatically selected by the algorithm to calculate the reporter peak area, error factor (EF) and p-value (using the default parameters in the Mascot Software package). The resulting dataset was auto bias-corrected to remove any variations imparted due to unequal mixing when combining different labeled samples. Proteins showing a fold change of 1.5 or more between the ethanol-treated and control samples and a p-value of less than 0.05 in the statistical evaluation were considered differentially expressed proteins when the two treatments were compared, and the sample with the lower embryogenic ability was used as the control in the three comparisons (2.5 μM vs. 5 μM, 3.6 μM vs. 2.5 μM, 3.6 μM vs. 5 μM). Quantization was performed at the peptide level by following the procedures described at http://www.matrixscience.com/help/quant_statistics_help.html. Student’s t-test was performed using Mascot 2.3.02 software (Matrix Science, London, United Kingdom). To understand the functions of the differentially expressed unigenes, we mapped all of the genes to terms in the KEGG database and compared these results with the whole transcriptome background, with the aim of searching for genes involved in metabolic or signal transduction pathways that were significantly enriched. Functional annotations based on GO terms (GO; http://www. geneontology.org) were analyzed with Blast2GO software. KEGG pathway annotation was performed using Blastall software against the Kyoto Encyclopedia of Genes and Genomes. The enrichment of differentially expressed proteins among the GO

terms

(KEGG

Pathway

Database)

was

assessed

using

the

following

formula:

where N is the total number of proteins with GO annotation information (KEGG pathway annotation information); n is the number of differentially expressed proteins with GO annotation information (KEGG pathway annotation information); M is the number of proteins with a given GO term annotation (KEGG pathway annotation); and m is the number of the differentially expressed proteins with a GO term annotation (a given KEGG pathway annotation). The GO terms (KEGG pathways) showing p-values of less than 0.05 were considered to be GO terms (KEGG pathways) that were enriched for 6-BAP responsive proteins.

This article is protected by copyright. All rights reserved

RNA isolation and library preparation for transcriptome analysis Total RNA was isolated using the SV Total RNA Isolation System (Promega) according to the manufacturer’s protocol. RNA integrity was confirmed using a 2100 Bioanalyzer (Agilent Technologies) with a minimum RNA integrity value of 8. Briefly, mRNA was purified from 6 μg of total RNA using oligo (dT) magnetic beads. Following purification, the mRNA was fragmented into small pieces using divalent cations under an elevated temperature, and the cleaved RNA fragments were employed for first-strand cDNA synthesis using reverse transcriptase and random primers. This step was followed by second-strand cDNA synthesis using DNA polymerase I and RNaseH. These cDNA fragments were subsequently subjected to an end repair process and ligation of adapters. These products were purified and enriched through PCR to generate the final cDNA library.

Analysis of RNA-Seq results The cDNA libraries were sequenced on the Illumina sequencing platform (GAII). The size of the fragments was approximately 200 bp, and both ends of the libraries were sequenced. Image deconvolution and quality value calculations were performed using the Illumina GA pipeline. The raw reads were cleaned by removing adaptor sequences, empty reads and low-quality sequences (reads with unknown sequences 'N'). Various k-mer were optimized for the best result too. Only one k-mer length (25-mer) was chosen in Trinity, using the follow parameters: path_reinforcement_distance=70, group_pairs_distance=250, min_ kmer_cov=2, min_glue=2 and other default parameters. Using Trinity, we combined reads with certain overlap length to form longer fragments called contigs. Reads were mapped back to contigs. With paired-end reads, it was possible to detect contigs from the same transcript and the distances between the contigs. Finally, Trinity connected the contigs to obtain sequences that cannot be extended on either end. Such sequences are defined as unigenes. Distinct sequences were subjected to BLAST searches and annotation against the NCBI nr database using an E-value cut-off of 10–5. The procedures for GO analysis and KEGG enrichment were the same as for the protein analysis. The data has been submitted to NCBI, and the accession number is SRP042246.

Differential gene expression analyses Statistical analysis of the frequency of each raw read in the various cDNA libraries was performed to This article is protected by copyright. All rights reserved

compare gene expression among the different 6-BAP treatments (2.5 μM vs. 5 μM, 3.6 μM vs. 2.5 μM, 3.6 μM vs. 5 μM). Statistical comparisons were performed as described by Audic and Claverie (1997). P values were used to identify genes expressed differentially between each pair samples following the formula below: i= y

i= y

i= y

i =0

i =0

i =0

2∑ p (i | x )(if ∑ p (i | x ) ≤ 0.5) or 2 × (1 − ∑ p (i | x )〉 0.5)

where N1 and N2 represented the total numbers of unique-match reads in sample 1 and sample 2, respectively, and gene A contained x and y unique-match reads mapped to sample 1 and sample 2, respectively. The false discovery rate (FDR) was used to determine the threshold P value in multiple tests and analyses (Benjamini and Hochberg 1995). We employed an FDR < 0.001 as the threshold to judge the significance of the differences in gene expression. Briefly, assuming that R differentially expressed genes had been selected, S genes of those were genuinely differentially expressed, whereas the other V genes were false positives. If the error ratio (Q = V/R) was required to remain below a cutoff (e.g. 0.001), the FDR ≤ 0.001 was used, which was calculated according to the Benjamini and Hochberg algorithm: FDR=E(Q)=E{V/= V+S} =E(V/R). The procedures for the GO and pathway enrichment analyses of all differentially expressed genes were the same as in the protein section.

Quantitative real-time PCR (qRT-PCR) validation Total RNA was extracted as described in the RNA isolation and library preparation procedures for transcriptome analysis. The ABI Prism 7500 Detection System (Applied Biosystems, Foster City, CA, USA) was employed, with SYBR Green as the fluorescent dye, according to the manufacturer’s protocol (Takara). First-strand cDNA was synthesized from 2 μg of total RNA as described above and used as a template for real-time PCR with specific primers (Additional file 1, Table S1). Real-time PCR was performed in a total volume of 20 μL, and the cycling conditions were as follows: 95°C for 2 min, followed by 40 cycles of 94°C for 10 s, 59°C for 10 s, and 72°C for 40 s. All reactions were performed as biological triplicates, and the results were expressed relative to the expression levels of WS0109_C05 (At1g29260, peroxisomal targeting signal receptor) in each sample using the 2–ΔΔCT method (Livak and Schmittgen 2001). Each sample was first normalized for the amount of template added based on comparison with the abundance of WS0109_C05 mRNA (Yu et al. 2010).

This article is protected by copyright. All rights reserved

Results Development of Picea balfouriana somatic embryos We found that different 6-BAP concentrations significantly influenced the embryogenic ability of the established cultures. The moderate level of 6-BAP (3.6 μM) resulted in the generation of the greatest number of somatic embryos, 113±6 per 100 mg of tissue (Fig. 2), which exhibited a higher germination rate (48.47%±0.03). In contrast, medium containing 5 μM 6-BAP induced the lowest number of somatic embryos (23±2 per 100 mg tissue), with the lowest germination rate (9.42%±0.02).

Contents of IAA and ZR in tissues treated by three levels of 6-BAP The contents of IAA in tissues treated by 2.5 μM and 3.6 μM 6-BAP were significantly higher than it in tissue treated by 5 μM 6-BAP (Fig. 3). The content of ZR in the 3.6 μM 6-BAP treatment was significantly lower than the other two treatments. Although there were no significant difference in the contents of ZR between the latter two treatments, the content of ZR in tissue treated by 2.5 μM 6-BAP were lower than the other.

Overview of the RNA-Seq analysis To better understand the molecular mechanisms underlying the effect of 6-BAP on somatic embryogenesis, we constructed three RNA-Seq libraries, in which 33 156, 32 047 and 42 114 high-quality unigenes were obtained using Trinity software, with maximum unigene lengths of 11 681 bp, 13 684 bp and 11 687 bp, respectively. The length distribution for all unigenes are presented in Fig. S1, which showed that there were differences among unigenes length of three treatments in 6-BAP. GO analysis of these genes was performed using Blast2GO (Conesa et al. 2005, Götz et al. 2008). Among the 51 375 unigenes, Blast2GO showed functional annotations for 14 290 genes. The GO functional annotation analysis is summarized in Table S2. Sequences with GO terms corresponding to the “molecular function” category fell into 11 subcategories, while those in the “cellular component” group fell into 4 subcategories, and those in the “biological process” category fell into 4 subcategories. The largest subcategories found in the “molecular function” group were “binding activity” and “enzyme activity”, which comprised 46.41% and 41.40% of the genes in this subcategory. In the “cellular component” and “biological process” categories, “cell” and “physiological processes” were the most abundant GO terms, constituting 98.62% and 88.87% of each subcategory, respectively. This article is protected by copyright. All rights reserved

A total of 29 758 unigenes from the Picea balfouriana transcriptome were mapped to 38 statistically remarkable categories in KEGG (P value < 0.05). The ribosome, chromosome, chaperone and folding catalyst, RNA degradation and ubiquitin system categories were identified as statistically significant (Table S3). Other major pathways, including glutathione metabolism and plant hormone signal transduction, were also statistically significant.

Global changes in gene expression under different levels of 6-BAP To characterize the response of embryogenic tissues to 6-BAP, three RNA-Seq libraries were constructed using mRNA from three types of embryogenic tissues in genotype 4 of Picea balfouriana treated with three levels of 6-BAP. A total of 190 (2.5 μM vs. 5 μM), 36 (3.6 μM vs. 2.5 μM) and 234 (3.6 μM vs. 5 μM) differentially expressed genes (P value < 0.05) were found, including 119, 28 and 170 genes that were up-expressed by at least 1.5-fold, while 71, 8 and 64 expressed genes were down-expressed by at least 1.5 fold. Enrichment analysis for GO terms indicated that the “physiological process”, “cell” and “binding activity” categories were active in the samples with a higher embryogenic ability (Table S2). Many of the genes expressed at higher levels in samples with a higher embryogenic ability were related to positive regulation of cell proliferation, cell part, intracellular and RNA binding transcription factor activity.

Functional annotation of differentially expressed unigenes Among the genes showing KEGG pathway annotation, 125 expressed genes were identified in the three libraries (Table 1). Notably, specific enrichment of genes was observed for pathways related to ribosomes, glutathione metabolism and plant hormone signal transduction. Glutathione metabolism, glutathione S-transferase (GST), glutathione reductase and glutathione peroxidase were significantly up-expressed in tissues treated with 2.5 μM 6-BAP and 3.6 μM 6-BAP compared with 5 μM 6-BAP. Interestingly, more than thirty putative NBS-LRR proteins were up-expressed. Moreover, two putative wuschel homeobox proteins (WOX8/9) were up-expressed.

Overview of quantitative proteomics analysis A total of 640 361 spectra were obtained from the iTRAQ – LC-MS/MS proteomic analysis. Peptides without labeling were excluded from the datasets. After data filtering to eliminate low-scoring spectra, This article is protected by copyright. All rights reserved

a total of 35 502 unique spectra were matched to 3243 proteomes. In terms of the protein mass distribution, good coverage (averages of 30% of the total proteins) was obtained for a wide range of molecular weights for proteins larger than 100 kDa (Fig. 4). In addition, most of the proteins were identified with good peptide coverage, among which 30% of the proteins showed more than 10% sequence coverage (Fig. 5).

Functional classification of the proteins Based on the number of unique proteins identified in each functional category, the most frequently detected functional categories in the molecular function group were binding (44.63%) and catalytic activity (41.26%), while “cellular processes” and “metabolic processes” accounted for 17.98% and 17.58%, respectively, of the proteins identified in biological process categories. High identification coverage of the “cell” (20.21%) and “cell part” (20.21%) groups was observed in the cellular component categories (Fig. 6). Most represented unigenes active in RNA binding and extracellular space, associated with enzyme regulator activity and/or participated in growth and cellular defense response. Furthermore, most of the identified proteins (Table S3) were involved in metabolic pathways (ko01100, 30.05%), biosynthesis of secondary metabolites (ko01110, 17.07%), protein processing in the endoplasmic reticulum (ko04141, 4.69%) and the ribosomes (ko03010, 4.69%). To demonstrate analytical reproducibility, we labeled and mixed two biological replicates each from the samples treated with 2.5 μM, 3.6 μM and 5 μM 6-BAP directly for proteomic analysis, and the differences were plotted versus the percentages of proteins identified. The results showed that approximately 40%, 20% and 20% of the proteins showed differences of less than a delta error of 0.1, and more than 95%, 71% and 68% of the proteins presented differences of less than a delta error of 0.5 for each of the three tested concentrations of 6-BAP, respectively.

Global changes in protein expression under different levels of 6-BAP Using a cutoff of a 1.5-fold change and a p-value less than 0.05, we determined that 260, 277 and 93 unique proteins were expressed between the 2.5 μM vs. 5 μM, 3.6 μM vs. 2.5 μM and 3.6 μM vs. 5 μM treatments, respectively (Fig. 7), among which slightly more proteins were up-expressed than were down-expressed. It is worth noting that several key GO terms, such as “cellular process” and “metabolic process” (biological process), “cell” and “cytoplasm” (cellular component) and “catalytic This article is protected by copyright. All rights reserved

activity” and “binding” (molecular function), were affected by 6-BAP treatment (Table S4). As expected, we identified several proteins involved in cytokinin binding and cytokinin biosynthetic process, intracellular auxin transport, a number of proteins associated with cytoplasmic part and ribosome, some others had histone deacetylase regulator activity.

Pathway enrichment analysis of the 6-BAP-responsive proteins A series of KEGG pathways affected by the 6-BAP treatments was identified using a p-value of less than 0.05 as a cut-off (Table 2). The results showed that most of these expressed proteins among the three treatments were involved in oxidative phosphorylation (ko00190), the ribosomes (ko03010), ascorbate and aldarate metabolism (ko00053), phenylpropanoid biosynthesis (ko00940) and glutathione metabolism (ko00480), which appeared in at least two comparisons. Many of the proteins that were significantly expressed by the treatments were ribosomal proteins (L10e and S27e) as well as other proteins involved in translation pathways (K02866 and K02973) that were up-expressed. Notably, there were other ribosomal proteins that were down-expressed (S10e and L32e). In addition, both of monodehydroascorbate reductase (K08232), probable mannitol dehydrogenase (K00681) and gamma-glutamyl transferase (K00083) were up-expressed in tissues with higher embryogenic ability, which involved in ascorbate metabolism, phenylpropanoid biosynthesis and glutathione metabolism.

Association analysis of the proteome and transcriptome We further found that there were 2235 proteins (85.41%) that corresponded to unigenes. Different factors may contribute to explaining why approximately 15% of the proteins did not overlap with unigenes. First, all of the unigenes were assembled using Trinity software, and some of them might therefore not actually exist and should be validated via qRT-PCR. Second, a change in unigene abundance may or may not be translated into changes in protein levels. Posttranscriptional regulatory processes such as RNA processing, alternative splicing and other processes may affect the efficiency of translation. Third, several proteins with a high abundance were differentially expressed by 6-BAP treatment, but these proteins show important housekeeping functions, and their transcript levels do not meet the 1.5-fold change criterion applied in transcriptome studies. For example, there were four proteins involved in peroxisome function that were up-expressed, but their transcripts were not. Finally, proteins with a low abundance might not be reliably detected and therefore did not meet the This article is protected by copyright. All rights reserved

criteria defining them as differentially accumulated, while the unigene levels of the corresponding genes were significantly altered. A generally low overlap of proteomic and transcriptional profiles has been reported previously (Gallardo et al. 2007, Cosette et al. 2012). It was notably to note that, some proteins and their transcripts, such as calcium-dependent protein kinase (CDPK) that has protein kinase activity and involved in phosphorylation, ribosomal protein that involves in translation and influence gene expression, germin-like protein (GLP) is a metal-binding protein with oxalate oxidase and has superoxide dismutases activity, were significantly up-expressed (P

Transcriptomic and proteomic analyses of embryogenic tissues in Picea balfouriana treated with 6-benzylaminopurine.

The cytokinin 6-benzylaminopurine (6-BAP) influences the embryogenic capacity of the tissues of Picea balfouriana during long subculture (after 3 mont...
1MB Sizes 1 Downloads 2 Views