Gene 540 (2014) 96–103

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Transcriptome and gene expression analysis during flower blooming in Rosa chinensis ‘Pallida’ Huijun Yan a,b, Hao Zhang a,b,1, Min Chen a,b, Hongying Jian a,b, Sylvie Baudino c, Jean-Claude Caissard c, Mohammed Bendahmane d, Shubin Li a,b, Ting Zhang a,b, Ningning Zhou a,b, Xianqin Qiu a,b, Qigang Wang a,b, Kaixue Tang a,b,⁎,1 a

Flower Research Institute of Yunnan Academy of Agricultural Sciences, Kunming, Yunnan 650205, PR China Yunnan Flower Breeding Key Lab., Kunming Yunnan 650205, PR China c Laboratoire BVpam, EA3061, Université de Saint-Etienne, Université de Lyon, 23 rue du Dr Michelon, Saint-Etienne F-42023, France d Laboratoire Reproduction et Développement des Plantes, UMR INRA-CNRS, Lyon1-ENSL, Ecole Normale Supérieure, 46 allée d'Italie, 69364 Lyon Cedex 07, France b

a r t i c l e

i n f o

Article history: Accepted 8 February 2014 Available online 12 February 2014 Keywords: Rosa chinensis ‘Pallida’ Transcriptome Digital gene expression Scent-related genes

a b s t r a c t Rosa chinensis ‘Pallida’ (Rosa L.) is one of the most important ancient rose cultivars originating from China. It contributed the ‘tea scent’ trait to modern roses. However, little information is available on the gene regulatory networks involved in scent biosynthesis and metabolism in Rosa. In this study, the transcriptome of R. chinensis ‘Pallida’ petals at different developmental stages, from flower buds to senescent flowers, was investigated using Illumina sequencing technology. De novo assembly generated 89,614 clusters with an average length of 428 bp. Based on sequence similarity search with known proteins, 62.9% of total clusters were annotated. Out of these annotated transcripts, 25,705 and 37,159 sequences were assigned to gene ontology and clusters of orthologous groups, respectively. The dataset provides information on transcripts putatively associated with known scent metabolic pathways. Digital gene expression (DGE) was obtained using RNA samples from flower bud, open flower and senescent flower stages. Comparative DGE and quantitative real time PCR permitted the identification of five transcripts encoding proteins putatively associated with scent biosynthesis in roses. The study provides a foundation for scent-related gene discovery in roses. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Roses are cultivated worldwide and they are mainly used as cut flowers as garden ornamental and for the perfume industry (Zhang and Zhu, 2006). Ancient Chinese cultivars, including Rosa chinensis ‘Pallida’, R. chinensis ‘Semperflorens’, Rosa odorata ‘Park's yellow teascented China’, and R. odorata ‘Hume Blush tea-scented’ were hybridized with European roses to breed modern rose cultivars (Chen, 2001). R. chinensis ‘Pallida’ is closely related to R. chinensis ‘Old Blush’. These two rose cultivars are part of the ‘China roses’ composed of both natural and cultivated hybrids that have evolved over more than a thousand years in Chinese gardens, under different ecological and environment conditions. Among ornamental plants, the rose is of interest as a model species because of its traits such as scent production, recurrent blooming and Abbreviations: GO, gene ontology; COG, clusters of orthologous groups; KEGG, Kyoto Encyclopedia of Genes and Genomes; NR, non-redundant databases; qRT-PCR, quantitative real-time polymerase chain reaction; OOMT, orcinol Omethyltransferases; EGS, eugenol synthase; LIS, linalool synthase; AAT, alcohol acetyltransferase; TF, transcription factor. ⁎ Corresponding author at: Flower Research Institute of Yunnan Academy of Agricultural Sciences, Kunming, Yunnan 650205, PR China. E-mail address: [email protected] (K. Tang). 1 Equal contribution to this work. 0378-1119/$ – see front matter © 2014 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2014.02.008

double flower characters (Bendahmane et al., 2013). Several groups have recently initiated molecular approaches aimed at providing new genetic and transcriptomic tools applied to rose. In the past few years efforts have been made to identify Rosa sp. expressed genes (Bendahmane et al., 2013; Channelière et al., 2002; Dubois et al., 2011, 2012; Foucher et al., 2008; Guterman et al., 2002; Kim et al., 2012; Pei et al., 2013). Very recently, the transcriptome in most organs of R. chinensis ‘Old Blush’ was investigated using a combination of 454 and Illumina sequencing technologies (Dubois et al., 2012). This study yielded valuable sequence dataset information representing about 20,000 Rosa sp. expressed genes, as well as digital gene expression (DGE) for the identified rose transcripts (Dubois et al., 2012). However, we are far from having information on all Rosa sp. expressed gene, estimated to about 30,000 genes. For example, some known genes associated with scent production and emission are missing in this dataset. In the past few years, there has been an increasing demand from consumers worldwide for scented rose flowers (Bergougnoux et al., 2007). However, modern rose breeding has mainly focused on cold tolerance and disease resistance, flower form and recurrent blooming (Yan et al., 2011). Fragrance seems to have been largely lost during the breeding process (Channelière et al., 2002). Nowadays, only limited transcriptomic and genomic data are available on scent-related gene pathways in Rosa sp.

H. Yan et al. / Gene 540 (2014) 96–103

With the objective to identify genes associated with rose scent, RNAseq was used to compare the transcriptome at three flower developmental stages of R. chinensis ‘Pallida’, namely flower bud, open flower and senescing flower stages. The EST datasets and DGE data will serve as a valuable resource for the discovery of candidate genes associated with scent in Rosa sp. 2. Results and discussion 2.1. Illumina sequencing and de novo assembly mRNAs were purified from three rose flower development stages exhibiting contrasted scent emission; the open flowers at which high peak of scent emission is observed and floral buds and senescing flowers with much less scent emission (Fig. 1). A cDNA library was generated from an equal mixture of RNA isolated from the above three flower development stages and then used for Illumina sequencing. Using 90 bp pair-end sequencing based on Illumina sequencing approach 66,523,228 reads were obtained and then assembled using Trinity Software (Grabherr et al., 2011). The longest assembled sequences containing blocks of unknown bases (Ns) were called contigs (Li et al., 2010). A total of 155,708 contigs ranging in length from 75 to 5236 bp were assembled with an average length of 380 bp (Table 1). The RNAseq data in this study have been deposited in the Gene Expression Omnibus (GEO) database (GSE54486). The de novo assembly yielded 89,641 unisequences with a total length of 38,355,533 bp and an average length of 428 bp (Table 1), which showed a similar average length to that previously published for R. chinensis ‘Old Blush’ (average length of 444 bp) (Dubois et al., 2012). However, the assembly in this study extended the range of sequence length ranging from 200 to 7326 bp. The length of sequences ranged from 200 to 500 nucleotides for 64,907 unisequences (72.4%), from 501 to 1000 nucleotides for 19,711 unisequences (22.0%) and over 1000 nucleotides for 5023 unisequences (Fig. S1). Further, the assembled sequences were compared with the 80,714 transcript clusters of R. chinensis ‘Old Blush’ using custom PERL scripts and plots were drawn by R with ggplot2 (Ito and Murphy, 2013). The percentages of sequences with more than 760 bp in length were slightly higher than those of the previous sequence assembly based on 454 sequencing (Dubois et al., 2012). This is likely due to the fact that here we performed 90 bp pair-end Illumina sequencing approach and to the depth of sequencing (Fig. 2). Therefore, the data here complement the previously published work and provide novel information on Rosa sp. expressed genes.

97

Table 1 Summary of data generated for R. chinensis ‘Pallida’ transcriptome. Total number of reads Total clean nucleotides (bp) Number of contigs Average length of all contigs (bp) Range of contig length (bp) Number of unisequences Length of all unigenes (bp) Range of unigene length (bp) Average unigene length (bp)

66,523,228 4,822,934,030 155,708 380 75–5236 89,614 38,355,533 200–7326 428

all clusters) unisequences providing significant BLAST hits, while 38,256 unisequences had similarity to proteins in the Swiss-Prot database. Altogether, 56,378 unisequences were successfully annotated in the Nr or Swiss-Prot databases. GO assignments were used to classify the predicted functions of R. chinensis ‘Pallida’ genes. Based on sequence homology, 25,705 unisequences were categorized into 43 functional groups (Fig. 3). In each of the three main categories (biological process, cellular component, and molecular function) of the GO classification, the metabolic process, various cellular activities and catalytic activity terms were dominant, respectively. A high-percentage of genes was assigned to the categories of cellular process, cell components and DNA binding (Fig. 3). Only few genes were assigned to other categories such as antioxidant activity (Fig. 3). Flavones, anthocyanin, coumarin lignans and catechins contribute to the majority of the antioxidant activity, they are usually present at high levels in medicinal plants (Škrovánková et al., 2012). In addition, all clusters were subjected to a search against the Cluster of Orthologous Groups (COG) database for functional prediction and classification. In total, 37,159 of 56,378 sequences showing Nr hits were assigned to COG classifications (Fig. S2). Among the 25 COG categories, the cluster for ‘General function prediction’ represents the largest group (5219; 14.0%) followed by ‘Transcription’ (3728; 10.0%) and ‘Replication, recombination and repair’ (2636; 7.1%), with the following categories extracellular structures (4; 0.01%), and nuclear structures (7; 0.02%) being the smallest groups (Fig. S2). In order to identify the active biological pathways in R. chinensis ‘Pallida’ flower buds, blooming and senescing flowers, 22,605 annotated transcripts were mapped to 121 KEGG pathways (Table S1). Transcripts representing metabolic pathways (5812 members), biosynthesis of secondary metabolites (2931 members), plant–pathogen interaction (1512 members), starch and sucrose metabolism (622 members) and phenylpropanoid biosynthesis (450 members) were represented in our dataset. These annotations provide a valuable resource for studying specific processes, and pathways in Rosa research.

2.2. Annotation of predicted proteins Sequence similarity search was conducted against non-redundant database (Nr), UniProtKB/Swiss-Prot (SwissProt), Gene Ontology (GO), Clusters of Orthologous Groups (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG). We found a total of 55,442 (61.9% of

2.3. R. chinensis ‘Pallida’ dataset as a resource for scent-related gene identification In the annotated R. chinensis ‘Pallida’ transcriptome dataset, we identified multiple transcripts encoding almost all known enzymes mainly

Fig. 1. Different developmental stages of R. chinensis ‘Pallida’. A. Floral bud. B. Open flower. C. Senescent flower.

98

H. Yan et al. / Gene 540 (2014) 96–103 Table 2 Unisequences putatively involved in metabolite biosynthesis in R. chinensis ‘Pallida’.

Fig. 2. Length of assembled sequences in R. chinensis ‘Pallida’ and R. chinensis cv. Old Blush (Dubois et al., 2012). The x axis represents the length of assembled sequences on a log 2 scale. The y axis represents the percentage of assembled sequences. ‘Old Blush’ and ‘Pallida’ are indicated with different colors.

involved in the phenylpropanoid biosynthesis pathway (Table 2), as well as orcinol O-methyltransferases (OOMTs) involved in the biosynthesis of phenolic methyl ethers (PMEs) (Lavid et al., 2002; Scalliet et al., 2006). Orcinol O-methyltransferases OOMT1 and OOMT2 catalyze the two final methylation reactions of 3,5-dimethoxytoluene (DMT) biosynthesis, which is known nowadays as a major scent compound

Query

Description

Score

E-value

Length (bp)

Uniseq10083 Uniseq10530 Uniseq10623 Uniseq10897 Uniseq11168 Uniseq11910 Uniseq18337 Uniseq1867 Uniseq18860

Acetyl co-A acetyltransferase S-adenosylmethionine transporter NADPH: Cytochrome P450 reductase S-linalool synthase Caffeic acid O-methyltransferase Phenylalanine ammonia-lyase Shikimate kinase Alcohol acyltransferase NADPH-dependent trioredoxin reductase Geranyl transferase type I Prenyltransferase Pyruvate kinase Eugenol O-methyltransferase Eugenol synthase 1 Eugenol synthase 1 Orcinol O-methyltransferase Germacrene D synthase Geranyl transferase component A Cinnamoyl-CoA reductase Alcohol acyl-transferase Cytochrome P450 enzyme Allyl alcohol dehydrogenase Citrate synthase Geranylgeranyl reductase Caffeic acid O-methyltransferase Cinnamyl alcohol dehydrogenase Chalcone protein O-acetyltransferase-related Flavone synthase II Amyrin synthase

79 516 139 164 150 137 94.4 136 123

2.00E−14 1.00E−145 1.00E−32 9.00E−40 6.00E−36 4.00E−32 9.00E−19 6.00E−32 8.00E−28

334 1351 270 806 259 231 601 432 198

108 183 107 128 346 80.1 117 111 110 338 109 110 176 243 175 140 51.2 225 216 120 81.6

2.00E−23 5.00E−89 2.00E−22 2.00E−29 2.00E−94 7.00E−15 4.00E−26 8.00E−24 7.00E−24 5.00E−92 1.00E−23 1.00E−23 1.00E−43 2.00E−63 4.00E−43 6.00E−33 4.00E−06 3.00E−58 5.00E−56 4.00E−27 2.00E−15

255 1920 982 227 862 165 291 666 493 845 385 532 556 600 611 339 382 679 338 314 227

Uniseq19465 Uniseq11692 Uniseq110 Uniseq43094 Uniseq6248 Uniseq55025 Uniseq297 Uniseq610 Uniseq618 Uniseq1162 Uniseq1617 Uniseq2709 Uniseq2750 Uniseq2864 Uniseq3625 Uniseq4148 Uniseq6243 Uniseq6564 Uniseq6658 Uniseq6721 Uniseq12783

in R. chinensis. OOMT genes are present in more than one copy in this dataset. It has been shown that OOMT genes encountered genome duplication events during the course of Rosa sp. evolution (Scalliet et al., 2008), thus our findings are in agreement with previously reported data. Volatile esters such as terpenes and ester derivatives such as geranyl acetate are important contributors to the aroma of roses and

Fig. 3. Gene ontology classification of the assembled unisequences. The results are summarized in three main categories: biological process, cellular component and molecular function. In total, 25,705 unisequences with BLAST matches to known proteins were assigned to gene ontology groups.

H. Yan et al. / Gene 540 (2014) 96–103

many other species (Cherri-Martin et al., 2007; Guterman et al., 2002). Transcripts corresponding to gene coding for a putative S-linalool synthase (LIS), an enzyme that uses geranyl pyrophosphate as a substrate and catalyzes the formation of linalool (Cseke et al., 1998), Germacrene D synthase, which uses farnesyl pyrophosphate as a substrate and catalyzes the formation of germacrene, and alcohol acetyltransferase (AAT) which catalyzes the formation of geranyl acetate (Guterman et al., 2002) were identified in our dataset (Table 2). The results suggest that the transcriptome of R. chinensis ‘Pallida’ is a reliable source for candidate genes associated with floral scent. Eugenol and isoeugenol belong to the phenylpropene compounds, which are derived from phenylalanine. Previously, we reported the cloning of the rose Eugenol Synthase Gene (RcEGS1) from R. chinensis ‘Pallida’ (Wang et al., 2012). Eugenol and isoeugenol synthases use coniferyl acetate and NADPH as substrates to catalyze the formation of eugenol and isoeugenol. In our transcriptome, we identified different transcripts that share sequence homologies to genes associated with eugenol and isoeugenol metabolism (Table 2). One of these transcripts (corresponding to cluster 6248) has a complete reading frame and shares 81.6% homology with RcEGS1 (Wang et al., 2012). The fact that we surveyed many different transcripts for EGS is not surprising. In Ocimum basilicum and Petunia hybrida, multiple eugenol and isoeugenol synthase genes, with different catalytic efficiencies, were proven (Koeduka et al., 2006, 2008). It will be interesting to know whether the different RcEGS identified in our transcriptome exhibit different enzymatic activities. 2.4. DGE library sequencing and analysis RNA samples from flower buds, open flowers and senescent flowers of R. chinensis ‘Pallida’, were used to construct three DGE libraries. The DGE sequencing quality and alignment statistics are shown in Table 3. 95.6% of raw tags in each library were clean tags. After removing the low quality tags, the total number of clean tags in each library was 5.64, 5.64 and 5.83 million, respectively. The number of unambiguous clean tags for each genes was calculated and normalized to tags per million (TPM). We compared pairs of DGE profiles of the three libraries (bud versus open flower, bud versus senescent flower, and senescent versus open flower). Thousands of differentially expressed genes (DEGs) were identified, demonstrating the substantial changes at the three different developmental stages. A total of 33,099 transcripts exhibited significant expression change among the three libraries. Between bud and open flower libraries, a total of 7145

99

differentially expressed transcripts were detected with 2578 transcripts up-regulated and 4567 transcripts down-regulated (Fig. 4). 361 transcripts showed specific expression in bud and 661 transcripts showed specific expression in open stage. Between bud and senescence libraries, a total of 3616 differentially expressed transcripts were found with of 1444 up-regulated transcripts and 2172 down-regulated transcripts (Fig. 4). 361 transcripts showed specific expression in bud sample and 348 transcripts showed specific expression during flower senescence. 3711 DEGs were up-regulated and 1751 DEGs were down-regulated in open flower stage compared with senescent flower stage (Fig. 4). 799 transcripts showed specific expression in open flowers and 517 transcripts showed specific expression in senescent flower stage. This suggests that the number of differentially expressed transcripts between bud and open flower stages is the largest, and the number of between bud and senescent flower stages is the smallest in the three pair comparisons. DEGs were mapped to KEGG databases and compared with our transcriptome data containing 89,641 unisequences. Genes involved in metabolic, biosynthesis of secondary metabolites and plant hormone signal transduction pathways were significantly enriched at open stage. Notably, specific enrichment of genes was observed for pathways involved in phenylpropanoid and terpenoid backbone biosynthesis at open stage. This suggests that the metabolic rate at open stage is probably higher than that at bud or senescent stages. We selected genes that have been previously reported to be putatively or likely associated with scent production and genes that have been shown to exhibit maximum expression levels at open flower developmental stage when scent emission is at its peak (Channelière et al., 2002) (Table 4). Similar expression patterns of selected genes were identified when using our DGE analyses. For example, unisequence58561, unisequence45098 and unisequence6248 sharing homologies with Acetyl-coA carboxylase, NADH dehydrogenase, and isoeugenol synthase, respectively, showed higher expression levels in open flowers than in buds and senescent flowers. Unisequence23563, unisequence24456 and unisequence79900 showed specific expression at open flower stages. The later share sequence homologies with Naphthoate synthase, Rhythmically-expressed protein and Geranyl pyrophosphate synthase, respectively (Table 4). To validate DGE profiling, RcOOMT (Unigene11168), RcEGS (Unigene6248), RcAAT (Unigene79900), RcP450 (Unigene35909, cytochrome P450), and RcR2R3 MYB (Unigene28110, MYB transcription factor) were selected to examine their expression using quantitative PCR (qPCR). The results were identical to those obtained by DGE analyses. Moreover, relative transcript levels of the five unisequences from open

Table 3 DGE sequencing quality evaluation and alignment statistics. Summary Raw data Raw data Clean tag Clean tag Clean tag All tag mapping to gene All tag mapping to gene All tag mapping to gene All tag mapping to gene Unambiguous tag mapping to gene Unambiguous tag mapping to gene Unambiguous tag mapping to gene Unambiguous tag mapping to gene All tag-mapped genes All tag-mapped genes Unambiguous tag-mapped genes Unambiguous tag-mapped genes Unknown tag Unknown tag Unknown tag Unknown tag

Total Distinct tag Total number Total % of raw data Distinct tag number Total number Total % of clean tag Distinct tag number Distinct tag % of clean tag Total number Total % of clean tag Distinct tag number Distinct tag % of clean tag Number % of ref genes Number % of ref genes Total number Total % of clean tag Distinct tag number Distinct tag % of clean tag

Bud stage

Open stage

Senescence stage

5,879,688 493,916 5,638,092 95.89% 252,849 4,385,372 77.78% 87,297 34.53% 3,159,758 56.04% 75,429 29.83% 34,594 38.12% 28,432 31.33% 1,252,720 22.22% 165,552 65.47%

5,861,000 773,488 5,645,837 96.33% 558,535 2,269,864 40.20% 93,656 16.77% 1,816,116 32.17% 82,607 14.79% 40,443 44.56% 33,899 37.35% 3,375,973 59.80% 464,879 83.23%

6,100,439 651,247 5,831,749 95.60% 383,084 4,084,867 70.05% 94,268 24.61% 3,066,758 52.59% 82,295 21.48% 37,413 41.22% 31,056 34.22% 1,746,882 29.95% 288,816 75.39%

100

H. Yan et al. / Gene 540 (2014) 96–103

Fig. 4. Differentially expressed genes (DEGs) between the different flower developmental stages of R. chinensis ‘Pallida’. This figure shows the number of upregulated (red) and downregulated (green) genes in each pairwise comparison of flower bud, open flower and senescent flower stages.

flower petals and leaves were further compared. RcOOMT and RcEGS were specifically expressed in petals, but not in leaves. The overall expression levels of RcAAT, cytochrome RcP450, and RcR2R3 MYB transcription factor were expressed much higher in petals than in leaves (Fig. 5). Therefore, the temporal and spatial expression patterns of RcEGS, RcP450, and RcR2R3 MYB were similar to other rose scent-related genes (Koeduka et al., 2008; Lavid et al., 2002; Yan et al., 2011), suggesting that they might be involved in the biosynthesis of rose scent.

2.5. Identification of transcription factors Transcription factors (TFs) are key regulators for transcriptional expression in biological processes (Riechmann et al., 2000). To identify putative transcription factors expressed in rose flowers, BLASTx search with Plant Transcription Factor Database (PlantTFDB) was performed for the 89,641 unisequences of R. chinensis ‘Pallida’. A total of 4533 unisequences matched in PlantTFDB, falling into 55 families and representing 5.1% of R. chinensis ‘Pallida’ transcripts (Fig. 6). The most abundant TF family is basic Helix-Loop-Helix (bHLH) which represented 9.7% of the total TFs identified in rose, followed by MYB (8.1%), ERF (6.7%), NAC (6.7%), bZIP (5.1%) and WRKY (4.5%). Similar to TFs founded in the ESTs from chickpea and sabaigrass (Garg et al., 2011; Zou et al., 2013), high abundance of WRKY, bHLH, and MYB were also observed in R. chinensis ‘Pallida’.

In this study, 159 unisequences matched to MYB TFs. Recent studies have shown that MYB transcriptional activators are important regulators of the phenylpropanoid pathway (Wei et al., 2007). AmMYBROSEA from Antirrhinum majus belongs to the R2R3-MYB subfamily of proteins regulating anthocyanin biosynthesis (Mehrtens et al., 2005). ODORANT1 is involved in regulating genes involved in floral scent biosynthesis and down-regulation of ODORANT1 in transgenic P. × hybrida Mitchell plants strongly reduced volatile benzenoid levels (Verdonk et al., 2005). RhMYB1 cloned from R. × hybrida was specifically expressed in wild-type scent rose, but not in non-scented mutant rose, suggesting that RhMYB1 might be involved in the biosynthesis of rose scent (Yan et al., 2011). Many MYB TFs are present in our dataset, some of which are likely important for regulation of scent biosynthesis.

3. Conclusion The combination of transcriptome and DGE analysis based on Illumina sequencing technology provided comprehensive information on gene expression during rose flower blooming. DGE permitted to identify almost all known genes of several major scent metabolic pathways, thus confirming the reliability of our dataset. Our dataset can also be used to identify other candidate genes putatively involved in rose scent as well as in rose flower development. The dataset will be helpful for gene discovery, function, mapping and genomics in Rosa sp.

Table 4 Differentially expressed genes putatively related to secondary metabolism in R. chinensis ‘Pallida’. Gene ID

Uniseq58561 Uniseq45098 Uniseq11168 Uniseq23563 Uniseq24456 Uniseq7064 Uniseq44601 Uniseq35909 Uniseq67901 Uniseq83119 Uniseq6248 Uniseq28110 Uniseq79900 Uniseq14068 Uniseq49761 Uniseq2367 Uniseq72098 Uniseq164 Uniseq5758 Uniseq3359

Expression level (TPM)

Description

Flower bud stage

Open flower stage

Senescent flower stage

1.42 0.53 0 0 0 0 0 0.35 0 0 0.53 0 0 0 0 0 0 1.42 2.13 1.42

101.14 19.48 0.53 7.08 6.38 5.67 3.72 3.54 3.01 2.66 1.98 0.53 0.35 0.53 0.35 4.07 6.19 10.98 11.34 9.03

0 1.89 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.51 4.8 3.77 3.09

Acetyl-coA carboxylase NADH dehydrogenase Caffeic acid O-methyltransferase Naphthoate synthase Rhythmically-expressed protein MYB22 Gibberellin oxidase Cytochrome P450 Ubiquitin–protein ligase Serine–threonine protein kinase Eugenol synthase R2R3 MYB Geranyl pyrophosphate synthase Flowering promoting factor Flowering locus D Methionine aminopeptidase Conserved hypothetical protein Unknown protein Unknown protein Unknown protein

H. Yan et al. / Gene 540 (2014) 96–103

101

Fig. 5. Validation of unisequences involved in putative scented-related metabolic pathways in R. chinensis ‘Pallida’ transcriptome by qRT-PCR. A. Expression of five unisequences at flower bud, open flower and senescent flower stages. B. The expression of five unisequences in petals and leaves.

4. Materials and methods 4.1. Sample collection and preparation R. chinensis ‘pallida’ was grown at the rose germplasm garden of the Flower Research Institute, Yunnan Agriculture Academic Science, Kunming, China. Flowers at three different developmental stages including flower buds, open flowers, senescent flowers, and young leaves were collected (Fig. 1), frozen immediately in liquid nitrogen, and stored at −80 °C until use. 4.2. RNA extraction and library preparation for transcriptome analysis Total RNA was isolated using the CTAB reagent method (Invitrogen) according to the manufacturer's instructions. Equal volumes of RNA from flower buds, open flowers and senescent flowers were pooled. RNA quality and quantity were verified using a NanoDrop 1000 spectrophotometer and an Agilent 2100 Bioanalyzer prior to further processing. Total RNA was treated with DNase I prior to library construction, and

poly-(A) mRNA was purified with Magnetic Oligo (dT) Beads. Doublestranded cDNA was further subjected to end-repair using T4 DNA polymerase, the Klenow fragment, and T4 polynucleotide kinase followed by a single A dNTP base addition using Klenow 3′ to 5′ exo-polymerase, then ligated with an adapter or index adapter using T4 DNA ligase. Adaptor-ligated fragments were separated by size on a 1.0% agarose gel, and the desired range of cDNA fragments (200 ± 25 bp) were excised from the gel. PCR was performed to selectively enrich and amplify the cDNA fragments. After validation with an Agilent 2100 Bioanalyzer, the cDNA library was subjected to Solexa sequencing using an Illumina HiSeq2000 sequencing platform at the Beijing Genomics Institute (BGI, Shenzhen, China). 4.3. De novo assembly, assessment and annotation Transcriptome de novo assembly was carried out with Trinity Software (Li et al., 2010). In order to reduce redundancy and chimeras in the Trinity pipeline, we used CAP3 to merge and combine highly similar assembled sequences into unisequences (Huang and Madan, 1999). The

Fig. 6. Distribution of R. chinensis ‘Pallida’ transcripts in 50 transcription factor families.

102

H. Yan et al. / Gene 540 (2014) 96–103

assembled sequences were compared with the 80,714 transcript clusters of R. chinensis ‘Old Blush’ (Dubois et al., 2012) using custom PERL scripts and plots were drawn by R with ggplot2 (Ito and Murphy, 2013). Sequences were annotated using a set of sequential BLAST searches designed to find the most descriptive annotation for each sequence (Altschul et al., 1997). The assembled unique transcripts were compared with sequences in Nr using the BLAST algorithm, the GI accessions of best hits were retrieved, and the GO accessions were mapped to GO terms according to molecular function, biological process, and cellular component ontologies (http://www.geneontology.org/). The remaining sequences that putatively encoded proteins were searched against the SwissProt protein database (http://www.expasy.ch/sprot), the KEGG pathway database (Kanehisa et al., 2008), and the COG database (http://www.ncbi.nlm.nih.gov/COG), applying a typical E-value threshold of less than 10−5.

4.4. DGE library preparation, sequencing and screening of differentially expressed genes (DEGs) Total RNA was extracted from flower buds, open flowers, and senescent flowers using Column Plant RNAout2.0. mRNA was enriched by using oligo(dT) magnetic beads. After adding the fragmentation buffer, the mRNA was interrupted to short fragments (about 200 bp). Then the first strand cDNA was synthesized by a random hexamer–primer using the mRNA fragments as templates. The double-stranded cDNA was purified with a QiaQuick PCR extraction kit and sequencing adaptors were ligated to the fragments. The required fragments were purified by agarose gel electrophoresis and enriched by PCR amplification. The library products were ready for sequencing analysis via Illumina HiSeqTM2000. To map the DGE tags, the sequenced raw data were filtered to remove low quality tags (tags with an unknown nucleotide “N”), empty tags (no tag sequence between the adaptors) and tags with only one copy number. For the annotation of tags, clean tags containing CATG and 21-bp tag sequences were mapped to our transcriptome reference database using SOAPaligner/soap2 (Li et al., 2009). To compare the differences in gene expression at different developmental stages, the tag frequency in the different DGE libraries was statistically analyzed according to the method described by Audic and Claverie (1997). The false discovery rate (FDR) was used to determine the threshold P-value in multiple tests. We used FDR b 0.001 and an absolute value of the log2 ratio N1 as the threshold to determine the significant difference in gene expression. The DEGs were used for KEGG enrichment analyses according to a method similar to that described by Xue et al. (2010). KEGG pathways with a Q-value ≤0.05 are significantly enriched in DEGs.

4.7. Gene validation and expression analyses Thirteen selected unisequences with potential roles in rose scent metabolism were chosen for validation using real-time qRT-PCR with gene specific primers designed with the Primer Premier software (version 5.0). All primers used in this study are listed in Table S2. Total RNAs were extracted from flower buds, open flowers, senescent flowers and young leaves of R. chinensis ‘Pallida’ using the TRIzol Reagent (TaKaRa, China) and purified with the RNA purification kit (TaKaRa, China). The standard curve for each gene was obtained by real-time PCR with five dilutions of cDNA. The PCR reactions were run in a Bio-Rad Sequence Detection System. Triplicates of each reaction were performed, and the GAPDH gene was chosen as an internal control. The quantification of the relative expression of the genes in different organs and different stages was performed using the delta–delta Ct method as described by Livak and Schmittgen (2001). All data were expressed as the mean ± standard deviation (SD) after normalization. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2014.02.008.

Conflict of interest None. Acknowledgments We thank Dr. Lian-feng Gu, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences for the data analysis and proofreading of the manuscript. This study was supported by the Natural Science Foundation of China (grant nos. 31360492 and 31160355), and Provincial Natural Science Foundation of Yunnan Province, People's Republic of China (grant nos. 2013FB093 and YAAS2013JC004). References Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402. Audic, S., Claverie, J.M., 1997. The significance of digital gene expression profiles. Genome Research 7, 986–995. Bendahmane, M., Dubois, A., Raymond, O., Bris, M.L., 2013. Genetics and genomics of flower initiation and development in roses. Journal of Experimental Botany 64, 847–857. Bergougnoux, V., et al., 2007. Both the adaxial and abaxial epidermal layers of the rose petal emit volatile scent compounds. Planta 226, 853–866. Channelière, S., et al., 2002. Analysis of gene expression in rose petals using expressed sequence tags. FEBS Letters 515, 35–38. Chen, J.Y., 2001. Chinese Taxology of Flower Cultivars. China Forestry Publishing House, Beijing, China 63–65. Cherri-Martin, M., Jullien, F., Heizmann, P., Baudino, S., 2007. Fragrance heritability in hybrid tea roses. Scientia Horticulturae 113, 177–181. Cseke, L., Dudareva, N., Pichersky, E., 1998. Structure and evolution of linalool synthase. Molecular Biology and Evolution 15, 1491–1498. Dubois, A., et al., 2011. Genomic approach to study floral development genes in Rosa sp. PLoS One 6, e28455. Dubois, A., et al., 2012. Transcriptome database resource and gene expression atlas for the rose. BMC Genomics 13, 638. Foucher, F., Chevalier, M., Corre, C., Soufflet-Freslon, V., Legeai, F., Hibrand-Saint, Oyant L., 2008. New resources for studying the rose flowering process. Genome 51, 827–837. Garg, R., Patel, R.K., Tyagi, A.K., Jain, M., 2011. De novo assembly of Chickpea transcriptome using short reads for gene discovery and marker identification. DNA Research 18, 53–63. Grabherr, M.G., et al., 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29, 644–652. Guterman, I., et al., 2002. Rose scent: genomics approach to discovering novel floral fragrance-related genes. Plant Cell 14, 2325–2338. Huang, X., Madan, A., 1999. CAP3: a DNA sequence assembly program. Genome Research 9, 868–877. Ito, K., Murphy, D., 2013. Application of ggplot2 to Pharmacometric Graphics. CPT: Pharmacometrics and Systems Pharmacology 2, e79. Kanehisa, M., et al., 2008. KEGG for linking genomes to life and the environment. Nucleic Acids Research 36, D480–D484. Kim, J., et al., 2012. Small RNA and transcriptome deep sequencing proffers insight into floral gene regulation in Rosa cultivars. BMC Genomics 13, 657. Koeduka, T., et al., 2006. Eugenol and isoeugenol, characteristic aromatic constituents of spices, are biosynthesized via reduction of a coniferyl alcohol ester. Proceedings of the National Academy of Sciences of the United States of America 103, 10128–10133. Koeduka, T., et al., 2008. The multiple phenylpropene synthases in both Clarkia breweri and Petunia hybrida represent two distinct protein lineages. Plant Journal 54, 362–374. Lavid, N., et al., 2002. O-methyltransferases involved in the biosynthesis of volatile phenolic derivatives in rose petals. Plant Physiology 129, 1899–1907. Li, R., et al., 2009. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25, 1966–1967. Li, R., et al., 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20, 265–272. Livak, K.J., Schmittgen, T.D., 2001. Analysis of relative gene expression data using realtime quantitative PCR and the 2(−Delta Delta C (T)) method. Applied Biosystems 25, 402–408. Mehrtens, F., Kranz, H., Bednarek, P., Weisshaar, B., 2005. The Arabidopsis transcription factor MYB12 is a flavonil-specific regulator of phenylpropanoid biosynthesis. Plant Physiology 138, 1083–1096. Pei, H.X., et al., 2013. Integrative analysis of miRNA and mRNA profiles in response to ethylene in rose petals during flower opening. PLoS One 8, e64290. Riechmann, J.L., et al., 2000. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science 290, 2105–2110. Scalliet, G., et al., 2006. Role of petal-specific orcinol o-methyltransferases in the evolution of rose scent. Plant Physiology 140, 18–29. Scalliet, G., et al., 2008. Scent evolution in Chinese roses. Proceedings of the National Academy of Sciences 105, 5927–5932.

H. Yan et al. / Gene 540 (2014) 96–103 Škrovánková, S., Mišurcová, L., Machů, L., 2012. Antioxidant activity and protecting health effects of common medicinal plants. Advances in Food and Nutrition Research 67, 75–139. Verdonk, J.C., Haring, M.A., Tunen, A.J.V., Schuurink, R.C., 2005. ODORANT1 regulates fragrance biosynthesis in petunia flowers. Plant Cell 17, 1612–1624. Wang, H.P., et al., 2012. Cloning and expression analysis of eugenol synthase gene RcEGS1 in Rosa chinensis ‘Pallida’. Acta Horticulturae Sinica 39, 1387–1394. Wei, Y.L., Li, J.N., Lu, J., Tang, Z.L., Pu, D.C., Chai, Y.R., 2007. Molecular cloning of Brassica napus TRANSPARENT TESTA 2 gene family encoding potential MYB regulatory proteins of proanthocyanidin biosynthesis. Molecular Biology Reports 34, 105–120.

103

Xue, J., et al., 2010. Transcriptome analysis of the brown planthopper Nilaparvata lugens. PLoS One 5, e14233. Yan, H.J., Zhang, H., Wang, Q.G., Jian, H.Y., Qiu, X.Q., Tang, K.X., 2011. Isolation and identification of a putative scent-related gene RhMYB1 from rose. Molecular Biology Reports 38, 4475–4482. Zhang, Z.S., Zhu, X.Z., 2006. China Rose. China Forestry Publishing House, Beijing, China 170–172. Zou, D., Chen, X.B., Zou, D.S., 2013. Sequencing, de novo assembly, annotation and SSR and SNP detection of sabaigrass (Eulaliopsis binata) transcriptome. Genomics 102, 57–62.

Transcriptome and gene expression analysis during flower blooming in Rosa chinensis 'Pallida'.

Rosa chinensis 'Pallida' (Rosa L.) is one of the most important ancient rose cultivars originating from China. It contributed the 'tea scent' trait to...
966KB Sizes 0 Downloads 0 Views