Molecular Plant Advance Access published September 18, 2014

Spotlight It Is Easy to Get Huge Candidate Gene Lists for Plant Metabolism Now, but How to Get Beyond? Alain Goossensa,b,1 a

Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Gent, Belgium

b

Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark

1

To whom correspondence should be addressed. E-mail [email protected], tel. +32

93313851, fax +32 93313809.

Running title: Gene discovery in plant metabolism

1

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

927, B-9052 Gent, Belgium

Living organisms are constantly threatened by other organisms, leading to a never-ending chemical arms race as both attacker and prey evolve. Microbes and plants developed a gigantic arsenal of specialized metabolites to deter their attackers. Humans in turn exploit this molecular diversity to defeat diseases or increase crop yield. These applications typically depend on sustainable and high metabolite production levels, which frequently cannot be guaranteed by the natural source. Synthetic biology programmes may resolve this shortage, but to enable those; the biogenesis of the natural product needs to be understood.

evolution of the respective, often species- and tissue-specific, biosynthetic pathways required the evolution of new genes, encoding enzymes, transporters, and regulators, which mainly occurred through gene duplications with successive diversifications. Over the past decades, substantial knowledge of the biochemistry and genetics of plant metabolic pathways has been generated, but nonetheless, much of the catalytic diversity of plant pathways still remains unexplored.

TRANSCRIPTOMES AND GENOMES, THE BLUEPRINTS OF METABOLISM Thanks to the advent of the next-generation sequencing (NGS) technologies, the pace with which plant genome sequences become available today is astonishing. The genomes of over 20 crop species have been sequenced and annotated (Bolger et al., 2014). Most of the specialized metabolites of interest, however, are produced in non-model plants for which genomic sequence information is not yet available. Fortunately, NGS technologies can also be implemented

to

sequence

plant

transcriptomes,

as

exemplified

by

Canadian

(www.phytometasyn.ca) and US (http://medicinalplantgenomics.msu.edu) research consortia, which generated RNA-Seq data from tens of medicinal plants, comprising hundreds of thousands annotated transcripts. This number is soon expected to increase spectacularly, e.g. when the BGI 1KP Project will finish the transcriptome sequencing of 1,000 plant species (http://onekp.com/project.html). The massive gene lists that are being generated thus represent a seemingly unlimited catalogue of enzymes, transporters or regulators to be used in synthetic biology programmes. This sequence wealth has shifted the paradigm in gene discovery quests, which no longer focus on the isolation of candidate plant gene sequences, but rather on how to find the right one in the jungle of duplicated and neofunctionalised genes. In the following sections, recent 2

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

Over 100,000 different specialized metabolites can be found in the plant kingdom. The

examples from distinct screening strategies to unravel plant specialized metabolism are provided (Figure 1).

IN SILICO PLANT METABOLIC NETWORKS If sequence assembly is complemented with extensive in silico analysis, including e.g. Enzyme Commission (EC) annotations (as in Xiao et al., 2013), a (tentative) functional annotation can be assigned. Annotated sequences can then be used to reconstruct metabolic

To date, MPDs have been built mainly for model plant species with sequenced genomes and gathered in ‘The Plant Metabolic Network’ (PMN) (http://www.plantcyc.org/). The PMN currently houses one multi-species reference MPD called PlantCyc that provides access to shared and unique metabolic pathways from over 350 plant species as well as to curated MPDs of single-species. Recently, the construction of an MPD from Catharanthus roseus RNA-Seq data was reported (http://www.cathacyc.org/) (Van Moerkercke et al., 2013), demonstrating that transcriptomes are equally suitable for building MPDs. Besides on gene sequences, MPD construction will heavily depend on metabolomics for validation. Plants are challenging in this regard, as a single plant (cell) can harbour thousands of metabolites, many of which of yet unknown structure. Plant metabolomics will require both targeted and non-targeted analysis, within which mass spectrometry-based techniques coupled to nuclear magnetic resonance spectroscopy will result most suitable because of the remaining need for the annotation of unknown compounds (Keuger et al., 2012). Approaches such as the systematic metabolite profiling of 50 Arabidopsis (Fukushima et al., 2014), will greatly help in improving gene annotation.

CO-EXPRESSION ANALYSIS, THE GOLDEN STANDARD By far, co-expression analysis has proven to be the best guarantee for successful gene discovery in plant specialized metabolism. The key for this high success rate is the peculiar organisation of plant specialized metabolism, which is usually tightly controlled at the transcriptional level to respond effectively to distinct environmental cues. This has led to the recognition of so-called ‘transcriptional regulons’: groups of genes involved in one particular metabolic pathway with a marked concerted expression pattern. Importantly, not only enzyme-encoding genes feature in these regulons, but also genes encoding transporters, 3

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

pathway databases (MPDs), which can help to discover missing pathway enzymes.

transcription factors, and other regulators, as illustrated by the tobacco nicotine regulon (Morita et al., 2009). One of the most powerful elicitors of such regulons is the phytohormone jasmonate (JA), which can trigger an extensive transcriptional reprogramming of metabolism across the plant kingdom (De Geyter et al., 2012). Despite a conserved perception and signalling machinery, JA can elicit species-specific pathways, such as of artemisinin in Artemisia annua, vinblastine in C. roseus, nicotine in Nicotiana tabacum, or paclitaxel in Taxus species. Besides JA elicitation, tissue-specificity emerges as another dominant regulon-defining

of the monoterpenoid indole alkaloid biosynthesis pathway in C. roseus, and allow its reconstitution in a heterologous host (Miettinen et al., 2014).

GENE CLUSTERING, ALSO IN PLANTS What transcriptional regulons signify for gene discovery in plant metabolism, mean gene clusters for microbial metabolism. Gene clusters are physical clusters of genes of one specialized metabolic pathway that predominate in bacterial and fungal genomes and usually comprise genes encoding the pathway enzymes, transporters, and regulators. The steadily growing number of sequenced plant genome has been accompanied by a steadily growing number of discoveries of gene clusters in plants, suggesting that this principle also applies to plants far more than originally assumed (Nützmann and Osbourn, 2014), although for most known plant pathways random gene scattering in the genome remains the rule. A nice plant cluster example is that of the steroidal glycoalkaloid (SGA) pathway from solanaceous food plants, such as tomato and potato, which involves a regulon of co-expressed enzyme-encoding genes of which several cluster on one chromosome (Itkin et al., 2013). Hence, intensified plant genome mining efforts might lead to the discovery of new pathways and chemistries, as it does in the microbial field.

PROTEIN-PROTEIN INTERACTOMES Ultimately, also proteins may physically ‘cluster’ in multiprotein complexes that manage the regulation of series of enzymatic reactions or catalyse them. Formation of transcription factor (TF) complexes that steer the expression of biosynthetic genes has been well documented (De Geyter et al., 2012) and screens for interactions between TFs might thus reveal new regulators 4

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

criterion, which has recently been successfully exploited to fully map the secoiridoid branch

of plant metabolism. Besides regulators, also the enzymes themselves can associate. This may occur in so-called ‘metabolons’, consisting of enzymes that directly connect, but, alternatively, interaction of enzymes of one pathway, may require partner proteins that serve as ‘docking stations’ or enhance the interaction. Exposure of metabolons and their partner proteins may be achieved by technologies such as tandem affinity purification, which recently revealed interaction between e.g. reticulon proteins and P450s involved in Arabidopsis phenylpropanoid biosynthesis (Bassard et al., 2012).

Because plants evolved their arsenal of chemical structures to guarantee protection against attackers, monitoring bioactivity in planta can be another powerful tool for gene discovery. This is illustrated by two studies, in which severe morphological phenotypes were triggered by the accumulation of toxic triterpene intermediates, which resulted from silencing of a regulator and an enzyme encoding gene in Medicago truncatula and tomato (Pollier et al., 2013; Itkin et al., 2011), respectively, and which revealed molecular mechanisms for selfprotection. Besides these apparent ‘serendipitous’ observations of bioactivity, gene discovery programmes based on e.g. quantitative trait loci analysis for interactions between plants and their predators will certainly allow expanding our metabolic network knowledge in the future.

FUTURE DIRECTIONS The increasing power of bioinformatics along with the booming of functional genomics technologies will offer unprecedented ways to list all the possible players involved in the synthesis of plant metabolites. For future functional screens, crucial parameters will be the coverage and the level of resolution that these technologies can ultimately achieve. In terms of coverage, genomics and transcriptomics approximate excellence, but proteomics and metabolomics do not comply yet, innate with the physical diversity of the molecules they need to profile. Similarly, interactomics needs to develop further. Several new technologies have emerged to study protein-protein or protein-DNA interactions under nearphysiological conditions at large scale and methods for investigation of endogenous proteinmetabolite interactomes are appearing. The promise of an increased spatiotemporal resolution of e.g. transcriptome profiling has been illustrated repeatedly but ultimately, a single-cell 5

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

ECOLOGICAL INSPIRATION

resolution should be achieved. For many ‘omics’ technologies this is already within reach. For others, such as metabolomics, this still remains a tough challenge, but marked technical improvements will allow taking it up (Oikawa and Saito, 2012).

ACKNOWLEDGMENTS The author gratefully acknowledges the support of the European Cooperation in Science and Technology Action FA1006-PlantEngine and the European Union Seventh Framework Programme FP7/2007-2013 under grant agreement number 613692-TriForC. No conflict of Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

interest declared.

6

FIGURE LEGENDS

Starting from gene compendia (A) established by genome or RNA sequencing, e.g. from C. roseus

(http://bioinformatics.psb.ugent.be/orcae/overview/Catro),

metabolic

pathway

databases (B) can be built based on metabolomics (C) and bioinformatics (D). Gaps in biosynthetic pathways can be filled by (non-)targeted gene discovery programmes, exploiting co-expression analysis (E), gene cluster identification (F), interactomics (G) or ecological and physiological aspects (H).

7

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

Figure 1. Multidisciplinary research platforms for the mapping of plant metabolism.

REFERENCES Bassard, J.-E., Richert, L., Geerinck, J., Renault, H., Duval, F., Ullmann, P., Schmitt, M., Meyer, E., Mutterer, J., Boerjan, W., et al. (2012). Protein-protein and proteinmembrane associations in the lignin pathway. Plant Cell. 24, 4465-4482. Bolger, M.E., Weisshaar, B., Scholz, U., Stein, N., Usadel, B., and Mayer, K.F.X. (2014). Plant genome sequencing - applications for crop improvement. Curr. Opin. Biotechnol. 26, 31-37.

machineries in jasmonate-elicited plant secondary metabolism. Trends Plant Sci. 17, 349-359. Fukushima, A., Kusano, M., Mejia, R.F., Iwasa, M., Kobayashi, M., Hayashi, N., Watanabe-Takahashi, A., Narisawa, T., Tohge, T., Hur, M., Wurtele, E.S., Nikolau, B.J., and Saito, K. (2014). Metabolomic characterization of knock-out mutants in Arabidopsis - Development of a metabolite profiling database for knockout mutants in Arabidopsis (MeKO). Plant Physiol., doi: pp.114.240986. Itkin, M., Heinig, U., Tzfadia, O., Bhide, A.J., Shinde, B., Cardenas, P.D., Bocobza, S.E., Unger, T., Malitsky, S., Finkers, R., et al. (2013). Biosynthesis of antinutritional alkaloids in solanaceous crops is mediated by clustered genes. Science. 341, 175-179. Itkin, M., Rogachev, I., Alkan, N., Rosenberg, T., Malitsky, S., Masini, L., Meir, S., Iijima, Y., Aoki, K., de Vos, R., et al. (2011). GLYCOALKALOID METABOLISM1 is required for steroidal alkaloid glycosylation and prevention of phytotoxicity in tomato. Plant Cell. 23, 4507-4525. Kueger, S., Steinhauser, D., Willmitzer, L., and Giavalisco, P. (2012). High-resolution plant metabolomics: from mass spectral features to metabolites and from whole-cell analysis to subcellular metabolite distributions. Plant J. 70, 39-50. Miettinen, K., Dong, L., Navrot, N., Schneider, T., Burlat, V., Pollier, J., Woittiez, L., van der Krol, S., Lugan, R., Ilc, T., et al. (2014). The seco-iridoid pathway from Catharanthus roseus. Nat. Commun. 5, 3606. Morita, M., Shitan, N., Sawada, K., Van Montagu, M.C.E., Inzé, D., Rischer, H., Goossens, A., Oksman-Caldentey, K.-M., Moriyama, Y., and Yazaki, K. (2009). Vacuolar transport of nicotine is mediated by a multidrug and toxic compound extrusion (MATE) transporter in Nicotiana tabacum. Proc. Natl. Acad. Sci. U S A. 106, 2447-2452. 8

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

De Geyter, N., Gholami, A., Goormachtig, S., and Goossens, A. (2012). Transcriptional

Nützmann, H.-W., and Osbourn, A. (2014). Gene clustering in plant specialized metabolism. Curr. Opin. Biotechnol. 26, 91-99. Oikawa, A., and Saito, K. (2012). Metabolite analyses of single cells. Plant J. 70, 30-38. Pollier, J., Moses, T., González-Guzmán, M., De Geyter, N., Lippens, S., Vanden Bossche, R., Marhavý, P., Kremer, A., Morreel, K., Guérin, C.J., et al. (2013). The protein quality control system manages plant defense compound synthesis. Nature. 504, 148-152. Van Moerkercke, A., Fabris, M., Pollier, J., Baart, G.J.E., Rombauts, S., Hasnain, G.,

CathaCyc, a metabolic pathway database built from Catharanthus roseus RNA-Seq data. Plant Cell Physiol. 54, 673-685. Xiao, M., Zhang, Y., Chen, X., Lee, E.-J., Barber, C.J.S., Chakrabarty, R., DesgagnéPenix, I., Haslam, T.M., Kim, Y.-B., Liu, E., et al. (2013). Transcriptome analysis based on next-generation sequencing of non-model plants producing specialized metabolites of biotechnological interest. J. Biotechnol. 166, 122-134.

9

Downloaded from http://mplant.oxfordjournals.org/ at University of California, San Francisco on December 11, 2014

Rischer, H., Memelink, J., Oksman-Caldentey, K.-M., and Goossens, A. (2013).

It Is Easy to Get Huge Candidate Gene Lists for Plant Metabolism Now, but How to Get Beyond?

It Is Easy to Get Huge Candidate Gene Lists for Plant Metabolism Now, but How to Get Beyond? - PDF Download Free
338KB Sizes 3 Downloads 6 Views