Available online at www.sciencedirect.com

ScienceDirect Editorial overview: Genomics: The era of genomically-enabled microbiology Neil Hall and Jay Hinton Current Opinion in Microbiology 2015, 23:ix–x For a complete overview see the Issue Available online 28th January 2015 http://dx.doi.org/10.1016/j.mib.2014.12.001 1369-5274/Published by Elsevier Ltd.

Neil Hall Centre for Genomic Research, Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK e-mail: [email protected] Neil Hall is professor of genomics and director of the centre for genomics research at the University of Liverpool. His research is primarily focused on comparative genomics and population genomics of human parasites. He uses genomic approaches to understand how parasites adapt in response to hosts and to identify genetic determinants of virulence.

Jay Hinton Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, UK e-mail: [email protected] Jay Hinton is professor of microbial pathogenesis at the University of Liverpool. He is interested in the way that bacterial pathogens such as Salmonella and E. coli cause disease in humans. The underlying theme of current research is to understand the intricate interplay of gene expression that leads to bacterial infection, and how transcription is modulated in emerging pathogens.

You must have spent the last 10 years on a desert island to miss the noise of the genomics community describing how sequencing technology is transforming biomedical research. While the headlines may have been dominated by the promise of $1000 human genomes, the microbial world is enjoying its own quiet revolution powered by the technological tidal-wave of ‘next generation sequencing’. As we have been swept along, there has been little time to stop and take stock of how far we have come, and to assess current possibilities — hence we commissioned this special issue. Our contributors have reviewed state-of-the-art methodologies in genomics, transcriptomics and bioinformatics, and describe new concepts and methodologies that have arisen with the advent of cheap, high throughput DNA sequencing. These important contributions from both the prokaryotic and eukaryotic microbial worlds show that it is important for microbiologists to learn from both sides of the phylogenetic tree. As large datasets are becoming the norm for biological research we all need to become ‘data scientists’. Many of the reviews in this issue provide details of the tools required for handling genomic scale data. The two contributions by Creecy and Conway and Koren and Phillippy highlight the remarkable insights that have been generated by new technologies. Koren and Phillippy point out that the latest technologies have not only made sequencing cheaper, but are now delivering much longer sequence reads. These breakthroughs enable ‘finished’ microbial genomes to be generated in a matter of days, without the need for manual intervention in the assembly process. While Pacific Biosciences has led the way in delivering long reads, newer portable Nanopore technologies may soon provide a cheap alternative that will drive future studies. The review discusses the relative merits of PacBio, Oxford Nanopore and Illumina/Moleculo sequencing technologies, and also summarizes the available software tools for assembling sequence reads into complete genomes. Creecy and Conway describe the process of RNA-seq analysis, and explain how transcriptional data add invaluable information to a genome. This ‘primary transcriptome’ can show the transcriptional start and termination sites that dictate operon structures. The pinpoint accuracy of the resolution of transcriptional features gives us the ability to define microbial gene regulation at both the genomic scale and at the level of individual nucleotides. At the same time, quantitative transcriptomics provides the most powerful way to compare levels of gene expression. In 1995, we began to generate complete microbial genomes that were manually annotated at a great cost, in terms of sequencing consumables

www.sciencedirect.com

Current Opinion in Microbiology 2015, 23:ix–x

x Genomics

and scientific salaries. Since 2006, we have been using cheap, high throughput automated genome analysis to produce low-quality draft genomes. Now, the scientific community is rediscovering the value of high-quality, high information-content genomics for some applications. Koren and Phillippy explain that the current technology enables the assembly of a ‘complete’ high-quality bacterial genome. Whilst this is many times more expensive than short-read sequencing technology, it is becoming increasingly important to unambiguously determine every base in the genome. In fact, the reviews by Haft, and Goodhead and Darby show improved annotation to be a critical step that fuels the understanding of biological systems and requires an accurate genome sequence. Vernikos et al. and Hirt et al. describe a paradigm shift that occurred around 2004. As the genomes of multiple strains of particular bacteria species emerged from sequencing centres worldwide, it became clear that many species of bacteria existed as ‘pan-genomes’. In these cases the sharing of genetic information was so widespread that the individual bacterial strains could simply be viewed as vessels inhabited by mobile DNA elements. Some pan genomes appeared to be ‘open’ or inexhaustible. In the same period, genomes of lower eukaryotes were revealing similar signals of horizontal gene transfer (HGT) that originated from the bacterial world. Hirt et al. explain that although HGT is far less extensive in lower eukaryotes, it can affect almost every cellular pathway. Evidence is presented to show that not only are mucosal parasites recipients of genes from bacteria, but these parasites also share genes with each other and, in turn, can transfer their genes to bacteria. In the case of bacterial genomes Vernikos et al. explain how the core genome of some bacterial species makes up the minority of genes, blurring the definition of a bacterial species and interfering with standard strain typing systems. The pan genome view of bacterial variation and evolution is redefining how genomic information is used to develop therapeutics as the very genes that are the target of drugs and vaccines are those influenced by HGT and rapid selection. Of course, new genes come and old genes may die. Goodhead and Darby highlight the neglected significance of pseudogenes. In fact, understanding the genes that have recently ‘pseudogenised’ can tell us what a bacterium no longer requires as it adapts from one niche to another. This is exemplified by Salmonella serovars: the broad host–range pathogen S. enterica serovar Typhimurium has far fewer pseudogenes than S. Typhi, a serovar that only infects humans.

Current Opinion in Microbiology 2015, 23:ix–x

The dynamic nature of the genome and the vast number of genes that lack experimental data has brought a new challenge: how do we achieve accurate annotation and interpretation of genome-based information in the face of the exponential increase in genome deposition in the public data archives? Is this an insurmountable summit that may never be conquered, or a jigsaw puzzle that will eventually be completed? Haft may have provided the solution by explaining that the intelligent bioinformatician can look for signals in the noise, and use increased genomic data to generate more biological knowledge, without laboratory experiments. Rather than relying on pipettes and phenomics to discover gene function, Haft explains that nature and natural selection have already performed the most important experiments. Techniques such as phylogenetic profiling that were proposed decades ago have finally come to fruition as the datasets have grown in size and scale. This approach is based on the fact that genes working in the same biological process co-occur in nature. Haft takes us on a ‘bioinformatic journey’, using deep mining of genomic data to generate hypotheses about the function of a ‘hypothetical’ gene, based on where the gene occurs in nature. Our final two reviews focus on the leaps that genomics have brought to the field of epidemiology. In bacterial systems, Croucher and Didelot focus on disease transmission, and explain how genome sequencing has enabled the tracking of bacterial pathogens between humans. This approach has identified the source of local outbreaks of infection and the impact of clinical factors such as antibiotic or vaccine-based therapy upon the worldwide transmission of bacteria. Moving to parasitic disease, Hupalo et al. give thought-provoking examples of the impact of genomics upon the population biology of parasitic protists. These eukaryotic organisms can have either clonal or recombining populations and genome wide studies have been used to obtain a detailed understanding of population structures. In the case of Plasmodium this strategy has successfully identified loci associated with drug resistance. First we had genomes, now we have the tools to work with them effectively. We are now in the era of genomically enabled microbiology, but the impact of these new data is not yet clear as we are in the middle of a revolutionary process. We are making unexpected discoveries in areas that were completely opaque a decade ago. The last 10 years have seen paradigm shifts that include the concept of the pan-genome and the use of genomic epidemiology to improve human health. Who knows what the next 10 years may bring?

www.sciencedirect.com

Editorial overview: Genomics: the era of genomically-enabled microbiology.

Editorial overview: Genomics: the era of genomically-enabled microbiology. - PDF Download Free
176KB Sizes 2 Downloads 12 Views