Draft Genome Sequence of the Lactobacillus agilis Strain Marseille Fatima Drissi, Noémie Labas, Vicky Merhej, Didier Raoult Aix Marseille Université, CNRS 7278, Inserm 1095, URMITE, UM63, IRD 198, Marseille, France
We report the draft genome sequence of Lactobacillus agilis strain Marseille, isolated from stool samples of a child suffering from kwashiorkor. This strain can use two metabolic pathways allowing the assimilation of glucose and xylose. Here, we present the first draft genome of the Lactobacillus agilis species. Received 19 June 2015 Accepted 24 June 2015 Published 30 July 2015 Citation Drissi F, Labas N, Merhej V, Raoult D. 2015. Draft genome sequence of the Lactobacillus agilis strain Marseille. Genome Announc 3(4):e00840-15. doi:10.1128/ genomeA.00840-15. Copyright © 2015 Drissi et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 Unported license. Address correspondence to Didier Raoult, [email protected]
actobacillus agilis strain Marseille is a lactic acid bacterium isolated from fecal samples of a child suffering from a severe form of malnutrition with insufficient protein consumption, called kwashiorkor. Glucose clearance rates are affected in children with kwashiorkor (1), and gut microbiome was found to be one of the causal factors in this disease (2). As it has the ability to ferment hexoses through the glycolysis pathway or use the pentose phosphate pathway for pentose assimilation (3), Lactobacillus agilis is a facultatively heterofermentative bacteria. The genome of Lactobacillus agilis Marseille was sequenced with the mate pair strategy on the MiSeq technology (Illumina Inc., San Diego, CA, USA). The reads where assembled through Velvet software (4), and the contigs obtained were combined together by SSPACE (5), Opera software version 1.2 (6), and GapFiller version 1.10 (7). Noncoding genes and miscellaneous features were predicted using RNAmmer (8), ARAGORN (9), Rfam (10), Pfam (11), and Infernal (12), while coding DNA sequences (CDSs) were predicted using Prodigal (13). Functional annotation was achieved using the Rapid Annotation using Subsystem Technology (RAST) server and BLAST⫹ (14) against the KEGG and COG databases. The draft genome is composed of 12 scaffolds and includes 2,369,669 bases (G⫹C content of 46.18%). It comprises 2,282 predicted genes, including 12 rRNAs (7 genes are 5S rRNA, 3 genes are 16S rRNA, and 2 genes are 23S rRNA) and 84 tRNAs genes. A total of 1,535 genes (69.80%) were assigned a putative function, 62 genes were identified as ORFans (2.82%), and the 481 remaining genes were annotated as hypothetical proteins (21.87%). The 16S rRNA analysis showed strong homology to other Lactobacillus species with completed genomes, including Lactobacillus salivarius JCM1046 (CP007646.1), with 95% similarity, and Lactobacillus ruminis ATCC 27782 (CP003032.1), with 94% similarity. The draft genome sequence of strain Marseille is larger than those of L. ruminis and L. salivarius (2.07 Mb and 2.31 Mb, respectively), and its G⫹C content is larger than those of L. ruminis and L. salivarius (43.50% and 32.97%, respectively). The genome of strain Marseille encodes for two fructose-1,6diphosphate aldolases (EC 22.214.171.124) involved in the glycolysis pathway and one phosphoketolase (EC 126.96.36.199) involved in the
July/August 2015 Volume 3 Issue 4 e00840-15
pentose phosphate pathway. According to the carbon source, strain Marseille can use both glycolysis and pentose phosphate pathways, and thus can produce either acetic acid, formic acid, and ethanol from sugars, or lactic acid under glucose limitation. Moreover, the genome of strain Marseille encodes for several enzymes that participate in carbohydrate transport and metabolism and two enzymes (acetyl-CoA acetyltransferase and carboxylesterase type B) involved in lipid transport and metabolism that are not present in L. ruminis ATCC 27782 and L. salivarius JCM1046. These specific features of L. agilis may result from a strategy of the bacterium to adapt to the particular environment of severely malnourished children and/or may have a causal effect in the development of the disease. Nucleotide sequence accession number. This whole-genome shotgun project has been deposited in ENA under the accession number CVQY00000000. ACKNOWLEDGMENTS This work was funded by the “IHU Méditerranée Infection” program. V.M. was supported by a Chairs of Excellence program from the CNRS (Centre National de Recherche Scientifique). We thank Xegen Company (http://www.xegen.fr) for automating the genomic annotation process.
REFERENCES 1. Spoelstra MN, Mari A, Mendel M, Senga E, van Rheenen P, van Dijk TH, Reijngoud DJ, Zegers RG, Heikens GT, Bandsma RH. 2012. Kwashiorkor and marasmus are both associated with impaired glucose clearance related to pancreatic ␤-cell dysfunction. Metabolism 61:1224 –1230. http://dx.doi.org/10.1016/j.metabol.2012.01.019. 2. Smith MI, Yatsunenko T, Manary MJ, Trehan I, Mkakosya R, Cheng J, Kau AL, Rich SS, Concannon P, Mychaleckyj JC, Liu J, Houpt E, Li JV, Holmes E, Nicholson J, Knights D, Ursell LK, Knight R, Gordon JI. 2013. Gut microbiomes of Malawian twin pairs discordant for kwashiorkor. Science 339:548 –554. http://dx.doi.org/10.1126/science.1229000. 3. Rohr LM, Teuber M, Meile L. 2002. Phosphoketolase, a neglected enzyme of microbial carbohydrate metabolism. Chimia Int J Chem 56: 270 –273. 4. Zerbino DR. 2010. Using the Velvet de novo assembler for short-read sequencing technologies. Curr Protoc Bioinformatics Chapter 11:Unit 11.5. http://dx.doi.org/10.1002/0471250953.bi1105s31. 5. Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. 2011. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27: 578 –579. http://dx.doi.org/10.1093/bioinformatics/btq683.
Drissi et al.
6. Gao S, Sung WK, Nagarajan N. 2011. Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences. J Comput Biol 18:1681–1691. http://dx.doi.org/10.1089/cmb.2011.0170. 7. Boetzer M, Pirovano W. 2012. Toward almost closed genomes with GapFiller. Genome Biol 13:R56. http://dx.doi.org/10.1186/gb-2012-13-6-r56. 8. Lagesen K, Hallin P, Rødland EA, Staerfeldt HH, Rognes T, Ussery DW. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35:3100 –3108. http://dx.doi.org/10.1093/ nar/gkm160. 9. Laslett D, Canback B. 2004. ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Res 32:11–16. http://dx.doi.org/10.1093/nar/gkh152. 10. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. 2003. Rfam: an RNA family database. Nucleic Acids Res 31:439 – 441. http:// dx.doi.org/10.1093/nar/gkg006.
11. Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD. 2012. The Pfam protein families database. Nucleic Acids Res 40:D290 –D301. http://dx.doi.org/10.1093/ nar/gkr1065. 12. Nawrocki EP, Kolbe DL, Eddy SR. 2009. Infernal 1.0: inference of RNA alignments. Bioinformatics 25:1335–1337. http://dx.doi.org/10.1093/ bioinformatics/btp157. 13. Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119. http://dx.doi.org/10.1186/ 1471-2105-11-119. 14. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. 2009. BLAST⫹: architecture and applications. BMC Bioinformatics 10:421. http://dx.doi.org/10.1186/1471-2105-10-421.
July/August 2015 Volume 3 Issue 4 e00840-15