MARGEN-00332; No of Pages 3 Marine Genomics xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Marine Genomics journal homepage: www.elsevier.com/locate/margen

Genomics/Technical resources

Draft genome sequence of Strain ATCC 17802T, the type strain of Vibrio parahaemolyticus Ning Yang a,b,1, Ming Liu a,1, Xuesong Luo a,⁎, Jicheng Pan b a b

State Key Laboratory of Agricultural Microbiology & College of Resources and Environment, Huazhong Agricultural University, Wuhan 430070, China College of Life Sciences, Hubei Normal University, Huangshi, Hubei 435002, China

a r t i c l e

i n f o

Article history: Received 11 April 2015 Received in revised form 14 May 2015 Accepted 14 May 2015 Available online xxxx

a b s t r a c t We report the draft genome of Vibrio parahaemolyticus ATCC 17802T, containing 5067729 bp. The G + C content of the genome is 45.24 %. This strain possesses genes encoding a Type III secretion system 1, a Type III secretion system 2 and a Tdh related hemolysin (TRH). Its taxonomically important phenotypes were also experimentally characterized. © 2015 Elsevier B.V. All rights reserved.

Keywords: Vibrio parahaemolyticus ATCC 17802T Genome Phenotype

1. Introduction The bacterial strain ATCC 17802T, first isolated in 1950, led to the outbreak of acute gastroenteritis in Japan (Fujino et al., 1951). It was initially identified as Pasteurella parahaemolytica. After that, the bacterial strain ATCC 17802T was transferred to the genus Vibrio, and its name was changed to Vibrio parahaemolyticus. The type strain of the species is ATCC 17802T (Skerman et al., 1980). This species is a marine inhabiting bacterium that is capable of causing seafood-related gastroenteritis, wound infections, and septicemia worldwide (Blake et al., 1980; Baffone et al., 2006). The mechanisms of its pathogenesis and evolution are unclear, so hundreds of clinical and environmental strains of this species have been genomic sequenced in a year (Loyola et al., 2015; Haendiges et al., 2015; Cui et al., 2015). It was shown that Type III Secretion system (T3SS) 2 plays important roles for the pathogenesis of this species (Letchumanan et al., 2014). The evolution of T3SS2 linked pathogenic islands remains unclear. Such information is important for both pathology and evolution studies. The inclusion of types is also of central importance in prokaryotic taxonomy and evolutionary

⁎ Corresponding author at: State Key Laboratory of Agricultural Microbiology & College of Resources and Environment, Huazhong Agricultural University, Wuhan 430070, China. Tel: +86 27 87671033; fax: +86 27 87280670. E-mail address: [email protected] (X. Luo). 1 These authors contributed equally to this work

microbiology (Tindall et al., 2010). However, the genomic sequence of the pre-pandemic strain ATCC 17802T remains unavailable. In this study, the type strain ATCC 17802T of V. parahaemolyticus was genomic sequenced and phenotypically re-characterized. DNA was isolated from 1 g of cells using the standard phenolchloroform extraction method. Pair end (2 × 300 bp) sequencing was performed on the Illumina Miseq PE300 platform. Filtered data were assembled by SOAPdenovo (http://soap.genomics.org.cn/soapdenovo. html) to generate scaffolds. Transfer RNA (tRNA) genes were predicted with tRNAscan-SE (Lowe and Eddy, 1997). Ribosome RNA (rRNA) genes were predicted with rRNAmmer (Lagesen et al., 2007) and sRNAs were predicted by BLAST against Rfam (http://rfam.xfam.org) database. Repetitive sequences were predicted using RepeatMasker (http://www. repeatmasker.org/). Tandem repeats were analyzed using Tandem Repeat Finder (http://tandem.bu.edu/trf/trf.html). Gene prediction was performed by GeneMarkS (http://topaz.gatech.edu/). A whole genome Blast search (E-value ≤ 1e − 5, minimal alignment length percentage ≥40%) was performed against 4 databases. They are KEGG (Kyoto Encyclopedia of Genes and Genomes) (Kanehisa et al., 2004), COG (Clusters of Orthologous Groups) (Galperin et al., 2015), NR (Non-Redundant Protein Database databases), and GO (Gene Ontology) (Ashburner et al., 2000). The original data of our sequence didn't possess any complete stretches of 16S rRNA gene copies. Three different fragments of its 16S rRNA genes were previously deposited in GenBank under the accession numbers of X56580, X74720 and AF388387. We then employed SPades to assemble the genome for comparison (http://bioinf.spbau.ru/ en/spades)). One copy of 16S rRNA gene was obtained. The data from

http://dx.doi.org/10.1016/j.margen.2015.05.010 1874-7787/© 2015 Elsevier B.V. All rights reserved.

Please cite this article as: Yang, N., et al., Draft genome sequence of Strain ATCC 17802T, the type strain of Vibrio parahaemolyticus, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.05.010

2

N. Yang et al. / Marine Genomics xxx (2015) xxx–xxx

Table 1 General description of the bacterial genome. Sequence data information

Total number Total length N50 (bp) N90 (bp) Max length Min length G + C content

Scaffold

Contig

51 5067729 366570 80385 697590 500 45.24

53 5067706 260767 80385 697590 500 45.24

Gene numbers tRNA 5S rRNA 16S rRNA 23S rRNA sRNA Protein coding genes

101 5 0 0 15 4639

Interspersed repetitive sequences Type

No.

Total length (bp)

% in genome

Average length (bp)

LTR DNA LINE SINE RC scRNA Unknown

217 56 57 21 6 0 2

13117 3343 3374 1262 299 0 135

0.259 0.066 0.067 0.025 0.006 0 0.003

61 60 63 63 50 0 68

Total length

% in genome

Repeat size (bp)

25018 1983 690

0.494 0.039 0.014

Tandem repetitive sequences Type TR Minisatellite Microsatellite

No. 93 46 8

5–1137 10–30 5–6

SPades analysis were provided as supplementary information in this manuscript. The original assembly of the draft genome sequences consisted of 51 scaffolds accounting to 5067729 bp and G + C content is 45.24% (Table 1). A total of 4639 genes were predicted. 121 RNA genes and 4518 protein coding genes were numbered. Information for different types of repetitive sequences was listed in Table 1. The putative functions of the majority of the protein coding genes were prognosticated. The distribution of the genes into COGs functional categories was shown in Fig. 1. The type strain has both T3SS1 and T3SS2, and a Tdh related hemolysin (TRHs). Thermostable direct hemolysin (Tdh) was not detected in the currently available data. It was shown that the nutrient uptake genes located upstream of the pathogenic island possessing T3SS2 is a typical feature of the pre-pandemic strains (Chen et al., 2011). However, the pre-pandemic strain ATCC 17802T does not possess this region. Considering that the genetic structure of the T3SS2-possessing pathogenic islands is similar, the key difference between pre-pandemic strain and the pandemic strains still remains unclear. The evolution of this marine species would largely be influenced by its inhabiting both the complex marine environment and diverse hosts. We further validate its phenotypes using a commercially available bacterial identification system. Morphology analysis showed that the bacterium is a curved, rod-shaped, non-spore forming, Gram-negative and motile by a single polar flagellum. Enzyme activities and biochemical features of the type strain were determined by API kits (API 20NE, API 20E, API 50CH and API ZYM) according to the manufacturers' instructions with necessary modifications, which involved 3% of the NaCl solution being used in the preparation of the cell suspensions and 3% of the NaCl being supplemented in the AUX and 50CHB media.

Fig. 1. Functional classification of the protein coding genes (COG classification).

The bacterium is oxidase and catlase positive. Other detected phenotypes were listed in Table 2. Our data provide both the genomic and phenotypic information of the type strain of the marine origin species V. parahaemolyticus that is also clinically important.

2. Nucleotide sequence accession number The genome sequences of V. parahaemolyticus ATCC 17802T have been deposited at DDBJ/EMBL/GenBank under the project accession LATW00000000. The present version has the accession number LATW01000000, and consists of sequences LATW01000001– LATW01000051.

Acknowledgement This work was supported by the Fundamental Funds for the Central Universities (2662014QC022 and 2662013BQ018). We thank Yao Mu for technical support in the research. We also thank Dr. Jinshui Zheng for SPades analysis.

Appendix A. Supplementary data Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.margen.2015.05.010.

Please cite this article as: Yang, N., et al., Draft genome sequence of Strain ATCC 17802T, the type strain of Vibrio parahaemolyticus, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.05.010

N. Yang et al. / Marine Genomics xxx (2015) xxx–xxx Table 2 Phenotypes of Vibrio parahaemolyticus ATCC 17802T. Characteristics

Characteristics

Shape Single polar flagellum Oxidase Catalase Gram-staining Nitrate reduction H2S production Glucose fermentation

Rod + + + − + − +

N-Acetyl-glucosamine Gluconate Capric acid Adipic acid Malic acid Trisodium citrate Phenylacetic acid L-Rhamnose

+ + − − + + − −

Citrate utilization



D-Ribose

+

Indole production Acetoin production Enzymatic activities

+ −

Inositol

− − +

Phosphatase alkaline Esterase(C4) Esterase lipase(C8) Lipase(C14) Leucine arylamidase Valine arylamidase Cystine arylamidase Trypsin α-Chymotrypsin Phosphatase acide Naphthol-AS-BI-phosphohydrolase α-Galactosidase

+ + + w + w w + w + + −

D-Saccharose D-Maltose

Itaconic acid Suberic acid Malonate Acetate Lactic acid L-Alanine

5-Ketogluconate Glycogen 3-Hydroxybenzoic acid L-Serine

Salicin D-Melibiose

− − − + w − − + − − − −

β-Galactosidase

w

L-Fucose



β-Glucuronidase α-Glucosidase β-Glucosidase N-Acetyl-β-glucosaminidase α-Mannosidase α-Fucosidase Arginine dihydrolase Lysine decarboxylase

− w − + − − − +

D-Sorbitol

− − − w − − − w

Ornithine decarboxylase

+

L-Histidine

w

Urease

+

D-Glucose

+

Tryptophane deAminase

+

L-Arabinose

+

Gelatinase

+

D-Mannose

+

Carbon source assimilation

D-Mannitol

+

Acid produced from Glycerin Erythritol

− − −

Aescinate Salicin Cellobiose Maltose

− − − +

w − −

Lactose Melibiose Sucrose

− − −

Adon alcohol β-Methyl-D-xyloside Galactose Glucose Fructose Mannose Sorbitol Rhamnose

− − − + + + + − −

Trehalose Inulin Melezitose Raffinose Starch Glycogen Xylitol Gentiobiose D-Turanose

+ − − − + + − − −

Dulcitol



D-Lyxose



Inositol



D-Tagatose



Mannitol

+

D-Fucose



Sorbitol



L-Fucose



α-Methyl-D-mannoside



D-Arabinitol



α-Methyl-D-glucopyranoside N-acetyl-glucosamine Amygdalin Arbutin

− + − −

L-Arabinitol

− + − −

D-arabinose L-Arabinose

Ribose D-Xylose L-Xylose

Propionic acid Valeric acid L-Histidine

2-Ketogluconate 3-Hydroxybutyric acid 4-Hydroxybenzoic acid L-Proline

Gluconate 2-Keto-D-gluconate 5-Keto-D-gluconate

3

Baffone, W., Tarsi, R., Pane, L., Campana, R., et al., 2006. Detection of free-living and plankton-bound vibrios in coastal waters of the Adriatic Sea (Italy) and study of their pathogenicity-associated properties. Environ. Microbiol. 8, 1299–1305. Blake, P.A., Weaver, R.E., Hollis, D.G., 1980. Diseases of humans (other than cholera) caused by vibrios. Annu. Rev. Microbiol. 34, 341–367. Chen, Y., Stine, O.C., Badger, J.H., Gil, A.I., et al., 2011. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence. BMC Genomics 12, 294. Cui, Y., Yang, X., Didelot, X., Guo, C., et al., 2015. Epidemic clones, oceanic gene pools, and eco-LD in the free living marine pathogen Vibrio parahaemolyticus. Mol. Biol. Evol. http://dx.doi.org/10.1093/molbev/msv009. Fujino, L., Okuno, Y., Nakada, D., Aoyama, A., et al., 1951. On the bacteriological examination of shirasu food poisoning. J. Jpn. Infect. Dis. 25, 11. Galperin, M.Y., Makarova, K.S., Wolf, Y.I., Koonin, E.V., 2015. Expanded microbial genome coverage and improved protein family annotation in the COG database. Nucleic Acids Res. 43 (Database issue), D261–D269. Haendiges, J., Timme, R., Allard, M.W., Myers, R.A., et al., 2015. Characterization of Vibrio parahaemolyticus clinical strains from Maryland (2012-2013) and comparisons to a locally and globally diverse V. parahaemolyticus strains by whole-genome sequence analysis. Front. Microbiol. 6, 125. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., et al., 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32 (Database issue), D277–D280. Lagesen, K., Hallin, P., Rodland, E.A., Staerfeldt, H.H., et al., 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 35 (9), 3100–3108. Letchumanan, V., Chan, K.G., Lee, L.H., 2014. Vibrio parahaemolyticus: a review on the pathogenesis, prevalence, and advance molecular identification techniques. Front. Microbiol. 5, 705. Loyola, D.E., Navarro, C., Uribe, P., Garcia, K., et al., 2015. Genome diversification within a clonal population of pandemic Vibrio parahaemolyticus seems to depend on the life circumstances of each individual bacteria. BMC Genomics 16, 1385. Lowe, T.M., Eddy, S.R., 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964. Skerman, V.B.D., Mcgowan, V., Snath, P.H.A., 1980. Approved lists of bacterial names. Int. J. Syst. Bacteriol. 30, 225–420. Tindall, B.J., Rossello-Mora, R., Busse, H.J., Ludwig, W., et al., 2010. Notes on the characterization of prokaryote strains for taxonomic purposes. Int. J. Syst. Evol. Microbiol. 60, 249–266.

References Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., et al., 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25 (1), 25–29.

Please cite this article as: Yang, N., et al., Draft genome sequence of Strain ATCC 17802T, the type strain of Vibrio parahaemolyticus, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.05.010

Draft genome sequence of Strain ATCC 17802(T), the type strain of Vibrio parahaemolyticus.

We report the draft genome of Vibrio parahaemolyticus ATCC 17802(T), containing 5067729 bp. The G+C content of the genome is 45.24 %. This strain poss...
437KB Sizes 3 Downloads 13 Views