IJSEM Papers in Press. Published March 3, 2015 as doi:10.1099/ijs.0.000161
International Journal of Systematic and Evolutionary Microbiology Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species --Manuscript Draft-Manuscript Number:
IJS-D-14-00485
Full Title:
Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species
Short Title:
Variations of 16S rRNA identity among bacterial species
Article Type:
Note
Section/Category:
Methods
Keywords:
16S rRNA; bacteria; pairwise identity; species; taxonomy
Corresponding Author:
Pierre Edouard Fournier, M.D., Ph.D. Universite de la Mediterranee Marseille, FRANCE
First Author:
Morgane Rossi-Tamisier
Order of Authors:
Morgane Rossi-Tamisier Samiar Benamar Didier Raoult Pierre Edouard Fournier, M.D., Ph.D.
Manuscript Region of Origin:
FRANCE
Abstract:
Modern bacterial taxonomy is based on a polyphasic approach that combines phenotypic and genotypic characteristics, including 16S rRNA sequence identity. However, the 95% (for genus) and 98.7% (for species) identity thresholds that are currently recommended to classify bacterial isolates were defined by comparison of a limited number of bacterial species, and may not apply to many genera that contain human-associated species. For each of 158 bacterial genera containing humanassociated species, we computed pairwise sequence identities between all species with standing in nomenclature and then analyzed the results, considering as abnormal any identity value lower than 95% or greater than 98.7%. Many of the currently validly published bacterial species do not respect the 95 and 98.7% thresholds, with 57.1% of species exhibiting 16S rRNA identity rates > 98.7%, and 60.1% of genera containing species exhibiting 16S rRNA identity rate < 95%. In only 17 of the 158 studied genera (10.8%), all species respected the 95 and 98.7% thresholds. As we need powerful and reliable taxonomical tools, and as potential new tools such as pangenomics have not yet been fully evaluated for taxonomic purposes, we propose to use as thresholds, genus by genus, the minimum and maximum identity values observed among species.
Powered by Editorial Manager® and ProduXion Manager® from Aries Systems Corporation
Manuscript with cc Click here to download Manuscript Including References (Word document): Article with cc.docx
1 2 3
Cautionary tale of using 16S rRNA gene sequence similarity values in identification of human-associated bacterial species
4 5 6 Morgane Rossi-Tamisier1, Samia Benamar1, Didier Raoult1 and Pierre-Edouard Fournier1*
7 8 9 10 11 12 13 14
1
Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UM 63, CNRS 7278, IRD 198, Inserm 1095, Institut Hospitalo-Universitaire Méditerranée-Infection, Faculté de médecine, Aix-Marseille Université
15
27 Boulevard Jean Moulin, 13385 Marseille cedex 05, France
16
Tel: + 33 491 385 517, Fax: +33 491 387 772
17 18 19 20
*Corresponding author: Pierre-Edouard Fournier (
[email protected])
21
Contents category: Methods
22
Running head: Variations of 16S rRNA identity among bacterial species
23 24
Key words: 16S rRNA, identity, bacteria, species, taxonomy
25 26
Word count: Abstract words: 200
1
27 28
Abstract Modern bacterial taxonomy is based on a polyphasic approach that combines phenotypic
29
and genotypic characteristics, including 16S rRNA sequence identity. However, the 95% (for genus)
30
and 98.7% (for species) identity thresholds that are currently recommended to classify bacterial
31
isolates were defined by comparison of a limited number of bacterial species, and may not apply to
32
many genera that contain human-associated species. For each of 158 bacterial genera containing
33
human-associated species, we computed pairwise sequence identities between all species with
34
standing in nomenclature and then analyzed the results, considering as abnormal any identity value
35
lower than 95% or greater than 98.7%. Many of the currently validly published bacterial species do
36
not respect the 95 and 98.7% thresholds, with 57.1% of species exhibiting 16S rRNA identity rates >
37
98.7%, and 60.1% of genera containing species exhibiting 16S rRNA identity rate < 95%. In only
38
17 of the 158 studied genera (10.8%), all species respected the 95 and 98.7% thresholds. As we
39
need powerful and reliable taxonomical tools, and as potential new tools such as pangenomics have
40
not yet been fully evaluated for taxonomic purposes, we propose to use as thresholds, genus by
41
genus, the minimum and maximum identity values observed among species.
2
42
Background
43
Taxonomy enables researchers to classify bacterial isolates into existing or potentially new
44
taxa. Bacterial taxonomy is a dynamic discipline and the criteria used have evolved, following the
45
scientific and technical progresses. Currently, the classification of a bacterial isolate is based on a
46
polyphasic approach that combines phenotypic and genotypic characteristics, although the tools
47
used varied over time. In 1996, Vandamme et al. suggested that polyphasic taxonomy should
48
include G+C content difference,16S rRNA identity (Fox et al., 1977; Fox et al., 1992; Hugenholtz
49
et al., 1998; Ludwig & Klenk, 2001), DNA-DNA hybridization (DDH) (Brenner et al., 1969;
50
Johnson, 1991), and chemotaxonomic criteria, in particular fatty acid analysis (Vandamme et al.,
51
1996). In 2002, Stackebrandt et al. proposed to add to the previously cited criteria whole genome
52
profiling (AFLP, RAPD, Rep-PCR, PFGE), multi-locus sequence typing, MALDI-TOF mass
53
spectrometry, Fourier-transformed infrared spectroscopy and/or pyrolysis-mass spectrometry
54
(Stackebrandt et al., 2002). In 2010, Tindall et al. reevaluated the various available methods and
55
propose a combination of 16S rRNA sequence identity and phylogeny, genomic G+C content, DDH,
56
cell morphology and Gram staining properties as well as chemotaxonomic criteria such as
57
peptidoglycan structure and composition in respiratory lipoquinones, hydrophobic side chains of
58
lipids or fatty acids (Tindall et al., 2010). However, the species rank is especially difficult to define,
59
as many bacterial species with standing in nomenclature have been validated using arbitrary criteria
60
(Rosselló-Mora, 2003).
61
The 16S rRNA gene is a highly conserved gene that is made of 9 hypervariable domains
62
separated by more preserved fragments that can be used to design universal primers. More than
63
three million 16S rRNA sequences are currently available in databases (Quast et al., 2013). The use
64
of 16S rRNA sequences for taxonomic purposes began in the 80's, when the development of
65
molecular tools, notably the development of automatic sequencing (Sentausa & Fournier, 2013),
66
enabled the development of large sequence databases such as GenBank (Benson et al., 2013),
67
Greengenes (DeSantis et al., 2006), SILVA (Quast et al., 2013) and EzTaxon (Kim et al., 2012) to
3
68
which a sequence obtained from an isolate may be compared using BLAST (Altschul et al., 1990;
69
Morgulis et al., 2008). Specific rRNA analysis tools have been developed as well, such as ARB
70
(Ludwig et al., 2004).
71
The massive use of 16S rRNA sequences notably had a very significant impact on the
72
number of validly published bacterial species, which grew from 1,800 in 1980 to nearly 12,500 in
73
2013 (Parte, 2014).
74
Many preexisting taxa were re-classified and a greater number created. In order to facilitate
75
the use of 16S rRNA as a genotyping tool, cutoff values have been defined. As early as 1994, two
76
strains were considered as belonging to distinct species if they shared 16S rRNA identities lower
77
than 97% (Stackebrandt & Goebel, 1994), and to discriminate two genera if this value was lower
78
than 95%. In 2006, the cutoff value at the species level was reevaluated at 98.7% (Stackebrandt &
79
Ebers, 2006). However, several authors have demonstrated that these cutoffs, initially designed to
80
standardize the use of 16S rRNA sequences in taxonomy, do not apply to several genera. Extreme
81
examples include Edwardsiella species (99.3 to 99.8% 16S rRNA identity, Janda & Abbott, 2007)
82
and the Streptomyces and Chlorobium genera in which the lowest interspecies identities are 78 and
83
86.1%, respectively (Alexander et al., 2002). Among genera containing human-associated species,
84
several inter-species 16S rRNA identity values 98.7% were observed in the Clostridium
85
genus. As an example, C. botulinum and C. sporogenes exhibit a 99.7% identity. However, the
86
species C. sporogenes was conserved as it groups non-toxigenic strains genetically related to C.
87
botulinum (Olsen et al., 1995). Another example may be found in the Rickettsia genus in which 16S
88
rRNA identity values > 99% are found among all 26 species with standing in nomenclature despite
89
species-specific epidemio-clinical characteristics (Fournier & Raoult, 2009). Thus, in some cases,
90
phenotypic and/or chemotypic specificities justify a distinct classification of bacterial strains into
91
species or genera from that suggested by their genetic relatedness. These examples question the use
92
of strict criteria for the interpretation the results of 16S rRNA-based metagenomic studies in which
93
phenotypic properties are not taken into account (Huse et al., 2008). Herein, as no broad analysis of
4
94
interspecies 16S rRNA identity among bacterial genera has been performed to date, we studied the
95
variability of this parameter among bacterial genera containing species of medical interest (i. e.,
96
species found in humans, pathogenic or not) and with standing in nomenclature.
97
Material and methods
98
Selection of studied bacterial genera and of 16S rRNA sequences
99
Within the « List of Prokariotic Names with Standing in Nomenclature » (LPSN) website
100
(Parte, 2014), we selected bacterial genera that contained at least one species associated to humans,
101
pathogenic or not, and obtained the 16S rRNA accession numbers from their type strains.
102
Collection of 16S rRNA sequences
103
For every studied genus, we created a FASTA format file containing the 16S rRNA sequences
104
of the type strain from each species. Due to the wide heterogeneity in length and quality of the 16S
105
rRNA sequences of type strains, we did not use sequences shorter than 1,200 nucleotides. We also
106
removed sequences shorter than 1,250 nucleotides with more than 6 degenerate nucleotides, and
107
sequences larger than 1,250 nucleotides with more than 10 degenerate nucleotides.
108
16S rRNA sequence analysis
109
Calculation of pairwise 16S rRNA identities
110
Sequences were aligned using Clustal Omega with default settings (Chenna et al., 2003; Goujon et
111
al., 2010). In each studied genus, pairwise 16S rRNA sequence identities between all possible
112
species pairs were first estimated using the Mega Phylogeny Software v5 (Tamura et al., 2011).
113
Then, the highest and the lowest values computed by this software were more accurately
114
determined using pairwise BLASTN. We defined as expected values (EV) interspecies 16S rRNA
115
identity percentages that were comprised between 95 and 98.7% (Stackebrandt & Ebers, 2006), and
116
as abnormal values (AV) those that were > 98.7 or < 95%.
117
Results
118 119
Fifty-six species, for which no 16S rRNA accession number was described in the LPSN website, were not included in the study. Another 74 species, including 7 that were shorter than
5
120
1,200 nucleotides and 67 that were of insufficient quality, were also excluded from the analysis.
121
Overall, 4,289 species with standing in nomenclature, classified within 158 genera, 81 families, 39
122
orders and 12 phyla were included in our study (Supplementary Table).
123
Among the 4,289 studied species, the lowest and highest interspecies pairwise 16S rRNA
124
identities were 68.7% and 100%, respectively. A total of 2,448 of the 4,289 studied species (57.1%)
125
exhibited at least one 16S rRNA identity rate with another species in the same genus > 98.7%. In 95
126
of 158 studied genera (60.1%) at least one species exhibited at least one 16S rRNA identity rate
127
with another species in the same genus < 95%. In only 17 of the 158 studied genera (10.8%), all
128
species exhibited EVs. These included genera of major medical interest such as Morganella and
129
Chlamydia (Supplementary Table).
130
Sixty-nine of the 158 studied genera (52.5%) had 50 to 100% EVs, including Kingella
131
(50%), Corynebacterium (53.1%), Bordetella (53.6%), Bartonella (56.9%), Acinetobacter (61.3%),
132
Neisseria (66.7%), Borrelia (70.0%), Yersinia (72.1%), Enterobacter (78.4%), Escherichia (80.0%),
133
Burkholderia (82.1%), Vibrio (82.7%), Streptomyces (83.5%), Mycobacterium (91.1%),
134
Staphylococcus (92.8%) and Nocardia (93.9%) (Figure 1). It sfhould be noted that bacterial genera
135
from the phyla Actinobacteria (24 of the 33 studied genera [72.7%]), Alphaproteobacteria (11/20,
136
55%), Betaproteobacteria (9/15, 60%) and Gammaproteobacteria (24/37, 64.8%) were more likely
137
to have > 50% EVs. In addition, within the Enterobacteriaceae family, 15/19 genera (78.9%)
138
showed >50% EVs (Figure 2, Supplementary Table).
139
Forty-three of the 158 studied genera (25.9%) exhibited 20 to 50% EVs, including
140
Actinomyces (22.0%), Campylobacter (22.1%), Legionella (29.9%), Streptococcus (33.2%),
141
Rickettsia (35.7%), Bacillus (39.4%), Listeria (46.4%), Bifidobacterium (44.7%), Pseudomonas
142
(46.5%) and Helicobacter (49.0%) (Figure 1, Supplementary Table).
143
Twenty-nine of the 158 studied genera (21.5%) had less than 20% EVs, including Brucella (0.0%),
144
Treponema (2.1%), Clostridium (6.4%), Mycoplasma (8.4%), Bacteroides (9.6%), Shigella (16.7%)
145
(Figure 1). The Clostridiales and Tenericutes phyla contained the genera with the lowest rates of
6
146
EVs, with 6 of 8 genera (75%) and 2 of 3 genera (66.6%), respectively, having less than 20% EVs.
147
None of the studied genera of the Tenericutes phylum had more than 50% EVs (Figure 2,
148
Supplementary Table). Noteworthy, 2 of 5 genera in the Spirochaetes phylum had very few EVs:
149
Spirochaeta (1.5%) and Treponema (2.1%) (Supplementary Table).
150
Discussion
151
Among the genetic tools used to classify bacterial strains, 16S rRNA had the most
152
significant impact and enabled the re-classification or creation of a large number of taxa. However,
153
the cutoff values of 95 and 98.7% used to classify bacterial isolates at the genus and species levels,
154
respectively, were established under the assumption that the level of interspecies16S rRNA
155
variation was homogenous among genera. However, it was suggested that the speed of evolution of
156
rRNA genes may vary according to the phylum (Clarridge, 2004), and the inadequacy of the current
157
cutoffs to many genera has been reported. It was notably demonstrated that its discriminatory power
158
could be insufficient at the species level (Bosshard et al., 2006; Mignard & Flandrois, 2006), as
159
observed for Streptococcus pneumoniae and S. mitis that exhibit as few as 3 bp differences.
160
Conversely, intraspecies 16S rRNA differences may be important, as is the case for Enterobacter
161
agglomerans for which 2 strains may exhibit as many as 27 bp differences, which does not validate
162
the 98.7% cutoff, and thus may justify their classification in distinct species. Such a genetic
163
difference was also observed at the genus level, with Clostridium tetani and C. innocuum exhibiting
164
104 bp differences, which does not validate the 95% threshold and thus may justify the
165
classification of both species in distinct genera. Another drawback is the sequence variability
166
among the multiple copies of 16S rRNA present in a bacterial chromosome. In 1997, Wang et al.
167
reported a degree of sequence similarity among 16S rRNA copies of Thermobispora bispora of
168
only 92% (Wang et al., 1997). In 2010, Pei et al. identified sequence diversity > 1.3% among 16S
169
rRNA genes in a genome in 11 bacterial species (Pei et al., 2010). Among these, Borrelia afzelii, an
170
agent of Lyme disease in humans, exhibits a similarity of only 79.62% between its two 16S rRNA
171
copies (Pei et al., 2010). Thus, a strict application of the 98.7% threshold would classify these
7
172
bacteria in different species depending on the 16S rRNA copy analyzed (Acinas et al., 2004; Pei et
173
al., 2010).
174
However, although variations of interspecies 16S rRNA identity values among genera had
175
been reported, we could not find any article studying this phenomenon at a large scale. In the
176
present study, using a systematic analysis of interspecies 16S rRNA identity of genera containing
177
bacteria of medical interest, we demonstrated that the current thresholds are applicable to only 42.9%
178
and 39.9% of studied species and genera, respectively. In addition, when both cutoff values were
179
taken into account, only 10.8% of studied genera had all their species adequately classified. We also
180
observed important discrepancies among bacterial genera, some of which contain species of major
181
medical importance. The proportion of species exhibiting EVs ranged from 0 (Brucella, Tissierella)
182
to 100% (Kytococcus, Chlorobium), and there were discrepancies regarding these rates within most
183
of the studied phyla (Figure 1, Supplementary Table).
184
The phylum Chlorobi was the only phylum within which the 95 and 98.7% thresholds were
185
respected for all studied taxa, but only one genus, Chlorobium, was studied. Other phyla, including
186
the Chlamydia, Actinobacteria, Bacilli, Alphaproteobacteria and Gammaproteobacteria (and more
187
specifically the order Enterobacteriales) contained many genera for which both thresholds were
188
respected. Additionally, the cutoffs were globally well adapted for some genera such as Bordetella
189
(53.6%), Bartonella (56.9%), Acinetobacter (61.3%), Enterococcus (63.2%), Neisseria (66.7%),
190
Borrelia (70%), Yersinia (72.1%), Enterobacter (78,4%), Escherichia (80.0%), Burkholderia
191
(82.1%), Vibrio (82.7%), Streptomyces (83.5%), Mycobacterium (91.2%), Staphylococcus (92,3%),
192
Klebsiella (93.3%), Nocardia (93.9%), Chlamydia (100%). In contrast, the current cutoffs were
193
poorly applicable to the phyla Bacteroidetes, Clostridiales and Spirochaetes. In the Bacteroidetes,
194
Clostridiales, Tenericutes and Spirochaetes phyla, within which no genus exhibited 100% EVs
195
(Figure 2). We acknowledge the fact that the limited number of studied genera of the Tenericutes
196
and Spirochaetes phyla may in part explain these results. Additionally, at the genus level, some
197
genera had few EVs, including Brucella (0%), Treponema (2,1%), Clostridium (6,4%),
8
198
Mycoplasma (8,4%), Pasteurella (15,4%), Shigella (16,7%), Actinomyces (22%), Campylobacter
199
(22,1%), Haemophilus (26,9%), Legionella (29,9%), Streptococcus (33,2%), Rickettsia (35,7%),
200
Bacillus (39,4%), Leptospira (43,3%), Listeria (46,4%), and Pseudomonas (46,5%) (Figure 1,
201
Supplementary Table). For each studied genus, the ranges of 16S rRNA identities were made
202
available in a free-access webpage (http://www.mediterranee-
203
infection.com/article.php?larub=152&titre=16s-yourself).
204
Therefore, on the basis of these results and observed variations, we confirm that the 95 and
205
98.7% 16S rRNA interspecies identity thresholds may only be used as indicators, and not as a
206
definite tool for the classification of bacterial strains, as was already the case for several bacterial
207
species exhibiting specific phenotypic characteristics but inter-species 16S rRNA identity values
208
greater than 98.7%, such as C. botulinum and C. sporogenes (Olsen et al., 1995), Rickettsia
209
prowazekii and R. rickettsii (Fournier & Raoult, 2009), or Nocardia paucivorans and N. brevicatena
210
(Roth et al., 2003). We recently proposed to include genomic data in the taxonomic classification of
211
new bacterial isolates (Ramasamy et al., 2014). In this strategy, we systematically compare the 16S
212
rRNA identity values obtained between the new isolate and its closest phylogenetic neighbor to
213
those observed among members of the same genus and/or closely related genera. Our findings in the
214
present study support our strategy of computing 16S rRNA extreme identity values, to better assign
215
a strain within a species or a genus, and suggest that applying strictly thresholds developed by
216
studying a limited number of taxa may lead to misclassification and contribute to confusing
217
situations.
218
In conclusion, the degree of conservation of 16S rRNA sequence identity may vary greatly
219
among bacterial genera. Therefore, rather than just referring to two cutoff values, we propose that
220
when classifying a new strain, in addition to its phenotypic characteristics, its 16S rRNA identity to
221
its phylogenetically-closest species should be compared, in the studied genus, to the range of
222
identity values observed among species with standing in nomenclature.
223
9
224
Bibliography
225
Acinas, S. G., Marcelino, L. A., Klepac-Ceraj, V. & Polz, M. F. (2004). Divergence and
226
redundancy of 16S rRNA sequences in genomes with multiple rrn operons. J Bacteriol 186, 2629–
227
2635.
228
Alexander, B., Andersen, J. H., Cox, R. P. & Imhoff, J. F. (2002). Phylogeny of green sulfur
229
bacteria on the basis of gene sequences of 16S rRNA and of the Fenna-Matthews-Olson protein.
230
Arch Microbiol, 178, 131–140.
231
Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. (1990). Basic local alignment
232
search tool. J Mol Biol 215, 403-410.
233
Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D. J., Ostell, J. &
234
Sayers, E. W. (2013). GenBank. Nucleic Acids Res 41, D36-42.
235
Bosshard, P. P., Zbinden, R., Abels, S., Böddinghaus, B., Altwegg, M. & Böttger, E. C. (2006).
236
16S rRNA gene sequencing versus the API 20 NE system and the VITEK 2 ID-GNB card for
237
identification of nonfermenting Gram-negative bacteria in the clinical laboratory. J Clin Microbiol
238
44, 1359-1366.
239
Brenner, D. J., Fanning, G. R., Rake, A. V. & Johnson, K. E. (1969). Batch procedure for
240
thermal elution of DNA from hydroxyapatite. Anal Biochem 28, 447–459.
241
Chenna, R., Sugawara H., Koike T., Lopez R., Gibson T. J., Higgins D. G. & Thompson J. D.
242
(2003). Multiple sequence alignment with the Clustal series of programs. Nucleic acids research
243
31:3497-3500.
244
Clarridge, J. E. 3rd. (2004). Impact of 16S rRNA gene sequence analysis for identification of
245
bacteria on clinical microbiology and infectious diseases. Clin Microbiol Rev 17, 840-862.
246
DeSantis, T. Z., Hugenholtz, P., Larsen, N., Rojas, M., Brodie, E. L., Keller, K., Huber, T.,
247
Dalevi, D., Hu, P. & Andersen, G. L. (2006). Greengenes, a chimera-checked 16S rRNA gene
248
database and workbench compatible with ARB. Appl Environ Microbiol 72, 5069-5072.
249
Fournier, P. E & Raoult, D. (2009). Current knowledge on phylogeny and taxonomy of Rickettsia
10
250
spp. Ann N Y Acad Sci 1166,1-11.
251
Fox, G. E., Magrum, L. J., Balch, W. E., Wolfe, R. S. & Woese C. R. (1977). Classification of
252
methanogenic bacteria by 16S ribosomal RNA characterization. Proc Natl Acad Sci U S A 74,
253
4537-4541.
254
Fox, G. E., Wisotzkey, J. D. & Jurtshuk, P. Jr. (1992). How close is close: 16S rRNA sequence
255
identity may not be sufficient to guarantee species identity. Int J Syst Bacteriol 42, 166–170.
256
Goujon, M., McWilliam, H., Li, W., Valentin, F., Squizzato, S., Paern, J., Lopez, R. (2010). A
257
new bioinformatics analysis tools framework at EMBL-EBI. Nucleic acids research 38: W695-699.
258
Hugenholtz, P., Goebel, B. M. & Pace N. R. (1998). Impact of culture-independent studies on the
259
emerging phylogenetic view of bacterial diversity. J Bacteriol 180, 4765–4774.
260
Huse, S. M., Dethlefsen, L., Huber, J. A., Mark Welch, D., Relman, D. A. & Sogin, M .L.
261
(2008). Exploring microbial diversity and taxonomy using SSU rRNA hypervariable tag sequencing.
262
PLoS Genet 4:e1000255.
263
Janda, J. M. & Abbott, S. L. (2007). 16S rRNA gene sequencing for bacterial identification in the
264
diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol 45, 2761-2764.
265
Johnson, J. L. (1991). DNA reassociation experiments. In Nucleic Acid Techniques in Bacterial
266
Systematics, pp. 21–44. Edited by E. Stackebrandt & M. Goodfellow. Chichester: Wiley.
267
Kim, O.S., Cho, Y.J., Lee, K., Yoon, S.H., Kim, M., Na, H., Park, S.C., Jeon, Y.S., Lee, J.H.,
268
Yi, H., Won, S. & Chun, J. (2012). Introducing EzTaxon: a prokaryotic 16S rRNA Gene sequence
269
database with phylotypes that represent uncultured species. Int J Syst Evol Microbiol 62, 716–721.
270
Ludwig, W. & Klenk, H. P. (2001). Overview: a phylogenetic backbone and taxonomic
271
framework for prokaryotic systematics. In Bergey’s Manual of Systematic Bacteriology, 2nd edn,
272
vol. 1, pp. 49–65. Edited by G.M. Garrity. New York, USA: Springer
273
Ludwig, W., Strunk, O., Westram, R., Richter, L., Meier, H., Yadhukumar, Buchner, A., Lai,
274
T., Steppi, S., Jobb, G., Förster, W., Brettske, I., Gerber, S., Ginhart, A. W., Gross, O.,
275
Grumann, S., Hermann, S., Jost, R., König, A., Liss, T., Lüssmann, R., May, M., Nonhoff, B.,
11
276
Reichel, B., Strehlow, R., Stamatakis, A., Stuckmann, N., Vilbig, A., Lenke, M., Ludwig, T,.
277
Bode, A. & Schleifer, K. H. (2004). ARB: a software environment for sequence data. Nucleic
278
Acids Res 25, 1363-1371.
279
Mignard, S. & Flandrois, J. P. (2006). 16S rRNA sequencing in routine bacterial identification: a
280
30-month experiment. J Microbiol Meth 67, 574-581.
281
Morgulis, A., Coulouris, G., Raytselis, Y., Madden, T. L., Agarwala, R. & Schäffer A. A.
282
(2008). Database indexing for production MegaBLAST searches. Bioinformatics 15, 1757-1764.
283
Olsen, I., Johnson, J. L., Moore, L. V. H. & Moore, W. E. C. (1995). Rejection of Clostridium
284
putrificum and conservation of Clostridium botulinum and Clostridium sporogenes: Request for an
285
Opinion. Int J Syst Bacteriol 45, 414.
286
Parte, A. C. (2014). LPSN—list of prokaryotic names with standing in nomenclature. Nucleic
287
Acids Res 42, 613-616.
288
Pei, A. Y., Oberdorf, W. E., Nossa, C. W., Agarwal, A., Chokshi, P., Gerz, E. A., Jin, Z., Lee,
289
P., Yang, L., Poles, M., Brown, S. M., Sotero, S., Desantis, T., Brodie, E., Nelson, K. & Pei, Z.
290
(2010). Diversity of 16S rRNA genes within individual prokaryotic genomes. Appl Environ
291
Microbiol 76, 3886-3897.
292
Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., Peplies, J. & Glöckner,
293
F. O. (2013). The SILVA ribosomal RNA gene database project: improved data processing and
294
web-based tools. Nucleic Acids Res 41, 590-596.
295
Rosselló-Mora R. (2003). Opinion: the species problem, can we achieve a universal concept? Syst
296
Appl Microbiol 26, 323-326.
297
Roth, A., Andrees, S., Kroppenstedt, R. M., Harmsen, D. & Mauch H.(2003). Phylogeny of the
298
genus Nocardia based on reassessed 16S rRNA gene sequences reveals underspeciation and
299
division of strains classified as Nocardia asteroides into three established species and two unnamed
300
taxons. J Clin Microbiol 41:851-6.
301
Sentausa, E. & Fournier, P. E. (2013). Advantages and limitations of genomics in prokaryotic
12
302
taxonomy. Clin Microbiol Infect 19, 790-795.
303
Stackebrandt, E. & Goebel, B. M. (1994). A place for DNA–DNA reassociation and 16S rRNA
304
sequence analysis in the present species definition in bacteriology. Int J Syst Bacteriol 44, 846–849.
305
Stackebrandt, E., Frederiksen, W., Garrity, G. M., Grimont, P. A., Kämpfer, P., Maiden, M.
306
C., Nesme, X., Rosselló-Mora, R., Swings, J., Trüper, H. G., Vauterin, L., Ward, A. C. &
307
Whitman, W. B. (2002). Report of the ad hoc committee for the re-evaluation of the species
308
definition in bacteriology. Int J Syst Evol Microbiol 52, 1043-1047.
309
Stackebrandt, E. & Ebers, J. (2006). Taxonomic parameters revisited: tarnished gold standards.
310
Microbiol Today 33, 152-155.
311
Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M. & Kumar, S. (2011). MEGA5:
312
Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and
313
Maximum Parsimony Methods. Molecular Biology and Evolution 28, 2731-2739.
314
Tindall, B. J., Rosselló-Móra, R., Busse, H. J., Ludwig, W. & Kämpfer, P. (2010). Notes on the
315
characterization of prokaryote strains for taxonomic purposes. Int J Syst Evol Microbiol 60, 249–
316
266.
317
Vandamme, P., Pot, B., Gillis, M., de Vos, P., Kersters, K. & Swings, J. (1996). Polyphasic
318
taxonomy, a consensus approach to bacterial systematics. Microbiol Rev 60, 407-438.
319
Wang, Y., Zhang, Z. & Ramanan, N. (1997). The actinomycete Thermobispora bispora contains
320
two distinct types of transcriptionally active 16S rRNA genes. J Bacteriol 179, 3270-3276.
13
321
Figure legends
322
Figure 1: Range of 16S rRNA identity rates between species of genera of major medical interest.
323
A: genera that respect similarity rates (>50% of interspecies identity rates between the two
324
thresholds); B: genera that poorly respect similarity rates (20-50% of interspecies identity rates
325
between the two thresholds); C: genera that do not respect similarity rates (50% of interspecies identity rates
331
between the two thresholds and 12% of genera have 50%
333
of interspecies identity rates between the two thresholds and 63% of genera have