Hindawi Publishing Corporation International Journal of Genomics Volume 2015, Article ID 198560, 15 pages http://dx.doi.org/10.1155/2015/198560

Research Article Phylogenomic and Molecular Demarcation of the Core Members of the Polyphyletic Pasteurellaceae Genera Actinobacillus, Haemophilus, and Pasteurella Sohail Naushad, Mobolaji Adeolu, Nisha Goel, Bijendra Khadka, Aqeel Al-Dahwi, and Radhey S. Gupta Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ON, Canada L8N 3Z5 Correspondence should be addressed to Radhey S. Gupta; [email protected] Received 5 November 2014; Revised 19 January 2015; Accepted 26 January 2015 Academic Editor: John Parkinson Copyright © 2015 Sohail Naushad et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The genera Actinobacillus, Haemophilus, and Pasteurella exhibit extensive polyphyletic branching in phylogenetic trees and do not represent coherent clusters of species. In this study, we have utilized molecular signatures identified through comparative genomic analyses in conjunction with genome based and multilocus sequence based phylogenetic analyses to clarify the phylogenetic and taxonomic boundary of these genera. We have identified large clusters of Actinobacillus, Haemophilus, and Pasteurella species which represent the “sensu stricto” members of these genera. We have identified 3, 7, and 6 conserved signature indels (CSIs), which are specifically shared by sensu stricto members of Actinobacillus, Haemophilus, and Pasteurella, respectively. We have also identified two different sets of CSIs that are unique characteristics of the pathogen containing genera Aggregatibacter and Mannheimia, respectively. It is now possible to demarcate the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of discrete molecular signatures. The other members of the genera Actinobacillus, Haemophilus, and Pasteurella that do not fall within the “sensu stricto” clades and do not contain these molecular signatures should be reclassified as other genera. The CSIs identified here also provide useful diagnostic targets for the identification of current and novel members of the indicated genera.

1. Introduction The family Pasteurellaceae, the single constituent family of the order Pasteurellales, represents a diverse group of commensal and pathogenic bacteria within the class Gammaproteobacteria. The family currently contains 19 genera, some of which are particularly important human and animal pathogens [1, 2]. The genera Haemophilus contains species responsible for human bacteremia, pneumonia, acute bacterial meningitis, and the sexually transmitted disease chancroid [3–5]; Aggregatibacter species have been implicated in juvenile periodontitis [6]; members of the genera Mannheimia, Pasteurella, and Actinobacillus have been implicated in the causation of shipping fever in cattle, fowl cholera, and pleuropneumonia in pigs, respectively [7–9]. The family Pasteurellaceae was originally proposed as a higher level taxonomic grouping of the related pathogenic genera Actinobacillus, Haemophilus, and Pasteurella [10].

Classification of organisms into these three genera was primarily based on DNA G-C content and a handful of phenotypic traits [11]. The phenotypic traits were later found not to be characteristic of any single genus [12]. Consequently, the genera Actinobacillus, Haemophilus, and Pasteurella each exhibit extensive polyphyly in subsequent 16S rRNA based phylogenies [12, 13]. Additional studies based on individual or concatenated gene sets and DNA-DNA/rRNA-DNA hybridization also support the presence of extensive polyphyly within the genera Actinobacillus, Haemophilus, and Pasteurella [14–20]. Extensive work has been undertaken to amend the classification of the genera Actinobacillus, Haemophilus, and Pasteurella [1, 5, 9, 18]. New genera have been created to house phylogenetically coherent clusters of Actinobacillus, Haemophilus, and Pasteurella. The species [Actinobacillus] actinomycetemcomitans, [Haemophilus] aphrophilus, [Haemophilus] paraphrophilus, and [Haemophilus] segnis have

2 been transferred to the genus Aggregatibacter [21]; the species [Haemophilus] paragallinarum, [Pasteurella] gallinarum, [Pasteurella] avium, and [Pasteurella] volantium have been transferred to the genus Avibacterium [22]; the species [Haemophilus] somnus and [Haemophilus] agni have been transferred to the genus Histophilus [23]; and the species [Pasteurella] haemolytica and [Pasteurella] granulomatis have been transferred to the genus Mannheimia [7]. Additionally, some individual species within the genera Actinobacillus, Haemophilus, and Pasteurella that do not cluster with other members of their genus in phylogenetic trees have been moved or proposed to be moved to novel or neighbouring genera (namely, the transfer of the species [Haemophilus] pleuropneumoniae to the genus Actinobacillus [24], the transfer of the species [Pasteurella] anatis to the genus Gallibacterium [25], the transfer of the species [Pasteurella] trehalosi to the genus Bibersteinia [26], the transfer of the species [Pasteurella] ureae to the genus Actinobacillus [27], and the proposed transfer of the species [Haemophilus] ducreyi to a novel genus [28]). However, despite these changes, the classification of the genera Actinobacillus, Haemophilus, and Pasteurella is still problematic and each genus continues to contain members which exhibit polyphyletic branching [5, 17–20]. Multiple studies have attempted to define a core group of species which cluster around the nomenclatural type species of Actinobacillus, Haemophilus, or Pasteurella as the only true members of these genera (i.e., sensu stricto) [13, 15–17, 29–31], but the taxonomy and phylogeny of these bacteria continue to remain inconclusive [20, 32, 33]. Several methods have been employed for the demarcation of these genera; however, no simple method or criterion is available that can clearly delimit these genera. It has been suggested that genome based studies may provide reliable means of clarifying the evolutionary relationships of these bacteria [33]. Since the availability of the first complete genome sequence of the Haemophilus influenzae [34], a large number of genomes for the members of the family Pasteurellaceae have become available in public databases [35, 36]. The availability of these genomes provides us with an opportunity to complete comprehensive genome scale phylogenetic analyses of the family Pasteurellaceae. These genome sequences have also been utilized to carry out comparative genomic analyses to identify molecular signatures (namely, conserved signature indels (CSIs) in various proteins), commonly shared by all or closely related subsets of species within the family Pasteurellaceae. On the basis of the molecular signatures identified from comparative analyses of Pasteurellaceae genomes in conjunction with core genome based and multilocus sequence based phylogenetic analyses, we have identified sensu stricto clades of Actinobacillus, Haemophilus, and Pasteurella that are supported by 3, 7, and 6 unique molecular signatures, respectively. We also report sets of molecular signatures that are unique characteristics of the pathogen containing genera Aggregatibacter and Mannheimia.

2. Methods 2.1. Multilocus Sequence Analysis. Multilocus sequence analysis was completed for members of the family Pasteurellaceae

International Journal of Genomics using widely available nucleotide sequences of the 16S rDNA, infB (translation initiation factor IF-2), recN (DNA repair protein), and rpoB (DNA-directed RNA polymerase subunit beta) genes which have been used, individually or as part of a set, in a number of previous phylogenetic analyses of the family Pasteurellaceae [15–17, 29, 30]. Gene sequences for these four genes were obtained for 52 Pasteurellaceae strains, representing a large majority of the known Pasteurellaceae species, and 2 members of Vibrio cholerae from the NCBI nucleotide database [37]. Species which were missing one of these four genes or which did not have a gene sequence that was at least 50% of the length of the full gene were excluded from the analysis. The four genes were individually aligned using MUSCLE [38] and manually concatenated to create a combined dataset that contained 10 183 nucleotide long alignments. A maximum-likelihood tree based on 100 bootstrap replicates of this alignment was constructed using MEGA 6.0 [39] while employing maximum composite likelihood substitution model. 2.2. Pasteurellaceae Core Genome Phylogenetic Tree. A phylogenetic tree of 76 Pasteurellaceae strains, rooted using 7 members of the family Vibrionaceae, based on the core genome of the family Pasteurellaceae was created for this study. The core set of Pasteurellaceae proteins were identified using the UCLUST algorithm [55] to identify widely distributed protein families with at least 30% sequence identity and 50% sequence length. Proteins families which were present in less than 50% of the input genomes were excluded from further analysis. Potentially paralogous sequences (additional proteins from the same organism in a single protein family) within the remaining protein families were also excluded from further analysis. Each protein family was individually aligned using MAFFT 7 [56]. Aligned amino acid positions which contained gaps in more than 50% of organisms were excluded from further analysis. The remaining amino acid positions were concatenated to create a combined dataset that contained 128 080 amino acid long alignments. An approximately maximum-likelihood tree based on this alignment was constructed using FastTree 2 [57] while employing the Whelan and Goldman substitution model [58]. 2.3. Identification of Molecular Signatures (CSIs) for Different Genera of the Family Pasteurellaceae. The detailed outline of the process of identifying CSIs has been recently published [59]. In brief, Blastp searches were performed on all proteins from the genome of Haemophilus influenzae F3047 [47]. Ten to fifteen high scoring homologues that were present in Haemophilus, other Pasteurellaceae, and Gammaproteobacteria species were retrieved, and their multiple sequence alignments were constructed using Clustal X 1.83 [60]. The alignments were visually inspected to identify any conserved inserts or deletions (indels) that are restricted to the particular clades of the family Pasteurellaceae, which are flanked on each side by at least 5-6 identical/conserved residues in the neighbouring 30–40 amino acids. The selected sequences containing the indels and their flanking conserved regions were further evaluated by detailed Blastp searches

International Journal of Genomics to determine species distribution and group specificity. The results of these Blast searches were processed using Sig Create and Seq Style to construct signature files [59]. Due to space constraints, the sequence alignment files presented here contain sequence information for a limited number of species within the order Pasteurellaceae and a representative selection of outgroup species. However, in each case, all members of the order and outgroups exhibited similar sequence characteristics to the representatives.

3. Results and Discussion 3.1. Phylogenetic Analysis of the Pasteurellaceae. Elucidating an accurate phylogeny of the members of the family Pasteurellaceae has been a long standing challenge in Pasteurellaceae research [10–12, 18, 19]. Early 16S rRNA based studies revealed that the established taxonomy of the family Pasteurellaceae was not consistent with their genetically inferred phylogeny [12, 14]. This has led to a long series of taxonomic revisions within the family Pasteurellaceae, a process which is still taking place today [7, 18, 22, 28]. However, it was subsequently discovered that phylogenetic trees of Pasteurellaceae species based on different genes did not completely agree with each other [15, 16, 31]. In particular, phylogenetic trees based on the 16S rRNA gene, often considered the gold standard in bacterial taxonomy and phylogeny [61, 62], disagreed with highly robust multilocus sequence and concatenated protein sequence based phylogenetic trees [9, 17, 19, 20, 53, 63]. Phylogenetic trees based on concatenated sequences for a large number of unlinked and conserved loci are more reliable and robust than phylogenetic trees based on any single gene or protein [64, 65]. Due to a rapid increase in the availability of genomic sequence data, we are now able to complete genome scale phylogenetic analyses of the family Pasteurellaceae which cover a vast majority of the diversity within the family. In this work we have produced a phylogenetic tree for 74 genome sequenced members of the family Pasteurellaceae based on 128 080 aligned amino acid positions (Figure 1(a)). The branching patterns of the core genome phylogenetic tree produced in this work largely agree with a previous genome based phylogenetic tree produced for a limited number of Pasteurellaceae species [19] and a concatenated protein based phylogenetic tree of the family Pasteurellaceae produced by our lab in a previous study [20]. Additionally, we have also produced a multilocus sequence based phylogenetic tree using the 16S rDNA, infB, recN, and rpoB genes which are commonly used in the phylogenetic analysis of the family Pasteurellaceae (Figure 1(b)) [15–17, 29, 30]. This tree also showed broadly similar branching patterns to past multilocus sequence based phylogenetic trees [17, 18] and to our core genome based phylogenetic tree. Both our core genome based and multilocus sequence based phylogenetic trees provide evidence for a division of the Pasteurellaceae into at least two higher taxonomic groups (families) which are broadly similar to the two clades of Pasteurellales identified in our previous work [20]. A similar division of the family Pasteurellaceae into two or more large groups is seen in many other robust multilocus or concatenated protein based phylogenetic trees [17, 19, 53, 63];

3 however, this division is not readily apparent in phylogenies based on the 16S rRNA gene [9, 66]. A majority of the known genera within the family Pasteurellaceae form well-defined and coherent clusters in phylogenetic trees (Figure 1) [9, 17, 19, 20, 66]. The genera Actinobacillus, Haemophilus, and Pasteurella, which were described before the advent of genetic characterization, exhibit polyphyletic branching in all gene and protein based phylogenetic trees, including the core genome based and multilocus sequence based phylogenetic trees created in this work (Figure 1). However, there are large clusters of Actinobacillus, Haemophilus, and Pasteurella species identifiable in the phylogenetic trees which represent the core or “sensu stricto” members of each genera. The clusters of species that represent Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto are indicated in Figure 1. Members of each genera which fall outside of the sensu stricto clusters, indicated in our phylogenetic trees by the presence of square brackets around their genus name (e.g., [Pasteurella] pneumotropica), are only distantly related to the sensu stricto members of their genus and will require reclassification in order to make their taxonomy and phylogeny concordant. 3.2. The Usefulness of Conserved Signature Indels as Phylogenetic and Taxonomic Markers. Whole genome sequences are a rich resource for the discovery of molecular signatures which are unique to a group of organisms [67–69]. One useful class of shared molecular signatures are conserved signature indels (CSIs), which are insertions/deletions uniquely present in protein sequences from a group of evolutionarily related organisms [59, 70, 71]. The unique, shared presence of multiple CSIs by a group of related species is most parsimoniously explained by the occurrence of the genetic changes that resulted in these CSIs in a common ancestor of the group, followed by vertical transmission of these CSIs to various descendant species [59, 71–73]. Hence, these CSIs represent molecular synapomorphies (markers of common evolutionary decent) which can be used to identify and demarcate specific bacterial groups in molecular terms and for understanding their interrelationships independently of phylogenetic trees [59, 70–72]. CSIs have recently been used to propose important taxonomic changes for a number of bacterial groups (namely, Aquificae, Spirochaetes, Thermotogae, Xanthomonadales, and Borrelia) at different taxonomic ranks [69, 74–77]. In the present work, we have completed comprehensive comparative analysis of Pasteurellaceae genomes (Table 1) in order to identify CSIs that are primarily restricted to the different genera within the family Pasteurellaceae. We have identified 3, 7, and 6 unique molecular signatures which are shared by Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto, respectively. Information regarding these CSIs and their evolutionary significances is discussed below. 3.3. Molecular Signatures Specific for Actinobacillus sensu stricto. The genus Actinobacillus was originally defined as a group of growth factor independent host-associated rods which shared phenotypic or biochemical similarity with

4

International Journal of Genomics

Haemophilus influenzae 224-21 Haemophilus influenzae 7P49H1 Haemophilus influenzae 86-028NP Haemophilus influenzae 6P18H1 Haemophilus influenzae PittHH Haemophilus influenzae 10810 Haemophilus aegyptius ATCC 11116 Haemophilus influenzae F3031 Haemophilus influenzae RdAW Haemophilus influenzae 221-21 Haemophilus sensu Haemophilus influenzae PittAA Haemophilus influenzae KR494 stricto Haemophilus haemolyticus M19501 Haemophilus haemolyticus M19107 Haemophilus haemolyticus HK386 Haemophilus sp.oral taxon 851 str. F0397 [Pasteurella] pneumotropica DSM 21403 Haemophilus parainfluenzae HK2019 Haemophilus parainfluenzae ATCC 33392 Haemophilus pittmaniae HK 85 Aggregatibacter actinomycetemcomitans D11S-1 Aggregatibacter actinomycetemcomitans ANH9381 Aggregatibacter actinomycetemcomitans I23C Aggregatibacter actinomycetemcomitans RhAA1 Aggregatibacter actinomycetemcomitans SC1083 Aggregatibacter Aggregatibacter aphrophilus ATCC 33389 Aggregatibacter aphrophilus F0387 Aggregatibacter segnis ATCC 33393 Aggregatibacter sp. oral taxon 458 str. W10330 Avibacterium paragallinarum 72 Avibacterium paragallinarum 221 Histophilus somni 2336 Histophilus somni 129PT Pasteurella dagmatis ATCC 43325 Pasteurella multocida 1500E Pasteurella multocida P1059 Pasteurella sensu stricto Pasteurella multocida 1500C Pasteurella multocida 671 90 Pasteurella multocida 3480 [Actinobacillus] succinogenes 130Z Basfia succiniciproducens MBEL55E [Pasteurella] bettyae CCUG 2042 [Haemophilus] parasuis 29755 [Haemophilus] parasuis 84-15995 [Haemophilus] parasuis SH0165 [Haemophilus] parasuis gx033 [Haemophilus] parasuis ZJ0906 [Haemophilus] parasuis MN-H [Haemophilus] parasuis 12939 [Haemophilus] parasuis SW114 [Haemophilus] parasuis D74 Bibersteinia trehalosi USDA-ARS-USMARC-190 Bibersteinia trehalosi USDA-ARS-USMARC-188 [Haemophilus] paraphrohaemolyticus HK411 [Haemophilus] paraphrohaemolyticus HK385 [Haemophilus] sp. HK 2154 [Haemophilus] parainfluenzae CCUG 13788 [Actinobacillus] minor NM305 [Actinobacillus] minor 202 Actinobacillus ureae ATCC 25976 Actinobacillus suis H91-0380 Actinobacillus capsulatus DSM 19761 Actinobacillus sensu Actinobacillus pleuropneumoniae 1096 stricto Actinobacillus pleuropneumoniae L20 Actinobacillus pleuropneumoniae 56153 Actinobacillus pleuropneumoniae S8 [Haemophilus] ducreyi 35000HP Mannheimia varigena USDA-ARS-USMARC-1388 Mannheimia varigena USDA-ARS-USMARC-1296 Mannheimia varigena USDA-ARS-USMARC-1312 Mannheimia varigena USDA-ARS-USMARC-1261 Mannheimia Mannheimia haemolytica D35 Mannheimia haemolytica D153 Mannheimia haemolytica PKL10 Gallibacterium anatis 1265612 Gallibacterium anatis F 149

Haemophilus influenzae Rd KW20 Haemophilus influenzae R2866 Haemophilus influenzae 10810 Haemophilus aegyptius CCUG 25716 Haemophilus sensu Haemophilus haemolyticus M21639 stricto Haemophilus pittmaniae HK 85 Haemophilus parainfluenzae T3T1 [Pasteurella] pneumotropica DSM 21403 [Actinobacillus] porcinus NM319 Lonepinella koalarum ATCC 700131 Avibacterium paragallinarum JF4211 Avibacterium gallinarum NCTC 11188 [Haemophilus] felis ATCC 49733 Pasteurella dagmatis ATCC 43325 Pasteurella multocida ATCC 43137 Pasteurella sensu Pasteurella multocida HN06 stricto Pasteurella multocida 36950 Aggregatibacter actinomycetemcomitans D11S-1 Aggregatibacter Aggregatibacter aphrophilus NJ8700 [Actinobacillus] succinogenes 130Z Necropsobacter rosorum CCUG 28028 Basfia succiniciproducens JF4016 Basfia succiniciproducens MBEL55E [Pasteurella] bettyae CCUG 2042 Bisgaardia hudsonensis M327/99/2 [Haemophilus] sputorum HK 2154 Volucribacter psittacicida CCUG 47536 [Actinobacillus] indolicus 46KC2 [Haemophilus] parasuis KL0318 Nicoletella semolina JF2465 Bibersteinia trehalosi USDA-ARS-USMARC-192 [Actinobacillus] muris NCTC 12432 Otariodibacter oris Baika1 Vespertiliibacter pulmonis E127/08 [Haemophilus] paraphrohaemolyticus HK385 [Haemophilus] paraphrohaemolyticus HK411 Histophilus somni 2336 [Actinobacillus] minor NM 305 Actinobacillus lignieresii NCTC 4189 Actinobacillus pleuropneumoniae L20 Actinobacillus arthritidis CCUG 24862 Actinobacillus sensu Actinobacillus capsulatus DSM 19761 stricto Actinobacillus suis ATCC 33415 Actinobacillus ureae CCUG 2139 Mannheimia haemolytica D171 Mannheimia varigena USDA-ARS-USMARC-1261 Mannheimia Mannheimia granulomatis DSM 19156 [Haemophilus] ducreyi 3500HP Gallibacterium anatis F 149 Gallibacterium melopsittaci F450 Chelonobacter oris 1662 [Pasteurella] testudinis CCUG 19802

0.05

0.1

(a)

(b)

Figure 1: (a) A maximum-likelihood core genome phylogenetic tree of sequenced members of the family Pasteurellaceae. (b) A maximumlikelihood phylogenetic tree based on concatenated nucleotide sequence alignments of the 16S rDNA, infB, recN, and rpoB genes. Both trees are rooted using members of the Vibrionaceae (not shown). Nodes with >80% bootstrap support are indicated by diamond shaped symbols at the node. Clusters of species representing Actinobacillus sensu stricto, Haemophilus sensu stricto, Pasteurella sensu stricto, Aggregatibacter, and Mannheimia are indicated by brackets. Members of the genera Actinobacillus, Haemophilus, and Pasteurella which do not fall into their respective “sensu stricto” clades are indicated by the presence of square brackets around their generic name (ex. [Pasteurella] pneumotropica).

International Journal of Genomics

5

Table 1: Genome characteristics of the sequenced Pasteurellaceae included in our analyses. Organism name Actinobacillus pleuropneumoniae L20 Actinobacillus pleuropneumoniae serovar 3 str. JL03 Actinobacillus pleuropneumoniae serovar7 str. AP76 Actinobacillus ureae ATCC 25976a Actinobacillus minor 202a Actinobacillus minor NM305a Actinobacillus succinogenes 130Z Aggregatibacter aphrophilus NJ8700 Aggregatibacter actinomycetemcomitans D11S-1 Aggregatibacter actinomycetemcomitans D7S-1 Aggregatibacter segnis ATCC 33393a Gallibacterium anatis UMN179 Haemophilus aegyptius ATCC 11116a Haemophilus ducreyi 35000HP Haemophilus haemolyticus M21621a Haemophilus influenzae 10810 Haemophilus influenzae F3031 Haemophilus influenzae F3047 Haemophilus influenzae 22.1-21a Haemophilus influenzae 3655 Haemophilus influenzae 6P18H1a Haemophilus influenzae 7P49H1a Haemophilus influenzae NT127a Haemophilus influenzae PittAAa Haemophilus influenzae PittIIa Haemophilus influenzae PittHHa Haemophilus influenzae R3021a Haemophilus influenzae RdAWa Haemophilus influenzae 86-028NP Haemophilus influenzae PittEE Haemophilus influenzae PittGG Haemophilus influenzae Rd KW20 Haemophilus influenzae R2846 Haemophilus influenzae R2866 Haemophilus parainfluenzae ATCC 33392 a Haemophilus parainfluenzae T3T1 Haemophilus parasuis 29755a Haemophilus parasuis SH0165 Haemophilus pittmaniae HK 85a Haemophilus sputorum CCUG13788a Haemophilus parahaemolyticus HK385a Haemophilus paraphrohaemolyticus HK411a Haemophilus sp. oral taxon 851 str.F0397a Histophilus somni 2336 Histophilus somni 129PT Mannheimia succiniciproducens MBEL55E Mannheimia haemolytica PHL213a Pasteurella multocida subsp. multocida str. Pm70 Pasteurella dagmatis ATCC 43325a a

BioProject CP000569 CP000687 CP001091 AEVG0 ACFT0 ACQL0 CP000746 CP001607 CP001733 CP003496 AEPS0 CP002667 AFBC0 AE017143 AFQQ0 FQ312006 FQ670178 FQ670204 AAZD0 AAZF0 ABWW0 ABWV0 ACSL0 AAZG0 AAZI0 AAZH0 AAZJ0 ACSM0 CP000057 CP000671 CP000672 L42023 CP002276 CP002277 AEWU0 FQ312002 ABKM0 CP001321 AFUV0 AFNK0 AJSW0 AJMU0 AGRK0 CP000947 CP000436 AE016827 AASA0 AE004439 ACZR0

Size (Mb) 2.27 2.24 2.35 2.30 2.13 2.43 2.32 2.31 2.20 2.31 1.99 2.69 1.92 1.70 2.09 1.98 1.99 2.01 1.89 1.88 1.91 1.83 1.87 1.88 1.95 1.84 1.88 1.80 1.91 1.81 1.89 1.83 1.82 1.93 2.11 2.09 2.22 2.27 2.18 2.14 1.81 2.02 1.84 2.26 2.01 2.31 2.57 2.26 2.25

The genomes of these species/strains are currently under scaffolds/contigs status. Stiftung Tieraerztliche Hochschule Hannover (STHH). c The Broad Institute Genome Sequencing Platform (BIGSP). d University of Washington; Seattle Biomedical Research Institute (UW-SBRI). e Center for Genomic Sciences, Allegheny-Singer Research Institute (CGS, ASRI). f Genome Sequencing Center (GSC) at Washington University (WashU) School of Medicine. g Baylor College of Medicine (BCM). h Wellcome Trust Sanger Institute (WTSI). b

Proteins 2013 2036 2142 2475 2050 2411 2079 2219 2280 2250 1956 2500 2020 1717 1894 1903 1770 1786 2224 1929 1893 1752 1809 1981 2028 1977 2307 1718 1792 1613 1661 1657 1636 1795 2010 1975 2244 2021 2390 2073 1764 2025 1809 1980 1798 2370 2695 2012 2053

G-C (%) 41.3 41.2 41.2 — 39.3 39.3 44.9 42.2 44.3 44.3 — 39.9 — 38.2 — 38.1 38.2 38.2 38.0 38.0 38.2 37.9 38.0 38.1 38.0 38.0 37.9 38.0 38.2 38.0 38.0 38.2 38.0 38.1 — 39.6 39.8 40.0 — — — — — 37.4 37.2 42.5 41.1 40.4 37.4

References [40] [41] STHHb BCMg McGill University McGill University Joint Genome Institute [19, 42] [43] [44] BCMg [45] BCMg Ohio State University [46] WTSIh [47] [47] [48] [48] CGS, ASRIe CGS, ASRIe BIGSPc [48] [48] [48] [48] BIGSPc [49] [48] [48] [34] UW-SBRId UW-SBRId BCMg WTSIh Iowa State University [50] J. Craig Venter Institute Aarhus University Hospital J. Craig Venter Institute J. Craig Venter Institute GCG-WUf Joint Genome Institute [51] [52] [53] [54] BCMg

6

International Journal of Genomics

Act. pleuropneumonia ser. 5b str. L20 Act. pleuropneumoniae serovar 6 str. Femo Act. pleuropneumoniae serovar 1 str. 4074 Act. pleuropneumoniae serovar 9 str. CVJ13261 Act. pleuropneumoniae serovar 11 str. 56153 Act. pleuropneumoniae serovar 12 str. 1096 Act. pleuropneumonia ser. 3 str. JL03 Act. pleuropneumonia ser. 7 str. AP76 Actinobacillus Act. pleuropneumoniae serovar 13 str. N273 sensu stricto Act. pleuropneumoniae serovar 2 str. S1536 Act. pleuropneumoniae serovar 2 str. 4226 Act. pleuropneumoniae serovar 4 str. M62 Act. pleuropneumoniae serovar 10 str. D13039 Actinobacillus capsulatus Actinobacillus suis H91-0380 Actinobacillus suis ATCC 33415 Actinobacillus ureae Actinobacillus minor Haemophilus parasuis Haemophilus paraphrohaemolyticus Haemophilus parahaemolyticus Bibersteinia trehalosi USDA-AR Mannheimia haemolytica USDA-AR Mannheimia haemolytica M42548 Basfia succiniciproducens Actinobacillus succinogenes 13 Other Pasteurellaceae Pasteurella bettyae Pasteurella dagmatis Pasteurella multocida HN06 Pasteurella multocida 36950 Aggregatibacter actinomycetemcomitans Aggregatibacter aphrophilus NJ Aggregatibacter segnis Haemophilus haemolyticus Haemophilus influenzae Plesiomonas shigelloides Yersinia bercovieri Serratia odorifera Enterobacter cloacae SCF1 Dickeya zeae Ech1591 Other Gammaproteobacteria Citrobacter freundii Shigella flexneri 0/250 Cronobacter turicensis z3032 Vibrio vulnificus YJ016 Klebsiella variicola At-22 Pantoea ananatis LMG 5342 Escherichia coli

126208128 306860341 306853561 306862597 306866939 306869187 165976058 190149956 306871346 306855886 302647429 306858147 306864716 517480365 407692352 672592002 491832514 492353747 75992966 491992285 491987878 470167188 472333619 482886678 52424119 152977811 492137838 492150287 383310907 378774943 491743308 251792866 493770251 491849990 491951884 499151755 491414840 491094352 311281241 251788420 489927089 491253659 260599446 37678593 288937493 378765502 446511916

367 QDDPTIQIVNQAQKAYVEN -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------V-----------------V------------------------------K ------------------K ----------A-----------------------------------------------------------T-----------R-----A-K ----------------A------V----------A------V-------R--A------V-------R--A-------------------------------------K ------------------K ------------------K ------------------K -----V----N-------H -----V----N--R--T-H -----V----N-------H -----V-V--M-------H -----V----N-------H -----V-V--N-------H -----V-V--N-------H -----V-V--N-------H -----V----L---D---R -----V-V--M-------H -----V----N--R----H -----V-V--N-------H

V -

402 VVKGLPELTGLPVLSA -----------------------------------------------------------------------------------A--------------A--------------A--------------A--------------A--------------A--------------A--------------A--------------A--------------A---------S----A------I-----Q-A---I--I-----Q-A---I---APSVA-M----I---ASSVA-MA---I--A--A--D-A---I----------A---I----------A---I-----N----A------IM-N--K-AK-----I--N----A---I-----N----A---------N----A---------N----A-------APSV-AMA---I---APSV-AMA---I---APSV-AMA---I---ASSVA-MA---I---APS-AAMA---I--F-Q-D-D-D-I----FIQ-D-D-A------YIQ-D-D-AD-----FIQ-D-D-AN-----FIQ-D-D-AE-----FIQ-D-D-AK-----YIQ-D-D-AK-----FIQ-D-D-AT-----FIQ-D-D-A------FIQ-D-D-AK-----FIQ-D-D-AT-----YIQ-D-D-AK------

Figure 2: A partial sequence alignment of a 3󸀠 -nucleotidase showing a 1-amino-acid insertion identified in all members of Actinobacillus sensu stricto. This insertion was not found in the homologues from any member of the genus Actinobacillus that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for a representative subset of the family Pasteurellaceae and the class Gammaproteobacteria is shown, but unless otherwise indicated, similar CSIs were detected in all members of the indicated group and not detected in any other bacterial species in the top 250 BLAST hits. The dashes (-) in the alignments indicate identity with the residue in the top sequence. GenBank identification (GI) numbers for each sequence are indicated in the second column. Sequence information for other CSIs specific to Actinobacillus sensu stricto are presented in Supplemental Figures 1-2 and their characteristics are summarized in Table 2(A).

Actinobacillus lignieresii, the type species of the genus [24, 78]. However, the original classification scheme for the genus Actinobacillus led to the inclusion of a highly heterogeneous and polyphyletic grouping of species within the genus [12– 14]. An assemblage of Actinobacillus species closely related to Actinobacillus lignieresii has been recognized as Actinobacillus sensu stricto (i.e., the core members of the genus Actinobacillus) in both our phylogenetic analysis (Figure 1) and past phylogenetic analyses [12–14, 17]. Differentiation of Actinobacillus sensu stricto from other Actinobacillus species and the modern criteria for placing novel species within the genus Actinobacillus sensu stricto is heavily reliant on genetic and genomic criteria, namely, DNA-DNA hybridization

values, 16S rRNA sequence similarity, and other single gene sequence comparisons [13, 18]. There are currently no known discrete characteristics which are unique to Actinobacillus that define the genus. In this work, we have completed a comprehensive comparative analysis of Pasteurellaceae genomes in order to identify unique, defining molecular signatures for different genera within the family Pasteurellaceae. We have identified 3 CSIs which are unique, defining molecular signatures for the sequenced members of Actinobacillus sensu stricto (namely, Actinobacillus capsulatus, A. pleuropneumoniae, A. suis, and A. ureae). An example of a CSI specific for Actinobacillus sensu stricto is shown in Figure 2. The CSI consists of a 1-amino-acid insertion

International Journal of Genomics

7

Table 2: Conserved signature indels specific for genera within the family Pasteurellaceae. GenBank Figure number identifier (A) CSIs specific for Actinobacillus sensu stricto surE Figure 2 3󸀠 -nucleotidase 126208128 GTP pyrophosphokinase relA 126207889 Sup. Figure 1 Anaerobic glycerol-3-phosphate dehydrogenase subunit glpA 491834528 Sup. Figure 2 (B) CSIs specific for Haemophilus sensu stricto Biotin-protein ligase birA Figure 3 144979005 Aspartate ammonia-lyase aspA Sup. Figure 3 145630289 NAD(P) transhydrogenase subunit alpha pntA 145631394 Sup. Figure 4 Fumarate reductase subunit C frdC 301169552 Sup. Figure 5 Hypothetical tRNA/rRNA methyltransferase — 145636352 Sup. Figure 6 Gamma-glutamyl kinase proB 145629980 Sup. Figure 7 ACP phosphodiesterase acpD Sup. Figure 8 68250119 (C) CSIs specific for Pasteurella sensu stricto Menaquinone-specific isochorismate synthase menF Figure 4 386834899 tRNA s(4)U8 sulfurtransferase thiI Sup. Figure 9 15602400 FKBP-type peptidyl-prolyl cis-trans isomerase slyD 378775595 Sup. Figure 10 Aspartate-semialdehyde dehydrogenase asd 383311492 Sup. Figure 11 Lactate permease family transporter lldP 492154065 Sup. Figure 12 Cell division protein ftsA ftsA 492155843 Sup. Figure 13 (D) CSIs specific for Aggregatibacter nhaC family sodium:proton antiporter nhaC Figure 5(a) 493769836 Outer membrane protein omp 261866907 Sup. Figure 14 Multidrug transporter murJ murJ 365966332 Sup. Figure 15 NADH dehydrogenase nuoE 387120244 Sup. Figure 16 (E) CSIs specific for Mannheimia Methyl-galactoside ABC transporter substrate-binding protein — Figure 5(b) 472335016 UDP-N-acetylmuramoylalanyl-D-glutamate–2,6-diaminopimelate ligase murE Sup. Figure 17 472333011 Glutathione-regulated potassium-efflux protein kefC 472333189 Sup. Figure 18 Glycerol-3-phosphate acyltransferase plsB 472334521 Sup. Figure 19 Protein name

in a conserved region of a 3󸀠 -nucleotidase which is present in all sequenced members of Actinobacillus sensu stricto and absent in all other sequenced Gammaproteobacteria. Sequence information for 2 other CSIs which are also unique characteristics of the Actinobacillus sensu stricto clade is presented in Supplemental Figures 1-2 available online at http://dx.doi.org/10.1155/2015/198560 and their characteristics are briefly summarized in Table 2(A). 3.4. Molecular Signatures Specific for Haemophilus sensu stricto. The classification of novel species into the genus Haemophilus was initially based on phenotypic and biochemical properties, most importantly, the dependence of growth on the presence of factor V and factor X in blood [13, 78, 79]. As with Actinobacillus, the classification of Haemophilus on the basis of phenotypic and biochemical properties has led to the genus containing an extremely heterogeneous group of species [12–14, 32]. Species from the genus Haemophilus have undergone a number of transfers

Gene name

Indel size

Indel positiona

1 aa ins 1 aa ins 1 aa ins

367–402 368–412 359–400

6 aa del 1 aa ins 1 aa del 3 aa ins 1 aa del 1 aa ins 2 aa del

138–178 34–75 352–378 31–89 17–58 197–253 119–159

4 aa ins 2 aa del 2 aa del 1 aa del 2 aa ins 1 aa ins

29–86 412–446 151–188 173–245 390–427 357–387

3 aa ins 4 aa del 1 aa del 1 aa ins

396–437 25–64 190–220 372–412

1 aa del 2 aa del 1 aa ins 2 aa del

33–73 418–473 504–531 214–252

and reclassifications [21–24, 28]. However, the genus remains highly polyphyletic (Figure 1) [17, 19, 28]. The core members of the genus Haemophilus (namely, Haemophilus sensu stricto) consist of Haemophilus influenzae, H. aegyptius, and H. haemolyticus based on 16S rRNA sequence analysis [12– 14, 32]. However, phylogenetic analysis based on DNA-DNA hybridization and multilocus sequence analysis suggests that H. parainfluenzae and H. pittmaniae are also members of Haemophilus sensu stricto [30, 80]. Phylogenetic analysis of rpoB, infB, and concatenated gene sets also suggest that [Pasteurella] pneumotropica and related isolates are closely related to Haemophilus sensu stricto [15, 16]. Our comparative analysis of Pasteurellaceae genomes has led to the identification of 7 CSIs that are unique characteristics of Haemophilus sensu stricto which consists of Haemophilus influenzae, H. aegyptius, H. haemolyticus, H. parainfluenzae, H. pittmaniae, and [Pasteurella] pneumotropica (Figure 1). One example of a CSI specific for the members of Haemophilus sensu stricto, shown in Figure 3, consists of a 4-amino-acid deletion in a biotin-protein ligase which is

8

International Journal of Genomics

Haemophilus sensu stricto

Other Haemophilus

Other Pasteurellaceae

Other Gammaproteobacteria 0/250

Haemophilus influenzae 22.1-21 Haemophilus influenzae PittHH Haemophilus influenzae R3021 Haemophilus influenzae R2866 Haemophilus influenzae PittAA Haemophilus influenzae PittII Haemophilus influenzae R2846 Haemophilus influenzae 10810 Haemophilus influenzae 7P49H1 Haemophilus influenzae PittEE Haemophilus influenzae 86-028NP Haemophilus influenzae F3031 Haemophilus influenzae CGSHiCZ412602 Haemophilus influenzae 7P49H1 Haemophilus influenzae 3655 Haemophilus influenzae 6P18H1 Haemophilus influenzae NT127 Haemophilus influenzae KR494 Haemophilus influenzae Rd KW20 Haemophilus influenzae PittGG Haemophilus haemolyticus M21621 Haemophilus haemolyticus HK386 Haemophilus haemolyticus M19107 Haemophilus haemolyticus M21127 Haemophilus haemolyticus M19501 Pasteurella pneumotropica Haemophilus sp. oral taxon 851 Haemophilus parainfluenzae ATCC 33392 Haemophilus parainfluenzae T3T1 Haemophilus parainfluenzae HK262 Haemophilus pittmaniae Haemophilus sputorum Haemophilus paraphrohaemolyticus Haemophilus parahaemolyticus Haemophilus ducreyi Haemophilus parasuis Histophilus somni Aggregatibacter segnis Aggregatibacter actinomycetemcomitans Aggregatibacter aphrophilus Actinobacillus pleuropneumoniae Actinobacillus ureae Actinobacillus minor Actinobacillus succinogenes Basfia succiniciproducens Mannheimia haemolytica Pasteurella multocida Pasteurella dagmatis Gallibacterium anatis Cronobacter sakazakii Edwardsiella ictaluri Enterobacter cloacae Erwinia tasmaniensis Escherichia coli Klebsiella pneumoniae Pantoea vagans Photorhabdus luminescens Serratia odorifera

144979005 145269679 144983393 386263583 145267548 145271085 386265397 378696350 229810402 148715656 68057028 317432060 646229376 229810402 144986658 229812060 260094107 540365110 16272182 501001793 341954888 386907988 341948169 341948545 341948213 517167265 696223133 325159690 301156028 385192842 343517642 359299234 386390324 387773709 33151348 219871466 113460565 315634470 261867631 251792305 190151209 322515677 223041563 152978568 161510992 254361412 15602161 260913039 332289774 156935825 238918134 311281472 188532305 218702608 152972835 308188898 37528549 270265458

138 LSLVIGLAIAEVL ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------V------AF -----------------------A-----------A-----------A---A-------TF ----VA-VL--SF ----VS-I---A----VS-I---A----VA-I---T----VSVL---TF --S-L---T-----T------VQA--S------VQA--S------VQS-----S-I---S-----S-I---S----VA-I---S---TV-M--HRAI ------M---DAI ----VA-I---S----V-M----T----V-M---DT---AV-M-V-QA------IVM---------IVM---------IVM---------IV---A------IVM---------IV----------IVM--T----V-IV----------IVM----

TELGIS QAQNVQ QAQNVQ QAQQVE QTLNVP NEMGA VELDMY AELDMY VELDMY QAQQVE QAQQVE QAQGVK RKLGSD KSAGGK TVQNVK KQAGAL RRTGVR TELNLH HELGAA QALGAA QRLGAE QQQGAP RKLGAD QQLGAE RALGAD HRFGAG QRLGAE

173 NVQVKWPNDILFDERKLGGILVE -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------G---------------------G---------------------G--------K------------G-----------------M--EG--------G------------G-------I-------------G----------K--------LSG----------K--------LSG---------AK------V-LSG--------GA-------V-YQDA-MA--SLD DI-IK----VYYQGK------IDI-I------YYQGK-M----IDI-I-----VYYQGK-M----IHI-I------YYQGK-M----IHI-I------YYQGK-M----IE-KL-----L-LFG---A----GF---------VN----A----GF---------VND---A----GF---------VND---A-----I-I------YYCSK-M----I-I-I------Y-CGK-M----IDI-I-----VY-QGK-M----IQTKL-----L-LHG---A---IEINL-----L-LNG---A---IDI-I------YYQGK-M----L-IGL-----V-LHG---A------KL-----V-LNG---A----S--L------WLNGK------IQ-R------LYLHD---A----G-K------LYLND---A----G-R------LYLQD---A----DIR-------YLND---A----K-R------LYLQD---A----Q-R-------YLQD---S----D-R-------YLND---A----RIR------LYL-DK--A----E-R------LYLND---A-----

Figure 3: A partial sequence alignment of 1,4-dihydroxy-2-naphthoate octaprenyltransferase showing a 2-amino-acid insertion identified in all members of Haemophilus sensu stricto. This insertion was not found in the homologues from any member of the genus Haemophilus that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for other CSIs specific to Haemophilus sensu stricto is presented in Supplemental Figures 3–8 and their characteristics are summarized in Table 2(B).

uniquely found in homologs from Haemophilus sensu stricto and absent in all other sequenced Gammaproteobacteria. Sequence information for 6 additional CSIs which are also unique characteristics of Haemophilus sensu stricto is presented in Supplemental Figures 3–8 and their characteristics are briefly summarized in Table 2(B). These CSIs and our phylogenetic trees (Figure 1) suggest that Haemophilus

influenzae, H. aegyptius, H. haemolyticus, H. parainfluenzae, H. pittmaniae, and [Pasteurella] pneumotropica share a close evolutionary relationship and should all be considered members of Haemophilus sensu stricto. Additionally, these results also suggest that [Pasteurella] pneumotropica is incorrectly classified as a member of the genus Pasteurella and should be reclassified as “Haemophilus pneumotropica.”

International Journal of Genomics

Pasteurella multocida 3480 Pasteurella multocida HN06 Pasteurella multocida Pm70 Pasteurella multocida Anand1 Pasteurella multocida X73 Pasteurella multocida 2000 Pasteurella multocida 93002 Pasteurella multocida 671/90 Pasteurella Pasteurella multocida HB03 sensu stricto Pasteurella multocida P1933 Pasteurella multocida P52VAC Pasteurella multocida Pasteurella multocida RIIF Pasteurella multocida 1500C Pasteurella multocida PMTB Pasteurella multocida 36950 Pasteurella dagmatis Pasteurella bettyae Basfia succiniciproducens Actinobacillus succinogenes 13 Aggregatibacter actinomycetemcomitans Aggregatibacter segnis Gallibacterium anatis UMN179 Histophilus somni 129PT Haemophilus parainfluenzae T3T Haemophilus pittmaniae Other Haemophilus influenzae Rd KW20 Pasteurellaceae Haemophilus sputorum Haemophilus parasuis SH0165 Haemophilus ducreyi 35000HP Actinobacillus minor Actinobacillus ureae Actinobacillus suis H91-0380 Actinobacillus pleuropneumonia Mannheimia haemolytica USDA-AR Bibersteinia trehalosi USDA-AR Moritella sp. PE36 Grimontia sp. AK16 Plesiomonas shigelloides Photobacterium damselae Escherichia sp. TW09231 Other Aliivibrio salmonicida LFI1238 Gammaproteobacteria Vibrio fischeri ES114 0/250 Yokenella regensburgei Yersinia bercovieri Aliivibrio fischeri Photobacterium sp. AK15 Klebsiella oxytoca

9

386834899 383310853 15601918 338217984 404383748 512753744 512754797 512760432 512755642 512755642 401690557 404384736 512761090 512763080 544580815 378774883 492154802 492145910 52425850 152978460 387121592 493768552 332289482 113460763 345428676 494450864 16272240 494790952 219871425 33151806 492367157 491835912 407691822 190150367 472333297 470166611 492903543 488492100 499151062 358410570 446418461 209694656 59712279 493874499 491418549 491562394 494734248 490215845

29 WYAGTLGVMGPAYADFCVTIRSAFIE -----------T----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------F-SKMQ-------------------F-SQVKSE----L----V-------F-NR-R-E----L----V---------TKEHSE-----------------FFNR-R-E-----------------FFNQQQ-E---A-----V----AI--ISH-F-E---GL---KLT ----A--I-TETESE-------S-----------SQNLSE-----------------L-SREQ-E-------------------SDVCSE---A----------A--FVS-ERSE----L---QVH ----A--YFT-EQ-E---ML---L-Q -------YFHTDH-E-T--L---K-D -------IL-EDE-E----L---Q-K -------YLQ-DE-E---AL---Q------F-YLQ-DE-E---AL---Q--------YLQ-DE-E---AL---Q-----------NEE-E----L---L-S -------YFS--Q-E----L---LVR --S-A--YV-QQKSE---A----R-L --S-SV-YLS-EQSE---A----L-V ----SV-HISRER-E-T-A----L-Q --S-AV-YLSRQHSE---A----L-A ----SA-YLSLQQSE---SL---K-S --S-AV-FLSQQRSE---A----LVM --S-AV-FLSQQRSE---A----LVM ----SA-YLSRQQSE---AL---MVS ----SA-YLSRQQSE-S--L---WL--S-AV-FLSQQRSE---A----LVM --S-AV-FLSRQRSE---A----L-M ----SA-YLSL-QSE---SL---KVQ

QAEN ----------------------------------------------MD-

80 DSQLCVFAGAGIVEGSIPLLEW ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Q-K------------------QNRIRI-------A-------QNRIR--------A--V----SNKIR--------A--V----ADKIR-----------V----ADKIH-----------V----HQ--HL-------KE-QADE-QDYIRI---------------EN-VR-----------Q-VE-QQ-IR--------A--D--A-GHRIR--------A--Q--E-GNK-I-Y------A--E-QA-AN-ITFY-------K-D-QS-HN--TLY------AE-QADS-QN-VTLY------QD-E--S-QNCITLY-------E-E-QS-QNCITLY-------E-E-QS-RNRITLY-------E-E-QS-QNSITLY------S--D-QS-AKEML-Y------AE-E-ES-ENE-QL-------P--D-MS--NKVHL-------P--VAES-QN-VHL-------P--D-EA-GEE-HL-------P--D-SS--NVVRLY------R--D-EQ-GNK-HL-------P--E-SS-GNK-HL-------P--E-SS-GET-RLY------R--D-E---QWVNLY------A--D-E--GNK-HL-------P--E-SS-GEE-HL-------P--TADS-QHT-RLY------S--D-EQ--

Figure 4: A partial sequence alignment of Menaquinone-specific isochorismate synthase showing a 4-amino-acid insertion identified in all members of Pasteurella sensu stricto. This insertion was not found in the homologues from any member of the genus Pasteurella that was not part of the “sensu stricto” clade or any other member of the Gammaproteobacteria. Sequence information for other CSIs specific to Pasteurella sensu stricto is presented in Supplemental Figures 9–13 and their characteristics are summarized in Table 2(C).

3.5. Molecular Signatures Specific for Pasteurella sensu stricto. The genus Pasteurella is highly heterogeneous and polyphyletic (Figure 1) [13]. Similar to the members of Actinobacillus, bacterial isolates were originally classified as members of the genus Pasteurella based on growth factor independent growth and phenotypic or biochemical similarity to Pasteurella multocida, the type species of the genus [78, 81]. The monophyletic clusters of Pasteurella species that branch with Pasteurella multocida are considered the core members of the genus (namely, Pasteurella sensu stricto) [9, 13, 16, 17]. Our comparative analysis of Pasteurellaceae genomes has led to the identification of 6 CSIs which are unique characteristics for the sequenced members of Pasteurella sensu stricto (namely, Pasteurella multocida and P. dagmatis). An example of a CSI uniquely

found in the sequenced members of Pasteurella sensu stricto, consisting of a 4-amino-acid insertion in a conserved region of Menaquinone-specific isochorismate synthase, is shown in Figure 4. This CSI is only found in the sequenced members of Pasteurella sensu stricto and is absent from all other sequenced Gammaproteobacteria. Partial sequence alignments for 5 additional CSIs which are also unique characteristics of Pasteurella sensu stricto are presented in Supplemental Figures 9–13 and their characteristics are briefly summarized in Table 2(C). 3.6. Molecular Signatures Specific for the Genera Aggregatibacter or Mannheimia. The genus Aggregatibacter was proposed as a novel taxonomic classification for a monophyletic cluster of Actinobacillus and Haemophilus species which branched

10

International Journal of Genomics

Aggregatibacter segnis ATCC 33393 Aggregatibacter aphrophilus NJ Aggregatibacter aphrophilus F0387 Agg. actinomycetemcomitans D7S-1 Agg. actinomycetemcomitans serotype d str. I63B Agg. actinomycetemcomitans serotype b str. SCC1398 Agg. actinomycetemcomitans serotype c str. SCC2302 Agg. actinomycetemcomitans serotype c str. AAS4A Agg. actinomycetemcomitans serotype b str. I23C Aggregatibacter Agg. actinomycetemcomitans serotype b str. SCC4092 Agg. actinomycetemcomitans serotype str. D18P1 Agg. actinomycetemcomitans serotype c str. D17P-2 Agg. actinomycetemcomitans serotype e str. SC1083 Agg. actinomycetemcomitans serotype e str. SCC393 Agg. actinomycetemcomitans RhAA1 Agg. actinomycetemcomitans D11S-1 Agg. actinomycetemcomitans ANH9381 Agg. actinomycetemcomitans D17P-3 Agg. actinomycetemcomitans Y4 Avibacterium paragallinarum Gallibacterium anatis UMN179 Pasteurella pneumotropica Pasteurella dagmatis Actinobacillus succinogenes 13 Actinobacillus capsulatus Actinobacillus minor Other Pasteurella multocida 36950 Pasteurellaceae Pasteurella dagmatis Pasteurella bettyae Actinobacillus ureae Bibersteinia trehalosi USDA-AR Haemophilus parasuis SH0165 Haemophilus sputorum Succinivibrionaceae bacterium Psychromonas sp. CNPT3 Other Shewanella denitrificans OS217 Gammaproteobacteria Grimontia hollisae Vibrio harveyi 0/250 Aeromonas molluscorum Vibrio sp. Ex25

315476949 251792630 353346368 387119969 347995553 348005459 348008621 443542862 348010852 443544948 348000709 348656398 347993209 348000829 359757150 261868126 365967906 348654113 429151349 523674050 332288121 517167962 492147403 152978710 517482329 492354008 378774582 492147403 492137490 491832364 470166919 219871308 494789693 498141043 470479953 91793888 491648712 13655509 492636195 262394622

396 TSWGTFGIMLPIAAAIASHAMP -----------------A--AA -----------------A--AA ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H ---------------------H -----------------AN-A---------------M-AN-D--------------SM-TN-A-----------------AN-A-----------------TN-A--------------GM-INVDT ------------G-SM-VNSD-----------------ANTA-----------------AN-A-----------------VN-A--------------GM-INVDT ---------------M-AN-E---------------M-AN-E------A-----G-SM-M-SE--------------T--IN-NS -------------GDM-GATDV --------------DM-MGSDS -------------GDL-GATDI -------------GDM-GATDV -----------L-GDM-AASEI -------------GDM-GATDL

472335016 525657229 261308718 528822364 526467663 452088105 481064088 523435054 523448251 528824940 528874222 528877290 528878523 575444678 575442543 575442169 575448350 652755709 470166988 219870684 491990832 497813944 492352703 407693344 52424698 332289959 491699858 345429694 68249417 490274190 490205456 493874411 237809379 283785958 487492822

33 YKYDDNFMALMRKEIEKEGKTQK -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------QA--------------------QA--------------------QA--------------------QA--------------------AN--N------S------VQ-AARM--------S---Q-----V-VVG ------------------AAQH--------S------N--A-ALN ------------------A-NL---------------D--ANQL---------------D--ATNL--------S------N-DAEKVE --------S------Q-NAEQLQ --------S------D--A-ALG --------S------D--A-VVG --------SMV--D----A-NSP --------SVV--A---D--SAP --------SVV--A---DA-ASP --------SAV--A-----DQYP --------SVV--A--ADA-AAP --------SVV--A--QDA-AAP

GSV -------------------------------------

437 EFMLPCLSAVMAGAVCG -L---------------L-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------L--------------AL---------------L--A------------L---------------LL-------------GLLI--M---------NLII-------------L---------------L---------------LL-------------GLLI--M---------ALL--S----------ALL-------------ALII---A---S-----LLI------------A----M----L--S-FTM---M----L----FAL---M----L----FAL---M----L----FSM---M----L----FAL---M----L----F-

(a)

Mannheimia haemolytica USDA-ARS-USMARC-183 Mannheimia haemolytica D171 Mannheimia haemolytica serotype A2 str. OVINE Mannheimia haemolytica D35 Mannheimia haemolytica USMARC_2286 Mannheimia haemolytica serotype 6 str. H23 Mannheimia haemolytica M42548 Mannheimia haemolytica D174 Mannheimia Mannheimia haemolytica D153 Mannheimia haemolytica D38 Mannheimia haemolytica MhBrain2012 Mannheimia haemolytica MhSwine2000 Mannheimia haemolytica D193 Mannheimia varigena USDA-ARS-USMARC-1312 Mannheimia varigena USDA-ARS-USMARC-1296 Mannheimia varigena USDA-ARS-USMARC-1261 Mannheimia varigena USDA-ARS-USMARC-1388 Mannheimia granulomatis Bibersteinia trehalosi USDA-AR Haemophilus parasuis SH0165 Haemophilus paraphrohaemolyticus Haemophilus sputorum Actinobacillus minor Other Actinobacillus suis H91-0380 Pasteurellaceae Mannheimia succiniciproducens Gallibacterium anatis UMN179 Aggregatibacter actinomycetemcomitans Haemophilus parainfluenzae T3T Haemophilus influenzae 86-028N Erwinia amylovora Klebsiella oxytoca Other regensburgei Gammaproteobacteria Yokenella Tolumonas auensis DSM 9187 0/250 Citrobacter rodentium ICC168 Escherichia coli

E G D D D D D G N G G G D D D D D

73 VELLMNDSQNTQSIQNDQ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Q----------------Q----------------Q----------------Q----------------N--------A-------N--L-----A-----N-D--------A---------------------------------A------------------------K----------------Q--------A------IK--------A-------K--------A------I---------A------IK--------A-------Q--------S--T----Q--------D--K---IQ--------D--K---IK--------D--K----Q--------D--K----Q--------D--K----

(b)

Figure 5: A partial sequence alignment of (a) a nhaC family sodium:proton antiporter containing a 3-amino-acid insertion specific for all sequenced species of the genus Aggregatibacter (b) a methyl-galactoside ABC transporter substrate-binding protein containing a 1amino-acid deletion specific for all sequenced species of the genus Mannheimia. In each case, the identified CSIs were only found in the sequenced members of the genera Aggregatibacter or Mannheimia and were absent from all other sequenced Gammaproteobacteria. Sequence information for other CSIs specific to Aggregatibacter or Mannheimia is presented in Supplemental Figures 14–19 and their characteristics are summarized in Tables 2(D) and 2(E).

International Journal of Genomics distinctly from the “sensu stricto” members of their respective clades [21]. Similarly, the genus Mannheimia was proposed as a novel classification for the Pasteurella Haemolytica complex which did not branch with Pasteurella sensu stricto in phylogenetic trees [7]. Currently other than branching in phylogenetic trees or relatedness in DNA-DNA hybridization studies, the members of the genera Aggregatibacter or Mannheimia do not share any single unique or defining biochemical or molecular characteristic that can differentiate them from all other bacteria [5, 82]. In this study we have identified 4 CSIs that are unique molecular characteristics shared by all sequenced species of the genus Aggregatibacter and another 4 CSIs which are uniquely found in all sequenced members of the genus Mannheimia. Examples of CSIs specific to the sequenced members of Aggregatibacter and Mannheimia are shown in Figure 5. A partial sequence alignment of a nhaC family sodium:proton antiporter containing a 3-amino-acid insertion specific for all sequenced species of the genus Aggregatibacter is shown in Figure 5(a) and a partial sequence alignment of a methyl-galactoside ABC transporter substratebinding protein containing a 1-amino-acid deletion specific for all sequenced species of the genus Mannheimia is shown in Figure 5(b). In each case, the identified CSIs were only found in the sequenced members of the genera Aggregatibacter or Mannheimia and were absent from all other sequenced Gammaproteobacteria. Partial sequence alignments additional CSIs specific for the genera Aggregatibacter or Mannheimia are provided in Supplemental Figures 14–19 and their characteristics are summarized in Tables 2(D)2 (E). These CSIs are the first discrete molecular characteristics which are unique for the genera Aggregatibacter and Mannheimia and support their observed monophyly in phylogenetic trees. Additionally, these CSIs could be useful targets for the development of PCR based diagnostic assays for the genera Aggregatibacter and Mannheimia which amplify the CSI containing DNA segment using the conserved flanking regions of the CSIs [83, 84].

4. Conclusion The genera Actinobacillus, Haemophilus, and Pasteurella, within the family Pasteurellaceae, are known to exhibit extensive polyphyletic branching. We have utilized molecular signatures and phylogenetic analyses to clarify the taxonomic boundary of these genera. We have been able to identify large clusters of Actinobacillus, Haemophilus, and Pasteurella species which represent the “sensu stricto” members of these genera. We have identified 3, 7, and 6 unique molecular signatures which are specifically shared by the members of the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto, respectively. The group specificity of the molecular signatures we have identified in this work is summarized in Figure 6 and their characteristics are briefly summarized in Table 2. Our comparative genomic analyses have not come across any CSIs that were unique characteristics of all sequenced members of the genera Actinobacillus, Haemophilus, or Pasteurella as currently defined, suggesting that the members of these genera that do not

11 7 CSIs specific for Haemophilus sensu stricto (Table 2(B))

Haemophilus sensu stricto

4 CSIs specific for Aggregatibacter (Table 2(D)) Aggregatibacter

6 CSIs specific for Pasteurella sensu stricto (Table 2(C))

3 CSIs specific for Actinobacillus sensu stricto (Table 2(A))

Pasteurella sensu stricto

Actinobacillus sensu stricto

Mannheimia 4 CSIs specific for Mannheimia (Table 2(E))

Figure 6: A summary diagram depicting the distribution of identified CSIs for genera within the family Pasteurellaceae.

fall into the “sensu stricto” clusters should not be considered members of their respective genus. Examinations of phenotypic and biochemical characteristics do not provide a reliable means of assigning a novel isolate to the genera Actinobacillus, Haemophilus, and Pasteurella [18]. However, based upon the CSIs described in this work, it is now possible to demarcate the genera Actinobacillus sensu stricto, Haemophilus sensu stricto, and Pasteurella sensu stricto on the basis of the presence or absence of unique molecular signatures. It is important to note that the current analysis of CSIs is limited to the currently available genomic sequence data and may show slight variance as additional bacterial genomes are sequenced. However, earlier work on CSIs for other groups of bacteria provides evidence that the identified CSIs have strong predictive value and will likely be found in other members of these groups as more species are sequenced and novel species are isolated [74, 77, 85, 86]. The conserved nature of the sequence regions that contain these CSIs, in conjunction with their strong predictive value, makes CSIs promising targets for the development of highly specific diagnostic assays for Actinobacillus sensu stricto, Haemophilus sensu stricto, Pasteurella sensu stricto, Aggregatibacter, and Mannheimia [83, 84]. Additionally, further analysis of these genus specific CSIs should lead to the discovery of their functional role in their respective organisms and may provide important insights into novel distinguishing features of these groups of organisms.

12

Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.

Authors’ Contribution Sohail Naushad and Mobolaji Adeolu contributed equally to this work.

References [1] A. C. Parte, “LPSN—list of prokaryotic names with standing in nomenclature,” Nucleic Acids Research, vol. 42, no. 1, pp. D613– D616, 2014. [2] K. Muehldorfer, S. Speck, and G. Wibbelt, “Proposal of Vespertiliibacter pulmonis gen. nov., sp. nov. and two genomospecies as new members of the family Pasteurellaceae isolated from European bats,” International Journal of Systematic and Evolutionary Microbiology, vol. 64, no. 7, pp. 2424–2430, 2014. [3] S. M. Spinola, M. E. Bauer, and R. S. Munson Jr., “Immunopathogenesis of Haemophilus ducreyi infection (chancroid),” Infection and Immunity, vol. 70, no. 4, pp. 1667–1676, 2002. [4] H. Christensen and M. Bisgaard, “Molecular classification and its impact on diagnostics and understanding the phylogeny and epidemiology of selected members of Pasteurellaceoe of veterinary importance,” Berliner und M¨unchener Tier¨arztliche Wochenschrift, vol. 123, no. 1-2, pp. 20–30, 2010. [5] N. Nørskov-Lauritsen, “Classification, identification, and clinical significance of Haemophilus and Aggregatibacter species with host specificity for humans,” Clinical Microbiology Reviews, vol. 27, no. 2, pp. 214–240, 2014. [6] B. Henderson, J. M. Ward, and D. Ready, “Aggregatibacter (Actinobacillus) actinomycetemcomitans: a triple A∗ periodontopathogen?” Periodontology 2000, vol. 54, no. 1, pp. 78–105, 2010. [7] Ø. Angen, R. Mutters, D. A. Caugant, J. E. Olsen, and M. Bisgaard, “Taxonomic relationships of the [Pasteurella] haemolytica complex as evaluated by DNA-DNA hybridizations and 16s rRNA sequencing with proposal of Mannheimia haemolytica gen. nov., comb. nov., Mannheimia granulomatis comb. nov., Mannheimia glucosida sp. nov., Mannheimia ruminalis sp. nov. and Mannheimia varigena sp. nov.,” International Journal of Systematic Bacteriology, vol. 49, no. 1, pp. 67–86, 1999. [8] J. T. Boss´e, H. Janson, B. J. Sheehan et al., “Actinobacillus pleuropneumoniae: pathobiology and pathogenesis of infection,” Microbes and Infection, vol. 4, no. 2, pp. 225–235, 2002. [9] B. A. Wilson and M. Ho, “Pasteurella multocida: from zoonosis to cellular microbiology,” Clinical Microbiology Reviews, vol. 26, no. 3, pp. 631–655, 2013. [10] S. Pohl, Reklassifizierung der gattung Actinobacillus Brumpt 1910, Haemophilus Winslow et al. 1971 und Pasteurella Trevisan 1887 anhand ph¨anotypischer und molekularer daten, insbesondere der DNS-verwandtschaften bei DNS: DNS-hybridisierung in vitro und vorschlag einer neuen familie, Pasteurellaceae [Inaug. Diss.], Philipps-Universit¨at, Marburg, Germany, 1979. [11] W. Mannheim, S. Pohl, and R. Holl¨ander, “On the taxonomy of Actinobacillus, Haemophilus, and Pasteurella: DNA base composition, respiratory quinones, and biochemical reactions of representative collection cultures (author’s transl),” Zentralblatt f¨ur Bakteriologie A, vol. 246, no. 4, pp. 512–540, 1979.

International Journal of Genomics [12] F. E. Dewhirst, B. J. Paster, I. Olsen, and G. J. Fraser, “Phylogeny of 54 representative strains of species in the family Pasteurellaceae as determined by comparison of 16S rRNA sequences,” Journal of Bacteriology, vol. 174, no. 6, pp. 2002–2013, 1992. [13] I. Olsen, F. E. Dewhirst, B. J. Paster, and H. J. Busse, “Family I. Pasteurellaceae,” in Bergey’s Manual of Systematic Bacteriology, D. J. Brenner, N. R. Krieg, G. M. Garrity, and J. T. Staley, Eds., vol. 2, pp. 851–856, Springer, New York, NY, USA, 2nd edition, 2005. [14] F. E. Dewhirst, B. J. Paster, I. Olsen, and G. J. Fraser, “Phylogeny of the Pasteurellaceae as determined by comparison of 16S ribosomal ribonucleic acid sequences,” Zentralblatt f¨ur Bakteriologie, vol. 279, no. 1, pp. 35–44, 1993. [15] H. Christensen, P. Kuhnert, J. E. Olsen, and M. Bisgaard, “Comparative phylogenies of the housekeeping genes atpD, infB and rpoB and the 16S rRNA gene within the Pasteurellaceae,” International Journal of Systematic and Evolutionary Microbiology, vol. 54, no. 5, pp. 1601–1609, 2004. [16] B. Korczak, H. Christensen, S. Emler, J. Frey, and P. Kuhnert, “Phylogeny of the family Pasteurellaceae based on rpoB sequences,” International Journal of Systematic and Evolutionary Microbiology, vol. 54, no. 4, pp. 1393–1399, 2004. [17] P. Kuhnert and B. M. Korczak, “Prediction of whole-genome DNA-DNA similarity, determination of G + C content and phylogenetic analysis within the family Pasteurellaceae by multilocus sequence analysis (MLSA),” Microbiology, vol. 152, no. 9, pp. 2537–2548, 2006. [18] H. Christensen, P. Kuhnert, H. J. Busse, W. C. Frederiksen, and M. Bisgaard, “Proposed minimal standards for the description of genera, species and subspecies of the Pasteurellaceae,” International Journal of Systematic and Evolutionary Microbiology, vol. 57, no. 1, pp. 166–178, 2007. [19] M. P. D. Bonaventura, E. K. Lee, R. DeSalle, and P. J. Planet, “A whole-genome phylogeny of the family Pasteurellaceae,” Molecular Phylogenetics and Evolution, vol. 54, no. 3, pp. 950– 956, 2010. [20] H. S. Naushad and R. S. Gupta, “Molecular signatures (conserved indels) in protein sequences that are specific for the order Pasteurellales and distinguish two of its main clades,” Antonie van Leeuwenhoek, vol. 101, no. 1, pp. 105–124, 2012. [21] N. Nørskov-Lauritsen and M. Kilian, “Reclassification of Actinobacillus actinomycetemcomitans, Haemophilus aphrophilus, Haemophilus paraphrophilus and Haemophilus segnis as Aggregatibacter actinomycetemcomitans gen. nov., comb. nov., Aggregatibacter aphrophilus comb. nov. and Aggregatibacter segnis comb. nov., and emended description of Aggregatibacter aphrophilus to include V factor-dependent and V factorindependent isolates,” International Journal of Systematic and Evolutionary Microbiology, vol. 56, no. 9, pp. 2135–2146, 2006. [22] P. J. Blackall, H. Christensen, T. Beckenham, L. L. Blackall, and M. Bisgaard, “Reclassification of Pasteurella gallinarum, [Haemophilus] paragallinarum, Pasteurella avium and Pasteurella volantium as Avibacterium gallinarum gen. nov., comb. nov., Avibacterium paragallinarum comb. nov., Avibacterium avium comb. nov. and Avibacterium volantium comb. nov,” International Journal of Systematic and Evolutionary Microbiology, vol. 55, no. 1, pp. 353–362, 2005. [23] Ø. Angen, P. Ahrens, P. Kuhnert, H. Christensen, and R. Mutters, “Proposal of Histophilus somni gen. nov., sp. nov. for the three species incertae sedis ‘Haemophilus somnus’, ‘Haemophilus agni’ and ‘Histophilus ovis’,” International Journal of Systematic and Evolutionary Microbiology, vol. 53, no. 5, pp. 1449–1456, 2003.

International Journal of Genomics [24] S. Pohl, H. U. Bertschinger, W. Frederiksen, and W. Mannheim, “Transfer of Haemophilus pleuropneumoniae and the Pasteurella haemolytica-like organism causing porcine necrotic pleuropneumonia to the genus Actinobacillus (Actinobacillus pleuropneumoniae comb. nov.) on the basis of phenotypic and deoxyribonucleic acid relatedness,” International Journal of Systematic Bacteriology, vol. 33, no. 3, pp. 510–514, 1983. [25] H. Christensen, M. Bisgaard, A. M. Bojesen, R. Mutters, and J. E. Olsen, “Genetic relationship among avian isolates classified as Pasteurella haemolytica. ’Actinobacillus salpingitidis’ or Pasteurella anatis with proposal of Gallibacterium anatis gen. nov., comb. nov. and description of additional genomospecies with Gallibacterium gen. nov,” International Journal of Systematic and Evolutionary Microbiology, vol. 53, no. 1, pp. 275–287, 2003. [26] P. J. Blackall, A. M. Bojesen, H. Christensen, and M. Bisgaard, “Reclassification of [Pasteurella] trehalosi as Bibersteinia trehalosi gen. nov., comb. nov,” International Journal of Systematic and Evolutionary Microbiology, vol. 57, no. 4, pp. 666–674, 2007. [27] R. Mutters, S. Pohl, and W. Mannheim, “Transfer of Pasteurella ureae Jones 1962 to the genus Actinobacillus Brumpt 1910: Actinobacillus ureae comb. nov.,” International Journal of Systematic Bacteriology, vol. 36, no. 2, pp. 343–344, 1986. [28] H. Christensen and P. Kuhnert, “International committee on systematics of prokaryotes. Subcommittee on the taxonomy of Pasteurellaceae. Minutes of the meetings, 25 August 2011, Elsinore, Denmark,” International Journal of Systematic and Evolutionary Microbiology, vol. 62, no. 1, pp. 257–258, 2012. [29] J. Hedegaard, H. Okkels, B. Bruun, M. Kilian, K. K. Mortensen, and N. Nørskov-Lauritsen, “Phylogeny of the genus Haemophilus as determined by comparison of partial infB sequences,” Microbiology, vol. 147, no. 9, pp. 2599–2609, 2001. [30] N. Nørskov-Lauritsen, B. Bruun, and M. Kilian, “Multilocus sequence phylogenetic study of the genus Haemophilus with description of Haemophilus pittmaniae sp. nov,” International Journal of Systematic and Evolutionary Microbiology, vol. 55, no. 1, pp. 449–456, 2005. [31] V. Cattoir, O. Lemenand, J.-L. Avril, and O. Gaillot, “The sodA gene as a target for phylogenetic dissection of the genus Haemophilus and accurate identification of human clinical isolates,” International Journal of Medical Microbiology, vol. 296, no. 8, pp. 531–540, 2006. [32] M. Kilian, “Genus III. Haemophilus,” in Bergey’s Manual of Systematic Bacteriology, D. J. Brenner, N. R. Krieg, G. M. Garrity, and J. T. Staley, Eds., vol. 2, pp. 883–904, Springer, New York, NY, USA, 2nd edition, 2005. [33] N. Nørskov-Lauritsen, B. Bruun, C. Andersen, and M. Kilian, “Identification of haemolytic Haemophilus species isolated from human clinical specimens and description of Haemophilus sputorum sp. nov,” International Journal of Medical Microbiology, vol. 302, no. 2, pp. 78–83, 2012. [34] R. D. Fleischmann, M. D. Adams, O. White et al., “Wholegenome random sequencing and assembly of Haemophilus influenzae Rd,” Science, vol. 269, no. 5223, pp. 496–512, 1995. [35] A. R. Wattam, D. Abraham, O. Dalay et al., “PATRIC, the bacterial bioinformatics database and analysis resource,” Nucleic Acids Research, vol. 42, no. 1, pp. D581–D591, 2014. [36] NCBI, NCBI Genome Database, 2014, http://www.ncbi.nlm.nih .gov/genome/. [37] NCBI, NCBI Nucleotide Database, 2014, http://www.ncbi.nlm .nih.gov/nuccore/.

13 [38] R. C. Edgar, “MUSCLE: multiple sequence alignment with high accuracy and high throughput,” Nucleic Acids Research, vol. 32, no. 5, pp. 1792–1797, 2004. [39] K. Tamura, G. Stecher, D. Peterson, A. Filipski, and S. Kumar, “MEGA6: molecular evolutionary genetics analysis version 6.0,” Molecular Biology and Evolution, vol. 30, no. 12, pp. 2725–2729, 2013. [40] S. J. Foote, J. T. Boss´e, A. B. Bouevitch, P. R. Langford, N. M. Young, and J. H. E. Nash, “The complete genome sequence of Actinobacillus pleuropneumoniae L20 (serotype 5b),” Journal of Bacteriology, vol. 190, no. 4, pp. 1495–1496, 2008. [41] Z. Xu, Y. Zhou, L. Li et al., “Genome biology of Actinobacillus pleuropneumoniae JL03, an isolate of serotype 3 prevalent in China,” PLoS ONE, vol. 3, no. 1, Article ID e1450, 2008. [42] M. P. Di Bonaventura, R. DeSalle, M. Pop et al., “Complete genome sequence of Aggregatibacter (Haemophilus) aphrophilus NJ8700,” Journal of Bacteriology, vol. 191, no. 14, pp. 4693–4694, 2009. [43] C. Chen, W. Kittichotirat, Y. Si, and R. Bumgarner, “Genome sequence of Aggregatibacter actinomycetemcomitans serotype c strain D11S-1,” Journal of Bacteriology, vol. 191, no. 23, pp. 7378– 7379, 2009. [44] C. Chen, W. Kittichotirat, W. Chen, J. S. Downey, Y. Si, and R. Bumgarner, “Genome sequence of naturally competent Aggregatibacter actinomycetemcomitans serotype a strain D7S1,” Journal of Bacteriology, vol. 192, no. 10, pp. 2643–2644, 2010. [45] T. J. Johnson, C. Fernandez-Alarcon, A. M. Bojesen, L. K. Nolan, D. W. Trampel, and T. Seemann, “Complete genome sequence of Gallibacterium anatis strain UMN179, isolated from a laying hen with peritonitis,” Journal of Bacteriology, vol. 193, no. 14, pp. 3676–3677, 2011. [46] I. K. Jordan, A. B. Conley, I. V. Antonov et al., “Genome sequences for five strains of the emerging pathogen Haemophilus haemolyticus,” Journal of Bacteriology, vol. 193, no. 20, pp. 5879–5880, 2011. [47] F. R. Strouts, P. Power, N. J. Croucher et al., “Lineage-specific virulence determinants of Haemophilus influenzae biogroup aegyptius,” Emerging Infectious Diseases, vol. 18, no. 3, pp. 449– 457, 2012. [48] J. S. Hogg, F. Z. Hu, B. Janto et al., “Characterization and modeling of the Haemophilus influenzae core and supragenomes based on the complete genomic sequences of Rd and 12 clinical nontypeable strains,” Genome Biology, vol. 8, no. 6, article R103, 2007. [49] A. Harrison, D. W. Dyer, A. Gillaspy et al., “Genomic sequence of an otitis media isolate of nontypeable Haemophilus influenzae: comparative study with H. influenzae serotyped, strain KW20,” Journal of Bacteriology, vol. 187, no. 13, pp. 4627–4636, 2005. [50] M. Yue, F. Yang, J. Yang et al., “Complete genome sequence of Haemophilus parasuisSH0165,” Journal of Bacteriology, vol. 191, no. 4, pp. 1359–1360, 2009. [51] J. F. Challacombe, A. J. Duncan, T. S. Brettin et al., “Complete genome sequence of Haemophilus somnus (Histophilus somni) strain 129Pt and comparison to Haemophilus ducreyi 35000HP and Haemophilus influenzae Rd,” Journal of Bacteriology, vol. 189, no. 5, pp. 1890–1898, 2007. [52] S. H. Hong, J. S. Kim, S. Y. Lee et al., “The genome sequence of the capnophilic rumen bacterium Mannheimia succiniciproducens,” Nature Biotechnology, vol. 22, no. 10, pp. 1275–1281, 2004.

14 [53] J. Gioia, X. Qin, H. Jiang et al., “The genome sequence of Mannheimia haemolytica A1: insights into virulence, natural competence, and Pasteurellaceae phylogeny,” Journal of Bacteriology, vol. 188, no. 20, pp. 7257–7266, 2006. [54] B. J. May, Q. Zhang, L. L. Li, M. L. Paustian, T. S. Whittam, and V. Kapur, “Complete genomic sequence of Pasteurella multocida, Pm70,” Proceedings of the National Academy of Sciences of the United States of America, vol. 98, no. 6, pp. 3460–3465, 2001. [55] R. C. Edgar, “Search and clustering orders of magnitude faster than BLAST,” Bioinformatics, vol. 26, no. 19, Article ID btq461, pp. 2460–2461, 2010. [56] K. Katoh and D. M. Standley, “MAFFT multiple sequence alignment software version 7: improvements in performance and usability,” Molecular Biology and Evolution, vol. 30, no. 4, pp. 772–780, 2013. [57] M. N. Price, P. S. Dehal, and A. P. Arkin, “FastTree 2— approximately maximum-likelihood trees for large alignments,” PLoS ONE, vol. 5, no. 3, Article ID e9490, 2010. [58] S. Whelan and N. Goldman, “A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach,” Molecular Biology and Evolution, vol. 18, no. 5, pp. 691–699, 2001. [59] R. S. Gupta, “Identification of conserved indels that are useful for classification and evolutionary studies,” in Methods in Microbiology, vol. 41, pp. 153–182, Academic Press, 2014. [60] F. Jeanmougin, J. D. Thompson, M. Gouy, D. G. Higgins, and T. J. Gibson, “Multiple sequence alignment with Clustal X,” Trends in Biochemical Sciences, vol. 23, no. 10, pp. 403–405, 1998. [61] E. Stackebrandt and J. Ebers, “Taxonomic parameters revisited: tarnished gold standards,” Microbiology Today, vol. 33, no. 4, p. 152, 2006. [62] K. T. Konstantinidis and E. Stackebrandt, “Defining taxonomic ranks,” in The Prokaryotes, pp. 229–254, Springer, 2013. [63] R. J. Redfield, W. A. Findlay, J. Boss´e, J. S. Kroll, A. D. S. Cameron, and J. H. E. Nash, “Evolution of competence and DNA uptake specificity in the Pasteurellaceae,” BMC Evolutionary Biology, vol. 6, article 82, 2006. [64] A. Rokas, B. I. Williams, N. King, and S. B. Carroll, “Genomescale approaches to resolving incongruence in molecular phylogenies,” Nature, vol. 425, no. 6960, pp. 798–804, 2003. [65] D. Wu, P. Hugenholtz, K. Mavromatis et al., “A phylogenydriven genomic encyclopaedia of Bacteria and Archaea,” Nature, vol. 462, no. 7276, pp. 1056–1060, 2009. [66] P. Yilmaz, L. W. Parfrey, P. Yarza et al., “The SILVA and ’sll-species living tree project (LTP)’ taxonomic frameworks,” Nucleic Acids Research, 2013. [67] B. Gao, R. Mohan, and R. S. Gupta, “Phylogenomics and protein signatures elucidating the evolutionary relationships among the Gammaproteobacteria,” International Journal of Systematic and Evolutionary Microbiology, vol. 59, no. 2, pp. 234–247, 2009. [68] A. M. Cuti˜no-Jim´enez, M. Martins-Pinheiro, W. C. Lima, A. Mart´ın-Tornet, O. G. Morales, and C. F. M. Menck, “Evolutionary placement of Xanthomonadales based on conserved protein signature sequences,” Molecular Phylogenetics and Evolution, vol. 54, no. 2, pp. 524–534, 2010. [69] H. S. Naushad and R. S. Gupta, “Phylogenomics and molecular signatures for species from the plant pathogen-containing order Xanthomonadales,” PLoS ONE, vol. 8, no. 2, Article ID e55216, 2013.

International Journal of Genomics [70] R. S. Gupta, “Applications of conserved indels for understanding microbial phylogeny,” in Molecular Phylogeny of Microorganisms, A. Oren and R. T. Papke, Eds., pp. 135–150, Caister Academic Press, Norfolk, UK, 2010. [71] H. S. Naushad, B. Lee, and R. S. Gupta, “Conserved signature indels and signature proteins as novel tools for understanding microbial phylogeny and systematics: Identification of molecular signatures that are specific for the phytopathogenic genera Dickeya, Pectobacterium and Brenneria,” International Journal of Systematic and Evolutionary Microbiology, vol. 64, no. 2, pp. 366–383, 2014. [72] R. S. Gupta, “Protein phylogenies and signature sequences: a reappraisal of evolutionary relationships among archaebacteria, eubacteria, and eukaryotes,” Microbiology and Molecular Biology Reviews, vol. 62, no. 4, pp. 1435–1491, 1998. [73] A. Rokas and P. W. H. Holland, “Rare genomic changes as a tool for phylogenetics,” Trends in Ecology and Evolution, vol. 15, no. 11, pp. 454–459, 2000. [74] R. S. Gupta and R. Lali, “Molecular signatures for the phylum Aquificae and its different clades: Proposal for division of the phylum Aquificae into the emended order Aquificales, containing the families Aquificaceae and Hydrogenothermaceae, and a new order Desulfurobacteriales ord. nov., containing the family Desulfurobacteriaceae,” Antonie van Leeuwenhoek, vol. 104, no. 3, pp. 349–368, 2013. [75] R. S. Gupta, S. Mahmood, and M. Adeolu, “A phylogenomic and molecular signature based approach for characterization of the phylum spirochaetes and its major clades: proposal for a taxonomic revision of the phylum,” Frontiers in Microbiology, vol. 4, article 217, 2013. [76] M. Adeolu and R. S. Gupta, “A phylogenomic and molecular marker based proposal for the division of the genus Borrelia into two genera: the emended genus Borrelia containing only the members of the relapsing fever Borrelia, and the genus Borreliella gen. nov. containing the members of the Lyme disease Borrelia (Borrelia burgdorferi sensu lato complex),” Antonie van Leeuwenhoek, vol. 105, no. 6, pp. 1049–1072, 2014. [77] V. Bhandari and R. S. Gupta, “Molecular signatures for the phylum (class) Thermotogae and a proposal for its division into three orders (Thermotogales, Kosmotogales ord. Nov. and Petrotogales ord. Nov.) containing four families (Thermotogaceae, Fervidobacteriaceae fam. Nov., Kosmotogaceae fam. Nov. and Petrotogaceae fam. Nov.) and a new genus Pseudothermotoga gen. Nov. with five new combinations,” Antonie van Leeuwenhoek, International Journal of General and Molecular Microbiology, vol. 105, no. 1, pp. 143–168, 2014. [78] I. Olsen, “Recent approaches to the chemotaxonomy of the Actinobacillus-Haemophilus-Pasteurella group (family Pasteurellaceae),” Oral Microbiology and Immunology, vol. 8, no. 6, pp. 327–336, 1993. [79] N. Hayashimoto, M. Ueno, A. Tkakura, and T. Itoh, “Biochemical characterization and phylogenetic analysis based on 16S rRNA sequences for V-factor dependent members of Pasteurellaceae derived from laboratory rats,” Current Microbiology, vol. 54, no. 6, pp. 419–423, 2007. [80] R. Mutters, W. Frederiksen, and W. Mannheim, “Taxonomy of the group,” in Pasturella and Pasteurellosis, C. Adlam and J. M. Rutter, Eds., pp. 3–34, Academic Press, London, UK, 1989. [81] K. P. Snipes and E. L. Biberstein, “Pasteurella testudinis sp. nov.: a parasite of desert tortoises (Gopherus agassizi),” International Journal of Systematic Bacteriology, vol. 32, no. 2, pp. 201–210, 1982.

International Journal of Genomics [82] Ø. Angen, P. Ahrens, and M. Bisgaard, “Phenotypic and genotypic characterization of Mannheimia (Pasteurella) haemolytica-like strains isolated from diseased animals in Denmark,” Veterinary Microbiology, vol. 84, no. 1-2, pp. 103–114, 2002. [83] N. Z. Ahmod, R. S. Gupta, and H. N. Shah, “Identification of a Bacillus anthracis specific indel in the yeaC gene and development of a rapid pyrosequencing assay for distinguishing B. anthracis from the B. cereus group,” Journal of Microbiological Methods, vol. 87, no. 3, pp. 278–285, 2011. [84] S. Y. Wong, A. Paschos, R. S. Gupta, and H. E. Schellhorn, “Insertion/deletion-based approach for the detection of Escherichia coli O157:H7 in freshwater environments,” Environmental Science & Technology, vol. 48, no. 19, pp. 11462–11470, 2014. [85] B. Gao and R. S. Gupta, “Phylogenetic framework and molecular signatures for the main clades of the phylum Actinobacteria,” Microbiology and Molecular Biology Reviews, vol. 76, no. 1, pp. 66–112, 2012. [86] M. Howard-Azzeh, L. Shamseer, H. E. Schellhorn, and R. S. Gupta, “Phylogenetic analysis and molecular signatures defining a monophyletic clade of heterocystous cyanobacteria and identifying its closest relatives,” Photosynthesis Research, vol. 122, no. 2, pp. 171–185, 2014.

15

Phylogenomic and molecular demarcation of the core members of the polyphyletic pasteurellaceae genera actinobacillus, haemophilus, and pasteurella.

The genera Actinobacillus, Haemophilus, and Pasteurella exhibit extensive polyphyletic branching in phylogenetic trees and do not represent coherent c...
948KB Sizes 1 Downloads 7 Views

Recommend Documents