crossmark

Lack of Overt Genome Reduction in the Bryostatin-Producing Bryozoan Symbiont “Candidatus Endobugula sertula” Ian J. Miller,a Niti Vanee,b Stephen S. Fong,b Grace E. Lim-Fong,c

Jason C. Kwana

Pharmaceutical Sciences Division, University of Wisconsin—Madison, Madison, Wisconsin, USAa; Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, Virginia, USAb; Department of Biology, Randolph-Macon College, Ashland, Virginia, USAc

ABSTRACT

The uncultured bacterial symbiont “Candidatus Endobugula sertula” is known to produce cytotoxic compounds called bryostatins, which protect the larvae of its host, Bugula neritina. The symbiont has never been successfully cultured, and it was thought that its genome might be significantly reduced. Here, we took a shotgun metagenomics and metatranscriptomics approach to assemble and characterize the genome of “Ca. Endobugula sertula.” We found that it had specific metabolic deficiencies in the biosynthesis of certain amino acids but few other signs of genome degradation, such as small size, abundant pseudogenes, and low coding density. We also identified homologs to genes associated with insect pathogenesis in other gammaproteobacteria, and these genes may be involved in host-symbiont interactions and vertical transmission. Metatranscriptomics revealed that these genes were highly expressed in a reproductive host, along with bry genes for the biosynthesis of bryostatins. We identified two new putative bry genes fragmented from the main bry operon, accounting for previously missing enzymatic functions in the pathway. We also determined that a gene previously assigned to the pathway, bryS, is not expressed in reproductive tissue, suggesting that it is not involved in the production of bryostatins. Our findings suggest that “Ca. Endobugula sertula” may be able to live outside the host if its metabolic deficiencies are alleviated by medium components, which is consistent with recent findings that it may be possible for “Ca. Endobugula sertula” to be transmitted horizontally. IMPORTANCE

The bryostatins are potent protein kinase C activators that have been evaluated in clinical trials for a number of indications, including cancer and Alzheimer’s disease. There is, therefore, considerable interest in securing a renewable supply of these compounds, which is currently only possible through aquaculture of Bugula neritina and total chemical synthesis. However, these approaches are labor-intensive and low-yielding and thus preclude the use of bryostatins as a viable therapeutic agent. Our genome assembly and transcriptome analysis for “Ca. Endobugula sertula” shed light on the metabolism of this symbiont, potentially aiding isolation and culturing efforts. Our identification of additional bry genes may also facilitate efforts to express the complete pathway heterologously.

M

arine invertebrates, such as tunicates and sponges, are known to harbor symbiotic communities of bacteria. Because these animals are sessile and have limited physical defenses, their microbiomes are thought to serve defensive functions, and bacterial symbionts in these invertebrates have been implicated in the production of many bioactive small molecules (1, 2). Such small molecules can possess therapeutically relevant activities, so there is great interest in studying the biosynthetic potential and symbiotic functions of these defensive microorganisms (1, 2). However, most symbionts (and most environmental bacteria in general [1, 3, 4]) are difficult to culture, which renders their bioactive metabolites challenging to access on an industrial or clinically useful scale. Due to fastidious or cryptic growth requirements, the study of these systems is possible only through culture-independent methods, such as direct sequencing of environment-derived DNA (termed shotgun metagenomics). In the present work, we used shotgun metagenomics to gain a greater understanding of the uncultured bacterial symbiont “Candidatus Endobugula sertula,” which resides within the marine bryozoan Bugula neritina (Fig. 1A), where it is known to produce defensive compounds called bryostatins (5–8) (Fig. 1B). Although the adult bryozoan is covered in a protective layer of chitin, the larvae require chemical defense after release (9–11), and high con-

November 2016 Volume 82 Number 22

centrations of “Ca. Endobugula sertula” have been observed within free-swimming larvae (10). Despite many unsuccessful attempts to culture the producing organism in laboratory settings, the bryostatins remain a target of therapeutic interest (7), as they are cytotoxic highly potent protein kinase C activators that have been investigated in clinical trials for the treatment of cancer, Alzheimer’s disease, and HIV infection (Fig. 1B) (7). The bry pathway for bryostatin production is what is known as a trans-acetyltransferase (AT)-type polyketide synthase (PKS) pathway (8, 12). PKSs are related to fatty acid synthases, and they construct a molecule in a similar fashion from two-car-

Received 15 June 2016 Accepted 25 August 2016 Accepted manuscript posted online 2 September 2016 Citation Miller IJ, Vanee N, Fong SS, Lim-Fong GE, Kwan JC. 2016. Lack of overt genome reduction in the bryostatin-producing bryozoan symbiont “Candidatus Endobugula sertula.” Appl Environ Microbiol 82:6573– 6583. doi:10.1128/AEM.01800-16. Editor: H. L. Drake, University of Bayreuth Address correspondence to Jason C. Kwan, [email protected]. Supplemental material for this article may be found at http://dx.doi.org/10.1128 /AEM.01800-16. Copyright © 2016, American Society for Microbiology. All Rights Reserved.

Applied and Environmental Microbiology

aem.asm.org

6573

Miller et al.

A

Zooids Lophophores

Ovicells

O

B

O

a.

O

O

b.

c.

O

O

O

O HO O

HO O

OH

b b b b c c H H H OH H H H c H

R1

O OH O

OHO

R2

R2

a a c b c b a b c a a a a a a

O

O OH O

R1

4 5 6 7 8 9 10 11 13 14 16 17 18 19 20

O

R1 O

Bryostatin

O

R2

OHO O

OH

OH O

O Bryostatins 4-11,13-14, and 16-18

Bryostatins 19 and 20

FIG 1 Morphology and chemotype of the Bugula neritina shallow genotype (type S). (A) Micrograph showing the morphology of zooids and ovicells in a B. neritina colony. Colonies consist of individual animals (zooids) that are specialized for feeding, substrate attachment, or reproduction. Food particles are captured by feeding zooids using pulsating projections called lophophores. Reproductive zooids brood developing embryos in chambers called ovicells before they are released as free-swimming larvae. (B) Bryostatin structures found in the shallow genotype (type S) Bugula neritina (73, 74). Bryostatins 4 to 11, 13, 14, and 16 to 18 are associated with this genotype, and these structures vary at positions R1 and R2, with some possessing the ester side chains shown. Bryostatins 19 and 20 possesses an extra cyclization of one ␤-branch, and they are also found in type S individuals.

bon units derived from malonate (13), using multiple enzymatic domains in a typical PKS protein. Most PKS pathways include AT domains within PKS proteins that are used to load the activated S-coenzyme A (CoA) thioester form of malonate, but in trans-AT systems, the AT exists on a separate protein. Portions of the bry pathway for the bryostatins were recovered previously using clone library methods (6), but the complete genome sequence, including missing pieces of the bry pathway, has remained elusive due to recalcitrance of the symbiont to culture. A number of factors are thought to contribute to this recalcitrance, including the possibility of substantial genome reduction (7), which has been reported in other invertebrate symbioses (14–16). Indeed, the divergence of symbionts in genetically isolated hosts (17), as well as restriction to B. neritina and allied species, the vertical mode of symbiont transmission, and the ongoing inability to isolate and culture “Ca. Endobugula sertula” suggested a lifestyle of extreme host dependence that can lead to gene loss and genome reduction (18). How-

6574

aem.asm.org

ever, we recently found that under some circumstances, “Ca. Endobugula sertula” may be horizontally transferred between B. neritina individuals (19), which is not consistent with strict host dependency that is associated with extreme genome reduction (18). Our objective was to assemble the genome of “Ca. Endobugula sertula” in order to determine whether the symbiosis has evolved to a state of codependency, evidenced by bacterial genome degradation, as in some other defensive symbioses (14–16). Shotgun DNA and RNA sequencing revealed a genome with few signs of reduction, largely intact mainstream metabolic pathways, and a putative mechanism of vertical transmission of “Ca. Endobugula sertula” to host larvae. MATERIALS AND METHODS Collection of biological material. Collection of samples AB1_ovicells and MHD_larvae has been described elsewhere (76). Briefly, an adult individ-

Applied and Environmental Microbiology

November 2016 Volume 82 Number 22

Genomic Analysis of “Candidatus Endobugula sertula”

ual (AB1) was collected by hand from the sides of a floating dock in November 2013 near Morehead City, NC, at site Atlantic Beach (AB; 34°42=24.527⬙N 76°44=18.286⬙W). Mature larval brooding chambers (ovicells) were dissected from AB1 and combined to form sample AB1_ovicells. Additional B. neritina individuals were collected at Morehead City Docks (MHD; 34°43=8.879⬙N 76°42=49.838⬙W), also in November 2013. Approximately 20 of these B. neritina colonies were used for combined larval collection. The collected larvae were combined to form sample MHD_larvae. Both samples were genotyped using Linneman et al.’s protocol (19) and found to be the “shallow” (S) genotype. Both samples were preserved in RNAlater and stored at ⫺80°C. Nucleic acid extraction, sequencing, and assembly. DNA was extracted from RNAlater-preserved tissue samples using a procedure previously optimized for tunicate shotgun metagenomics (20). A portion of AB1_ovicells was ground with a mortar and pestle under liquid nitrogen before being resuspended in 5 ml of 2 mg/ml lysozyme in Tris-EDTA (TE) buffer. A portion of MHD_larvae was added directly to 5 ml of 2 mg/ml lysozyme in TE. In both cases, extractions were incubated at 30°C, with shaking, for 1 h. After this time, 1.2 ml of 0.5 M EDTA was added to each tube along with proteinase K (final concentration, 0.2 mg/ml; Qiagen), and mixtures were incubated at 30°C for 5 min. After the addition of 650 ␮l of 10% SDS, the mixtures were incubated at 37°C with shaking overnight. NaCl (1.2 ml of 5 M) was then added to each tube, along with 1.0 ml of cetyltrimethylammonium bromide (CTAB)-NaCl solution (10% CTAB in 0.7 M NaCl), and the tubes were incubated at 65°C for 20 min. Mixtures were extracted twice with 1:1 phenol-chloroform, and 1 volume of isopropanol was added to the aqueous fraction, which was then stored at 4°C overnight. Tubes were spun down at 3,220 ⫻ g for 30 min at 0°C. Supernatants were carefully removed, and 2 ml of 70% ethanol in water was added to each tube before they were spun down again. The supernatants were removed, and tubes were inverted for 20 min before 500 ␮l of TE was added. The tubes were left overnight at 4°C to allow DNA to dissolve, before extractions were subjected to repurification by Genomictip 100/G (Qiagen), according to the manufacturer’s instructions. This procedure yielded 9.4 ␮g of DNA from AB1_ovicells and 882 ng of DNA from MHD_larvae (both in 500 ␮l, with the concentration measured by the PicoGreen assay). For metagenomic analysis, TruSeq libraries were prepared with ⬃300-bp inserts and then subjected to sequencing in 2 ⫻ 100-bp runs on a HiSeq 2000 (Illumina). RNA was extracted from approximately 40 mg of AB1_ovicells. After grinding with a mortar and pestle under liquid nitrogen, the material was resuspended in 600 ␮l of buffer RLT (Qiagen) containing 6 ␮l of ␤-mercaptoethanol. The mixture was homogenized by drawing a sterile 20-G needle up and down 15 times before being spun down at 16,800 ⫻ g for 3 min. Total RNA was then purified from the crude lysate using the RNeasy minikit (Qiagen), utilizing the optional DNase step. This procedure yielded 1.6 ␮g of RNA (in 100 ␮l). The resulting RNA was flash-frozen in liquid nitrogen and stored at ⫺80°C. Prokaryotic and eukaryotic rRNA was depleted with the RiboZero rRNA removal (epidemiology) kit (Epicentre), and eukaryotic polyadenylated transcripts were depleted with poly(T) beads. RNA in the resulting eluate was recovered and purified with Agencourt RNAClean XP beads. Stranded RNA sequencing (RNA-seq) Illumina libraries were prepared with ⬃300-bp inserts and subjected to two Illumina HiSeq 2000 sequencing runs, one at 2 ⫻ 101 bp and one at 2 ⫻ 151 bp. Sequence yields are shown in Table S1 in the supplemental material. Reads were quality filtered and then assembled with SPAdes (21), and contigs were taxonomically classified by using MEGAN (22) to process the results of BLASTP (protein-protein Basic Local Alignment Search Tool) searches of all predicted open reading frames (ORFs) against the NCBI nonredundant protein database (NR), as previously described (76). These classifications allowed bacterial contigs to be separated from eukaryotic and “unclassified kingdom” contigs likely originating from the host genome. AB1_ovicells and MHD_larvae metagenomes were compared by aligning raw MHD_larvae sequence reads to bacterial AB1_ovicells contigs of ⬎3 kbp in length. Contigs with ⬎1⫻ read coverage in the resulting alignment

November 2016 Volume 82 Number 22

were visualized in R (23) as points on a graph of G⫹C% versus coverage (see Fig. S1 in the supplemental material). Outlier contigs to the single apparent cluster were removed using mclust (24) in R (23). This cluster was determined to be the “Ca. Endobugula sertula” genome, based on phylogeny and inclusion of known “Ca. Endobugula sertula” genes (see Results). PCR amplification and screening to confirm connectivity between contigs. Primers (see Table S3 in the supplemental material) were designed to have an annealing temperature of ⬃55°C manually and using the Primer3 algorithm (25) to test various aspects of genomic assemblies. For a 10-␮l reaction matrix, the following volumes and concentrations of each component were used: 5 ␮l of 2 KOD buffer, 2 ␮l of 2 mM dinucleoside triphosphates (dNTPs), 0.2 ␮l of KOD Xtreme Hot Start DNA polymerase (Merck Group, Darmstadt, Germany), 1 ␮l of 3 ␮M forward primer, 1 ␮l of 3 ␮M reverse primer, and 0.8 ␮l of template (1:10 dilution of the AB1_ovicells DNA extraction). Reactions were carried out on a Bio-Rad (Hercules, CA) C1000 Touch thermal cycler in 200-␮l 8-strip tubes using a thermocycling program consisting of 94°C for 2 min; 35 cycles of 98°C for 10 s, 55°C for 30 s, and custom extension time (1 min per kbp expected product) at 68°C; and then 68°C for 10 min with an indefinite hold at 12°C upon thermocycling completion. “Candidatus Endobugula sertula” genome annotation, RNA-seq alignment, and functional category assignment. RNA-seq data were filtered with SeqyClean (https://github.com/ibest/seqyclean), using the parameter “-polyat.” The resulting filtered reads were aligned with Bowtie 2 (26) to contigs in the “Ca. Endobugula sertula” assembly using the endto-end alignment options “-very-sensitive -no-discordant -no-unal.” Contigs were annotated with the Prokka pipeline (27) and combined with RNA-seq alignment files in Geneious (Biomatters Ltd., San Francisco, CA) to visualize transcript abundance. Reads aligned to each ORF were counted in Geneious, and for each gene, normalized reads per kilobase pair of gene per million (RPKM) reads aligning to annotated ORFs in the “Ca. Endobugula sertula” genome (28) values were calculated. MEGAN (22) was used to assign functional Kyoto Encyclopedia of Genes and Genomes (KEGG) categories to predicted protein sequences. KEGG trees were uncollapsed two levels in MEGAN, and all assignments except for “organismal systems” and “human diseases” were exported to a .csv file (with the columns “read name” and “KEGG name”). Calculated RPKM values and the MEGAN .csv table were used to calculate proportions of the “Ca. Endobugula sertula” transcriptome that corresponded to each KEGG category. Where multiple KEGG categories were assigned to one predicted gene, that gene’s RPKM value was split equally among the assigned categories. Homolog analysis in “Ca. Endobugula sertula.” Predicted genes in the draft “Ca. Endobugula sertula” assembly with homologs in other gammaproteobacteria (Fig. 2; see also Fig. S4 and Table S2 in the supplemental material) were identified with a procedure previously used to determine homologs conserved between an intracellular symbiont and its closest free-living relative (47). Predicted protein sequences from the “Ca. Endobugula sertula” assembly were used as queries in BLASTP searches against the NR database, and the ranks of proteins from target bacteria were tabulated with a custom script. Genes were counted as having homologs in the target sequence set if a target sequence had a rank of ⬍100 in the BLAST results. The best BLAST hit from the target sequence set was counted as the homolog of the protein query from the “Ca. Endobugula sertula” assembly. Bioinformatic analysis for metabolic modeling. Genes identified by annotation and homology were referenced to UniProt to associate and confirm Enzyme Commission (EC) numbers with each gene. The identified EC numbers were mapped to biochemical pathways and compared using BioCyc and KEGG pathways. The functional continuity of biochemical pathways was tested using a draft constraint-based model of “Ca. Endobugula sertula” containing 1,774 biochemical reactions using a biomass objective function and flux balance analysis (FBA).

Applied and Environmental Microbiology

aem.asm.org

6575

Miller et al.

A

B

Protein homologs in “Ca. E. sertula” Pseudomonas aeruginosa PA7 Pseudomonas aeruginosa LESB58 91 Pseudomonas aeruginosa PAO1 Teredinibacter turnerae T7901 γ-proteobacterium IMCC1989 Pseudomonas aeruginosa UCBPP-PA14 100 Azotobacter vinelandii DJ 100 Pseudomonas stutzeri ATCC 17588 Pseudomonas stutzeri A1501 100 Pseudomonas fulva 12-X 100 Pseudomonas mendocina ymp 99 100 Pseudomonas mendocina NK Pseudomonas putida S16 100 Pseudomonas entomophila L48 100 Pseudomonas putida W619 Pseudomonas putida GB-1 99 Pseudomonas putida KT2440 100 99 Pseudomonas putida F1 Pseudomonas fluorescens Pf0-1 100 Pseudomonas protegens Pf-5 91 Pseudomonas brassicacearum subsp. brassicacearum 95 Pseudomonas fluorescens SBW25 Photorhabdus/ 98 Pseudomonas syringae pv. tomato str. DC3000 Xenorhabdus 100 Pseudomonas syringae pv. syringae B728a 90 Pseudomonas syringae pv. phaseolicola 1448A Halomonas elongata DSM 2581 100 Chromohalobacter salexigens DSM 3043 86 Hahella chejuensis KCTC 2396 100 Marinobacter hydrocarbonoclasticus VT8 Marinomonas mediterranea MMB-1 100 Marinomonas sp. MWYL1 100 Other proteinMarinomonas posidonica IVIA-Po-181 96 coding genes γ-proteobacterium IMCC1989 100 “Candidatus Endobugula sertula” AB1 100 Cellvibrio japonicus Ueda107 85 Saccharophagus degradans 2-40 100 Teredinibacter turnerae T7901 0.1

100

937

168

41

15

100

529

47

172

n = 3,049

1,140

C AB835_02580

AB835_02575

AB835_02570

AB835_02565

AB835_02460

AB835_02555

AB835_02550

AB835_02545

NODE28

60% 30% chi1

yenA1

yenA2

chi2

yenB

yenC1

yenC2

Yen-Tc 5 kb

FIG 2 Phylogenomic analysis of “Ca. Endobugula sertula.” (A) An approximately maximum-likelihood tree generated by FastTree 2 from 29 concatenated single-copy-marker gene protein sequences from the “Ca. Endobugula sertula” genome and 1,336 other reference genomes. Bootstrap proportions greater than 70% are expressed to the left of each node as a percentage of 1,000 replicates. (B) Venn diagram of protein-coding genes in the “Ca. Endobugula sertula” genome, showing homologs of predicted proteins in T. turnerae T7901, gammaproteobacterium IMCC1989, and Photorhabdus/Xenorhabdus. (C) Comparison of Tc gene cluster on NODE28 with a chitinase-containing Tc locus in the insect pathogen Y. entomophaga, showing corresponding regions of protein sequence homology between the two clusters (65). The levels of shading represent amino acid identity.

Construction of “Ca. Endobugula sertula” phylogenomic tree. AMPHORA2 (30, 31) was used to scan the “Ca. Endobugula sertula” assembly for 104 phylogenetic marker genes, which were extracted and manually examined to resolve instances of multiple hits. The marker genes were individually aligned to the internal reference database supplied with AMPHORA2. Genes from gammaproteobacterium IMCC1989 (32) were manually added, as this bacterium was not present in the AMPHORA2 database. The marker genes were individually aligned to the internal reference database supplied with AMPHORA2. The set of individual marker alignments was filtered such that only reference genomes with ⬎75% of the marker genes were retained, and then marker genes represented in ⬍75% of these genomes were removed. The resulting 29 marker alignments were concatenated, and residues not aligning to AMPHORA2’s hidden Markov models (HMMs), signified by lowercase residues in the resulting alignment file, were removed from the alignment. Trees were then constructed with FastTree 2 (33) using the parameters “-gamma -slow -spr 10 -mlacc 3 -bionj.” After FastTree 2 runs were complete, accession numbers were substituted for strain designations according to entries in the RefSeq database. The resulting tree was rooted arbitrarily at the divergence of the phylum Deinococcus-Thermus and other

6576

aem.asm.org

bacteria, as others have done previously (30). The tree was manipulated using the Interactive Tree of Life server (34). Accession number(s). Raw shotgun metagenome reads for AB1_ovicells and MHD_larvae were deposited in the Sequence Read Archive (SRA). The accession numbers for AB1_ovicells are SRR4020077, SRR4020078, SRR4020081, SRR4020082, SRR4020083, and SRR4020084; the accession numbers for MHD_larvae are SRR4020079, SRR4020080, SRR4020086, SRR4020087, and SRR4020088. Metatranscriptomic RNA-seq reads for AB1 were deposited in the SRA under accession numbers SRR4020085 and SRR4125604. The annotated draft genome for AB1_endobugula assembly is accessible through the NCBI under the accession number MDLC00000000. All these data sets are accessible under BioProject PRJNA322176 and BioSample numbers SAMN05039512 (AB1_ovicells) and SAMN05513359 (MHD_larvae).

RESULTS

Assembly of “Ca. Endobugula sertula” genome. DNA from two separate samples was sequenced: free-swimming larvae from a pooled sample of B. neritina colonies (MHD_larvae), and ovicells

Applied and Environmental Microbiology

November 2016 Volume 82 Number 22

Genomic Analysis of “Candidatus Endobugula sertula”

dissected from a single adult colony (AB1_ovicells). Referencebased assembly of 16S sequences using EMIRGE (35, 36) reconstructed a 16S rRNA sequence with 99% identity to that of “Ca. Endobugula sertula” (type S strain, also termed BnSP; accession number AF006606 [37]) in both AB1_ovicells and MHD_larvae, but it suggested the presence of multiple bacteria in the AB1_ovicells sample that were not found in MHD_larvae. Through 16S rRNA amplicon sequencing, it was found that this “Ca. Endobugula sertula” sequence accounted for 2.4% and 5.6% of reads in AB1_ovicells and MHD_larvae, respectively, and apart from this symbiont, there was virtually no overlap in the wider microbiome between these two samples (76). Reads from both AB1_ovicells and MHD_larvae were assembled separately using SPAdes (21). Contigs from the AB1_ovicells assembly, which had generally higher-quality sequence assembly characteristics (larger overall assembly size, higher N50, etc.) than the pooled MHD_larvae sample (see Table S1 in the supplemental material), were classified taxonomically based on the homology of their open reading frames (ORFs) to the NCBI NR database. To simplify the overall metagenomic assembly of AB1_ovicells, contigs smaller than 3,000 bp and contigs that were classified as eukaryotic (⬃135 Mbp in total) or unclassified at the kingdom level based on these taxonomic assignments were removed. Because the EMIRGE-reconstructed 16S sequences suggested that “Ca. Endobugula” was the only shared bacterium between the AB1_ovicells and MHD_larvae assembly, contigs with read coverage in both samples were identified, which should include sequences belonging to “Ca. Endobugula sertula” and any host genome contigs misclassified as bacteria. AB1_ovicells contigs that had MHD_larvae coverage appeared to form two discrete clusters based on G⫹C% and coverage (see Fig. S1 in the supplemental material). These two clusters were automatically separated using a technique where data are fitted to a certain number of groups based on discrete distribution patterns, known as normal-mixture modeling (24). The accuracy of this clustering approach was evaluated using the assigned ORF (see Fig. S2 in the supplemental material) and contig-based (see Fig. S3 in the supplemental material) taxonomy of these two clusters, as well as single-copy-marker analysis (Table 1). The contigs belonging to one of these groups (clusters 1 and 2) contained bry pathway components and had ORF taxonomic classifications consistent with previous 16S-based taxonomies of “Ca. Endobugula sertula,” such as known relatives gammaproteobacterium IMCC1989 (32) and Teredinibacter turnerae (38). Phylogenomic and symbiotic characteristics of the “Candidatus Endobugula sertula” genome. The draft genome of “Ca. Endobugula sertula” is 3.4 Mbp in size (112 contigs), has 41.2% G⫹C content, and is estimated to be 100% complete by singlecopy-marker gene analysis (39) (Table 1). The “Ca. Endobugula sertula” 16S rRNA gene sequence reconstructed by EMIRGE was joined to one of the putative “Ca. Endobugula sertula” contigs through PCR and Sanger sequencing. In addition, a whole-genome phylogenetic tree constructed from concatenated bacterial marker genes placed the genome in a clade with known “Ca. Endobugula sertula” relatives gammaproteobacterium IMCC1989, T. turnerae, Saccharophagus degradans, and Cellvibrio japonicus (Fig. 2A) (17, 38, 40). All of these relatives are free living except for T. turnerae, which is an intracellular symbiont of wood-feeding shipworms (38), although it should be noted that T. turnerae can

November 2016 Volume 82 Number 22

TABLE 1 “Candidatus Endobugula sertula” genome assembly characteristics Genome featurea

Value

No. of contigs Assembly size (bp) G⫹C content (%) N50 (bp) Longest contig (bp) Coverage No. of tRNA genes No. of rRNA genes No. of CDSs Coding density (%) No. of conserved single-copy genes (expected/observed) No. of repeated conserved single-copy genes No. of ORFs with functional annotation No. of ORFs without functional prediction Avg CDS length (bp)

112 3,350,348 41.63 49,663 153,360 6.2 49 1 2,850 84.2 139/139 1 3,049 1,576 972

a The coverage quoted here is k-mer coverage reported by the SPAdes assembler, where k ⫽ 77. CDSs, protein-coding genes.

be grown in the laboratory, in contrast to “Ca. Endobugula sertula.” “Candidatus Endobugula sertula” possesses many proteins with homologs in related gammaproteobacteria (Fig. 2B), but a large portion, including the bry pathway, are also not shared with the closest free-living relatives. Teredinibacter turnerae (38), S. degradans (41), and C. japonicus (42) are all specialized in the degradation of complex polysaccharides, but “Ca. Endobugula sertula” does not appear to possess homologs of the ⬃100 glycosyl hydrolase (GH) domains found in T. turnerae, which are used to digest wood for its host. “Candidatus Endobugula sertula” also does not appear to possess homologs of the nitrogen fixation genes found in T. turnerae (38), which compensate for a nitrogen-poor wood diet. However, a group of genes were identified that showed homology to proteins found in Photorhabdus and/or Xenorhabdus species (Fig. 2B), which are predominantly insect pathogens that associate with nematode vectors (43). These genes included several toxin complex (Tc) gene clusters (Fig. 2C), which have been associated with both host specificity and pathogenesis (44). One of these loci, on contig NODE28, was found to contain two chitinase genes. Reductive evolution and metabolic and metatranscriptomic analysis. “Candidatus Endobugula sertula” has, to our knowledge, never been cultured in the laboratory, and it has been previously suggested that its genome might be reduced (7), similar to some long-term symbionts of insects (15, 18) and marine tunicates (14, 16). Although the “Ca. Endobugula sertula” genome is the smallest out of several close relatives (Table 2), there is little other evidence to indicate genome degradation or reduction. It is 3.4 Mbp in size, larger than the previously predicted 2 Mbp based on flow cytometry (7), and despite the high A⫹T content (⬃70%) of certain regions in the bryostatin pathway (6, 7), the genome as a whole is only 58.4% A⫹T. Furthermore, the coding density of the “Ca. Endobugula sertula” genome is 85.1%, which falls into the average bacterial coding density range (85 to 90% [45]). We compared the lengths of genes in T. turnerae and gammaproteobacterium IMCC1989 to homologs in “Ca. Endobugula sertula” in an attempt to identify potential pseudogenes that were ⬍80%

Applied and Environmental Microbiology

aem.asm.org

6577

Miller et al.

acid biosynthesis was aspartate transaminase (AB835_06295), which is a multifunctional enzyme that is involved in aspartate, tyrosine, phenylalanine, and tryptophan metabolism. While the pathways for histidine and lysine appeared to be complete, present, and transcribed, lower expression levels of certain enzymes in these pathways (Fig. 3; see also Data Set S1 in the supplemental material) make the actual level of amino acid biosynthesis more uncertain, given that transcript levels often do not accurately predict downstream protein abundance (48). De novo assembly and repeat resolution of the bryostatin biosynthetic pathway. The bry pathway was previously sequenced as a contiguous locus in the type S genotype of “Ca. Endobugula sertula,” containing five genes encoding PKS proteins (BryBCXDA) and various accessory genes, including the trans-AT bryP and genes involved in ␤-branching (bryQR, Fig. 5C) (6). The known sequence included several exact repeats, which complicated de novo assembly of this region in the AB1_ovicells metagenome. A procedure devised by Albertsen et al. (49) was used to resolve the repeats by identifying paired reads aligning at the ends of two (or more) contigs (Fig. 5A). Relative read coverage of repeats versus unique regions, as well as contig orientation, was used to construct a connection map and restrict the sequence to two possible structures (Fig. 5B). The correct bry locus structure, as determined by PCR (see Tables S3 and S4 in the supplemental material), was in agreement with the published sequence (accession no. EF032014). The published pathway, however, is missing some key components needed for ␤-branching to produce the methylated esters seen in the final bryostatin structures. In polyketides made by trans-AT PKS pathways, ␤-alkyl chains are commonly introduced by a set of proteins minimally including a standalone acyl-carrier protein (ACP), a decarboxylative ketosynthase (KS), a 3-hydroxy-3-methylglutaryl-CoA synthase (HMGS) and 1-2 enoyl-CoA-hydratases (ECH) (50) (Fig. 5C). A previously unknown ECH and an adjacent ACP were

TABLE 2 Comparison of “Ca. Endobugula sertula” genome with those of close relatives

Genome (reference) “Ca. Endobugula sertula” Gammaproteobacterium IMCC1989 (32) T. turnerae T7901 (38) S. degradans 2-40 (41) C. japonicus Ueda107 (42) H. chejuensis KCTC 2396 (75)

G⫹C content No. of Size genes Habitat (Mbp) (%)

Lifestyle

3.34 3.94

41.6 42.1

2,850 2,844

Marine Marine

Symbiont Free-living

5.19 5.06 4.58 7.22

50.8 45.8 52.0 54.8

4,690 4,017 3,790 6,783

Marine Marine Terrestrial Marine

Symbiont Free-living Free-living Free-living

of the length of their homologs (46, 47). A low number of genes, 30, were identified as potential pseudogenes, and most of these were annotated with functions associated with motility and regulatory sensors (see Fig. S4 and Table S2 in the supplemental material). Bioinformatic analysis and metabolic modeling, however, indicated potential metabolic insufficiencies in amino acid metabolism, as we could not find complete pathways to synthesize methionine and threonine (Fig. 3). To gain a functional snapshot of “Ca. Endobugula sertula,” we extracted and sequenced RNA from the AB1_ovicells sample (Fig. 4). In “Ca. Endobugula sertula,” 12.1% of the functionally assigned transcriptome is dedicated collectively to putative symbiotic functions: bryostatin synthesis (3.6%) and the expression of Tc genes (8.5%). Despite apparent insufficiencies in amino acid biosynthesis, the single largest fraction (14.7%), by KEGG category, of the symbiont’s transcriptome was dedicated to processes in amino acid metabolism (Fig. 4). A number of enzymes in the amino acid metabolism category are involved in arginine biosynthesis. For instance, argininosuccinate synthase (AB835_04910) and acetylornithine aminotransferase (AB835_00040) are the two most highly expressed enzymes in this amino acid metabolism category. Another highly expressed enzyme involved in amino

tyrosine erythrose-4-P

tryptophan shikimate

chorismate phenylalanine

Absent RPKM < 100

5-phosphoribosyl 1-pyrophosphate

100 < RPKM < 1000

histidine ornithine

RPKM >1000 leucine cysteine

pyruvate

valine 2-ketovaline

homoserine

isoleucine

2-oxobutanoate

threonine

methionine cystathionine homocysteine

glutamate

arginine ornithine

aspartate

lysine

FIG 3 Amino acid biosynthesis pathways found in the “Ca. Endobugula sertula” genome. Enzyme commission (EC) numbers are colored based on presence and RPKM values.

6578

aem.asm.org

Applied and Environmental Microbiology

November 2016 Volume 82 Number 22

Genomic Analysis of “Candidatus Endobugula sertula”

Category Carbohydrate Metabolism

Translation

Energy Metabolism

Folding, Sorting and Degradation

Lipid Metabolism

Replication and Repair

Nucleotide Metabolism

Membrane Transport

Amino Acid Metabolism

Signal Transduction

Metabolism of Other Amino Acids

Transport and Catabolism

Glycan Biosynthesis and Metabolism

Cell Motility

Metabolism of Cofactors and Vitamins

Cell Growth and Death

Metabolism of Terpenoids and Polyketides

Cell Communication

Biosynthesis of Other Secondary Metabolites

Bry

Xenobiotics Biodegradation and Metabolism

Chitinase and Tc

Transcription

FIG 4 Metatranscriptomic analysis of “Ca. Endobugula sertula.” Proportion of normalized reads, expressed as reads per kilobase of gene per million reads aligning to annotated ORFs in the “Ca. Endobugula sertula” draft genome (RPKM; 28), aligned to genes with assigned function.

found in a separate locus (bryTU), and these are predicted to act in the dehydration of the new side chain and as a carrier for the malonate ␤-branch, respectively. These genes are homologous to corresponding genes in other trans-AT PKS pathways, with the closest homologs to bryT and bryU being batD (batumin/kalimantacin pathway [51], 53% amino acid identity) and calX (calyculin pathway [52], 45% amino acid identity), respectively. Interestingly, all bry mRNA transcripts were detected through metatranscriptomics, except bryS, a methyltransferase previously thought to carry out the O-methylation of ␤-branches (6). Unlike other bry genes, bryS has a homolog in gammaproteobacterium IMCC1989 that is annotated as a tRNA-methyltransferase (accession no. WP_040805166) and therefore may not be part of the bry pathway, as previously predicted. DISCUSSION

Reductive evolution is known to occur in a number of settings, including intracellular symbiosis (18, 53), intracellular pathogenesis (54), and free-living pelagic microbes (55). In strictly intracellular organisms, reduction is thought to occur due to genetic drift as a consequence of genetic isolation and strict vertical transmis-

November 2016 Volume 82 Number 22

sion, low effective population sizes, and frequent bottlenecks (18). In free-living examples, it is thought that reduction is driven by selection rather than drift (56), potentially due to a selective advantage in not investing in the production of a metabolite that has become a “public good” in the community (57). In the “Black Queen” model (57), when a metabolic pathway becomes rare enough in the community, further loss is selected against to prevent loss of community fitness as a whole. Features in the “Ca. Endobugula sertula” genome were inconsistent with extreme reduction due to sequence drift. The early stages of such a process are characterized by a proliferation of pseudogenes and an accompanying low coding density (18). In contrast to the hundreds of pseudogenes identified in Sodalis glossinidius, a tsetse fly symbiont in the early stages of genome degradation (58), only 30 potential pseudogenes were found in the “Ca. Endobugula sertula” genome. Furthermore, the “Ca. Endobugula sertula” genome did not have a particularly low G⫹C content but did have a coding density within the typical bacterial range (45) and a genome size substantially larger than the previously predicted 2 Mbp (7). However, careful examination of the metabolic capabilities implied by the “Ca. Endobugula sertula” genome suggested that it lacked a number of enzymes required for the synthesis of certain amino acids (Fig. 3), potentially explaining its dependence on its host environment and, perhaps, other microbial constituents of the host microbiome. Metatranscriptomic analysis suggested that the symbiont was actively involved in metabolic processes related to amino acid metabolism, such as arginine biosynthesis. High expression levels of the arginine biosynthesis pathway may help explain an interesting observation made by a previous study that found that B. neritina larvae do not become depleted of nitrogen during the swimming stage prior to settlement, compared to the larvae of other Bugula species that are aposymbiotic (59). Perhaps then, the host’s larval nutrition may be augmented through nitrogen recycling by “Ca. Endobugula sertula” in the form of ammonia assimilation as carbamoyl phosphate (60) before storage as arginine (61). Previously, it was found that different populations and sibling species of Bugula harbor different strains of “Ca. Endobugula sertula” (7), which is a pattern of distribution consistent with a vertical mode of symbiont transmission, where respective populations have been genetically isolated since host divergence. However, we previously found evidence that horizontal transmission is likely also possible (19). We found that two sibling species of B. neritina, northern (N) and shallow (S), previously thought to be allopatrically distributed, actually coexist along the Western Atlantic. Type N populations are divergent from type S animals and are aposymbiotic in their typical northern range. However, in Western Atlantic populations, both type N and type S individuals can be found harboring 100% identical “Ca. Endobugula sertula” strains (as measured by 16S and internal transcribed spacer [ITS] sequences). The most parsimonious explanation for this observation is the horizontal transfer of symbionts from type S to type N individuals. The noncongruence of host and symbiont phylogenies (17, 38, 40) also argues against a long evolutionary history of strict vertical transmission. Although “Ca. Endobugula sertula” has proven difficult to isolate and culture, the lack of extensive genome reduction suggests that a transient host-free existence may be possible, a plausible explanation for the symbiont acquisition by type N hosts found alongside symbiotic type S hosts at low latitudes (19). A

Applied and Environmental Microbiology

aem.asm.org

6579

Miller et al.

A e s

4

s

s

1

s

7 e

e

C

e

5

e

e

s

e

6

s

s

s

e

A

B e

3

s

e

2

s

B

PCR amplicons not observed

7 s

C e e

4 s e

166_R

B s e

12197_F

6 s s

12197_R

e e

6199_F

5

C

6199_R

s e

3

B s e

s e

A s s

2

A e e

s s

e e

1 s

139658_F

PCR amplicons observed

7 s

C e e

166_R

B

5 s e

s e

139658_F

139658_R

6 s s 6199_F

C e e

6199_R

s e

3

B

4 s e

s e

A s s

2

A e e

s s

e e

1 s

12197_F

O O

C

SH

ACP

ACP

BryP

O

O

*

S

ACP

PKS

ECH BryT

BryR O

OH HO O HO

R

KS

O O

O

S

O

HO

O-

O

BryQ

BryU R1

O-

S

AT1

O

O

S

O

*

O

HMGS

O R

PKS

O

R OH

S

PKS

O

O-

R2

FIG 5 De novo assembly and repeat resolution of the bryostatin pathway, and discovery of new bry genes. Contigs aligning to the published bry gene cluster from type S “Ca. Endobugula sertula” (accession no. EF032014) were identified in the AB1 ovicells metagenome. Paired reads were realigned to those contigs, and connections were identified as suggested by paired alignments to multiple contig ends, using a script published by Albertsen et al. (49), and visualized in Cytoscape (72). (A) Connection map of bry contigs 1 to 7 and repeats A to C (see Table S4 in the supplemental material for details on these contigs). Each contig is represented by two dots connected by a red line, arbitrarily marked “s” (start) and “e” (end). Dot size is scaled to contig length, and the color is scaled according to coverage (white ⫽ ⬃12⫻, black ⫽ ⬃6⫻ k-mer coverage reported by SPAdes, where k ⫽ 77). Green lines represent connections between assembled contigs suggested by alignment of paired reads across the indicated ends. (B) Diagram showing the two possible resolutions to the connection map in panel A, given the assumption that the correct resolution is a single contiguous sequence. The two resolutions differ in the placement of contigs 4 and 5. Primers were designed for the locations shown, and PCR amplicons were only observed for the primer pairings consistent with the lower solution, agreeing with EF032014. All amplicons were confirmed by Sanger end sequencing. (C) Scheme showing predicted functions of newly discovered bry genes bryT and bryU, which may be involved in forming the ␤-branches seen in bryostatin structures (positions highlighted in the bottom left structure). BryU is proposed as a standalone acyl-carrier protein (ACP), which carries a malonate unit as it is decarboxylated by BryQ and before the resulting acetate is condensated onto the PKS-bound substrate by BryR. BryT is an enoyl-CoA dehydratase, which is proposed to generate the ␣-␤-unsaturated acid from the intermediate shown. AT, acyltransferase; ECH, enoyl-CoA-reductase; HMGS, 3-hydroxy-3-methylglutarylCoA synthase; KS, ketosynthase; PKS, polyketide synthase.

6580

aem.asm.org

Applied and Environmental Microbiology

November 2016 Volume 82 Number 22

Genomic Analysis of “Candidatus Endobugula sertula”

similar situation is found in the tunicate Lissoclinum patella, which harbors a photosynthetic symbiont, Prochloron didemni (62). As with “Ca. Endobugula sertula,” P. didemni has never been cultivated in the laboratory, but its large genome size and the lack of evidence for genetic drift and accelerated evolution suggest that it might not be genetically isolated inside individual hosts (63). “Candidatus Endobugula sertula” appears to have specific metabolic deficiencies, and these deficiencies might be the reason that “Ca. Endobugula sertula” is dependent on the host. If the symbiont is indeed able to withstand short periods of time outside the host (i.e., it is not genetically isolated within individual hosts), these metabolic deficiencies may have come about due to selection for not investing in “public goods” that are supplied by the host (57). However, over evolutionary time scales, further metabolic deficiencies may develop if the symbiont becomes more strictly restricted to only living within the host and undergoes genome degradation and reduction, which would likely prevent horizontal transmission of the symbiont. Despite few signs of genome degradation, the “Ca. Endobugula sertula” genome shows some potential adaptations to symbiotic life. Several pseudogenes were annotated with functions in flagellar assembly and motility, and similar functions are lost in intracellular Rickettsiales (64). The chitinase-containing Tc locus that was found on NODE28 is related to the Yen-Tc locus from the insect pathogen Yersinia entomophaga (65) and might also be involved in symbiosis. The chitinases in Yen-Tc are thought to associate with the secreted Tc assembly, which exhibits chitinase activity in vitro, to contribute to Y. entomophaga pathogenicity (65). The chitinase-containing Tc locus in “Ca. Endobugula sertula” is highly expressed in the ovicells, consistent with its use in allowing “Ca. Endobugula sertula” to move from the funicular cords within the adult host (5) through the potentially chitinaceous ectocyst (66, 67), thus ensuring vertical transmission to the larvae. The high expression of Tc loci in ovicells is suggestive of the importance of these proteins to the symbiont during the reproductive phase of its host. Other symbiotic systems have been shown to coopt mechanisms used in virulence and immunogenicity of pathogens, such as during the acquisition of the light-producing symbiont Vibrio fischeri by the Hawaiian bobtail squid, where the immunogenic bacterial products lipopolysaccharide and peptidoglycan are central to symbiont-host interactions (68, 69). In a similar fashion, the other Tc loci in the “Ca. Endobugula sertula” genome might also be involved in aspects of recognition and communication with the bryozoan host, and their presence might signify that ancestors of “Ca. Endobugula sertula” were pathogenic. In the present work, a draft assembly of the “Ca. Endobugula sertula” genome was recovered and estimated to be 100% complete by single-copy-marker analysis (39) using comparative shotgun metagenomic data sets. Adapting a method developed by Albertsen et al. (49), the sequence of the repeat-heavy bry biosynthetic pathway was confirmed to agree with the sequence previously constructed using a clone library method and Sanger sequencing (6). Although additional pathway components were identified, enzymes in the “Ca. Endobugula sertula” genome that could be responsible for the addition of the ester side chains could not be found (Fig. 1B), suggesting that the symbiont employs either a novel mechanism that could not be identified bioinformatically or enzymes annotated with roles in primary metabolism, as others have observed (16, 70). Alternatively, the side

November 2016 Volume 82 Number 22

chains may be installed by the host, rather than the symbiont. Additionally, because bryS showed no RNA-seq coverage, this gene may not be involved in the biosynthesis of bryostatins. If BryS is not responsible for O-methylation of ␤-branches, this conversion is most likely to be carried out by methyltransferase domains in BryA and BryB polyketide synthases. Interestingly, although the majority of bacterial secondary metabolite pathways consist of genes clustered into one chromosome region (71), several symbiotic bacteria have been found to contain fragmented pathways (14–16, 29), which include the previously known portions of the bry pathway found to be split into two separate loci in the type D genotype of “Ca. Endobugula sertula” (7). The AB1 draft genome of “Ca. Endobugula sertula” does not contain any other PKS pathway, apart from bry, that would explain the presence of newly identified bryTU genes. Identifying these genes likely implicated in bryostatin biosynthesis highlights the importance of recovering complete or near-complete genomes and, thus, the use of unbiased shotgun sequencing in capturing and understanding the fragmented biosynthetic pathways of uncultured microorganisms. Our identification of these two additional bry genes, along with the metabolic analysis of “Ca. Endobugula sertula,” may facilitate further studies into the renewable supply of bryostatins, either through heterologous expression or targeted symbiont culture. ACKNOWLEDGMENTS This research was performed in part using the computer resources and assistance of the UW-Madison Center For High Throughput Computing (CHTC) in the Department of Computer Sciences. We thank Niels Lindquist (UNC) for assistance with field collections, Ben Oyserman (UW-Madison) for assistance with RNA-seq analysis, and Ahron Flowers (RMC) for assistance with B. neritina genotyping. We thank the University of Wisconsin Biotechnology Center DNA Sequencing Facility for providing sequencing facilities and library preparation services. The CHTC is supported by UW-Madison, the Advanced Computing Initiative, the Wisconsin Alumni Research Foundation, the Wisconsin Institutes for Discovery, and the National Science Foundation, and is an active member of the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy’s Office of Science. Additionally, this work utilized computer resources at Future Grid, which is supported by National Science Foundation grant 0910812. This work was supported by grant R21AI121704-01 from NIAID, as well as funding from The Thomas F. and Kate Miller Jeffress Memorial Trust (Bank of America, Trustee) and the American Foundation for Pharmaceutical Education (to I.J.M.), as well as the School of Pharmacy, the Graduate School, and the Institute for Clinical & Translational Research at the University of Wisconsin-Madison. G.E.L.-F. and J.C.K. collected and prepared samples for analysis; I.J.M. and J.C.K. designed and performed the research; I.J.M., N.V., G.E.L.-F., S.S.F., and J.C.K. analyzed data; and I.J.M. and J.C.K. wrote the paper.

FUNDING INFORMATION This work, including the efforts of Jason Christopher Kwan, was funded by Division of Intramural Research, National Institute of Allergy and Infectious Diseases (DIR, NIAID) (R21AI121704-01). This work, including the efforts of Ian J. Miller, was funded by American Foundation for Pharmaceutical Education (AFPE). This work, including the efforts of Stephen Fong, Grace E. Lim-Fong, and Jason Christopher Kwan, was funded by Thomas F. and Kate Miller Jeffress Memorial Trust (Jeffress Trust).

REFERENCES 1. Piel J. 2009. Metabolites from symbiotic bacteria. Nat Prod Rep 26:338 – 362. http://dx.doi.org/10.1039/B703499G.

Applied and Environmental Microbiology

aem.asm.org

6581

Miller et al.

2. Flórez LV, Biedermann PHW, Engl T, Kaltenpoth M. 2015. Defensive symbioses of animals with prokaryotic and eukaryotic microorganisms. Nat Prod Rep 32:904 –936. http://dx.doi.org/10.1039/C5NP00010F. 3. Staley JT, Konopka A. 1985. Measurement of in situ activities of nonphotosynthetic microorganisms in aquatic and terrestrial habitats. Annu Rev Microbiol 39:321–346. http://dx.doi.org/10.1146/annurev.mi.39.100185 .001541. 4. Rappé MS, Giovannoni SJ. 2003. The uncultured microbial majority. Annu Rev Microbiol 57:369 –394. http://dx.doi.org/10.1146/annurev .micro.57.030502.090759. 5. Sharp KH, Davidson SK, Haygood MG. 2007. Localization of “Candidatus Endobugula sertula” and the bryostatins throughout the life cycle of the bryozoan Bugula neritina. ISME J 1:693–702. http://dx.doi.org/10 .1038/ismej.2007.78. 6. Sudek S, Lopanik NB, Waggoner LE, Hildebrand M, Anderson C, Liu H, Patel A, Sherman DH, Haygood MG. 2007. Identification of the putative bryostatin polyketide synthase gene cluster from “Candidatus Endobugula sertula,” the uncultivated microbial symbiont of the marine bryozoan Bugula neritina. J Nat Prod 70:67–74. http://dx.doi.org/10.1021 /np060361d. 7. Trindade-Silva AE, Lim-Fong GE, Sharp KH, Haygood MG. 2010. Bryostatins: biological context and biotechnological prospects. Curr Opin Biotechnol 21:834 – 842. http://dx.doi.org/10.1016/j.copbio.2010.09.018. 8. Piel J. 2010. Biosynthesis of polyketides by trans-AT polyketide synthases. Nat Prod Rep 27:996 –1047. http://dx.doi.org/10.1039/b816430b. 9. Lindquist N, Hay ME. 1996. Palatability and chemical defense of marine invertebrate larvae. Ecol Monogr 66:431– 450. http://dx.doi.org/10.2307 /2963489. 10. Lopanik N, Lindquist N, Targett N. 2004. Potent cytotoxins produced by a microbial symbiont protect host larvae from predation. Oecologia 139: 131–139. http://dx.doi.org/10.1007/s00442-004-1487-5. 11. Lopanik N, Gustafson KR, Lindquist N. 2004. Structure of bryostatin 20: a symbiont-produced chemical defense for larvae of the host bryozoan, Bugula neritina. J Nat Prod 67:1412–1414. http://dx.doi.org/10.1021 /np040007k. 12. Helfrich EJN, Piel J. 2016. Biosynthesis of polyketides by trans-AT polyketide synthases. Nat Prod Rep 33:231–316. http://dx.doi.org/10 .1039/C5NP00125K. 13. Hertweck C. 2009. The biosynthetic logic of polyketide diversity. Angew Chem Int Ed Engl 48:4688 – 4716. http://dx.doi.org/10.1002/anie .200806121. 14. Kwan JC, Donia MS, Han AW, Hirose E, Haygood MG, Schmidt EW. 2012. Genome streamlining and chemical defense in a coral reef symbiosis. Proc Natl Acad Sci U S A 109:20655–20660. http://dx.doi.org/10.1073 /pnas.1213820109. 15. Nakabachi A, Ueoka R, Oshima K, Teta R, Mangoni A, Gurgui M, Oldham NJ, van Echten-Deckert G, Okamura K, Yamamoto K, Inoue H, Ohkuma M, Hongoh Y, Miyagishima S-Y, Hattori M, Piel J, Fukatsu T. 2013. Defensive bacteriome symbiont with a drastically reduced genome. Curr Biol 23:1478 –1484. http://dx.doi.org/10.1016/j.cub.2013.06.027. 16. Schofield MM, Jain S, Porat D, Dick GJ, Sherman DH. 2015. Identification and analysis of the bacterial endosymbiont specialized for production of the chemotherapeutic natural product ET-743. Environ Microbiol 17:3964 –3975. http://dx.doi.org/10.1111/1462-2920.12908. 17. Lim-Fong GE, Regali LA, Haygood MG. 2008. Evolutionary relationships of “Candidatus Endobugula” bacterial symbionts and their Bugula bryozoan hosts. Appl Environ Microbiol 74:3605–3609. http://dx.doi.org /10.1128/AEM.02798-07. 18. McCutcheon JP, Moran NA. 2012. Extreme genome reduction in symbiotic bacteria. Nat Rev Microbiol 10:13–26. http://dx.doi.org/10.1038 /nrmicro2670. 19. Linneman J, Paulus D, Lim-Fong G, Lopanik NB. 2014. Latitudinal variation of a defensive symbiosis in the Bugula neritina (Bryozoa) sibling species complex. PLoS One 9:e108783. http://dx.doi.org/10.1371/journal .pone.0108783. 20. Schmidt EW, Donia MS. 2009. Cyanobactin ribosomally synthesized peptides—a case of deep metagenome mining. Methods Enzymol 458: 575–596. http://dx.doi.org/10.1016/S0076-6879(09)04823-X. 21. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-

6582

aem.asm.org

22. 23. 24. 25. 26. 27. 28.

29.

30. 31. 32. 33. 34. 35.

36.

37.

38.

39.

40.

41.

cell sequencing. J Comput Biol 19:455– 477. http://dx.doi.org/10.1089 /cmb.2012.0021. Huson DH, Weber N. 2013. Microbial community analysis using MEGAN. Methods Enzymol 531:465– 485. http://dx.doi.org/10.1016 /B978-0-12-407863-5.00021-6. R Development Core Team. 2009. The R project for statistical computing. The R Foundation, Vienna, Austria. https://www.r-project.org/. Fraley C, Raftery AE. 2002. Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611– 631. http://dx.doi.org /10.1198/016214502760047131. Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, Rozen SG. 2012. Primer3–new capabilities and interfaces. Nucleic Acids Res 40:e115. http://dx.doi.org/10.1093/nar/gks596. Langmead B, Salzberg SL. 2012. Fast gapped-read alignment with Bowtie 2. Nat Methods 9:357–359. http://dx.doi.org/10.1038/nmeth.1923. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioinformatics 30:2068 –2069. http://dx.doi.org/10.1093/bioinformatics/btu153. Mandlik A, Livny J, Robins WP, Ritchie JM, Mekalanos JJ, Waldor MK. 2011. RNA-Seq-based monitoring of infection-linked changes in Vibrio cholerae gene expression. Cell Host Microbe 10:165–174. http://dx.doi.org /10.1016/j.chom.2011.07.007. Piel J, Wen G, Platzer M, Hui D. 2004. Unprecedented diversity of catalytic domains in the first four modules of the putative pederin polyketide synthase. Chembiochem 5:93–98. http://dx.doi.org/10.1002 /cbic.200300782. Wu M, Eisen JA. 2008. A simple, fast, and accurate method of phylogenomic inference. Genome Biol 9:R151. http://dx.doi.org/10.1186/gb -2008-9-10-r151. Wu M, Scott AJ. 2012. Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2. Bioinformatics 28:1033–1034. http://dx.doi .org/10.1093/bioinformatics/bts079. Jang Y, Oh H-M, Kim H, Kang I, Cho J-C. 2011. Genome sequence of strain IMCC1989, a novel member of the marine Gammaproteobacteria. J Bacteriol 193:3672–3673. http://dx.doi.org/10.1128/JB.05202-11. Price MN, Dehal PS, Arkin AP. 2010. FastTree 2–approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490. http://dx .doi.org/10.1371/journal.pone.0009490. Letunic I, Bork P. 2011. Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res 39:W475– W478. http://dx.doi.org/10.1093/nar/gkr201. Miller CS, Baker BJ, Thomas BC, Singer SW, Banfield JF. 2011. EMIRGE: reconstruction of full-length ribosomal genes from microbial community short read sequencing data. Genome Biol 12:R44. http://dx .doi.org/10.1186/gb-2011-12-5-r44. Miller CS, Handley KM, Wrighton KC, Frischkorn KR, Thomas BC, Banfield JF. 2013. Short-read assembly of full-length 16S amplicons reveals bacterial diversity in subsurface sediments. PLoS One 8:e56018. http: //dx.doi.org/10.1371/journal.pone.0056018. Haygood MG, Davidson SK. 1997. Small-subunit rRNA genes and in situ hybridization with oligonucleotides specific for the bacterial symbionts in the larvae of the bryozoan Bugula neritina and proposal of “Candidatus Endobugula sertula.” Appl Environ Microbiol 63:4612– 4616. Yang JC, Madupu R, Durkin AS, Ekborg NA, Pedamallu CS, Hostetler JB, Radune D, Toms BS, Henrissat B, Coutinho PM, Schwarz S, Field L, Trindade-Silva AE, Soares CAG, Elshahawi S, Hanora A, Schmidt EW, Haygood MG, Posfai J, Benner J, Madinger C, Nove J, Anton B, Chaudhary K, Foster J, Holman A, Kumar S, Lessard PA, Luyten YA, Slatko B, Wood N, Wu B, Teplitski M, Mougous JD, Ward N, Eisen JA, Badger JH, Distel DL. 2009. The complete genome of Teredinibacter turnerae T7901: an intracellular endosymbiont of marine wood-boring bivalves (shipworms). PLoS One 4:e6085. http://dx.doi.org/10.1371/journal.pone.0006085. Rinke C, Schwientek P, Sczyrba A, Ivanova NN, Anderson IJ, Cheng J-F, Darling A, Malfatti S, Swan BK, Gies EA, Dodsworth JA, Hedlund BP, Tsiamis G, Sievert SM, Liu W-T, Eisen JA, Hallam SJ, Kyrpides NC, Stepanauskas R, Rubin EM, Hugenholtz P, Woyke T. 2013. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499: 431– 437. http://dx.doi.org/10.1038/nature12352. Lim GE, Haygood MG. 2004. “Candidatus Endobugula glebosa,” a specific bacterial symbiont of the marine bryozoan Bugula simplex. Appl Environ Microbiol 70:4921– 4929. http://dx.doi.org/10.1128/AEM.70.8 .4921-4929.2004. Weiner RM, Taylor LE, Jr, Henrissat B, Hauser L, Land M, Coutinho

Applied and Environmental Microbiology

November 2016 Volume 82 Number 22

Genomic Analysis of “Candidatus Endobugula sertula”

42.

43.

44.

45. 46. 47. 48. 49.

50.

51.

52.

53. 54. 55.

56.

57.

PM, Rancurel C, Saunders EH, Longmire AG, Zhang H, Bayer EA, Gilbert HJ, Larimer F, Zhulin IB, Ekborg NA, Lamed R, Richardson PM, Borovok I, Hutcheson S. 2008. Complete genome sequence of the complex carbohydrate-degrading marine bacterium, Saccharophagus degradans strain 2-40 T. PLoS Genet 4:e1000087. http://dx.doi.org/10.1371 /journal.pgen.1000087. DeBoy RT, Mongodin EF, Fouts DE, Tailford LE, Khouri H, Emerson JB, Mohamoud Y, Watkins K, Henrissat B, Gilbert HJ, Nelson KE. 2008. Insights into plant cell wall degradation from the genome sequence of the soil bacterium Cellvibrio japonicus. J Bacteriol 190:5455–5463. http: //dx.doi.org/10.1128/JB.01701-07. Chaston JM, Suen G, Tucker SL, Andersen AW, Bhasin A, Bode E, Bode HB, Brachmann AO, Cowles CE, Cowles KN, Darby C, de Léon L, Drace K, Du Z, Givaudan A, Herbert Tran EE, Jewell KA, Knack JJ, Krasomil-Osterfeld KC, Kukor R, Lanois A, Latreille P, Leimgruber NK, Lipke CM, Liu R, Lu X, Martens EC, Marri PR, Médigue C, Menard ML, Miller NM, Morales-Soto N, Norton S, Ogier J-C, Orchard SS, Park D, Park Y, Qurollo BA, Sugar DR, Richards GR, Rouy Z, Slominski B, Slominski K, Snyder H, Tjaden BC, van der Hoeven R, Welch RD, Wheeler C, Xiang B, Barbazuk B, et al. 2011. The entomopathogenic bacterial endosymbionts Xenorhabdus and Photorhabdus: convergent lifestyles from divergent genomes. PLoS One 6:e27909. http: //dx.doi.org/10.1371/journal.pone.0027909. Meusch D, Gatsogiannis C, Efremov RG, Lang AE, Hofnagel O, Vetter IR, Aktories K, Raunser S. 2014. Mechanism of Tc toxin action revealed in molecular detail. Nature 508:61– 65. http://dx.doi.org/10 .1038/nature13015. Kuo C-H, Moran NA, Ochman H. 2009. The consequences of genetic drift for bacterial genome complexity. Genome Res 19:1450 –1454. http: //dx.doi.org/10.1101/gr.091785.109. Lerat E, Ochman H. 2005. Recognizing the pseudogenes in bacterial genomes. Nucleic Acids Res 33:3125–3132. http://dx.doi.org/10.1093/nar /gki631. Kwan JC, Schmidt EW. 2013. Bacterial endosymbiosis in a chordate host: long-term co-evolution and conservation of secondary metabolism. PLoS One 8:e80822. http://dx.doi.org/10.1371/journal.pone.0080822. Vogel C, Marcotte EM. 2012. Insights into the regulation of protein abundance from proteomic and transcriptomic analyses. Nat Rev Genet 13:227–232. http://dx.doi.org/10.1038/nrg3185. Albertsen M, Hugenholtz P, Skarshewski A, Nielsen KL, Tyson GW, Nielsen PH. 2013. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat Biotechnol 31:533–538. http://dx.doi.org/10.1038/nbt.2579. Calderone CT, Kowtoniuk WE, Kelleher NL, Walsh CT, Dorrestein PC. 2006. Convergence of isoprene and polyketide biosynthetic machinery: isoprenyl-S-carrier proteins in the pksX pathway of Bacillus subtilis. Proc Natl Acad Sci U S A 103:8977– 8982. http://dx.doi.org/10.1073/pnas .0603148103. Mattheus W, Gao L-J, Herdewijn P, Landuyt B, Verhaegen J, Masschelein J, Volckaert G, Lavigne R. 2010. Isolation and purification of a new kalimantacin/batumin-related polyketide antibiotic and elucidation of its biosynthesis gene cluster. Chem Biol 17:149 –159. http://dx.doi.org/10 .1016/j.chembiol.2010.01.014. Wakimoto T, Egami Y, Nakashima Y, Wakimoto Y, Mori T, Awakawa T, Ito T, Kenmoku H, Asakawa Y, Piel J, Abe I. 2014. Calyculin biogenesis from a pyrophosphate protoxin produced by a sponge symbiont. Nat Chem Biol 10:648 – 655. http://dx.doi.org/10.1038/nchembio.1573. Bennett GM, Moran NA. 2015. Heritable symbiosis: the advantages and perils of an evolutionary rabbit hole. Proc Natl Acad Sci U S A 112:10169 – 10176. http://dx.doi.org/10.1073/pnas.1421388112. Casadevall A. 2008. Evolution of intracellular pathogens. Annu Rev Microbiol 62:19 –33. http://dx.doi.org/10.1146/annurev.micro.61.080706 .093305. Sun Z, Zhiyi S, Blanchard JL. 2014. Strong genome-wide selection early in the evolution of Prochlorococcus resulted in a reduced genome through the loss of a large number of small effect genes. PLoS One 9:e88837. http: //dx.doi.org/10.1371/journal.pone.0088837. Martínez-Cano DJ, Reyes-Prieto M, Martínez-Romero E, PartidaMartínez LP, Latorre A, Moya A, Delaye L. 2014. Evolution of small prokaryotic genomes. Front Microbiol 5:742. http://dx.doi.org/10.3389 /fmicb.2014.00742. Morris JJ, Lenski RE, Zinser ER. 2012. The Black Queen hypothesis:

November 2016 Volume 82 Number 22

58.

59. 60. 61. 62.

63. 64.

65.

66.

67.

68.

69.

70.

71. 72.

73.

74.

75.

76.

evolution of dependencies through adaptive gene loss. mBio 3(2):e0003612. http://dx.doi.org/10.1128/mBio.00036-12. Toh H, Weiss BL, Perkin SAH, Yamashita A, Oshima K, Hattori M, Aksoy S. 2006. Massive genome erosion and functional adaptations provide insights into the symbiotic lifestyle of Sodalis glossinidius in the tsetse host. Genome Res 16:149 –156. http://dx.doi.org/10.1101/gr.4106106. Wendt DE. 2000. Energetics of larval swimming and metamorphosis in four species of Bugula (Bryozoa). Biol Bull 198:346 –356. http://dx.doi.org /10.2307/1542690. Nelson DL, Lehninger AL, Cox MM. 2008. Lehninger principles of biochemistry, 5th ed. W. H. Freeman, New York, NY. Llácer JL, Fita I, Rubio V. 2008. Arginine and nitrogen storage. Curr Opin Struct Biol 18:673– 681. http://dx.doi.org/10.1016/j.sbi.2008.11.002. Donia MS, Fricke WF, Partensky F, Cox J, Elshahawi SI, White JR, Phillippy AM, Schatz MC, Piel J, Haygood MG, Ravel J, Schmidt EW. 2011. Complex microbiome underlying secondary and primary metabolism in the tunicate-Prochloron symbiosis. Proc Natl Acad Sci U S A 108: E1423–E1432. http://dx.doi.org/10.1073/pnas.1111712108. Lin Z, Torres JP, Tianero MD, Kwan JC, Schmidt EW. 2016. Origin of chemical diversity in Prochloron-tunicate symbiosis. Appl Environ Microbiol 82:3450 –3460. http://dx.doi.org/10.1128/AEM.00860-16. Martijn J, Schulz F, Zaremba-Niedzwiedzka K, Viklund J, Stepanauskas R, Andersson SGE, Horn M, Guy L, Ettema TJG. 2015. Single-cell genomics of a rare environmental alphaproteobacterium provides unique insights into Rickettsiaceae evolution. ISME J 9:2373–2385. http://dx.doi .org/10.1038/ismej.2015.46. Busby JN, Landsberg MJ, Simpson RM, Jones SA, Hankamer B, Hurst MRH, Lott JS. 2012. Structural analysis of Chi1 chitinase from Yen-Tc: the multisubunit insecticidal ABC toxin complex of Yersinia entomophaga. J Mol Biol 415:359 –371. http://dx.doi.org/10.1016/j.jmb.2011.11.018. Woollacott RM, Zimmer RL. 1975. A simplified placenta-like system for the transport of extraembryonic nutrients during embryogenesis of Bugula neritina (Bryozoa). J Morphol 147:355–377. http://dx.doi.org/10 .1002/jmor.1051470308. Moosbrugger M, Schwaha T, Walzl MG, Obst M, Ostrovsky AN. 2012. The placental analogue and the pattern of sexual reproduction in the cheilostome bryozoan Bicellariella ciliata (Gymnolaemata). Front Zool 9:29. http://dx.doi.org/10.1186/1742-9994-9-29. Troll JV, Bent EH, Pacquette N, Wier AM, Goldman WE, Silverman N, McFall-Ngai MJ. 2010. Taming the symbiont for coexistence: a host PGRP neutralizes a bacterial symbiont toxin. Environ Microbiol 12:2190 – 2203. http://dx.doi.org/10.1111/j.1462-2920.2009.02121.x. Brennan CA, Hunt JR, Kremer N, Krasity BC, Apicella MA, McFallNgai MJ, Ruby EG. 2014. A model symbiosis reveals a role for sheathedflagellum rotation in the release of immunogenic lipopolysaccharide. eLife 3:e01579. http://dx.doi.org/10.7554/eLife.01579. Peng C, Pu J-Y, Song L-Q, Jian X-H, Tang M-C, Tang G-L. 2012. Hijacking a hydroxyethyl unit from a central metabolic ketose into a nonribosomal peptide assembly line. Proc Natl Acad Sci U S A 109:8540 – 8545. http://dx.doi.org/10.1073/pnas.1204232109. Osbourn A. 2010. Secondary metabolic gene clusters: evolutionary toolkits for chemical innovation. Trends Genet 26:449 – 457. http://dx.doi.org /10.1016/j.tig.2010.07.001. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. 2003. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13: 2498 –2504. http://dx.doi.org/10.1101/gr.1239303. Davidson SK, Haygood MG. 1999. Identification of sibling species of the bryozoan Bugula neritina that produce different anticancer bryostatins and harbor distinct strains of the bacterial symbiont “Candidatus Endobugula sertula.” Biol Bull 196:273–280. http://dx.doi.org/10.2307/1542952. Mackie JA, Keough MJ, Christidis L. 2006. Invasion patterns inferred from cytochrome oxidase I sequences in three bryozoans, Bugula neritina, Watersipora subtorquata, and Watersipora arcuata. Mar Biol 149:285–295. http://dx.doi.org/10.1007/s00227-005-0196-x. Jeong H, Yim JH, Lee C, Choi S-H, Park YK, Yoon SH, Hur C-G, Kang H-Y, Kim D, Lee HH, Park KH, Park S-H, Park H-S, Lee HK, Oh TK, Kim JF. 2005. Genomic blueprint of Hahella chejuensis, a marine microbe producing an algicidal agent. Nucleic Acids Res 33:7066 –7073. http://dx .doi.org/10.1093/nar/gki1016. Miller IJ, Weyna TR, Fong SS, Lim-Fong GE, Kwan JC. 2016. Single sample resolution of rare microbial dark matter in a marine invertebrate metagenome. Sci Rep 6:34362. http://dx.doi.org/10.1038/srep34362.

Applied and Environmental Microbiology

aem.asm.org

6583

Lack of Overt Genome Reduction in the Bryostatin-Producing Bryozoan Symbiont "Candidatus Endobugula sertula".

The uncultured bacterial symbiont "Candidatus Endobugula sertula" is known to produce cytotoxic compounds called bryostatins, which protect the larvae...
1MB Sizes 0 Downloads 10 Views