JGV Papers in Press. Published July 30, 2014 as doi:10.1099/vir.0.067553-0
1
Characterization and complete genome sequence
2
analysis of novel bacteriophage IME-EFm1 infecting
3
Enterococcus faecium
4 5
Yahui Wang1,3§, Wei Wang2,3§, Yongqiang Lv4§, Wangliang Zheng3, Zhiqiang Mi3,
6
Guangqian Pei3, Xiaoping An3, Xiaomeng Xu2,3, Chuanyin Han3, Jie Liu5*, Changlin
7
Zhou1*, Yigang Tong3*
8 9
1
School of Life Science & Technology, China Pharmaceutical University, 24 Tong Jia
10
Xiang, Nanjing 210009, P.R. China
11
2
Anhui Medical University, Hefei 230032, China
12
3
State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology
13
and Epidemiology, Beijing 100071, China
14
4
Department of Laboratory, Dalian Beihai Hospital, Dalian Liaoning 116021, China
5
The General Hospital of Beijing Military Command, Beijing 100041, China
15 16 17
Subject category: Phage
18
Running title: Genome analysis of novel phage IME-EFm1
19
GenBank accession number: KJ010489.1
20 21
§
22
*Corresponding authors
These authors contributed equally to this work
23
E-mail addresses:
24
Yigang Tong:
[email protected] 25
Changlin Zhou:
[email protected] 26
Jie Liu:
[email protected] 27
Yahui Wang:
[email protected] 28
Wei Wang:
[email protected] 29
Yongqiang Lv:
[email protected] 30 31
Number of words in abstract:231.
32
Number of words in main text (including figure legends):4407.
33
34
Abstract
35
We isolated and characterized a novel virulent bacteriophage IME-EFm1 specifically
36
infecting
37
morphologically similar to the family Siphoviridae. It was capable of lysing a wide
38
range of our E. faecium collections, including two strains resistant to vancomycin.
39
One-step growth tests revealed the host lysis activity of phage IME-EFm1, with a
40
latent time of 30 min and a large burst size of 116 plaque-forming units (PFU)/cell.
41
These biological characteristics suggested that IME-EFm1 hold the potential to be
42
used as a therapeutic agent. The complete genome of IME-EFm1 is a 42597bp in
43
length, linear, terminally non-redundant double-stranded DNA, with a G+C content of
44
35.2%. The termini of the phage genome were determined with next generation
45
sequencing data and were further confirmed by nuclease digestion analysis. To our
46
knowledge, this is the first report of a complete genome sequence of a bacteriophage
47
infecting E. faecium. IME-EFm1 exhibited low similarity with other phages in terms
48
of genome organization and structural protein amino acid sequences. The coding
49
region corresponds to 90.7% of the genome. 70 putative open reading frames were
50
deduced, and of these, 29 could be functionally identified based on their homology to
51
previously characterized proteins. A predicted metallo-beta-lactamase gene was
52
detected in the genome sequence. The identification of antibiotic resistance gene
53
emphasizes the necessity of complete genome sequencing of a phage to ensure it free
54
of any undesirable genes.
55
Keywords
multiple-drug
resistant
Enterococcus
faecium.
IME-EFm1
is
56
Bacteriophage; Enterococcus faecium; complete genome sequence; antibiotic
57
resistance; genome termini
58
59
Background
60
Enterococcus faecium is a Gram-positive facultative anaerobe that is part of the
61
normal flora of human and animal digestive tracts, and is widely distributed in the
62
environment. It is also an opportunistic bacterial pathogen that causes a variety of
63
serious diseases in humans, notably nosocomial and secondary infections (Arias and
64
Murray, 2012; de Been et al., 2013). Although these infections can be successfully
65
cured by broad spectrum antibiotics, long-term and occasionally needless use of
66
antibiotics in humans enhances the spread of antibiotic-resistant bacterial strains, and
67
causes them to eventually dominate populations of human microorganisms. E.
68
faecium stains are robust and adaptable, with a particular ability to survive under
69
harsh conditions and at a wide range of temperatures (from 10°C to >45°C) (Arias and
70
Murray, 2012). In the last decade, enterococcal hospital-acquired infections have
71
increasingly been associated with E. faecium compared with other Enterococcus
72
species (Brueggemann et al., 2007). Extensive studies revealed that the E. faecium
73
was more resistant to most drugs than other Enterococcus species (de Been et al.,
74
2013). Furthermore, the increasing number of E. faecium strains with resistance to
75
multiple antibiotics is a major public health concern.
76
Bacteriophages are viruses that specifically infect and lyse bacteria. They are
77
ubiquitous throughout the environment and are the most genetically diverse biological
78
entities in the biosphere (Abedon, 2008; Hemminga et al., 2010; Lima-Mendez,
79
Toussaint, and Leplae, 2007). Recently, phages have been suggested as the most
80
promising alternative therapeutic agents against multiple-drug resistant bacterial
81
infections, including vancomycin-resistant E. faecium (Abedon, 2011; Anisimov and
82
Amoako, 2006; Biswas et al., 2002; Debarbieux, 2008; Matsuzaki et al., 2005).
83
Bacteriophages have shown highly potent and often species-specific bacteriolytic and
84
notable lack of bacterial resistance in medicine for the treatment and prophylaxis of
85
infections (Salifu et al., 2013; Yele et al., 2012). Technologies have also been patented
86
employing phages in other pathogen related applications including detection and
87
decontamination (Dorval Courchesne, Parisien, and Lan, 2009).
88
In the current study, we selected E. faecium strains isolated by the clinical
89
laboratory at the Chinese People's Liberation Army Hospital 307 (Beijing, China) as
90
indicator bacteria. Phages active against these strains were isolated from Hospital 307
91
sewage water. We then undertook characterization and genetic analysis of a newly
92
isolated lytic E. faecium bacteriophage, designated IME-EFm1. To our knowledge,
93
the complete nucleotide sequence of an E. faecium bacteriophage has not been
94
previously reported. Analysis of the complete genome provided insights into the
95
features of IME-EFm1, which contribute to our knowledge on the interactions
96
between phages and host bacteria. This investigation also provides experimental
97
evidence to validate the clinical application of E. faecium phage (Debarbieux, 2008).
98
99
Results and discussion
100
Morphology properties of IME-EFm1
101
A bacteriophage designated IME-EFm1 was isolated from hospital sewage.
102
IME-EFm1 formed clear plaques that did not produce a halo (Figure 1(a)).
103
Transmission electron microscopy analysis revealed that IME-EFm1 had an isometric
104
head and a non-contractile tail (Figure 1(b)). The diameter of the isometric head was
105
about 50 nm, and the tail length was about 192 nm. According to the guidelines of the
106
International Committee on Taxonomy of Viruses (Viruses and Fauquet, 2005),
107
IME-EFm1 was classified as belonging to the Siphoviridae family (order
108
Caudovirales).
109
Optimal multiplicity of infection (MOI) and one-step growth curve
110
Samples infected at a MOI of 1 generated the maximum number of phage progeny
111
(Supplementary file 1). IME-EFm1 had relatively a similar latent period (30 min), a
112
long release period, and an average burst size of 116 plaque-forming units (PFU)/cell
113
(Figure 2). The plateau phase was reached after 90min, following a 60min burst
114
period. The interval between the eclipse phase and the latent period was only 10 min,
115
which is short compared with other bacteriophages (Raytcheva et al., 2011).
116
Phage host range test
117
As shown in Supplementary file 2, IME-EFm1 could infect 17 of the 22 (77.3%)
118
clinical E. faecaium isolates, including two strains resistant to vancomycin. However,
119
IME-EFm1 did not infect any of the tested Enterococcus faecalis, Staphylococcus
120
aureus or Escherichia coli strains.
121
Determination of IME-EFm1 genomic termini using NGS data
122
Complete genome sequencing of phage IME-EFm1 was conducted using
123
next-generation sequencing (NGS). About 3.9Megabases (Mb) generated data was
124
used to assemble a 42597-bp-long linear chromosome with more than 50-fold genome
125
coverage, using Roche Newbler 2.8 assembler.
126
We have established an approach to predict the genomic termini based on NGS
127
data (Li et al., 2014). The high-frequency sequences (HFSs) derived from
128
high-throughout sequencing represent the termini of the sequenced genome. To
129
identify the termini of the IME-EFm1 genome, we mapped the raw sequence reads
130
onto the assembled IME-EFm1 genome. The results revealed that two sequences of
131
extremely high frequency were mapped at the ends of the assembled genome (Figure
132
3), representing the 5 and 3 ends of the phage genome. The average occurrence of a
133
read was calculated to be 13.6 ((total reads)/(genome length×2 direction) = 1154818/
134
(42597×2)). Using this formula, the ratios of the highest forward and reverse
135
frequencies versus the average frequency were 324.4 (4412/13.6) and 234.9
136
(3194/13.6), respectively, which further suggested that these HFSs were the genomic
137
termini. Manual extension of these HFSs showed that they could only be extended in
138
one direction, again confirming that they are the termini. The occurrence of the top 10
139
highest frequency sequences in the raw data was counted in both forward and reverse
140
directions (Supplementary file 3). The surrounding 20 bp of the highest frequency
′
′
141
sequences in both forward and reverse directions were also counted. HFSs showed an
142
extraordinarily higher peak than the surrounding 20bp area in both forward and
143
reverse directions (Figure 3). Together, these results suggest that these HFSs are the
144
termini of the assembled genome. Therefore, the assembled genome has distinct
145
termini at the end of forward and reverse directions of the genome. The above results
146
also demonstrated that IME-EFm1 has a linear, “terminally non-redundant”
147
double-stranded DNA genome, which is a distinct feature not seen in most previously
148
identified double-stranded DNA bacteriophages. This characteristic suggests that
149
IME-EFm1 has a unique DNA replication mechanism.
150
To further confirm the above assumption regarding the termini of the IME-EFm1
151
genome, the restriction assay and terminal run-off sequencing were carried out. The
152
restriction endonuclease Stu I, which has only one cut site at position 3658nt on the
153
genomic DNA, was used to digest the DNA, releasing two bands in the agarose gel
154
electrophoresis (data not shown). The result of terminal run-off sequencing is showed
155
in Figure 4(b), Figure 4(c). No signal is detected after the terminal sequence
156
“CCTTTTTATAACGAATTAAT” in the positive strand, neither after the terminal
157
sequence “GAATTTCGTGCGAAGAAGAG” in the negative strand. These results
158
proved the linearity and further confirmed that “CTCTTCTTCGCACGAAATTC…”
159
and “ATTAATTCGTTATAAAAAGG…” (Supplementary file 3) are the true physical
160
ends of the genome.
161
Overview of phage IME-EFm1 genome
162
Restriction assay and NGS sequencing revealed that IME-EFm1 has a
163
double-stranded, terminally non-redundant genome, 42,597 bp in length with a low
164
G+C content (35.2%) and a nucleotide content of 32.4% A, 32.4% T, 17.5% G, and
165
17.7% C. A total 146 putative promoters and 120 putative rho-independent
166
terminators were predicted in the genome. Sequence analysis revealed 70 putative
167
open reading frames (ORFs) (Table 1), from 114 to 3225 bp in length with ATG as the
168
main start codon, encoding proteins of 38–1075 amino acids (aa) in length. Together,
169
the ORFs covered 38633 bp, resulting in a coding density of 90.7%. A map of
170
predicted ORFs was then generated with gene features (Figure 5). The majority of the
171
ORFs (40 ORFs, 57.14%) were transcribed on the negative strand. Interestingly, the
172
start codon of 14 ORFs (20.2% of the total) overlapped with the stop codon of the
173
previous gene, indicating possible transcriptional interactions between neighboring
174
genes. Putative tRNA genes were searched using tRNAscan-SE (v. 1.21) and no
175
tRNA was detected. On the basis of homology comparisons, 29 ORFs were assigned
176
significant similarity (E value
177
GenBank database, and 23 of the 29 ORFs were found in phages from other
178
Enterococcus species. Based on the genome size and sequence similarity, the closest
179
relatives of IME-EFm1 were identified as E. faecalis phages EFAP-1, IME-EF4,
180
EFRM31 and IME-EF3.
181
Functional ORF prediction
182
The functionally-identified ORFs were classified into three groups, including
183
structure and packaging, replication and regulation, and lysis (Figure 5). BLASTP
184
analysis revealed that ORF2 and ORF3 were the most likely candidates for the
≤
1E-4; Supplementary file 4) to other proteins in the
185
terminase small subunit and terminase large subunit, respectively. Based on their
186
homology to proteins in other Enterococcus species phages, these ORFs were
187
designated as coding for terminase proteins belonging to pfam05119 and pfam03354,
188
respectively.
189
The morphogenesis module of IME-EFm1 is located next to the packaging
190
module in the genome. A portal protein gene (ORF5) of 1206 bp (402 aa) in length
191
was detected, encoding the portal protein that forms a channel through which
192
bacteriophages inject their genome into host cells. ORF7 of IME-EFm1 showed
193
similarities to major capsid proteins belonging to pfam 05065. The genes encoded by
194
ORFs8-11 are involved in the formation and connection of the head and tail structure.
195
ORFs10, 11 and 12 had amino acid similarity to three parts of the head–tail joining
196
protein of Enterococcus faecalis phage EFAP-1, respectively. ORF9 showed
197
similarity with head-tail adaptor protein of enterococcal bacteriophage EFRM31.
198
ORF16 was similar to tail length tape-measure proteins, and the predicted protein
199
product of ORF16 was the longest in IME-EFm1 genome. Interestingly, the following
200
cluster of packaging and structural genes was found in both the IME-EFm1 (ORFs
201
2–12) and EFRM31 (gp12–23) bacteriophages (Fard et al., 2010), in the same order:
202
terminase small subunit, terminase large subunit, a hypothetical protein, portal protein,
203
prohead protease, major capsid protein, hypothetical protein, head-tail joining protein,
204
head-tail adaptor protein, two head-tail joining proteins and the major tail protein.
205
This suggests a close relationship between the two bacteriophages.
206
The replication and regulation module of the IME-EFm1 genome showed
207
significant similarity to that in the phage EfaCPT1 genome. ORF43 of IME-EFm1
208
encodes prophage Lp4 protein 7, and was predicted to have a primase-polymerase
209
(prim-pol) domain. The helicase protein and DNA primase encoded by ORF45 and
210
ORF63, respectively, are also within this module.
211
ORF38 was found to be related to metallo- -lactamase domains and showed β
212
homology to putative metallo- -lactamase genes from previously identified phage.
213
Several other reports have analyzed antibiotic resistance genes in phage DNA
214
(Desiere et al., 2002; Muniesa et al., 2004; Parsley et al., 2010). The bacteriophages
215
have the potential to carry antibiotic resistance genes to accelerate the dispersal of
216
antibiotic resistance genes. (Muniesa et al., 2004; Petrovski, Seviour, and Tillett, 2011;
217
Witte, 2004). The predicted metallo- -lactamase of phage may confer antibiotic
218
resistance to the host, thereby enhancing adaptation to antibiotic stress (Marti,
219
Variatza, and Balcázar, 2013). Although all these assumptions have not been further
220
validated, complete genome sequence analysis of potentially-therapeutic phages is
221
necessary. The function of the putative metallo- -lactamase gene should be further
222
clarified in future before IME-EFm1 could be used for therapeutic purposes.
β
β
β
223
Another characteristic of the phage IME-EFm1 genome is that it contains a
224
protein toxin, haemolysin gene (ORF22) followed by holin and endolysin genes
225
(ORF23 and 24). ORF22 possesses a XhlA (pfam 10779), which encoding a putative
226
membrane-associated protein (Krogh, Jørgensen, and Devine, 1998). This putative
227
haemolysin protein can insert into cellular membranes and form pores. Holin is a
228
protein that perforates the membrane and inserts into the bacterial cell (Wang, Smith,
229
and Young, 2000). Expression of both heamolysin and holin is necessary to effect
230
host cell lysis (Krogh, Jørgensen, and Devine, 1998). The endolysin function was
231
attributed to ORF24. The phage likely uses the holin-endolysin strategy to lyse the
232
host cell to liberate progeny virions (Wang, Smith, and Young, 2000).
233
N-Acetylmuramoyl-L-alanine amidase is specifically dedicated to lysis, and holin
234
contributes to the activation of the amidase at a precisely defined time (Pastagia et al.,
235
2013; Wang, Smith, and Young, 2000).
236
Phylogenetic tree analysis of phage IME-EFm1
237
To analyze the evolutionary relationship between phage IME-EFm1 and other phages,
238
the large terminase protein sequence was used to construct a phylogenetic tree (Figure
239
6(a)). IME-EFm1 was clustered with E. faecalis phages EFRM31, EfaCPT1, and
240
EFAP-1, while some other Enterococcus species phages were clustered into different
241
subgroups. Another phylogenetic tree was constructed based on amino acid sequences
242
of the portal protein (Figure 6(b)). Notably, these two phylogenetic trees were in
243
perfect accord, showing the same clustering of phages EFRM31, EfaCPT1, EFAP-1,
244
IME-EF3, IME-EF4 and IME-EFm1.
245
Genome-wide comparison
246
Genome sequencing results showed that the IME-EFm1 genomes showed partial
247
identity to the complete sequence of the enterococcal bacteriophages IME-EF3
248
[GenBank: NC_023595.2], IME-EF4 [GenBank: NC_023551.1], EfaCPT1 [GenBank:
249
JX193904.1]. Figure 7 shows the distribution of the homology across the genomes
250
using Easyfig software with amino acid sequence comparison. The regions showing
251
the most significant similarity included the head capsid, tail, and DNA replication
252
gene clusters.
253
Nucleotide sequence accession number
254
The accession number for the complete genome sequence of E. faecium phage
255
IME-EFm1 was deposited in the NCBI GenBank database under accession number
256
[GenBank: KJ010489.1].
257
258
Conclusions
259
Bacterial infections that are recalcitrant to currently available antibiotics are a serious
260
clinical problem. The increasing prevalence of antibiotic resistant bacteria is mainly a
261
result of the extensive and often unnecessary use of antibiotics. Therefore, the search
262
for alternatives to antibiotics is a pressing public concern. Enterococci are inherently
263
resistant to antimicrobials and are a key source of antibacterial resistance determinants
264
for other members of the intestinal microflora.
265
We successfully isolated a novel phage that has a board host range amongst E.
266
faecium strains, including vancomycin-resistant enterococci. The phage, designated
267
IME-EFm1, is morphologically similar to phages of the family Siphoviridae, and has
268
a latent time of 30 min and a large burst size of 116 PFU/cell. These characteristics
269
mean that IME-EFm1 has significant potential for use in veterinary and human
270
medicine for the treatment and prophylaxis of E. faecium infections. However,
271
because this phage also contains a putative metallo- -lactamase gene in genome, it
272
will be necessary to clear the function of this putative antibiotic resistance gene before
273
it can be used for clinical application.
β
274
The complete nucleotide sequence of the IME-EFm1 is the first one of
275
Enterococcus faecium phages. It has low similarity to other phages at the nucleic acid
276
and amino acid sequence levels, including phages from other Enterococci species.
277
Homology analysis revealed no highly homologous sequences in the database,
278
suggesting that phage IME-EFm1 is novel. The determination of the phenotypic
279
features and genetic properties of IME-EFm1 provides useful information for future
280
studies, including host specificity, propagation dynamics, and adaptation to bacterial
281
defense systems, and will assist in programs to exploit bacteriophages as therapeutic
282
agents against bacterial pathogens. Therefore, this study should provide elementary
283
data for the future application of E. faecium phages.
284
285
Methods
286
Bacterial strains
287
Thirty bacterial strains isolated from clinical urine specimens by the clinical
288
laboratory at the Chinese People's Liberation Army Hospital 307 (Beijing, China)
289
were used. This collection included 22 E. faecium strains, 4 E. faecalis strains, 2 S.
290
aureus strains and 2 E. coli strains. Bacterial isolates were identified using the
291
automatic bacteria identification analysis system (VITEK, bioMerieux, France) and
292
16S rDNA PCR amplification and sequencing. PCR was carried out using 2×EasyTaq
293
PCR
294
(AGAGTTTGATCMTGGCTCAG) and 1492R (CGGTTACCTTGTTACGACTT)
295
(Weisburg et al., 1991). Drug susceptibility testing was carried out according to
296
Clinical and Laboratory Standards Institute guidelines. Bacteria were stored at −70°C
297
in BHI (Brain Heart Infusion broth, Becton Dickinson, America) supplemented with
298
25% glycerol. All strains were cultivated in liquid BHI medium at 37°C with aeration.
299
Isolation and purification of bacteriophages
300
Bacteriophage IME-EFm1, which specifically targets E. faecium strain 383, was
301
isolated from sewage collected from the Chinese PLA Hospital 307 using enrichment
302
cultures (Adams, 1959). Purification, concentration, and replication were carried out
303
by standard methods as described previously (Carlson, 2005). The bacteriophage titer
304
was assessed using the double layer agar technique according to methods described
305
previously (Adams, 1959).
SuperMix
(TransGen,
Beijing,
China)
and
primers
27F
306
Transmission electron microscopy
307
The lysate of purified phage was recovered by centrifugation at 4800×g for 5min. The
308
supernatant was filtered through a 0.45- m filter to clear bacterial cells and other
309
debris. A 20- l aliquot of purified bacteriophage sample was placed in carbon-coated
310
copper grids to absorb for 15 min. Subsequently, the sample was negatively stained
311
with 2% (w/v) phosphotungstate. Images were obtained using a transmission electron
312
(JEM-1200EX, Japan Electron Optics Laboratory Co., Japan) at an acceleration
313
voltage of 100 kV.
314
MOI and one-step growth assay
315
Serial dilutions of E. faecium strain 383 in growth phase were added to aliquots of
316
IME-EFm1 stock solution each containing same number of bacteriophages. After 10
317
min of absorption at 37°C, the mixtures were centrifuged at 7000×g for 5 min. The
318
phage-cell complexes were sedimented and resuspended in 5 ml BHI medium. The
319
phage titer was analyzed following 4 h of incubation at 37°C, as described previously
320
(Gallet, Shao, and Wang, 2009; Zhu et al., 2010).
μ
μ
321
One-step growth curve experiments were performed as previously described
322
(Ellis and Delbrück, 1939). Briefly, the initial phage titer was determined by adding
323
serial dilutions of high-titer phage lysates to lawns of the host bacterial strain (Adams,
324
1959). Then, 100 l of phage solution (5×107 PFU/ml) was added to 1 ml of bacterial
325
suspension (5×107 colony-forming units/ml; multiplicity of infection = 0.1). This
326
mixture was incubated at 37°C for 5 min to allow phage adsorption. After 5 min, the
327
mixture was centrifuged at 4800×g for 5 min, the supernatant removed, and the pellets
μ
328
were resuspended in 10 ml BHI broth. Subsequently, 100- l samples were taken at 0,
329
5, 10, 15, 20, 25, 30, 40, 50, 60, 90, 120 and 150min and titered using the double-agar
330
method (Adams, 1959). The first set of samples was immediately diluted and plated
331
for phage titer determination. The second set of samples was processed with 1% (v/v)
332
chloroform prior to phage titration to release intra-cellular phages to determine the
333
eclipse phase (Pajunen, Kiljunen, and Skurnik, 2000). The phage titer was then
334
plotted against time to estimate the latent period and burst size.
335
Host range determination
336
Host range was determined using spot testing, which is a rapid and efficient method,
337
with the above-mentioned 30 bacterial strains (Kutter, 2009). To observe the scope of
338
phage sterilization, the bacterial strains grown in liquid BHI medium were mixed with
339
semi-solid BHI medium and transferred directly onto plates already containing a layer
340
of solid BHI medium. After drying, a drop of the phage suspension was put on the
341
bacterial layer. After 8h of incubation at 37°C, plates were checked for plaques on
342
bacterial lawns (Kutter, 2009). Phages able to infect a particular host type formed
343
plaques on the bacterial lawn (Carlson, 2005).
344
Genomic DNA extraction
345
Bacteriophage genomic DNA was extracted from the phage lysate through standard
346
phenol–chloroform extraction protocols as described previously (Brabban, Hite, and
347
Callaway, 2005; Lu et al., 2013; Wilcox, Toder, and Foster, 1996), with minor
348
modifications. Phage stock solution was treated with 1 g/ml DNase I and 1 g/ml
349
RNaseA (Thermo Scientific, America) and incubated overnight at 37°C to remove
μ
μ
μ
350
contaminating bacterial DNA and RNA. Samples were then incubated at 80°C for 15
351
minutes to deactivate the DNase I. Lysis buffer (final concentration, 0.5 % sodium
352
dodecyl sulfate, 20 mM EDTA, and 50 g/ml proteinase K) was added to samples,
353
which were then incubated for 1h at 56°C in a water bath, after which an equal
354
volume of phenol was added to extract the viral DNA. Following centrifugation at
355
7000×g for 5min, the aqueous layer was removed to a fresh tube containing an equal
356
volume of phenol-chloroform-isoamyl alcohol (25:24:1) and centrifuged at 7000×g
357
for 5min to remove proteins and polysaccharides. The aqueous layer was collected
358
and mixed with an equal volume of isopropanol, and stored at −20°C for 3h. The
359
mixture was then centrifuged at 4°C for 20min at 10000×g, and the resulting DNA
360
pellet was washed with 75% ethanol. The DNA pellet was air dried at room
361
temperature, resuspended in deionized water, and stored at −20°C for use (Sambrook
362
and Russell, 2001).
363
Library Preparation and Genome Sequencing
364
The genome sequence of purified genomic DNA was performed on the Personal
365
Genome Machine (PGM) sequencer. Adapter-ligated library was made following the
366
manufacturer’s NEBNext Fast DNA Library Prep Set for Ion Torrent protocol (NEB
367
#E6270L). Briefly, 100ng of purified DNA was dissolved in deionized water to a total
368
volume of 50 l and fragmented by Bioruptor Sonication System to a size distribution
369
of 300-400 bp. The sonicated fragments were end-repaired and ligated with Ion
370
Torrent adapters P1 and A. Then, the 350-370 bp adapter-ligated fragments were
371
selected was with E-Gel Size Select 2% agarose (Invitrogen). The selected product
μ
μ
372
was PCR-amplified , the reaction conditions were 98°C for 30sec (initial denaturation)
373
followed by 9 cycles for 98°C, 10sec (denaturation); 58°C, 30sec (annealing); 72°C,
374
30sec (extension) and 72°C, 5min (final extension). Concentration of the amplified
375
product was determined with Qubit 2.0 fluorometer (Life Technologies). Prior to
376
sequencing, quality control analysis for the constructed library was performed for
377
fragment size distribution with Bioanalyser 2100 (Agilent Technologies, USA).
378
Template preparation was carried out with the Ion One-Touch 200 Template Kit v2
379
DL (Life Technologies) according to the manufacturer’s instructions (Catalog Number
380
4480285). Briefly, the library was diluted to 3ng/ml and attached to the surface of Ion
381
Sphere particles (ISPs) using as the templates for clonal amplification during the
382
emulsion PCR. Emulsion breaking, and enrichment was processed subsequently. The
383
quality of the enriched ISPs was estimated with Ion Sphere Quality Control Kit (Life
384
Technologies). The Ion PGM Sequencing 300 Kit (Life Technologies) was used with
385
the PGM sequencer according to the Ion PGM Sequencing 300 Kit protocol (Catalog
386
Number 4480445). Enriched ISPs were loaded onto an Ion 318 chip and sequenced on
387
the PGM for 640 flows resulting in an average read length of >300 bp. Newbler
388
version 2.8 was used to assemble raw sequencing reads. The genome was visualized
389
in the CLC Genomics Workbench software (CLCbio, Aarhus, Denmark).
390
Terminal run-off sequencing
391
The IME-EFm1 complete genome without interrupting is used as the template for
392
terminal run-off sequencing (Sanger sequencing, ABI 3730XL). The process was
393
described previously (Lu et al., 2013). Figure 4(a) shows the position of primer F
394
(GCAACCACTATGCGAGGTATGC) and primer R (GCGTGTCTGCCCAGTTGAC)
395
in genome.
396
Genome annotation and comparison and phylogenetic tree reconstruction
397
Initial gene prediction was carried out with Rapid Annotation using Subsystem
398
Technology (RAST) annotation server (Aziz et al., 2008). All predicted ORFs were
399
verified manually using results of searches against the non-redundant database (NCBI)
400
by PSI-BLAST (Altschul et al., 1997) with a minimum E value of 1E-4. The
401
conserved domains were searched against PFAM (Bateman et al., 2004), CDD
402
(Marchler-Bauer et al., 2005), and COG (Tatusov et al., 2000) using RPS-BLAST
403
(Marchler-Bauer et al., 2009). Missed ORFs, frameshifts, and pseudogenes were
404
identified using the online program
405
(http://www.ncbi.nlm.nih.gov/genomes/frameshifts/frameshifts.cgi). The
406
identification of putative promoter regions were carried out using the Neural Network
407
Promoter Prediction tool (Reese, 2001) of the Berkeley Drosophila Genome Project
408
(minimum promoter score: 0.9) .The FindTerm programs (Solovyev and Salamov,
409
2011) predicted the rho-independent transcription terminators (energy threshold value:
410
-11). Putative tRNA-encoding genes were searched using tRNAscan-SE (v. 1.21)
411
(Lowe and Eddy, 1997). A physical map of the annotated IME-EFm1 genome was
412
generated using DNAPlotter (Rosseel et al., 2012). Easyfig software (Sullivan, Petty,
413
and Beatson, 2011) was used for construction of multiple amino acid sequence
414
alignments. The neighbor-joining phylogenetic tree was built using the Poisson model
415
with 1000 bootstrap replications in MEGA 5.0 (Tamura et al., 2011).
416
417
Competing interests
418
The authors declare that they have no competing interests.
419 420
Authors' contributions
421
Yigang Tong, Changlin Zhou and Jie Liu conceived and designed the experiments and
422
critically evaluated the manuscript. Yahui Wang isolated and identified the phage,
423
carried out experiments, data analysis, and wrote the manuscript. Wei Wang was
424
responsible for sequence analysis and drafting the manuscript. Yongqiang Lv
425
collected and identified clinical bacteria. Wangliang Zheng collected clinical bacteria
426
and conducted the biological characterization experiments. Zhiqiang Mi advised on
427
data analysis and critically evaluated the manuscript. Guangqian Pei and Xiaoping An
428
conducted the sequencing experiments. Xiaomeng Xu was responsible for sequence
429
analysis. Chuanyin Han carried out biological characterization experiments. All
430
authors read and approved the final manuscript.
431 432
Acknowledgments
433
This research was supported by a grant from the National Hi-Tech Research and
434
Development (863) Program of China (No. 2012AA022003 and No.2014AA021402),
435
the China Mega-Project on Infectious Disease Prevention (No. 2013ZX10004605, No.
436
2011ZX10004001, No. 2013ZX10004607-004 and No. 2013ZX10004217-002-003)
437
and the State Key Laboratory of Pathogen and BioSecurity Program (No.
438
SKLPBS1113).
439
440
References
441
Abedon, S. (2008). "Bacteriophage Ecology: Population Growth, Evolution and Impact of Bacterial
442 443 444
Abedon, S. (2011). Phage therapy pharmacology: calculating phage dosing. Advances in applied
445
Adams, M. (1959). Bacteriophages. Interscience, New York, 445-447.
446 447 448
Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D. J.
449 450
Anisimov, A. P., and Amoako, K. K. (2006). Treatment of plague: promising alternatives to antibiotics.
451
Arias, C. A., and Murray, B. E. (2012). The rise of the Enterococcus: beyond vancomycin resistance.
Viruses. Advances in Molecular and Cellular Microbiology." Cambridge University Press. microbiology 77, 1-40.
(1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic acids research 25(17), 3389-3402. Journal of medical microbiology 55(11), 1461-1475.
452 453
Aziz, R. K., Bartels, D., Best, A. A., DeJongh, M., Disz, T., Edwards, R. A., Formsma, K., Gerdes, S.,
454
Glass, E. M., and Kubal, M. (2008). The RAST Server: rapid annotations using subsystems
455
technology. BMC genomics 9(1), 75.
Nature Reviews Microbiology 10(4), 266-278.
456
Bateman, A., Coin, L., Durbin, R., Finn, R. D., Hollich, V., Griffiths‐Jones, S., Khanna, A., Marshall,
457 458
M., Moxon, S., and Sonnhammer, E. L. (2004). The Pfam protein families database. Nucleic
459
Biswas, B., Adhya, S., Washart, P., Paul, B., Trostel, A. N., Powell, B., Carlton, R., and Merril, C. R.
460 461 462
(2002). Bacteriophage therapy rescues mice bacteremic from a clinical isolate of
463 464 465
acids research 32(suppl 1), D138-D141.
vancomycin-resistant Enterococcus faecium. Infection and immunity 70(1), 204-210. Brabban, A., Hite, E., and Callaway, T. (2005). Evolution of foodborne pathogens via temperate bacteriophage-mediated gene transfer. Foodbourne Pathogens & Disease 2(4), 287-303. Brueggemann, A. B., Pai, R., Crook, D. W., and Beall, B. (2007). Vaccine escape recombinants emerge after pneumococcal vaccination in the United States. PLoS Pathogens 3(11), e168.
466 467
Carlson, K. (2005). Appendix: working with bacteriophages: common techniques and methodological
468
de Been, M., van Schaik, W., Cheng, L., Corander, J., and Willems, R. J. (2013). Recent recombination
469 470
events in the core genome are associated with adaptive evolution in Enterococcus faecium.
471 472 473
Debarbieux, L. (2008). [Experimental phage therapy in the beginning of the 21st century]. Medecine et
474 475 476 477
approaches. Bacteriophages: biology and applications, 437-494.
Genome biology and evolution 5(8), 1524-1535. maladies infectieuses 38(8), 421-425. Desiere, F., Lucchini, S., Canchaya, C., Ventura, M., and Brüssow, H. (2002). Comparative genomics of phages and prophages in lactic acid bacteria. Antonie Van Leeuwenhoek 82(1-4), 73-91. Dorval Courchesne, N. M., Parisien, A., and Lan, C. Q. (2009). Production and application of bacteriophage and bacteriophage-encoded lysins. Recent patents on biotechnology 3(1), 37-45. Ellis, E. L., and Delbrück, M. (1939). The growth of bacteriophage. The Journal of general physiology
478 479
Fard, R. M. N., Barton, M. D., Arthur, J. L., and Heuzenroeder, M. W. (2010). Whole-genome
480
sequencing and gene mapping of a newly isolated lytic enterococcal bacteriophage EFRM31.
481
Archives of virology 155(11), 1887-1891.
22(3), 365-384.
482 483 484 485 486 487 488
Gallet, R., Shao, Y., and Wang, N. (2009). High adsorption rate is detrimental to bacteriophage fitness in a biofilm-like environment. BMC evolutionary biology 9(1), 241. Hemminga, M. A., Vos, W. L., Nazarov, P. V., Koehorst, R. B., Wolfs, C. J., Spruijt, R. B., and Stopar, D. (2010). Viruses: incredible nanomachines. New advances with filamentous phages. European Biophysics Journal 39(4), 541-550. Krogh, S., Jørgensen, S. T., and Devine, K. M. (1998). Lysis Genes of the Bacillus subtilisDefective Prophage PBSX. Journal of bacteriology 180(8), 2110-2117.
489 490 491
Kutter, E. (2009). Phage host range and efficiency of plating. Bacteriophages, Springer, 141-149.
492 493
Lima-Mendez, G., Toussaint, A., and Leplae, R. (2007). Analysis of the phage sequence space: the
494
Lowe, T. M., and Eddy, S. R. (1997). tRNAscan-SE: a program for improved detection of transfer RNA
Li, S., Fan, H., An, X., Fan, H., Jiang, H., Chen, Y., and Tong, Y. (2014). Scrutinizing Virus Genome Termini by High-Throughput Sequencing. PloS one 9(1), e85806. benefit of structured information. Virology 365(2), 241-249.
495 496
Lu, S., Le, S., Tan, Y., Zhu, J., Li, M., Rao, X., Zou, L., Li, S., Wang, J., and Jin, X. (2013). Genomic
497
and Proteomic Analyses of the Terminally Redundant Genome of the Pseudomonas aeruginosa
498 499
Phage PaP1: Establishment of Genus PaP1-Like Phages. PloS one 8(5), e62933.
genes in genomic sequence. Nucleic acids research 25(5), 0955-964.
Marchler-Bauer, A., Anderson, J. B., Cherukuri, P. F., DeWeese-Scott, C., Geer, L. Y., Gwadz, M., He,
500 501 502
Marchler-Bauer, A., Anderson, J. B., Chitsaz, F., Derbyshire, M. K., DeWeese-Scott, C., Fong, J. H.,
503
Geer, L. Y., Geer, R. C., Gonzales, N. R., and Gwadz, M. (2009). CDD: specific functional
504 505
annotation with the Conserved Domain Database. Nucleic acids research 37(suppl 1),
506
Marti, E., Variatza, E., and Balcázar, J. L. (2013). Bacteriophages as a reservoir of extended‐spectrum
S., Hurwitz, D. I., Jackson, J. D., and Ke, Z. (2005). CDD: a Conserved Domain Database for protein classification. Nucleic acids research 33(suppl 1), D192-D196.
D205-D210.
507
β‐lactamase
508 509
and Infection 10.1111/1469-0691.12446. Matsuzaki, S., Rashel, M., Uchiyama, J., Sakurai, S., Ujihara, T., Kuroda, M., Ikeuchi, M., Tani, T.,
510
Fujieda, M., and Wakiguchi, H. (2005). Bacteriophage therapy: a revitalized therapy against
511 512
bacterial infectious diseases. Journal of infection and chemotherapy 11(5), 211-219.
513 514 515 516 517 518 519 520 521 522 523
and fluoroquinolone resistance genes in the environment. Clinical Microbiology
Muniesa, M., García, A., Miró, E., Mirelis, B., Prats, G., Jofre, J., and Navarro, F. (2004). Bacteriophages and diffusion of β-lactamase genes. Emerging infectious diseases 10(6), 1134. Pajunen, M., Kiljunen, S., and Skurnik, M. (2000). Bacteriophage φYeO3-12, Specific forYersinia enterocolitica Serotype O: 3, Is Related to Coliphages T3 and T7. Journal of bacteriology 182(18), 5114-5120. Parsley, L. C., Consuegra, E. J., Kakirde, K. S., Land, A. M., Harper, W. F., and Liles, M. R. (2010). Identification of diverse antimicrobial resistance determinants carried on bacterial, plasmid, or viral metagenomes from an activated sludge microbial assemblage. Applied and environmental microbiology 76(11), 3753-3757. Pastagia, M., Schuch, R., Fischetti, V. A., and Huang, D. B. (2013). Lysins: the arrival of pathogen-directed anti-infectives. Journal of medical microbiology 62(Pt 10), 1506-1516. Petrovski, S., Seviour, R. J., and Tillett, D. (2011). Genome sequence and characterization of the
524
Tsukamurella bacteriophage TPA2. Applied and environmental microbiology 77(4),
525 526
1389-1398.
527 528 529 530
Raytcheva, D. A., Haase-Pettingell, C., Piret, J. M., and King, J. A. (2011). Intracellular assembly of cyanophage Syn5 proceeds through a scaffold-containing procapsid. Journal of virology 85(5), 2406-2415. Reese, M. G. (2001). Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome. Computers & chemistry 26(1), 51-56.
531 532 533
Rosseel, T., Scheuch, M., Höper, D., De Regge, N., Caij, A. B., Vandenbussche, F., and Van Borm, S.
534 535
Salifu, S., Valero-Rello, A., Campbell, S., Inglis, N., Scortti, M., Foley, S., and Vázquez-Boland, J.
536
(2012). DNase SISPA-next generation sequencing confirms Schmallenberg virus in Belgian field samples and identifies genetic variation in Europe. PloS one 7(7), e41967. (2013). Genome and proteome analysis of phage E3 infecting the soil-borne actinomycete Rhodococcus equi. Environmental microbiology reports 5(1), 170.
537 538
Sambrook, J., and Russell, D. (2001). Molecular Cloning: A Laboratory Manual. The 3rd ed., Cold
539
Solovyev, V., and Salamov, A. (2011). Automatic annotation of microbial genomes and metagenomic
540 541
sequences, p 61–78. Metagenomics and its applications in agriculture, biomedicine and
542 543 544
Sullivan, M. J., Petty, N. K., and Beatson, S. A. (2011). Easyfig: a genome comparison visualizer.
545
evolutionary genetics analysis using maximum likelihood, evolutionary distance, and
546 547 548
maximum parsimony methods. Molecular biology and evolution 28(10), 2731-2739.
Spring Horbor laboratory. Cold Spring Harbor, NY.
environmental studies. Nova Science Publishers, Hauppauge, NY. Bioinformatics 27(7), 1009-1010. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011). MEGA5: molecular
Tatusov, R. L., Galperin, M. Y., Natale, D. A., and Koonin, E. V. (2000). The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic acids research 28(1), 33-36.
549 550
Viruses, I. C. o. T. o., and Fauquet, C. (2005). "Virus Taxonomy: Classification and Nomenclature of
551
Wang, I.-N., Smith, D. L., and Young, R. (2000). Holins: the protein clocks of bacteriophage infections.
552 553 554
Firuses: Eighth Report of the International Committee on Taxonomy of Viruses." Elsevier. Annual Reviews in Microbiology 54(1), 799-825. Weisburg, W. G., Barns, S. M., Pelletier, D. A., and Lane, D. J. (1991). 16S ribosomal DNA amplification for phylogenetic study. Journal of bacteriology 173(2), 697-703.
555 556
Wilcox, S., Toder, R., and Foster, J. (1996). Rapid isolation of recombinant lambda phage DNA for use
557 558 559
Witte, W. (2004). International dissemination of antibiotic resistant strains of bacterial pathogens.
560
AB7-IBB1 of Acinetobacter baumannii: isolation, characterization and its effect on biofilm.
561 562 563
Archives of virology 157(8), 1441-1450.
564 565 566 567
in fluorescence in situ hybridization. Chromosome Research 4(5), 397-404. Infection, Genetics and Evolution 4(3), 187-191. Yele, A. B., Thawal, N. D., Sahu, P. K., and Chopade, B. A. (2012). Novel lytic bacteriophage
Zhu, J., Rao, X., Tan, Y., Xiong, K., Hu, Z., Chen, Z., Jin, X., Li, S., Chen, Y., and Hu, F. (2010). Identification of lytic bacteriophage MmP1, assigned to a new member of T7-like phages infecting Morganella morganii. Genomics 96(3), 167.
568
Figures
569
Figure 1:
570
(a) Plaques of phage IME-EFm1
571
Arrow indicates phage plaque. The IME-EFm1 stock solution (0.1 ml) was mixed
572
with E. faecium strain 383 (0.5 ml, OD600=0.6) in 5 ml semi-solid BHI medium
573
(0.75% agar) and transferred directly onto solidified base nutrient agar (1.5% agar).
574
After 5h of incubation at 37°C, plates were checked for phage plaques. The diameter
575
of phage plaque is about 1mm.
576
(b) Morphology of phage IME-EFm1 as revealed by transmission electron
577
micrographs
578
Scale bar represents 100 nm.
579 580
Figure 2: One-step growth curve of phage IME-EFm1
581
The two sets of data represent samples treated with chloroform (black line) and
582
samples without chloroform (gray line), respectively. Each curve represents average
583
results from three experiments.
584 585
Figure 3: Distribution of the top 10 forward and reverse HFSs in the phage
586
IME-EFm1 genome
587
One HFS is on the left end while the other is on the right end. Their frequencies are
588
2207 and 471, respectively. Black rhombus: forward; gray square: reverse.
589
590
Figure 4: Terminal run-off sequencing of IME-EFm1
591
(a) The position of primer in genome.
592
(b) The result of sequencing in the positive strand.
593
(c) The result of sequencing in the negative strand.
594
The base sequence underlined is the natural termini of IME-EFm1 genome.
595 596
Figure 5: Genome map of IME-EFm1
597
The linear genome of IME-EFm1 depicted in a circularized format. The three circular
598
tracks describe (from inner to outer): GC skew ([G-C]/[G+C]), with inward peaks
599
indicating a greater proportion of G; GC content, with inward peaks indicating below
600
average GC content; ORFs and direction of transcription.
601 602
Figure 6: Phylogenetic analysis of selected IME-EFm1 structural proteins
603
Phylogenetic trees constructed from selected structural genes from enterococcal
604
phages using the neighbor-joining method and 1,000 bootstrap replicates.
605
Phylogenetic trees were constructed based on the amino acid sequences of the large
606
terminase proteins (a) and portal proteins (b). Bootstrap support values (numbers on
607
the lines) are indicated for selected internal branches.
608 609
Figure 7: Comparison of the complete genome sequences of IME-EFm1 with
610
IME_EF3, IME-EF4 and EfaCPT1
611
The colored arrows indicate ORFs according to gene function. Comparisons were
612
done by BLAST algorithm. The colored lines between boxes represent homologous
613
regions present in each genome. Darker purple fills indicate lower E values.
614
615
Table
616
Table 1: Summary of ORFs and predicted functions in IME-EFm1 ORFa
Start
End
Strand
nucleotide
Size(aa)
Start codon
Predict functionb
1
261
443
+
183
61
ATG
hypothetical protein
2
448
912
+
465
155
ATG
terminase small subunit(pfam 05119)
3
1515
3290
+
1776
592
ATG
terminase large subunit
4
3384
3560
+
177
59
ATG
sensor histidine kinase
5
3564
4769
+
1206
402
ATG
portal protein(pfam 04860)
6
4714
5265
+
552
184
ATG
prohead protease(pfam 04586)
7
5335
6546
+
1212
404
TTG
capsid protein
8
6623
6892
+
270
90
ATG
head-tail joining protein
9
6892
7227
+
336
112
ATG
head-tail adaptor protein (pfam 05135)
10
7206
7526
+
321
107
ATG
head-tail joining protein (pfam 06264)
11
7600
7965
+
366
122
ATG
head-tail joining protein
12
8037
8600
+
564
188
ATG
major tail protein
13
8707
9054
+
348
116
ATG
hypothetical protein
14
9059
9265
+
207
69
ATG
tail tape measure chaperone frameshift protein
15
9332
10360
+
1029
343
ATG
tail tape measure protein
16
10420
13644
+
3225
1075
GTG
phage tail length tape-measure protein
17
13714
16848
+
3135
1045
ATG
hypothetical protein(pfam 01464)
18
16915
17793
+
879
293
TTG
minor tail protein
19
17985
18338
+
354
118
TTG
hypothetical protein
20
18352
20166
+
1815
605
ATG
phage tail assembly
21
20185
20430
+
246
82
ATG
hypothetical protein
22
20586
20864
+
279
93
ATG
hemolysin XhlA family protein (pfam 10779)
23
20877
21158
+
282
94
ATG
holin
24
21175
22200
+
1026
342
ATG
N-acetylmuramoyl-L-alanine amidase(pfam 01510)
25
22568
22347
-
222
74
ATG
hypothetical protein
26
24923
22632
-
2292
764
ATG
DNA polymerase
27
25507
25109
-
399
133
ATG
HNH homing endonuclease
28
25710
25504
-
207
69
ATG
hypothetical protein
29
26277
25726
-
552
184
ATG
hypothetical protein
30
26624
26292
-
333
111
ATG
hypothetical protein(pfam 05154)
31
26871
26659
-
213
71
ATG
hypothetical protein
32
27654
26962
-
693
231
ATG
hypothetical protein
33
27938
27720
-
219
73
ATG
hypothetical protein
34
28744
27935
-
810
270
ATG
hypothetical protein
35
28901
28734
-
168
56
GTG
hypothetical protein
36
29130
28903
-
228
76
ATG
hypothetical protein
37
29320
29153
-
168
56
ATG
hypothetical protein
38
30032
29358
-
675
225
ATG
metallo-beta-lactamase domain protein
39
30436
30170
-
267
89
TTG
HNH endonuclease family protein
40
30901
30437
-
465
155
ATG
HNH homing endonuclease-like protein
617
618 619 620 621
Table 1 continued
a
ORFa
Start
End
strand
nucleotide
Size(aa)
Start codon
Predict functionb(pfam)
41
31370
30891
-
480
160
ATG
hypothetical protein
42
31707
31507
-
201
67
ATG
hypothetical protein
43
32472
31726
-
747
249
ATG
prim-pol domain protein
44
32693
32532
-
162
54
ATG
hypothetical protein
45
33988
32690
-
1299
433
ATG
helicase(pfam 00271)
46
34463
33978
-
486
162
ATG
HNH homing endonuclease(pfam 07463)
47
34576
34463
-
114
38
ATG
hypothetical protein
48
34770
34573
-
198
66
ATG
hypothetical protein
49
35113
34763
-
351
117
ATG
hypothetical protein
50
35319
35113
-
207
69
ATG
hypothetical protein
51
35413
35321
-
93
31
ATG
hypothetical protein
52
35719
35441
-
279
93
ATG
hypothetical protein
53
35984
35802
-
183
61
ATG
hypothetical protein
54
36156
35977
-
180
60
ATG
hypothetical protein
55
36596
36156
-
441
147
ATG
hypothetical protein
56
36819
36658
-
162
54
ATG
hypothetical protein
57
37130
36831
-
300
100
ATG
hypothetical protein
58
37426
37142
-
285
95
ATG
hypothetical protein
59
37764
37438
-
327
109
ATG
hypothetical protein
60
37994
37758
-
237
79
ATG
hypothetical protein
61
38343
38059
-
285
95
ATG
hypothetical protein
62
38647
38432
-
216
72
ATG
hypothetical protein
63
40329
38650
-
1680
560
ATG
DNA primase
64
40689
40411
-
279
93
ATG
hypothetical protein
65
41191
41394
+
204
68
ATG
hypothetical protein
66
41391
41537
+
147
49
ATG
hypothetical protein
67
41539
41721
+
183
61
ATG
hypothetical protein
68
41733
41924
+
192
64
ATG
hypothetical protein
69
41924
42082
+
159
53
ATG
hypothetical protein
70
42126
42518
+
393
131
ATG
HNH endonuclease
ORFs were numbered consecutively Predicted function is based on amino acid sequence identity, conserved motifs, and gene location within functional modules b
622
Supplementary file
623 624
Supplementary file 1: Determination of optimal multiplicity of infection (MOI) Supplementary file 2: Lytic spectrum of IME-EFm1 Supplementary file 3: Ten most frequent forward and reverse sequences (starting 20bp) in the phage IME-EFm1 genome Supplementary file 4: ORF analysis of the IME-EFm1 genome
625 626 627