YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx 1

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev 5 6

Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin

3 4 7 8 9 11 10 12 1 2 4 6 15 16 17 18 19 20 21 22 23 24 25

Q1

Mei He a, Xue-Chun Yan b, Yang Liang a, Xiao-Wen Sun b, Chun-Bo Teng a,⇑ a b

College of Life Science, Northeast Forestry University, Harbin 150040, China Heilongjiang River Fisheries Research Institute, Chinese Academy of Fishery Sciences, Harbin 150070, China

a r t i c l e

i n f o

Article history: Received 14 December 2013 Revised 26 March 2014 Accepted 1 April 2014 Available online xxxx Keywords: VHSV Substitution rate TMRCA Divergence

a b s t r a c t Viral hemorrhagic septicemia virus (VHSV) is an economically significant rhabdovirus that affects an increasing number of freshwater and marine fish species. Extensive studies have been conducted on the molecular epizootiology, genetic diversity, and phylogeny of VHSV. However, there are discrepancies between the reported estimates of the nucleotide substitution rate for the G gene and the divergence times for the genotypes. Herein, Bayesian coalescent analyses were conducted to the time-stamped entire coding sequences of the six VHSV genes. Rate estimates based on the G gene indicated that the marine genotypes/subtypes might not all evolve slower than their major European freshwater counterpart. Age calculations on the six genes revealed that the first bifurcation event of the analyzed isolates might have taken place within the last 300 years, which was much younger than previously thought. Selection analyses suggested that two codons of the G gene might be positively selected. Surveys of codon usage bias showed that the P, M and NV genes exhibited genotype-specific variations. Furthermore, we proposed that VHSV originated from the Pacific Northwest of North America. Ó 2014 Elsevier Inc. All rights reserved.

27 28 29 30 31 32 33 34 35 36 37 38 39 40 41

42 43

1. Introduction

44

Viral hemorrhagic septicemia, historically known as Egtved disease, is a deadly virulent disease that infects more than 80 fish species from diverse families, including rainbow trout (Oncorhynchus mykiss), turbot (Scophthalmus maximus), and yellow perch (Perca flavescens), which are important in commerce and recreation (Kurath, 2012; Smail and Snow, 2011). First observed in Germany by Schäperclaus (1938), the disease was believed to be restricted to only freshwater fish in continental Europe until 1988 when it was also discovered among Pacific anadromous salmonids in North America (Brunson et al., 1989; Hopper, 1989). Now, it is known to circulate within the Northern Hemisphere. With such high infectivity, broad host range and wide distribution, the disease lays a heavy burden on the global fish farming industry, especially on the European trout aquaculture which has suffered significant losses from its outbreaks for massive die-offs (Smail and Snow, 2011). Viral hemorrhagic septicemia virus (VHSV), the causative agent identified in 1962 (Jensen, 1963), belongs to the Novirhabdovirus genus in the Rhabdoviridae family (Walker et al., 2000). Its

45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62

⇑ Corresponding author. Fax: +86 451 8219 1784. E-mail address: [email protected] (C.-B. Teng).

enveloped bullet-shaped virion encapsidates a non-segmented, negative-sense, single-stranded RNA of 11,000 nucleotides. The linear genome contains six genes encoding a non-virion protein (NV) and five structural proteins: nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G), and RNA polymerase (L), which are organized as 30 -N–P–M–G–NV–L-50 (Schutze et al., 1999). NV, unique to the genus, is capable of suppressing apoptosis at the early stage of viral infection (Ammayappan and Vakharia, 2011), whereas the other five proteins are common in rhabdoviruses with analogous functions (Kurath, 2012; Kuzmin et al., 2009). So far, extensive studies have been conducted on the molecular epizootiology, genetic diversity, and phylogeny of VHSV. Phylogenetic analyses based on the N or G gene sequences of global VHSV isolates have defined four genotypes designated with Roman numerals I to IV (Einer-Jensen et al., 2004; Snow et al., 2004). Further, genotypes I and IV are divided into five (Ia–Ie) and three (IVa–IVc) subtypes, respectively. As illustrated in Fig. 1, the genotypes/subtypes of VHSV have different geographic distributions. Among the European lineages, Ia is predominantly composed of freshwater trout isolates from the mainland; Ib primarily circulates within the Baltic and North Sea water system; Ic is a small group of old isolates from freshwater rainbow trout; Id infects rainbow trout reared in fresh or brackish water; Ie prevails in the marine/estuarine Black Sea; II and III are recovered from the Baltic Sea and the North Atlantic (and connected waters), respectively.

http://dx.doi.org/10.1016/j.ympev.2014.04.002 1055-7903/Ó 2014 Elsevier Inc. All rights reserved.

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 2

M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx

Fig. 1. Sketch of the geographical distribution of VHSV. The four major genotypes (I–IV) are depicted with different symbols and their subtypes (Ia–e and IVa–c) are distinguished with different colors. The first isolate of each lineage and the special isolates (ME03, GH30 and KRRV9601) are indicated. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128

The distinct genotype IV has a much wider range from North America to East Asia: IVa includes marine isolates from East Asia and the Pacific Northwest of North America; IVb contains freshwater isolates in the North American Great Lakes region; IVc comprises seaboard isolates from the Atlantic coast of Canada (Einer-Jensen et al., 2004; Kurath, 2012; Pierce and Stepien, 2012). Notably, in 2005, VHSV was identified for the first time in China from diseased flounder (Paralichthys olivaceus) (Zhu and Zhang, 2013). This isolate is closely related to the Korean strains, suggesting new colonization by the severe fish pathogen. Moreover, there are differences in the level of our knowledge about these genotypes/subtypes. In Europe, both freshwater VHSV and the marine reservoir have been very well characterized owing to large-scale surveillance efforts in farmed fish and extensive ocean cruises to scrutinize wild fish. As can be seen in Fig. 1, most isolates are collected there, especially in continental Europe. Overseas from Europe, IVa is also reasonably well defined, but based on detection during surveillance of cultured fish, and smaller scale surveillance of wild fish; IVb has also been surveyed quite thoroughly but only since its outbreak in 2005; IVc, however, is less well defined, which is so far based on 5 isolations in one report. The nucleotide substitution rates of the N, G and NV genes of VHSV as well as the divergence times of the genotypes according to the G calibration have already been assessed (Pierce and Stepien, 2012). Although the rate estimate is influenced by panel composition, their result conducted on the partial G sequences was 2.58  104 subs/site/year, which was much lower than the previous report as between 7.06  104 and 1.74  103 based on the full-length G dataset lacking IVb, IVc and later isolates (EinerJensen et al., 2004). Consequently, discrepancy in the divergence times of the genotypes has arisen. The primary bifurcation event was calculated to have occurred 697 years ago, nearly 200 years earlier than the former estimate (Einer-Jensen et al., 2004). Therefore, to see whether the G gene of VHSV evolves at a relatively slow rate and the genotypes have diverged for such a long time, Bayesian coalescent method was applied to the time-stamped entire coding sequences of each VHSV gene, with emphasis on the divergence history of the genotypes/subtypes. In addition, to better understand the processes governing the evolution of VHSV, selection analyses and surveys of codon usage bias were also carried out.

129

2. Materials and methods

130

Complete coding sequences of the six VHSV genes were retrieved from GenBank. Dataset compilation, Bayesian estimates,

131

selection analyses and surveys of codon usage bias were performed as previously described (He et al., 2013). Dates of isolation were supplemented via literature (Einer-Jensen et al., 2004; Elsayed et al., 2006; Gagné et al., 2007; Raja-Halli et al., 2006; Reichert et al., 2013). When the Markov chain Monte Carlo (MCMC) method (Drummond et al., 2012) was employed, the uncorrelated exponential clock was recommended to be the best fit by the Bayes factor tests (Baele et al., 2012; Suchard et al., 2001). The exponential growth model was chosen for better performance and conformity to the known history of VHSV in Europe. Independent analyses for 5–25 million MCMC iterations (with 10% burn-in) were combined to ensure convergence in estimates of the nucleotide substitution rates and the times to the most recent common ancestor (TMRCA). Information of the analyzed isolates was given in each of the maximum clade credibility (MCC) trees. Moreover, representative entire G coding sequences of the other three recognized novirhabdoviruses, Infectious haematopoietic necrosis virus (IHNV), Hirame rhabdovirus (HIRRV), and Snakehead rhabdovirus (SHRV), were also retrieved from GenBank and aligned with CLUSTAL W (Thompson et al., 1997). Phylogenetic tree of the genus was constructed by MEGA 5.1 (Tamura et al., 2011) employing the ML method with 1000 bootstrap replicates under the best-fit nucleotide substitution model GTR+I+G determined by MODELTEST in HyPhy (Pond et al., 2005).

132

3. Results

157

3.1. Nucleotide substitution rates of the VHSV genes and genotypes

158

As listed in Table 1, when the entire coding sequences of the G gene from 277 worldwide VHSV isolates spanning 49 years were subjected to Bayesian analysis, the average rate was 5.91  104 subs/site/year with the 95% highest probability density (HPD) values ranging from 4.59  104 to 7.22  104, which was a little higher than that of N at 4.72  104 (2.25  104–7.38  104) calculated on 35 isolates spanning 42 years. Moreover, among the six genes, NV and M displayed the highest and lowest rate, respectively. Since the gene panels were composed of different numbers of isolates (Table 1) which might result in biased rate estimates, we compiled identical datasets consisting of only the isolates with all six gene sequences available (n = 17) and observed similar rate difference among the genes (data not shown). To assess rate difference among the genotypes/subtypes, the G alignment was partitioned accordingly and analyzed. It could be seen that the average rates varied from 2.80  104 to

159

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156

160 161 162 163 164 165 166 167 168 169 170 171 172 173 174

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 3

M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx Table 1 Details of datasets and estimates of the six VHSV genes. Parametera,b

N

P

M

G

NV

L

Sequence length (nt) No. of sequence Time span Best-fit substitution model Mean substitution rate 95% HPD rate Mean TMRCA (95% HPD) Age of genotype I (95% HPD) dN/dS ratio Nc GC3S

1215 35 1968–2010 GTR+G 4.72  104 2.25–7.38  104 225 (89–422) 80 (51–120) 0.17 42.94–50.22 0.632–0.724

669 22 1970–2010 GTR+G 5.11  104 1.17–9.55  104 276 (71–617) 81 (45–139) 0.14 44.42–51.51 0.574–0.620

606 22 1970–2010 GTR+G 4.39  104 1.57–7.41  104 231 (77–458) 82 (48–134) 0.14 53.85–60.82 0.546–0.607

1524 277 1962–2011 GTR+I+G 5.91  104 4.59–7.22  104 201 (95–345) 82 (59–114) 0.20 52.81–57.19 0.533–0.607

369 57 1962–2010 GTR+I 7.38  104 3.94–11.1  104 250 (108–445) 87 (58–123) 0.33 43.91–55.31 0.500–0.669

5955 17 1970–2010 GTR+G 5.89  104 1.57–10.7  104 207 (57–442) n/a 0.06 50.80–51.52 0.554–0.593

a HPD: highest probability density; TMRCA: time to the most recent common ancestor; dN/dS ratio: mean ratio of nonsynonymous to synonymous substitution per site; Nc: the effective number of codons used by a gene; GC3S: the frequency of (G+C) at the synonymous third codon position. b The P, M and L datasets lack genotype II.

183

1.72  103 subs/site/year (Table 2). Ic and Id, the two subtypes of genotype I, respectively had the highest and lowest mean rates. Among the four major genotypes, III was estimated to evolve faster than the other three at a nearly doubled rate. Among the five marine lineages, Ib, II, III, IVa, and IVc, the latter four were estimated with rates similar to or higher than that of Ia, the major European freshwater counterpart. However, except IVa, these marine lineages were all represented by small numbers of isolates (Table 2); thus more complete sequence data are required for confirmation.

184

3.2. TMRCAs of the VHSV genes and genotypes

185

No matter whether genotype II was covered in the panel or not, the average TMRCAs calculated for the six VHSV genes ranged from 201 (95–345, before 2011) to 276 (71–617, before 2010) years (Table 1). Therefore, the first diversification event of the analyzed VHSV isolates might have taken place no more than 300 years ago. Notably, unlike the overall mean TMRCAs, age estimates for genotype I conducted on the five applicable datasets were much more similar (Table 1). According to the more reliable G calibration, the average TMRCA of I was 82 (59, 114) years. Then, this large European group might acquire high pathogenicity to rainbow trout and go into expansion around 1929, nearly a decade before the first observation of the disease. However, the isolates of the other three genotypes were all collected more recently because the asymptomatic marine strains were not noticed until 1988. As a result, their average TMRCAs were much younger than that of genotype I. The most recent common ancestor (MRCA) of III and that of IV both emerged in the mid1960s, whereas that of II was only 19 years old (up to 2011) (Table 1). It should be noted that TMRCA is not the emergence time of the genotype; thus the fact that the MRCA is young does not mean that the genotype is young. In fact, according to the

175 176 177 178 179 180 181 182

186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205

time-scaled MCC trees (Fig. 2), all the three genotypes with young MRCAs have existed for more than a century, that is, II, III and IV are newly recognized old genotypes, supporting the marine origin of VHSV. Among the four analyzed subtypes of genotype I, Ia was calculated to have a mean TMRCA of 52 (45, 60) years, that is, this continental European group might start to diversify around 1959, three years before the first isolation of VHSV and four years before the birth of Id’s MRCA. However, the MRCAs of the other two subtypes, Ib and Ic, did not appear until the early 1970s. As for the three subtypes of genotype IV, the MRCA of IVa was estimated to be 26 years of age (up to 2011), more than 10 years older than those of IVb and IVc.

206

3.3. Divergence history of the VHSV genotypes

219

The evolutionary scenario of the VHSV genotypes was revealed by the MCC cladograms (Fig. 2). According to the six phylogenies, the primary bifurcation of the analyzed isolates might have happened less than three centuries ago between the single ancestor of the European lineages and that of the North American one. In Europe, I and III might be sister lineages diverging from each other 150 years ago. However, II was clustered with the progenitor of I and III in both the G and NV trees but fell into the IV branch in the N phylogeny, reflecting an intergradation role. According to the G MCC tree (Figs. 2D and S1), the five subtypes of I had all diversified out before 1962. Among them, Ie, the first one, had branched off before the disease was reported in 1938. Ic, the second one, bifurcated earlier than DK-F1, the first isolate (Jensen, 1963). In succession, Id diverged from the progenitor of the two sister taxa, Ia and Ib. Notably, three French isolates (07-71, 02-84 and 14-58) were close to the ancestral nodes of the three subclades of Ia, suggesting that Ia might originate in

220

Table 2 Estimates of the G subsets based upon the VHSV genotypes and subtypes.a

a

Genotype

No. of sequences

Mean substitution rate (95% HPD)

Mean TMRCA (95% HPD)

dN/dS ratio

I Ia Ib Ic Id II III IV IVa IVb IVc

201 146 19 5 28 12 7 57 48 4 5

5.79  104 6.01  104 3.89  104 1.72  103 2.80  104 5.86  104 1.63  103 6.16  104 5.60  104 9.75  104 1.16  103

82 (59, 114) 52 (45, 60) 38 (32, 48) 37 (30, 44) 48 (43, 56) 19 (15, 25) 45 (29, 68) 44 (29, 64) 26 (23, 31) 11 (8, 14) 14 (12, 17)

0.27 0.27 0.28 0.17 0.51 0.21 0.19 0.21 0.24 0.27 0.14

(4.83  104–6.81  104) (4.69  104–7.36  104) (1.56  104–6.40  104) (3.81  104–2.97  103) (8.84  105–4.85  104) (7.23  105–1.17  103) (4.62  104–2.66  103) (3.83  104–8.67  104) (3.95  104–7.33  104) (2.46  105–1.65  103) (1.93  104–2.29  103)

Abbreviations are as Table 1.

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

207 208 209 210 211 212 213 214 215 216 217 218

221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 4

M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx

Fig. 2. Maximum clade credibility phylogenies of the N (A), P (B), M (C), G (D), NV (E) and L (F) genes of VHSV. The trees are scaled to time generated under the relaxed uncorrelated exponential molecular clock. Nodes correspond to mean TMRCAs. Isolate information (name/year/origin/accession) and genotype classification are shown on the right (identical tree of the G gene with virus information is available in Fig. S1). Subtypes are labeled above the nodes. The genotypes/subtypes are also chromatically indicated as Fig. 1.

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx

Fig. 3. Nc plot of the six genes of VHSV. Nc (the effective number of codons) vs. GC3S (the GC content at the synonymous third codon position) is plotted for the N (violet), P (sky blue), M (yellow), G (blue), NV (purplish red) and L (red) genes of each isolate. The continuous curve represents the expected Nc values under assumptions of no selection. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

239

France. Moreover, as for the three subtypes of IV, the concurrent IVb and IVc might be sisters diverging in the late 1980s, whose ancestor had separated from that of IVa two decades earlier.

240

3.4. Selection on the VHSV genes

241

247

The mean ratios of nonsynonymous (dN) to synonymous (dS) substitutions per site estimated for the six VHSV genes as well as the genotypes/subtypes were all less than 1.0 (Tables 1 and 2). This suggested that purifying selection occurred rather than positive selection. However, two codons (258 and 476) of the G gene might be positively selected (P < 0.05). Surprisingly, 120 codons of the G gene were under significantly negative selection (P < 0.05).

248

3.5. Codon usage bias of each VHSV gene

249

The extent of codon usage bias was examined in each of the six VHSV genes. A predominance of C over U in most of the synonymous third positions of the codons was observed from the overall relative synonymous codon usage (RSCU) values (Table S1). Such

237 238

242 243 244 245 246

250 251 252

5

pattern was similar to those observed in IHNV and several major fish hosts (He et al., 2013), suggesting that it might be shaped by host adaptation. Moreover, the NC (effective number of codons used by a gene) value of each gene was more than 40 (Table 1), indicating a slight bias in codon usage (20 for top bias vs. 61 for no bias). However, as was apparent in the Nc-plot (a plot of Nc vs. GC3S) (Fig. 3), the points of the P, M and NV genes were much more dispersed than those of the N, G and L genes, which suggested different selection constraints on these genes. In addition, such distribution patterns of the three genes were correlated with genotypes. All analyzed IV-type isolates had distinct M and NV points, whilst the II and III isolates had discrete NV and P points, respectively (Table S2). Notably, the M points of the IV-type viruses clustered right on the curve, reflecting no selection other than GC composition.

253

3.6. Phylogeny of the novirhabdoviruses

268

It was evident in the ML phylogeny (Fig. 4) of the genus that VHSV and SHRV shared a common ancestor sister to that of IHNV and HIRRV. Roughly calculated (data not shown), bifurcation between VHSV and SHRV might have taken place over one thousand years ago, not to mention the more ancient primary event of the genus. Notably, IHNV has a clear origin in the Pacific Northwest, whereas HIRRV and SHRV are both endemic in Asia Pacific (He et al., 2013; Kurath, 2012). Thus, each of the two branch events experienced by VHSV involved a Pacific relative.

269

4. Discussion

278

In our study, dated sequences of the six VHSV genes were subjected to Bayesian coalescent analyses. The nucleotide substitution rate and the TMRCA of each gene were estimated. NV, coding for the non-virion protein, exhibited the highest evolutionary pace, conforming to its lowest level of sequence homology among the novirhabdoviruses (Kim et al., 2005). In contrast, M, the gene of the multifunctional protein essential for virion assembly and release, had the lowest mutation rate, consistent with its greatest identity among VHSV isolates (Ammayappan and Vakharia,

279

Fig. 4. Phylogenetic relationship of the novirhabdoviruses. Based on the entire G gene sequences of the four acknowledged species: Infectious haematopoietic necrosis virus (IHNV), Hirame rhabdovirus (HIRRV), Snakehead rhabdovirus (SHRV), and Viral hemorrhagic septicemia virus (VHSV), the tree is constructed using the Maximum Likelihood (ML) method under the best-fit GTR+I+G nucleotide substitution model. Branches supported by >70% bootstrap value (1000 replicates) are shown. The Pacific nodes are indicated. Information about each isolate used is given as name/origin/accession/type. ‘‘’’: LR-73 belongs to the intermediate genogroup between L and U.

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

254 255 256 257 258 259 260 261 262 263 264 265 266 267

270 271 272 273 274 275 276 277

280 281 282 283 284 285 286 287

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 6 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353

M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx

2009). Notably, although only small amounts of complete sequences were available for the P, M and L genes, their age estimates were close to those of the other three (Table 1), supporting the theoretical study that large samples are not necessary to date old events in a gene’s history (Templeton, 2006). Certainly, the timing result of the G gene was the most reliable one. However, it should be noted that even if more representative samples were analyzed, the date obtained was still approximate but not accurate since the entire virus population could never be reached. Based on the G gene, nucleotide substitution rates and TMRCAs of the four genotypes and their subtypes were also calculated. The average rates of the genotypes/subtypes of VHSV varied from 2.80  104 to 1.72  103 subs/site/year, which was a reflection of different evolutionary modes and selective pressure. The overall mean rate calculated for the entire G gene was 5.91  104, which was lower than that of Einer-Jensen et al. (7.06  104–1.74  103) but higher than that of Pierce and Stepien (2.58  104). As exemplified by the variations from the partitioned G datasets, panel composition has an impact on the rate estimate; however, it did not account for the difference between our estimate and that of EinerJensen et al. (2004). When Bayesian analysis was conducted under the strict clock model on their Ia dataset composed of 29 isolates, the average rate was nearly 6  104 subs/site/year. This was similar to ours (6.01  104) but lower than theirs (1.74  103), despite the similar age calculations (44 vs. 50, before 2000). It seems that the large European freshwater subtype has a steady evolutionary speed, which implies the ever heavy viral burden. Furthermore, the marine reservoirs may be evolving at similar or even higher rates, which is a dangerous signal for new emergences. A good example is the virulent freshwater subtype IVb that is emerging in the Great Lakes (Elsayed et al., 2006). In addition, we disagreed with their view concerning the relationship between Ia and Ib. According to the topologies of genotype I, Ib should be a sister but not a progenitor to Ia, and both of them were the offspring of a freshwater rainbow trout pathogenic virus diverging around 1950. As for the difference between our estimate and that of Pierce and Stepien (2012), it was most likely due to underestimation resulting from partial sequences. In contrast, for the N gene, our rate estimate was much similar to theirs (4.72  104 vs. 4.26  104), both of which were conducted on complete sequences. Consequently, the analyzed VHSV isolates were overestimated to have an age of 697 years according to the G calibration. This was apparent in their paper as TMRCAs were calculated to be 298 years for the N gene and 267 years for the NV gene based on full-length sequences. Actually, the latter two were in agreement with our finding that the primary bifurcation might have occurred within the last 300 years. Thus, divergence of VHSV did not initiate long before the aquaculture was established on the two continents and the formation of the three European genotypes were possibly relevant to the farming activities that time. A higher frequency of the synonymous substitutions over the nonsynonymous ones (dN/dS < 1) suggested that the principal evolutionary force on each VHSV genes was purifying selection. However, in the evolution of certain codons, positive selection did play a role, as two positively selected sites in the G gene were identified by SLAC analysis. Like in other rhabdoviruses, G is the major antigen important for protective immunity (Kuzmin et al., 2009). One major neutralizing epitope has been mapped by monoclonal antibodies to the amino acid region 254–259 (Bearzotti et al., 1995). Thus, one positively selected site, codon 258, is located in the antigenic region and may have been driven by the host immune reaction. As for the second site, codon 476, there is a possibility that it is recognized by the fish antibody response differing from that in mice. Moreover, neither site is analogous to those identified in IHNV (He et al., 2013), suggesting difference in evolution niche and possible diversity in protein function between the two relatives.

As noted previously (He et al., 2013), the Nc-plot is effective in investigating the patterns of synonymous codon usage (Wright, 1990). When codon choice is dictated only by uneven GC composition, the point will fall on or just below the curve of the expected values. Here, the M points of the IV-type viruses were the case (Fig. 3). Then, for the other points lying well below the curve, their codon usage variation should have influences in addition to mutational bias. Interestingly, the P, M and NV points exhibited genotype-specific distributions in the Nc-plot, which adds one more clue that the genotypes are confronted with different selective pressure. To date, the origin and the transmission routes of VHSV remain controversial. Studer and Janies (2011) visualized westward spreading paths from Europe to North America and Asia. They proposed that VHSV was transmitted from the North Atlantic and/or Baltic Sea to the Atlantic coast of North America, then independently to the Great Lakes and the Pacific Northwest, and from latter to Asia and Alaska. However, such routes were established on the premise that the European isolates were older because VHSV outbreaks occurred elsewhere decades later. In fact, there was no doubt that divergence between the European ancestor and the overseas one had taken place long before the first observation of the disease in Europe. Which one was older could not be asserted according to the time of notice especially when the marine strains were once neglected for the lack of clinical signs. Judging from the G MCC tree, four candidates were feasible for marine origin of VHSV: the Pacific Northwest, the Atlantic coast of Canada, the North Atlantic and the Baltic Sea. Currently, we know little about the second one as IVc has not been well defined and might have been missed in past routine health surveillance which focused predominantly on salmonids before 1997 (Gagné et al., 2007). Moreover, whether there is an extensive reservoir of VHSV in wild fish off the Atlantic coast of North America has not been looked into yet. Here, we ruled it out considering that it was the transcontinental IVa but not the adjacent IVc that emerged in the Atlantic coast of the USA, as ME03, collected in seawater near the state of Maine in 2003 (Elsayed et al., 2006), was a progeny of Makah (Fig. S1), the first IVa-type isolate (Brunson et al., 1989). Then, all American isolates should share a common origin in the Pacific Northwest, so did the Asian isolates except KRRV9601 which might be accidentally introduced from Europe (Nishizawa et al., 2002), as was clear in the phylogenies (Figs. 2 and S1). Unquestionably, the North Atlantic is one of the viral reservoirs of continental Europe; however, it is likely not the reservoir of North America. Notably, III-type isolates (e.g. GH30, Fig. 1) were found at the Flemish Cap in 1994 (Dopazo et al., 2002; Einer-Jensen et al., 2005); however, the spread to North America has not been observed to date and all American isolates are pure descendants of the single ancestor. Such case demonstrates that the Atlantic lineage has been geographically close to the New World but has not colonized it yet. In fact, transmission between the two continents is restrained and may have happened only once thus far resulting in the primary bifurcation of VHSV. Due to the absence of history relevant to this transmission incident, whether VHSV was introduced from Europe to North America or vice versa could not be asserted. Then, either the North Atlantic/ Baltic Sea or the Pacific Northwest could be the origin of the VHSV genotypes. Clearly, VHSV has more lineages and greater diversity in the North East Atlantic region. This may lead Pierce and Stepien (2012) to propose the North Atlantic as the origin. However, these traits may instead suggest that the European lineages are younger. The European freshwater group of VHSV itself illustrates this: it is not only the earliest observed and isolated, but also the largest and most diversified; however, it is not the freshwater origin but the marine origin that is widely acknowledged.

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419

YMPEV 4865

No. of Pages 7, Model 5G

16 April 2014 M. He et al. / Molecular Phylogenetics and Evolution xxx (2014) xxx–xxx

441

Moreover, the Pacific origin is supported by the phylogenetic analysis of the genus. The cladograms of the four novirhabdoviruses (Fig. 4) (Kuzmin et al., 2009) revealed that VHSV has undergone two branching events, first from the progenitor of IHNV and HIRRV, then from SHRV. As IHNV originated in the Pacific Northwest and its global spread took place within decades, while both HIRRV and SHRV are enzootic in Asia Pacific (He et al., 2013; Kurath, 2012), it is more likely that these events occurred in the Pacific region. Notably, only once in the 1980s was IHNV accidentally introduced to continental Europe, laying another heavy burden on the trout farming industry. More subgroups and higher diversity were also observed for the younger European E genogroup derived from a single North American M genogroup ancestor (He et al., 2013; Kurath, 2012). Taken together, we proposed that VHSV originated in the Pacific Northwest and then radiated in both directions: westward to East Asia, and eastward to the North Atlantic/Baltic Sea, the Great Lakes and Atlantic Canada water system, and the Atlantic coast of the USA. In addition, it should be noted that as the eastward spread of VHSV would be blocked by the North American continent, these transmissions were most likely to be associated with human activities.

442

Acknowledgments

443

447

We are deeply grateful to the editor and the reviewers for their helpful comments and advice for improving our manuscript, and to http://english.freemap.jp/ for free world map. This work was supported by the Fundamental Research Funds for the Central Universities of China (No. DL13EA06).

448

Appendix A. Supplementary material

449 451

Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ympev.2014.04. 002.

452

References

453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475

Ammayappan, A., Vakharia, V.N., 2009. Molecular characterization of the Great Lakes viral hemorrhagic septicemia virus (VHSV) isolate from USA. Virol. J. 6, 171. Ammayappan, A., Vakharia, V.N., 2011. Nonvirion protein of novirhabdovirus suppresses apoptosis at the early stage of virus infection. J. Virol. 85, 8393– 8402. Baele, G., Lemey, P., Bedford, T., Rambaut, A., Suchard, M.A., Alekseyenko, A.V., 2012. Improving the accuracy of demographic and molecular clock model comparison while accommodating phylogenetic uncertainty. Mol. Biol. Evol. 29, 2157–2167. Bearzotti, M., Monnier, A.F., Vende, P., Grosclaude, J., de Kinkelin, P., Benmansour, A., 1995. The glycoprotein of viral hemorrhagic septicemia virus (VHSV): antigenicity and role in virulence. Vet. Res. 26, 413–422. Brunson, R., True, K., Yancey, J., 1989. VHS Virus Isolated at Makah National Fish Hatchery. American Fisheries Society, Fish Health Section Newsletter, vol. 17, pp. 3–4. Dopazo, C.P., Bandin, I., Lopez-Vazquez, C., Lamas, J., Noya, M., Barja, J.L., 2002. Isolation of viral hemorrhagic septicemia virus from Greenland halibut Reinhardtius hippoglossoides caught at the Flemish Cap. Dis. Aquat. Organ. 50, 171–179. Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973. Einer-Jensen, K., Ahrens, P., Forsberg, R., Lorenzen, N., 2004. Evolution of the fish rhabdovirus viral haemorrhagic septicaemia virus. J. Gen. Virol. 85, 1167–1179.

420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440

444 445 446

450

7

Einer-Jensen, K., Winton, J., Lorenzen, N., 2005. Genotyping of the fish rhabdovirus, viral haemorrhagic septicaemia virus, by restriction fragment length polymorphisms. Vet. Microbiol. 106, 167–178. Elsayed, E., Faisal, M., Thomas, M., Whelan, G., Batts, W., Winton, J., 2006. Isolation of viral haemorrhagic septicaemia virus from muskellunge, Esox masquinongy (Mitchill), in Lake St Clair, Michigan, USA reveals a new sublineage of the North American genotype. J. Fish Dis. 29, 611–619. Gagné, N., Mackinnon, A.M., Boston, L., Souter, B., Cook-Versloot, M., Griffiths, S., Olivier, G., 2007. Isolation of viral haemorrhagic septicaemia virus from mummichog, stickleback, striped bass and brown trout in eastern Canada. J. Fish Dis. 30, 213–223. He, M., Ding, N.Z., He, C.Q., Yan, X.C., Teng, C.B., 2013. Dating the divergence of the infectious hematopoietic necrosis virus. Infect. Genet. Evol. 18, 145–150. Hopper, K., 1989. The Isolation of VHSV from Chinook Salmon at Glenwood Springs, Orcas Island, Washington. American Fisheries Society, Fish Health Section Newsletter, vol. 17, pp. 1–2. Jensen, M.H., 1963. Preparation of fish tissue cultures for virus research. Bull. Off. Int. Epizoot. 59, 131–134. Kim, D.H., Oh, H.K., Eou, J.I., Seo, H.J., Kim, S.K., Oh, M.J., Nam, S.W., Choi, T.J., 2005. Complete nucleotide sequence of the hirame rhabdovirus, a pathogen of marine fish. Virus Res. 107, 1–9. Kurath, G., 2012. Fish Novirhabdoviruses. In: Dietzgen, R.G., Kuzmin, I.V. (Eds.), Rhabdoviruses: Molecular Taxonomy, Evolution, Genomics, Ecology, Host– Vector Interactions, Cytopathology and Control. Caister Academic Press, pp. 89– 116. Kuzmin, I.V., Novella, I.S., Dietzgen, R.G., Padhi, A., Rupprecht, C.E., 2009. The rhabdoviruses: biodiversity, phylogenetics, and evolution. Infect. Genet. Evol. 9, 541–553. Nishizawa, T., Iida, H., Takano, R., Isshiki, T., Nakajima, K., Muroga, K., 2002. Genetic relatedness among Japanese, American and European isolates of viral hemorrhagic septicemia virus (VHSV) based on partial G and P genes. Dis. Aquat. Organ. 48, 143–148. Pierce, L.R., Stepien, C.A., 2012. Evolution and biogeography of an emerging quasispecies: diversity patterns of the fish Viral Hemorrhagic Septicemia virus (VHSv). Mol. Phylogenet. Evol. 63, 327–341. Pond, S.L., Frost, S.D., Muse, S.V., 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21, 676–679. Raja-Halli, M., Vehmas, T.K., Rimaila-Parnanen, E., Sainmaa, S., Skall, H.F., Olesen, N.J., Tapiovaara, H., 2006. Viral haemorrhagic septicaemia (VHS) outbreaks in Finnish rainbow trout farms. Dis. Aquat. Organ. 72, 201–211. Reichert, M., Matras, M., Skall, H.F., Olesen, N.J., Kahns, S., 2013. Trade practices are main factors involved in the transmission of viral haemorrhagic septicaemia. J. Fish Dis. 36, 103–114. Schäperclaus, W., 1938. Die Schädigungen der deutschen Fischerei durch Fischparasiten und Fischkrankheiten. Allg. Fischztg 41, 256–270. Schutze, H., Mundt, E., Mettenleiter, T.C., 1999. Complete genomic sequence of viral hemorrhagic septicemia virus, a fish rhabdovirus. Virus Genes 19, 59–65. Smail, D.A., Snow, M., 2011. Viral haemorrhagic septicemia. In: Woo, P.T.K., Bruno, D.W. (Eds.), Fish Diseases and Disorders: Viral, Bacterial and Fungal Infections. CABI Publishing, pp. 110–142. Snow, M., Bain, N., Black, J., Taupin, V., Cunningham, C.O., King, J.A., Skall, H.F., Raynard, R.S., 2004. Genetic population structure of marine viral haemorrhagic septicaemia virus (VHSV). Dis. Aquat. Organ. 61, 11–21. Studer, J., Janies, D.A., 2011. Global spread and evolution of viral haemorrhagic septicaemia virus. J. Fish Dis. 34, 741–747. Suchard, M.A., Weiss, R.E., Sinsheimer, J.S., 2001. Bayesian selection of continuoustime Markov chain evolutionary models. Mol. Biol. Evol. 18, 1001–1013. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. Templeton, A.R., 2006. Population Genetics and Microevolutionary Theory. WileyLiss, Hoboken, 705p.. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882. Walker, P.J., Benmansour, A., Calisher, C.H., Dietzgen, R.G., Fang, R.X., Jackson, A.O., Kurath, G., Leong, J.C., Nadin-Davies, S., Tesh, R.B., Tordo, N., 2000. Family Rhabdoviridae. In: The 8th Report of the International Committee on Taxonomy of Viruses. Springer-Verlag, New York. Wright, F., 1990. The ‘effective number of codons’ used in a gene. Gene 87, 23–29. Zhu, R.L., Zhang, Q.Y., 2013. Determination and analysis of the complete genome sequence of Paralichthys olivaceus rhabdovirus (PORV). Arch. Virol.. http:// dx.doi.org/10.1007/s00705-013-1716-5.

476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550

Please cite this article in press as: He, M., et al. Evolution of the viral hemorrhagic septicemia virus: Divergence, selection and origin. Mol. Phylogenet. Evol. (2014), http://dx.doi.org/10.1016/j.ympev.2014.04.002

Evolution of the viral hemorrhagic septicemia virus: divergence, selection and origin.

Viral hemorrhagic septicemia virus (VHSV) is an economically significant rhabdovirus that affects an increasing number of freshwater and marine fish s...
2MB Sizes 1 Downloads 3 Views