Infection, Genetics and Evolution 32 (2015) 51–59

Contents lists available at ScienceDirect

Infection, Genetics and Evolution journal homepage: www.elsevier.com/locate/meegid

Identifying the pattern of molecular evolution for Zaire ebolavirus in the 2014 outbreak in West Africa Si-Qing Liu a,b, Cheng-Lin Deng a,b, Zhi-Ming Yuan b, Simon Rayner a, Bo Zhang a,b,⇑ a b

Key Laboratory of Etiology and Biosafety for Emerging and Highly Infectious Diseases, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China Key Laboratory of Agricultural and Environmental Microbiology, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan 430071, China

a r t i c l e

i n f o

Article history: Received 6 December 2014 Received in revised form 16 February 2015 Accepted 24 February 2015 Available online 4 March 2015 Keywords: Zaire ebolavirus Positive selection Evolutionary rate West Africa

a b s t r a c t The current Ebola virus disease (EVD) epidemic has killed more than all previous Ebola outbreaks combined and, even as efforts appear to be bringing the outbreak under control, the threat of reemergence remains. The availability of new whole-genome sequences from West Africa in 2014 outbreak, together with those from the earlier outbreaks, provide an opportunity to investigate the genetic characteristics, the epidemiological dynamics and the evolutionary history for Zaire ebolavirus (ZEBOV). To investigate the evolutionary properties of ZEBOV in this outbreak, we examined amino acid mutations, positive selection, and evolutionary rates on the basis of 123 ZEBOV genome sequences. The estimated phylogenetic relationships within ZEBOV revealed that viral sequences from the same period or location formed a distinct cluster. The West Africa viruses probably derived from Middle Africa, consistent with results from previous studies. Analysis of the seven protein regions of ZEBOV revealed evidence of positive selection acting on the GP and L genes. Interestingly, all putatively positive-selected sites identified in the GP are located within the mucin-like domain of the solved structure of the protein, suggesting a possible role in the immune evasion properties of ZEBOV. Compared with earlier outbreaks, the evolutionary rate of GP gene was estimated to significantly accelerate in the 2014 outbreak, suggesting that more ZEBOV variants are generated for human to human transmission during this sweeping epidemic. However, a more balanced sample set and next generation sequencing datasets would help achieve a clearer understanding at the genetic level of how the virus is evolving and adapting to new conditions. Ó 2015 Elsevier B.V. All rights reserved.

1. Introduction Ebola viruses (EBOV) are non-segmented, single stranded, linear and negative sense RNA viruses, which are members of family Filoviridae. The complete EBOV genomes, which are about 18.9 kb long, contain seven protein-coding genes and various intergenic regions (Sanchez et al., 2007). The functions performed by these genes have been studied in the process of viral replication, virus–host interaction, and virion assembly (Feldmann et al., 1999; Hoenen et al., 2010b; Mühlberger et al., 1998; Mateo et al., 2011; Nanbo et al., 2010, 2013). The nucleoprotein (NP) is associated with virus replication and viral assembly, together with the RNA-dependent RNA polymerase cofactor (VP35), the transcriptional activator (VP30), the RNA-dependent RNA ⇑ Corresponding author at: Xiao Hong Shan Zhong Qu 44, Wuhan, Hubei 430071, China. Tel.: +86 27 87197822. E-mail addresses: [email protected] (S.-Q. Liu), [email protected] (C.-L. Deng), [email protected] (Z.-M. Yuan), [email protected] (S. Rayner), [email protected] (B. Zhang). http://dx.doi.org/10.1016/j.meegid.2015.02.024 1567-1348/Ó 2015 Elsevier B.V. All rights reserved.

polymerase (L), and the minor matrix protein (VP24) (Becker et al., 1998; Mühlberger et al., 1998, 1999; Watanabe et al., 2006). The surface spike-like glycoprotein (GP) plays an important role in virus entry into cells by mediating receptor binding and fusion (Feldmann et al., 1999; Lee et al., 2008; Nanbo et al., 2010). Finally, the major matrix protein (VP40), along with VP24, plays a critical role in the regulation of viral replication and transcription (Hoenen et al., 2010a,b), as well as viral assembly (Hoenen et al., 2010b; Mateo et al., 2011). Ebola viruses are known to be the etiological agents of hemorrhagic fever, which has a high mortality rate (Baron et al., 1983; Suzuki and Gojobori, 1997). Since 1976, the Zaire ebolavirus (ZEBOV) has been associated with outbreaks in the Democratic Republic of Congo (DRC), Gabon, and the Republic of Congo (Baize et al., 2014; Feldmann and Geisbert, 2011). The most recent outbreak in West Africa, which began in February of 2014, is the largest recorded outbreak, spreading through Guinea, Liberia and Sierra Leone, with incursions into other countries including Nigeria, the USA and Mali (Alexander et al., 2014; Gire et al.,

52

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

2014). The role of human population density in the spread of many viruses is well known and can lead to shorter infection cycles and the evolution of higher virulence strains (Holmes, 2004). Social factors can also impact an epidemic. For example, a recent study highlighted the role of an improved transportation infrastructure, growing social activities and a lack of public awareness in the reemergence of rabies in China in the 1990s (Song et al., 2014). In the region of West Africa, population growth has been dramatic, with population densities increasing by nearly 200% in Guinea, Sierra Leone, and Liberia in the past five decades (Alexander et al., 2014). In addition, an overburdened public health network, delayed response and coordination challenges have also facilitated the spread of ZEBOV. A previous study demonstrated that the ZEBOV outbreaks occurring after the Yambuku outbreak in DRC in 1976 resulted from either direct or closely related offsprings of a Yambuku-like virus (Wahl-Jensen et al., 2005). Many phylogenetic studies from the sequence of available ZEBOV strains have placed the viruses from this earliest recorded outbreak at the tree root (CalvignacSpencer et al., 2014; Dudas and Rambaut, 2014; Wahl-Jensen et al., 2005). Furthermore, these derived phylogenetic relationships indicate that the viruses of later outbreaks evolved from those of earlier outbreaks, implying an epidemiological connection amongst these outbreaks (Wahl-Jensen et al., 2005). This ladder-like phylogenetic structure has been repeatedly reconstructed in different studies (Calvignac-Spencer et al., 2014; Dudas and Rambaut, 2014; Gire et al., 2014; Wahl-Jensen et al., 2005). This structure has also been observed in many other RNA viruses and has been attributed to the pressure of positive selection (Grenfell et al., 2004; Holmes, 2004), suggesting a similar mechanism is driving the evolution of ZEBOV. Furthermore, the restricted genetic diversity and rapidly generated variants observed in earlier ZEBOV outbreaks could also be the consequence of continuous positive selection (Wahl-Jensen et al., 2005). Before the ZEBOV outbreak recorded in 2007, the nucleotide substitution rate of isolates was estimated to be constant over time (Wahl-Jensen et al., 2005). However, compared with retroviruses and influenza A virus, ZEBOV seems to be evolving relatively slowly (Cox et al., 1983; Suzuki and Gojobori, 1997). Three explanations have been proposed for this: (1) The RNA-dependent RNA polymerase of ZEBOV may not be as error-prone as other viruses (Suzuki and Gojobori, 1997); (2) the replication frequency was relatively low in the natural (reservoir) host(s) for the 20 years between the 1976 and 1995 outbreaks (Suzuki and Gojobori, 1997); (3) given the limited ZEBOV sampling and overburdened public health resources in parts of West Africa, as well as limited RNA genome sequencing (Matranga et al., 2014), the degree of mutational differences in earlier outbreaks could be underestimated. ZEBOV has invaded West Africa from Middle Africa in a wave like pattern within the last decade (Wahl-Jensen et al., 2005; Baize et al., 2014). While change of environment and specific conditions in communities in West Africa could facilitate the spread of ZEBOV disease from human to human, the distinct evolutionary properties of ZEBOV in the current outbreak have not been addressed. Our aim in the present study was to investigate the genetic variation and evolutionary pattern for ZEBOV amongst the different outbreaks based on evolutionary analyses, followed by the interpretation of ZEBOV adaptation to new conditions. Specifically, we estimated amino acid mutations, tested the hypothesis of positive selection, and compared the evolutionary rates of the GP gene among each outbreak in an attempt to identify the evolutionary properties of ZEBOV during the large epidemics which occur in West Africa.

2. Materials and methods 2.1. Sequences processing and phylogenetic analysis of ZEBOV genomes We collected all available whole genome sequences (as of 16 September 2014) of Z. ebolavirus for the period of 1976–2014 from the GenBank database, of which 99 were nearly complete genomes derived from the Sierra Leone outbreak in West Africa in 2014. We then removed the sequences with one or more ambiguous nucleotide sequences within the protein-coding regions. Three datasets were used for inferences of phylogenetic trees: combined noncoding sequences only (50 UTR, 30 UTR, and six intergenic regions); combined coding sequences only (seven protein-coding genes, NP, VP35, VP40, GP, VP30, VP24, and L, corresponding to an alignment of 14,516 bp); and non-coding plus coding sequences. Nucleotide sequence variation and amino acid mutation were estimated using the MEGA 4.0 software package (Tamura et al., 2007). Nucleotide sequences were initially aligned using the online MUSCLE program through the NIAID Virus Pathogen Database and Analysis Resource (ViPR, http://www.viprbrc.org). Phylogenetic analysis was performed for each of the three datasets using the neighbor-joining (NJ) and partitioned Bayesian Analysis (BA) methods. The NJ trees were implemented in MEGA 4.0 and bootstrap analysis with 1000 replicates was used to evaluate support values for phylogenetic relationships (Felsenstein, 1985). BA analyses were implemented in MrBayes 3.1.1 (Huelsenbeck and Ronquist, 2001). Prior to analyses, the most appropriate model of nucleotide substitution and parameter values for each dataset were estimated under a nested array of substitution models using the Akaike Information Criterion (AIC) as implemented in Modeltest 3.7 (Posada and Crandall, 1998). In the Bayesian analyses, two independent searches were conducted for each dataset. Four independent Markov Monte Carlo (MCMC) chains were run for 2,000,000 generations, with sampling of one tree per 100 replicates in each run. The first 1000 trees with non-stationary log likelihood values represented ‘‘burn-in’’ and were discarded. Posterior probabilities (PP) of phylogenetic inferences were determined from remaining trees. Trees shown herein represent 50% majority-rule consensus trees and BA PP for each node.

2.2. Investigation of positive selection The CODEML program within the PAML software package was used to assess parameters in models of sequence evolution and to test relevant hypotheses (Yang, 2007). We examined three pairwise codon-based substitution models to assess non-synonymous vs. synonymous substitution rates (denoted as dN/dS ratio or x ratio) for all ZEBOV codon sites and all branches of the phylogeny: M0 (one-ratio) vs. M3 (discrete x), M1a (nearly neutral) vs. M2a (positive selection), and M7 (b distribution) vs. M8 (b distribution and a fraction of sites with x > 1). Likelihood ratio tests (LRTs) were performed to compare the fit of two pairwise models. It is assumed that twice the log likelihood difference between nested models (2DlnL) follows a chi-squared distribution with a number of degrees of freedom equal to the difference in the number of free parameters (Whelan and Goldman, 1999). When LRTs indicated positive selection, we used the Bayesian empirical Bayesian (BEB) approach (Yang et al., 2005) to calculate posterior probabilities for identifying sites under positive selection. In order to confirming the results of PAML analyses, the data sets were re-analyzed using the Datamonkey web server (Delport et al., 2010; Kosakovsky Pond and Frost, 2005), which implements

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

HyPhy, a molecular evolution analysis platform (Kosakovsky Pond et al., 2005). The single likelihood ancestor counting (SLAC) method was used for detecting sites under selection, as this was able to process large alignments. 2.3. Detection of increase in dN/dS Evaluation of dN/dS values was used to determine whether genes have historically experienced different selective pressures along different, evolutionary independent lineages. The GP gene, having been identified as possessing the greatest intraspecific genetic variation (Table 1), was selected and dN/dS values were estimated for each branch of the phylogenetic tree using the free-ratio model implemented in PAML. For analysis of changes in dN/dS, dN/dS values of dN – 0 and dS = 0 were excluded in analyses in case of extreme outliers. The dN/dS values were classified by outbreak years according to the phylogenetic lineages. Thus, three groups of dN/dS values calculated for 1976, 1994–1996, and 2007 were compared with those calculated in 2014. All statistical analyses were performed using the open-source R-project software version 3.1.1 (R Core Team, 2013), and the Kruskal– Wallis rank sum test implemented in the core package. The multiple comparison test following the Kruskal–Wallis rank sum test was implemented in the ‘pgirmess’ package. 3. Results 3.1. Complete genome and phylogenetic analysis We obtained 27 complete genome (18,959 bp) and 96 partial genome (P18,613 bp) sequences of Z. ebolavirus that were available in GenBank as of 16 September 2014. A summary of genome structure and genetic characteristics is given in Table 1. Among the aligned seven protein-encoding genes, no indels could be detected in the new 2014 West Africa ZEBOV strains. The total length of the concatenated sequences for seven codon sequences was 14,516 bp. The 50 UTR, 30 UTR, and six intergenic regions sequences constituted an aligned data matrix of 4443 bp. The best-fit models of nucleotide substitution for combined codon and intergenic sequences were identified as GTR + gamma and TIM + gamma, respectively. Phylogenetic relationships of the ZEBOV were reconstructed separately based on three datasets: the entire genome sequence; combined protein-encoding genes sequences; and combined UTR and intergenic sequences. The entire genome sequences and combined coding sequences yielded almost the same topologies (Figs. 1 and 2). In both NJ and BA trees, the sequences from oldest outbreak in 1976 were placed at the basal position. Viral sequences from the same period or location formed a distinct cluster, compatible with a severe bottleneck at each new outbreak. The branch lengths of ZEBOV lineages were correlated with time of recorded outbreak.

Table 1 Genetic Information of seven protein-codon genes for Zaire ebolavirus. Gene

Length (nucleotide/AA)

Variable (nucleotide/AA)

Genetic distance (%)

dS

dN

NP VP35 VP40 GPa VP30 VP24 L

2220/740 1023/341 981/327 2031/677 867/289 756/252 6639/2213

156/41 44/10 46/7 138/48 34/8 41/10 374/66

1.0 0.5 0.6 1.1 0.5 0.8 0.9

0.037 0.018 0.019 0.025 0.015 0.027 0.027

0.003 0.002 0.001 0.005 0.000 0.001 0.002

a In order to the consensus of RNA editing site, we generated a full-length glycoprotein sequences alignment of 2031 bp with the insertion of a adenosine residue in the 7 consecutive adenosine in mRNA sequences.

53

Although ZEBOV sequences from earlier outbreaks are relatively undersampled compared to the current outbreak, these humanderived sequences nevertheless provide insight into the intraspecific history of the emerging virus. Before 2014, Z. ebolavirus was mainly found in countries within Central Africa, such as the DRC and Gabon. There are three independent lineages associated with DRC, while two lineages are associated and separately diverge in adjacent Gabon. The sequences of ZEBOV extracted from the 2014 outbreak could be divided into two subclades and all Sierra Leone viruses might evolve from a common ancestor in Guinea. For the phylogenetic relationships estimated from combined intergenic sequences (Fig. 3), the major lineages are consistent with those based on the entire genome and combined coding datasets, except for the relationship between Guinea and Sierra Leone. The NJ and BA trees indicate that the three Guinea strains are clustered with viruses of Sierra Leone lineage, suggesting some uncertainty regarding the relationship of the virus lineage between these two regions. 3.2. Synonymous and non-synonymous mutations with amino acid The alignment of combined coding sequences (14,516 bp) identified variable 833 sites (5.74%). The mutations in the NP, GP, and L sequences accounted for more than 80% of these sites (Table 1). At the amino acid level, the GP protein is the most variable with 48 non-synonymous substitutions located within the 677 amino acids, followed by the NP protein with 1 substitution per 18 amino acids. The most highly conserved protein is VP40 (Table 1). 3.3. Positive selection and sites under selection in the proteins for West Africa ZEBOV In order to investigate whether distinct environmental pressures are driving the evolution of ZEBOV virus in specific outbreaks, we used likelihood ratio tests (LRT) in the software package PAML 4.0 to identify the presence of positive selection. Since there are seven protein-coding genes in the ebolavirus genome, the PAML calculation was implemented for each of these seven genes (Tables A.1–A.7). For the GP gene, we generated a fulllength glycoprotein alignment of 2031 bp in which an adenosine residue was inserted in the 7 consecutive adenosines in the mRNA sequences. Except for the GP and L genes, all other genes had very low LRT values and high p-value (Table 2), indicating predominantly purifying or neutral selection. In contrast, positive selection was detected in the GP and L sequences based on the comparisons of model M1a and M2a (Table 2). The model M2a (positive selection) had a significantly better fit than M1a (nearly neutral) (p < 0.05) according to the LRTs. The BEB method identified a small number of amino acid sites under positive selection, albeit none of them reached a posterior probability (Pr) of 95%. For the GP gene, there were five sites predicted to have x values >1 (Table 2) which were distributed within the mucin-like domain of the solved 3D structure of the protein (Lee et al., 2008). The mucin-like domain and glycan cap sit together as an external domain to the viral attachment and fusion subunits, suggesting a possible mechanism for immune evasion (Lee et al., 2008). Interestingly, two of the positive selected sites (331E and 455Y) are characteristic of West Africa ZEBOV. For the L gene, four sites, 1405Q, 1607H, 1610F, and 1662G, were detected to be under positive selection. The RNA-dependent RNA polymerase protein (i.e. L protein), together with NP and VP35, constitute the ribonucleoprotein complex of the Ebola virus (Feldmann and Klenk, 1996; Ishihama and Barbier, 1994; Volchkov et al., 1999). The positive selected genes are associated with functions for viral genome synthesis and evasion of the host immune system.

54

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

88/89 88/100 67/65 88/100

88/100

87/100 88/100 AY142960 KC242791 KC242801 EU224440 AF272001 NC 002549

87/100 87/100

JQ352763 AY354458 KC242796 KC242799

KC242795 KC242797 KC242793 KC242792 KC242794 KC242798

KC242785 KC242790 KC242784 KC242787 KC242788 KC242786 KC242789

KM233061 KM233110 KM233098 KM233112 KM233104 KM233115 KM233048 KM233102 KM233099 KM233045 KM233047 KM233093 KM233071 KM233111 KM233086 KM233090 KM233096 KM233095 KM233079 KM233058 KM233059 KM233106 KM233050 KM233052 KM233051 KM233108 KM233038 KM233039 KM233042 KM233070 KM233069 KM233092 KM233049 KM233109 KM233046 KM233076 KM233056 KM233082 KM233035 KM233089 KM233064 KM233118 KM233063 KM233113 KM233114 KM233041 KM233100 KM233101 KM233083 KM233060 KM233091 KM233037 KM233057 KM233040 KM233116 KM233036 KM233117 KM233053 KM233075 KM233065 KM233066 KM233067 KM233068 KM034551 KM034558 KM233073 KM233080 KM233085 KM034557 KM233107 KM034553 KM233055 KM233077 KM233054 KM034560 KM233072 KM233094 KM233087 KM233088 KM233043 KM233081 KM034556 KM233078 KM034552 KM233062 KM233084 KM233074 KM233103 KM233044 KM233105 KM233097 KM034562 KM034555 KM034559 KM034554 KM034561 KJ660346 KJ660347 KJ660348

Sierra Leone, 2014

Guinea, 2014 DRC, 2007

KC242800

Gabon, 2002 Gabon, 1996/1994 DRC, 1995 DRC, 1976/1977

0.01

Fig. 1. Phylogenetic tree based on complete genomes of Zaire ebolavirus isolated from different outbreaks. Topology is based on partitioned Bayesian Analysis (BA). Numbers on nodes before/after slash represent posterior probabilities and bootstrap values, respectively. Years are given for each outbreak. DRC = the Democratic Republic of Congo. The samples highlighted in gray represent the first batch of EVD samples from 12 patients in Sierra Leone.

55

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

89/92 100/92 52/38 100/92

100/96

100/92 100/92 KC242791 AY142960 KC242801 EU224440 AF272001 NC 002549

100/92 100/96

KC242796 KC242799 JQ352763 AY354458

KC242793 KC242797 KC242795 KC242792 KC242794 KC242798

KC242786 KC242789 KC242790 KC242784 KC242785 KC242787 KC242788 KC242800

KM233048 KM233099 KM233112 KM233115 KM233102 KM233110 KM233061 KM233104 KM233098 KM233043 KM233059 KM233109 KM233066 KM233067 KM233068 KM233065 KM233100 KM233101 KM233108 KM034557 KM233086 KM233096 KM233090 KM233095 KM233055 KM233038 KM034553 KM233042 KM233070 KM233069 KM233071 KM233078 KM233080 KM233074 KM233103 KM233073 KM233035 KM034556 KM233097 KM233116 KM233056 KM233094 KM233037 KM233111 KM233046 KM233091 KM233050 KM233052 KM233051 KM233057 KM233060 KM233077 KM233083 KM034551 KM034558 KM233040 KM233087 KM233088 KM233062 KM233113 KM233114 KM233047 KM233085 KM034560 KM233079 KM233089 KM233107 KM034552 KM233049 KM233118 KM233044 KM233106 KM233072 KM233081 KM233063 KM233064 KM233045 KM233093 KM233084 KM233105 KM233058 KM233092 KM233036 KM233053 KM233054 KM233082 KM233041 KM233117 KM233075 KM233076 KM233039 KM034554 KM034561 KM034562 KM034555 KM034559 KJ660346 KJ660347 KJ660348

Sierra Leone, 2014

Guinea, 2014 DRC, 2007 Gabon, 2002 Gabon, 1996/1994 DRC, 1995 DRC, 1976/1977

0.01

Fig. 2. Phylogenetic tree based on concatenated coding sequences of Zaire ebolavirus isolated from different outbreaks. Topology is based on partitioned Bayesian Analysis (BA). Numbers on nodes before/after slash represent posterior probabilities and bootstrap values, respectively. Years are given for each outbreak. DRC = the Democratic Republic of Congo.

56

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

100/88 68/44 99/88

100/88

100/96 95/88 KC242801 EU224440 AF272001 KC242791 AY142960 NC 002549

KC242795 KC242797 KC242793 KC242792 KC242794 KC242798 JQ352763 AY354458 KC242796 KC242799

100/88 99/98

KC242785 KC242790 KC242788 KC242787 KC242784 KC242789 KC242786 KC242800

KM233058 KM233059 KM233091 KM233098 KM233113 KM233086 KM233117 KM233036 KM233102 KM233076 KM233110 KM233056 KM233115 KM233042 KM233069 KM233070 KM233063 KM233051 KM233118 KM233039 KM233071 KM233037 KM233057 KM233052 KM233101 KM233104 KM233111 KM233038 KM233045 KM233106 KM233116 KM233035 KM233093 KM233053 KM233108 KM233092 KM233095 KM233112 KM233064 KM233079 KM233049 KM233050 KM233046 KM233096 KM233061 KM233083 KM233048 KM233109 KM233041 KM233047 KM233082 KM233114 KM233090 KM233099 KM233040 KM233100 KM233060 KM233089 KM233066 KM233067 KM233065 KM233068 KM233105 KM233087 KM233088 KM034552 KM034553 KM034561 KM233078 KM233080 KM233074 KM034560 KM233075 KM034557 KM233044 KM233097 KM233043 KM233094 KJ660348 KM034551 KM034558 KM233077 KM034562 KJ660347 KM233073 KM233103 KM233081 KM034559 KM233055 KM233107 KM034554 KM034556 KM233062 KJ660346 KM233072 KM233084 KM233054 KM034555 KM233085

West Africa Sierra Leone, 2014

Guinea, 2014

DRC, 2007 Gabon, 2002 Gabon, 1996/1994 DRC, 1995 DRC, 1976/1977

0.1 Fig. 3. Phylogenetic tree based on concatenated intergenic sequences of Zaire ebolavirus isolated from different outbreaks. Topology is based on partitioned Bayesian Analysis (BA). Numbers on nodes before/after slash represent posterior probabilities and bootstrap values, respectively. Years are given for each outbreak. DRC = the Democratic Republic of Congo. The samples highlighted in gray represent three Guinea strains.

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

57

Table 2 Likelihood ratio tests (LRTs) between selected CODEML codon substitution models for each gene data of Zaire ebolavirus. Gene

Compared models

df

2DlnL

p value

Positive selected sites

GP

M3(K = 3)–M0 M2a–M1a

4 2

0.335598 7.594066

0.987 0.022*

M8–M7

2

0.143498

0.931

– 331E, 377P, 430L, 443S, 455Y 331E, 377P, 430L, 443S, 455Y

M3(K = 3)–M0 M2a–M1a

4 2

1.609296 6.349854

0.807 0.042*

M8–M7

2

5.060692

0.080

VP24

M3(K = 3)–M0 M2a–M1a M8–M7

4 2 2

1.205978 1.346396 1.283322

0.877 0.510 0.526

– 163K 163K

NP

M3(K = 3)–M0 M2a–M1a M8–M7

4 2 2

0.50813 1.266658 0.306972

0.973 0.531 0.858

– 87Y, 525T 87Y, 525T

VP30

M3(K = 3)–M0 M2a–M1a M8–M7

4 2 2

0.040532 8E-06 0.11419

1.000 1.000 0.945

– – –

VP35

M3(K = 3)–M0 M2a–M1a M8–M7

4 2 2

2.421866 2.41735 0.12579

0.659 0.299 0.939

– 68M 68M

VP40

M3(K = 3)–M0 M2a–M1a M8–M7

4 2 2

0.126502 2.870964 0.132896

0.998 0.238 0.936

– 324V 324V

L

*

– 1405Q, 1607H, 1610F, 1662G 1405Q, 1607H, 1610F, 1662G

Fig. 4. Multiple comparison test of mean dN/dS values for GP gene among different outbreaks. According to the lineages defined by the phylogenetic tree of ZEBOV (Fig. 1), the interior and terminal branches were divided into four groups with the branch leading to KC242800 (Gabon/2002) excluded. ⁄Indicates the difference is significant at p < 0.05.

Indicates the difference is significant at p < 0.05.

This suggests that these mutations could be of significance in the current outbreak in West Africa. For further confirmation, the data of GP and L genes were reanalyzed using the SLAC method in Datamonkey. The results predicted that AA331, AA430, and AA443 of the GP gene, and AA1405, AA1607, and AA1662 of the L gene were under positive selection with p value 60.5, consistent with the results of the PAML analysis (Table 2). 3.4. Accelerated evolutionary rate for the GP gene in 2014 ZEBOV outbreak Given that the GP gene has the greatest intraspecific divergence among the seven genes (Table 1) and is probably experiencing positive selection (Table 2), we then attempted to investigate whether the evolutionary rate of this gene is accelerating. Given there is only a single genome from the 2002 outbreak, we excluded this sequence (KC242800) from the analysis. The results revealed that the evolutionary rates (dN/dS) of the GP gene for the remaining four lineages were significantly different (Kruskal–Wallis rank sum test, p < 0.001). To compare the difference between pairwise outbreaks, a multiple comparison test was then used. The values of dN/dS for the 2014 lineage were significantly higher than either the earliest outbreak (1976) or the previous outbreak (2007) lineages (Fig. 4). The increase in dN/dS values are considered to be related to strong positive selection or relaxed purifying selection (Kawahara and Imanishi, 2007; Toll-Riera et al., 2010; Yang, 1998). 4. Discussion Since the current outbreak started in Guinea in West Africa in February 2014, two studies have investigated the position of the root within Z. ebolavirus isolates (Calvignac-Spencer et al., 2014; Dudas and Rambaut, 2014). The different methodologies

implemented in these studies all support that the Z. ebolavirus of the first recorded outbreak in 1976 should be placed at the root of the ZEBOV tree. The phylogenetic trees in our study also showed that the branch leading to the West Africa outbreak is extremely long (Figs. 1–3). Thus, an unnoticed long-branch attraction to outgroup could result in a misperception regarding the phylogenetic history of ZEBOV from West Africa (Bergsten, 2005; CalvignacSpencer et al., 2014). For identifying the evolutionary processes driving the virus, we used the strains from the outbreak in 1976 to root the ZEBOV tree (Figs. 1–3). All the phylogenetic trees identified five main clades in ZEBOV, approximately corresponding to the five outbreaks, with strong nodal supports (Figs. 1–3). Viral sequences from the same period or location formed a distinct cluster. These results suggests that bottleneck and founder effects have occurred in the ZEBOV population, which is also supported by previous research (Biek et al., 2006). Our results from analysis of complete genomes support the conclusion that the outbreak in West Africa is likely caused by a Z. ebolavirus lineage that has spread from Central Africa into Guinea in recent decades, and is consistent with previous studies (Calvignac-Spencer et al., 2014; Dudas and Rambaut, 2014; Gire et al., 2014). Furthermore, the sequences of ZEBOV extracted from the 2014 outbreak could be divided into two subclades. One subcladeis composed of two strains from Guinea, while the other subclade consists of one Guinea strain and all Sierra Leone strains (Figs. 1 and 2). All Sierra Leone viruses form a large cluster and are a sister-group to the Guinea strain (KJ660346), implying the Sierra Leone viruses originate from a common ancestor in Guinea. Epidemic investigation also revealed that samples from 12 of the first EVD patients in Sierra Leone had attended the funeral of an EVD case from Guinea (Gire et al., 2014). This was consistent with our results that placed these isolates at relatively ‘ancient’ positions within the Sierra Leone lineage (Fig. 1, with the exception of isolate KM233049 that was sampled in 31 May, while the remaining strains were isolated in 25–28 May according to the records of Gire et al. (2014)). This result suggests that the

58

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59

evolutionary rates of ZEBOV could be fairly rapid, and associated with the rapid spread of the outbreak in West Africa. Our estimated phylogenetic relationship of ZEBOV between Guinea and Sierra Leone in the new outbreak based on combined non-coding sequences (Fig. 3) was slightly different from that estimated from combined coding sequences (Fig. 2). However, the length of the nucleotide alignment in the protein-coding dataset (14,516 bp) was over 3 times longer than the non-coding dataset (4443 bp) and the latter may contain insufficient information to resolve the relationship of closely related ZEBOV variants generated in a short time course. In addition, the inconsistency may also be associated with the function of non-coding sequences. If the non-coding regions possess important (but undetermined) function, then they might be subjected to strong purifying selection, which would make it difficult to construct a reliable phylogenetic tree based on these sequences. PAML and Datamonkey were used to investigate selection pressure acting on coding sequences in ZEBOV. For PAML, based on the comparisons of models M1a and M2a, positive selection was detected in GP and L sequences (p = 0.022 and 0.042, respectively), consistent with estimates for Datamonkey. For the GP gene, five amino acid sites predicted to be under positive selection (Table 2) are distributed within the mucin-like domain of the solved 3D structure of the protein (Lee et al., 2008). Although these sites were estimated with less than 95% posterior probabilities, the possibility that the evolution of the GP gene is affected by host environment should not be excluded; there is evidence that the GP protein plays an important role in the attachment and fusion to host cells (Takada et al., 1997; Volchkov et al., 1998), spread of infection (Feldmann et al., 1999) and decrease in endothelial cell barrier function (Wahl-Jensen et al., 2005). The mucin-like domain constitutes an external domain of the glycoprotein involved in viral attachment and fusion subunits (Lee et al., 2008), and also promotes interaction of GP with human cellular factors (Takada et al., 2004). Thus, the mucin-like domain of GP protein may play a crucial role in immune evasion and ZEBOV infection and it may follow that positive selection acting on this domain could be important in regulating adaptive evolution of Z. ebolavirus. For the L gene, previous comparative analysis of the amino acid sequence of the L protein confirmed the existence of six conservative amino acid regions, interspersed by more variable regions (Volchkov et al., 1999). In the present study, three sites predicted to be under positive selection (AA1405, AA1607, AA1610, Table 2) are located in the variable regions, while only one (AA1662) lies in the conserved region of aa 1649–1800 in the filoviruses L protein (Volchkov et al., 1999). Functions of L protein are associated with enzymatic activities in transcription and replication of RNA viruses and could be dependent on these concatenated conserved domains, such as three polymerase motifs A–C (Poch et al., 1990; Volchkov et al., 1999). Such highly variable sequences under putative natural selection are good candidates to conduct specialized functions developed by virus and spatial redistribution of replicative functions could occur in filovirus L proteins (Poch et al., 1990). The current study demonstrates that the GP gene of ZEBOV has a significantly different evolutionary rate (dN/dS) between the 2014 West Africa outbreak and earlier outbreaks (p < 0.001; Fig. 4), with increasing dN/dS values in the new outbreak lineage. These findings are consistent with our understanding of hostdriven evolution. When a virus experiences adaptation to a new host, not only amino acid variation, but also a higher nucleotide evolution rate are commonly observed (Liu et al., 2014). Similar findings are also found in the envelope gene of Hepatitis C virus (HCV) during the acute phase of infection (Booth et al., 1998; Kuntzen et al., 2007). The ZEBOV viral envelope glycoprotein (i.e. GP) is responsible for binding to the receptors on target cells and

decreases endothelial barrier function (Wahl-Jensen et al., 2005). The accelerated evolutionary rate of the GP gene in new outbreak is likely to be driven by host immune response (Kuntzen et al., 2007; Liu et al., 2010). In this respect, more ZEBOV variants are generated during this sweeping epidemic, which in turn, could have contributed to adaptation to new conditions in West Africa. Bats are putative ZEBOV reservoirs and all viruses samples from both humans and bats isolated between 2002 and 2003 can be traced back to a recent common ancestor based on L gene genealogy (Biek et al., 2006). Three bat species are considered to be associated with the 2014 West African outbreak (Leroy et al., 2005; World Health Organization, 2014) and movement of ZEBOV by bat colonies from Middle Africa into West Africa could be a possible explanation for the geographical displacement of the virus (Alexander et al., 2014). Thus, viral sequences from infected bat populations both within and outside the affected areas would be helpful to further the understanding of the evolutionary and epidemiological patterns of ZEBOV. Although many human samples were collected from the most recent outbreak, they predominantly originated from Sierra Leone, relatively few viruses were isolated from Guinea and none were isolated from Liberia. This limits the insight that can be obtained from the data. While collection of samples are obviously challenging and face many obstacles, a more balanced sampling in terms of geographical location and collection time would yield far greater insight into the dynamics and history of ZEBOV outbreak. Additionally, investigation of virus samples using NGS RNA-Seq and ChIP-Seq would provide deeper insight into virus evolution and virus–host interaction (Heilmann et al., 2012; Portal et al., 2013). In this way, we can better evaluate the potential threat from the emergence of new prevalent lineage. Acknowledgements We thank the Core Facility and Technical Support, Wuhan Institute of Virology and Wuhan Key Laboratory on Emerging Infectious Diseases and Biosafety for helpful supports during the course of the work. This work was supported by the National Basic Research Program of China (Grants 2011CB504701, 2012CB518904) and the National Natural Science Foundation of China (Grant 31170158). We thank Hai-Zhou Liu for help with data analysis. We are grateful to Prof. Zhi-Hong Hu for helpful discussion and advice in the preparation of this manuscript. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.meegid.2015.02. 024. References Alexander, K., Sanderson, C., Marathe, M., Lewis, B., Rivers, C., Shaman, J., Drake, J., Lofgren, E., Dato, V., Eisenberg, M., 2014. What factors might have led to the emergence of Ebola in West Africa? PLoS Blogs (Nov. 11). Baize, S., Pannetier, D., Oestereich, L., Rieger, T., Koivogui, L., Magassouba, N.F., Soropogui, B., Sow, M.S., Keïta, S., De Clerck, H., Tiffany, A., Dominguez, G., Loua, M., Traoré, A., Kolié, M., Malano, E.R., Heleze, E., Bocquin, A., Mély, S., Raoul, H., Caro, V., Cadar, D., Gabriel, M., Pahlmann, M., Tappe, D., Schmidt-Chanasit, J., Impouma, B., Diallo, A.K., Formenty, P., Van Herp, M., Günther, S., 2014. Emergence of Zaire Ebola virus disease in Guinea. N. Engl. J. Med. 371, 1418– 1425. Baron, R.C., McCormick, J.B., Zubeir, O.A., 1983. Ebola virus disease in southern Sudan: hospital dissemination and intrafamilial spread. Bull. World Health Organ. 61, 997–1003. Becker, S., Rinne, C., Hofsäss, U., Klenk, H.D., Mühlberger, E., 1998. Interactions of Marburg virus nucleocapsid proteins. Virology 249, 406–417. Bergsten, J., 2005. A review of long-branch attraction. Cladistics 21, 163–193. Biek, R., Walsh, P.D., Leroy, E.M., Real, L.A., 2006. Recent common ancestry of Ebola Zaire virus found in a bat reservoir. PLoS Pathog. 2, 885–886.

S.-Q. Liu et al. / Infection, Genetics and Evolution 32 (2015) 51–59 Booth, J.C., Kumar, U., Webster, D., Monjardino, J., Thomas, H.C., 1998. Comparison of the rate of sequence variation in the hypervariable region of E2/NS1 region of hepatitis C virus in normal and hypogammaglobulinemic patients. Hepatology 27, 223–227. Calvignac-Spencer, S., Schulze, J.M., Zickmann, F., Renard, B.Y., 2014. Clock rooting further demonstrates that Guinea 2014 EBOV is a member of the Zaïre lineage. PLoS Curr. (June 16) Cox, N.J., McCormick, J.B., Johnson, K.M., Kiley, M.P., 1983. Evidence for two subtypes of Ebola virus based on oligonucleotide mapping of RNA. J. Infect. Dis. 147, 272–275. Delport, W., Poon, A.F.Y., Frost, S.D.W., Kosakovsky Pond, S.L., 2010. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology. Bioinformatics 26, 2455–2457. Dudas, G., Rambaut, A., 2014. Phylogenetic analysis of Guinea 2014 EBOV ebolavirus outbreak. PLoS Curr. (May 2) Feldmann, H., Geisbert, T.W., 2011. Ebola haemorrhagic fever. Lancet 377, 849–862. Feldmann, H., Klenk, H.D., 1996. Marburg and Ebola viruses. In: Karl Maramorosch, F.A.M., Aaron, J.S. (Eds.), Advances in Virus Research. Academic Press, pp. 1–52. Feldmann, H., Volchkov, V., Volchkova, V., Klenk, H., 1999. The Glycoproteins of Marburg and Ebola Virus and their Potential Roles in Pathogenesis. Springer. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Gire, S.K., Goba, A., Andersen, K.G., Sealfon, R.S.G., Park, D.J., Kanneh, L., Jalloh, S., Momoh, M., Fullah, M., Dudas, G., Wohl, S., Moses, L.M., Yozwiak, N.L., Winnicki, S., Matranga, C.B., Malboeuf, C.M., Qu, J., Gladden, A.D., Schaffner, S.F., Yang, X., Jiang, P.P., Nekoui, M., Colubri, A., Coomber, M.R., Fonnie, M., Moigboi, A., Gbakie, M., Kamara, F.K., Tucker, V., Konuwa, E., Saffa, S., Sellu, J., Jalloh, A.A., Kovoma, A., Koninga, J., Mustapha, I., Kargbo, K., Foday, M., Yillah, M., Kanneh, F., Robert, W., Massally, J.L.B., Chapman, S.B., Bochicchio, J., Murphy, C., Nusbaum, C., Young, S., Birren, B.W., Grant, D.S., Scheiffelin, J.S., Lander, E.S., Happi, C., Gevao, S.M., Gnirke, A., Rambaut, A., Garry, R.F., Khan, S.H., Sabeti, P.C., 2014. Genomic surveillance elucidates Ebola virus origin and transmission during the 2014 outbreak. Science 345, 1369–1372. Grenfell, B.T., Pybus, O.G., Gog, J.R., Wood, J.L.N., Daly, J.M., Mumford, J.A., Holmes, E.C., 2004. Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303, 327–332. Heilmann, A.M.F., Calderwood, M.A., Portal, D., Lu, Y., Johannsen, E., 2012. Genomewide analysis of Epstein–Barr virus Rta DNA binding. J. Virol. 86, 5151–5164. Hoenen, T., Biedenkopf, N., Zielecki, F., Jung, S., Groseth, A., Feldmann, H., Becker, S., 2010a. Oligomerization of Ebola virus VP40 is essential for particle morphogenesis and regulation of viral transcription. J. Virol. 84, 7053–7063. Hoenen, T., Jung, S., Herwig, A., Groseth, A., Becker, S., 2010b. Both matrix proteins of Ebola virus contribute to the regulation of viral genome replication and transcription. Virology 403, 56–66. Holmes, E.C., 2004. The phylogeography of human viruses. Mol. Ecol. 13, 745–756. Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755. Ishihama, A., Barbier, P., 1994. Molecular anatomy of viral RNA-directed RNA polymerases. Arch. Virol. 134, 235–258. Kawahara, Y., Imanishi, T., 2007. A genome-wide survey of changes in protein evolutionary rates across four closely related species of Saccharomyces sensu stricto group. BMC Evol. Biol. 7, 1–13. Kosakovsky Pond, S.L., Frost, S.D.W., 2005. Datamonkey: rapid detection of selective pressure on individual sites of codon alignments. Bioinformatics 21, 2531– 2533. Kosakovsky Pond, S.L., Frost, S.D.W., Muse, S.V., 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics 21, 676–679. Kuntzen, T., Timm, J., Berical, A., Lewis-Ximenez, L.L., Jones, A., Nolan, B., Schulze zur Wiesch, J., Li, B., Schneidewind, A., Kim, A.Y., Chung, R.T., Lauer, G.M., Allen, T.M., 2007. Viral sequence evolution in acute hepatitis C virus infection. J. Virol. 81, 11658–11668. Lee, J.E., Fusco, M.L., Hessell, A.J., Oswald, W.B., Burton, D.R., Saphire, E.O., 2008. Structure of the Ebola virus glycoprotein bound to an antibody from a human survivor. Nature 454, 177–182. Leroy, E.M., Kumulungui, B., Pourrut, X., Rouquet, P., Hassanin, A., Yaba, P., Delicat, A., Paweska, J.T., Gonzalez, J.P., Swanepoel, R., 2005. Fruit bats as reservoirs of Ebola virus. Nature 438, 575–576. Liu, L., Fisher, B.E., Dowd, K.A., Astemborski, J., Cox, A.L., Ray, S.C., 2010. Acceleration of hepatitis C virus envelope evolution in humans is consistent with progressive humoral immune selection during the transition from acute to chronic infection. J. Virol. 84, 5067–5077. Liu, H., Han, N., Fang, W., Adams, J., Zheng, K., Li, T., Hu, Z., Rayner, S., 2014. The limited number of available nucleotide and protein sequence data from the recent H7N9 cases in China impeded investigation and characterization of the outbreak. Virol. Sin. 29, 126–127. Mateo, M., Carbonnelle, C., Martinez, M.J., Reynard, O., Page, A., Volchkova, V.A., Volchkov, V.E., 2011. Knockdown of Ebola virus VP24 impairs viral nucleocapsid assembly and prevents virus replication. J. Infect. Dis. 204, S892–S896.

59

Matranga, C., Andersen, K., Winnicki, S., Busby, M., Gladden, A., Tewhey, R., Stremlau, M., Berlin, A., Gire, S., England, E., Moses, L., Mikkelsen, T., Odia, I., Ehiane, P., Folarin, O., Goba, A., Khan, S.H., Grant, D., Honko, A., Hensley, L., Happi, C., Garry, R., Malboeuf, C., Birren, B., Gnirke, A., Levin, J., Sabeti, P., 2014. Enhanced methods for unbiased deep sequencing of Lassa and Ebola RNA viruses in clinical and biological samples. Genome Biol. 15, 519. Mühlberger, E., Lötfering, B., Klenk, H.D., Becker, S., 1998. Three of the four nucleocapsid proteins of Marburg virus, NP, VP35, and L, are sufficient to mediate replication and transcription of Marburg virus-specific monocistronic minigenomes. J. Virol. 72, 8756–8764. Mühlberger, E., Weik, M., Volchkov, V.E., Klenk, H.D., Becker, S., 1999. Comparison of the transcription and replication strategies of Marburg virus and Ebola virus by using artificial replication systems. J. Virol. 73, 2333–2342. Nanbo, A., Imai, M., Watanabe, S., Noda, T., Takahashi, K., Neumann, G., Halfmann, P., Kawaoka, Y., 2010. Ebolavirus is internalized into host cells via macropinocytosis in a viral glycoprotein-dependent manner. PLoS Pathog. 6, e1001121. Nanbo, A., Watanabe, S., Halfmann, P., Kawaoka, Y., 2013. The spatio-temporal distribution dynamics of Ebola virus proteins and RNA in infected cells. Sci. Rep. 3, 1206. Poch, O., Blumberg, B.M., Bougueleret, L., Tordo, N., 1990. Sequence comparison of five polymerases (L proteins) of unsegmented negative-strand RNA viruses: theoretical assignment of functional domains. J. Gen. Virol. 71, 1153–1162. Portal, D., Zhou, H., Zhao, B., Kharchenko, P.V., Lowry, E., Wong, L., Quackenbush, J., Holloway, D., Jiang, S., Lu, Y., Kieff, E., 2013. Epstein–Barr virus nuclear antigen leader protein localizes to promoters and enhancers with cell transcription factors and EBNA2. Proc. Natl. Acad. Sci. U.S.A. 110, 18537–18542. Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818. R Core Team, 2013. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, 3-900051-07-0. Available from: . Sanchez, A., Geisbert, T., Feldmann, H., 2007. Filoviridae: Marburg and Ebola viruses. In: Knipe, D., Howley, P. (Eds.), Fields Virology, fifth ed. Lippincott/Williams & Wilkins Co., Philadelphia, PA, pp. 1409–1448. Song, M., Tang, Q., Rayner, S., Tao, X.Y., Li, H., Guo, Z.Y., Shen, X.X., Jiao, W.T., Fang, W., Wang, J., Liang, G.D., 2014. Human rabies surveillance and control in China, 2005–2012. BMC Infect. Dis. 14, 212. Suzuki, Y., Gojobori, T., 1997. The origin and evolution of Ebola and Marburg viruses. Mol. Biol. Evol. 14, 800–806. Takada, A., Robison, C., Goto, H., Sanchez, A., Murti, K.G., Whitt, M.A., Kawaoka, Y., 1997. A system for functional analysis of Ebola virus glycoprotein. Proc. Natl. Acad. Sci. U.S.A. 94, 14764–14769. Takada, A., Fujioka, K., Tsuiji, M., Morikawa, A., Higashi, N., Ebihara, H., Kobasa, D., Feldmann, H., Irimura, T., Kawaoka, Y., 2004. Human macrophage C-Type lectin specific for galactose and N-acetylgalactosamine promotes filovirus entry. J. Virol. 78, 2943–2947. Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599. Toll-Riera, M., Laurie, S., Albà, M.M., 2010. Lineage-specific variation in intensity of natural selection in mammals. Mol. Biol. Evol. 28, 383–398. Volchkov, V.E., Feldmann, H., Volchkova, V.A., Klenk, H.D., 1998. Processing of the Ebola virus glycoprotein by the proprotein convertase furin. Proc. Natl. Acad. Sci. U.S.A. 95, 5762–5767. Volchkov, V.E., Volchkova, V.A., Chepurnov, A.A., Blinov, V.M., Dolnik, O., Netesov, S.V., Feldmann, H., 1999. Characterization of the L gene and 50 trailer region of Ebola virus. J. Gen. Virol. 80, 355–362. Wahl-Jensen, V.M., Afanasieva, T.A., Seebach, J., Ströher, U., Feldmann, H., Schnittler, H.J., 2005. Effects of Ebola virus glycoproteins on endothelial cell activation and barrier function. J. Virol. 79, 10442–10450. Watanabe, S., Noda, T., Kawaoka, Y., 2006. Functional mapping of the nucleoprotein of Ebola virus. J. Virol. 80, 3743–3751. WestArica_WHO_RiskAssessment_20140624.pdf (accessed 19.09.14.). Whelan, S., Goldman, N., 1999. Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics. Mol. Biol. Evol. 16, 1292. World Health Organization, 2014. WHO Risk Assessment Human Infections with Zaïre Ebola virus in West Africa 24 June 2014 [Online]. Available: . Yang, Z., 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol. Biol. Evol. 15, 568–573. Yang, Z.H., 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591. Yang, Z.H., Wong, W.S.W., Nielsen, R., 2005. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol. Biol. Evol. 22, 1107–1118.

Identifying the pattern of molecular evolution for Zaire ebolavirus in the 2014 outbreak in West Africa.

The current Ebola virus disease (EVD) epidemic has killed more than all previous Ebola outbreaks combined and, even as efforts appear to be bringing t...
551KB Sizes 0 Downloads 10 Views