Molecular Phylogenetics and Evolution 76 (2014) 172–180

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Multilocus approach to clarify species status and the divergence history of the Bemisia tabaci (Hemiptera: Aleyrodidae) species complex Chia-Hung Hsieh a,b, Chiun-Cheng Ko b, Cheng-Han Chung a, Hurng-Yi Wang a,⇑ a b

Graduate Institute of Clinical Medicine, National Taiwan University, Taipei 100, Taiwan Department of Entomology, National Taiwan University, Taipei 106, Taiwan

a r t i c l e

i n f o

Article history: Received 4 September 2013 Revised 24 January 2014 Accepted 21 March 2014 Available online 29 March 2014 Keywords: Invasive pest Cryptic speciation Bayesian species delimitation Approximate Bayesian computation

a b s t r a c t The sweet potato whitefly, Bemisia tabaci, is a highly differentiated species complex. Despite consisting of several morphologically indistinguishable entities and frequent invasions on all continents with important associated economic losses, the phylogenetic relationships, species status, and evolutionary history of this species complex is still debated. We sequenced and analyzed one mitochondrial and three singlecopy nuclear genes from 9 of the 12 genetic groups of B. tabaci and 5 closely related species. Bayesian species delimitation was applied to investigate the speciation events of B. tabaci. The species statuses of the different genetic groups were strongly supported under different prior settings and phylogenetic scenarios. Divergence histories were estimated by a multispecies coalescence approach implemented in *BEAST. Based on mitochondrial locus, B. tabaci was originated 6.47 million years ago (MYA). Nevertheless, the time was 1.25 MYA based on nuclear loci. According to the method of approximate Bayesian computation, this difference is probably due to different degrees of migration among loci; i.e., although the mitochondrial locus had differentiated, gene flow at nuclear loci was still possible, a scenario similar to parapatric mode of speciation. This is the first study in whiteflies using multilocus data and incorporating Bayesian coalescence approaches, both of which provide a more biologically realistic framework for delimiting species status and delineating the divergence history of B. tabaci. Our study illustrates that gene flow during species divergence should not be overlooked and has a great impact on divergence time estimation. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction The sweet potato whitefly, Bemisia tabaci (Gennadius), is one of the world’s top 100 invasive pests in the IUCN (International Union for Conservation of Nature) list and is recognized as a species complex consisting of morphologically indistinguishable entities (Boykin et al., 2007; Perring, 2001), of which the genetic delimitation among them remains obscure. This polyphagous pest species not only transmits viruses but also induces plant physiological disorders and causes agricultural damage throughout tropical and subtropical areas (Perring, 2001). Because of its confused taxonomic status and frequent worldwide invasions with associated huge economic losses, many studies have focused on the phylogeny, biotype identification, and geographic distribution of B. tabaci. Phylogenetic analysis of nuclear ribosomal internal transcribed spacer 1 (ITS1) indicates that B. tabaci should be divided into six races: Asia, Bali, Australia, sub-Saharan Africa, Mediterranean/Asia ⇑ Corresponding author. E-mail address: [email protected] (H.-Y. Wang). http://dx.doi.org/10.1016/j.ympev.2014.03.021 1055-7903/Ó 2014 Elsevier Inc. All rights reserved.

Minor/Africa, and New World (De Barro et al., 2005). According to mitochondrial cytochrome oxidase I (mtCOI) differentiation and geographic distributions, Boykin et al. (2007) divided B. tabaci into 12 well-resolved genetic groups: Mediterranean/Asia Minor/Africa invasive (MAMAI), Mediterranean invasive (MI), Indian Ocean (IDO), Sub-Saharan Africa silverleafing (SSA), Asia I (ASI), Asia II (ASII), Australia (AUS), China (CN), Italy (ITA), New World (NW), Sub-Saharan Africa non-silverleafing (SSANS), and Uganda sweet potato (USP), all of which probably represent different cryptic species (Boykin et al., 2007; Dinsdale et al., 2010). However, a substantial amount of mtCOI diversity was still observed within genetic groups (Dinsdale et al., 2010), and species or genetic groups delimited by only mitochondrial markers without other information, such as nuclear differentiation, intercrossing, or ecological specialization, may be problematic. Among these genetic groups, MAMAI (=B biotype) and MI (=Q biotype) are global invasive pest. The former has higher fecundity and broad host range but lower temperature tolerance than the latter (Pascual and Callejas, 2004). Besides, they have different insecticide resistance (Horowitz et al., 2005; Khasdan et al.,

173

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

2005). Based on mating behavior, allozymic frequencies, and DNA differentiation, MAMAI was proposed as a new species (B. argentifolii) (Bellows et al., 1994; Perring et al., 1993) and a sister group of B. tabaci (Bellows et al., 1994; Perring, 2001; Perring et al., 1993). Nevertheless, the species status of the B. argentifolii was not supported by phylogenetic analyses based on ITS1 (De Barro et al., 2005). Morphologically, the distinction between B. tabaci and B. argentifolii was obscure (Bellows et al., 1994). Furthermore, incomplete reproductive isolation was observed by cross-mating experiments between MAMAI and MI (Pascual and Callejas, 2004). Typically, species delimitation by genetic data relies upon genetic distances or branching patterns, i.e., reciprocal monophyly (Sites and Marshall, 2003, 2004), both of which often require subjective decisions regarding thresholds that demarcate the species boundary (Hey, 2009). Because speciation is a continuous process (de Queiroz, 1998), delimiting species using genetic data is usually accompanied by some degree of uncertainty, which can be substantial at the recent time scales (Knowles and Carstens, 2007). In light of this issue, a recently developed Bayesian species delimitation method (Yang and Rannala, 2010), which deals with a multilocus data and multispecies coalescent framework while accommodating gene tree uncertainty, provides progress toward an accurate estimation of a speciation event. In addition to species boundaries, the divergence history of B. tabaci is not well understood. Such information is crucial because it provides a framework for studying the origin of genetic variations and formation of current population distributions, both of which are in turn important for conservation decision-making and invasive pest management (Bickford et al., 2007; Folino-Rorem et al., 2009; Trontelj and Fiser, 2009). Dating the time of divergence and understanding the speciation process are two central and interconnected issues in the study of the evolutionary history of organisms. However, obtaining accurate estimates of divergence and depicting the speciation process are notoriously difficult with two main sources of error: stochasticity of the coalescent process, and mode of species formation, i.e., whether or not gene flow occurred during speciation (Edwards and Beerli, 2000; Wu, 2001). The divergence time of a gene between species is the species divergence (t) plus its coalescence time, the latter is exponentially distributed with a mean equal to 2Ne generation times (Ne being the effective population size at the time of speciation). Thus, variation in Ne can enhance variance in divergence time among loci. A recently developed Bayesian Markov chain Monte Carlo (MCMC)

method, which deals with multispecies coalescence, provides progress toward a precise and accurate understanding of the timing of speciation (Heled and Drummond, 2010). In addition, most divergence dating methods assume no gene flow during speciation, i.e., allopatric speciation (Heled and Drummond, 2010; McCormack et al., 2011). In the face of gene flow, or parapatric speciation, divergence between species will be more complex. For genomic regions associated with reproductive incompatibility, early cessation of gene flow is likely. For regions free of such association, gene flow may continue until relatively late (Osada and Wu, 2005). Therefore, the test of gene flow is important not only for divergence dating but also for understanding the mode of speciation. Recently, an efficient and flexible method, approximate Bayesian computation (ABC), has been developed to provide a means to test alternative hypotheses for a complex speciation history under a likelihood search algorithm (Beaumont, 2010; Beaumont et al., 2002). The ABC method, which provides statistical support for competing hypotheses, is now being widely applied to phylogeographic studies (Bertorelle et al., 2010; NadachowskaBrzyska et al., 2013). By using the ABC model selection procedure, the underlying predominant speciation process between diverging sister taxa can be reconstructed confidently with statistical supports. In order to study phylogenetic relationships, species statuses, and divergence histories of the B. tabaci species complex, we sequenced one mitochondrial and three nuclear genes from 55 individuals belonging to nine genetic groups of B. tabaci and five selected outgroups. Speciation events were examined by the Bayesian species delimitation method using multilocus data. The species trees and divergence times were jointly estimated by a multispecies coalescent approach under a Bayesian framework using *BEAST (Drummond et al., 2012). Potential speciation models and divergence histories were evaluated by ABC. Based on our findings, a divergence model that explains when and how B. tabaci originated and diverged is proposed for further investigation. 2. Materials and methods 2.1. Sample preparation, PCR amplification, and sequencing Multiple individuals from 9 of the 12 genetic groups of B. tabaci (Boykin et al., 2007) and closely related species were included in this study (Tables 1 and S1). Whiteflies were collected from weeds

Table 1 Number of chromosomes (n), segregating site (S), average number of mutations per site (h), nucleotide diversity (p), and results of Tajima’s D (D) test of different Bemesia species. Taxa

n

B. tabaci

shaw (510 bp)

RNApyII (961 bp)

mtCOI1 (563 bp)

prp8 (1060 bp)

S

h (%)

p (%)

D

S

h (%)

p (%)

D

S

h (%)

p (%)

0.09 0.40 0 0.50 0 0 0.07 0.09 0

0.11 0.29 0 0.51 0 0 0.04 0.07 0

0.85 1.02 – 0.15 – – 1.11 0.93 –

1 7 5 24 0 3 0 6 4

0.05 0.24 0.23 0.79 0 0.10 0 0.27 0.18

0.06 0.28 0.26 0.74 0 0.09 0 0.33 0.18

0.85 0.69 0.71 0.26 – 0.17 – 1.25 0.06

1 9 0 35 2 11 3 3 0

0.04 0.29 0 1.07 0.08 0.33 0.10 0.12 0

0.03 0.25 0 1.35 0.10 0.34 0.07 0.15 0

1.69% 6 0 0 0 16

(0.2–2.8%) 0.27 0.32 0 0 0 0 0 0 0.68 0.79

0.96 – – – 0.83

2.25% 3 0 1 0 –

(0.5–4.8%) 0.12 0.13 0 0 0.04 0.04 0 0 – –

AUS ASI CN ASII ITA MI MAMAI NW SSANS

6 12 6 14 6 14 10 6 6

1 6 0 8 0 0 1 1 0

Average divergence B. formosana B. vernoniae B. breyniae B. emiliae B. porteri

6 2 8 6 8

0.65% (0.2–1.3%) 1 0.09 0.07 0 0 0 0 0 0 0 0 0 0 0 0

0.93 – – – –

S

h (%)

p (%)

D

0 3.66 0 8.05 0 0.80 0 1.42 0

0 2.87 0 8.17 0 0.69 0 1.42 0



– 1.15 1.03 0.17 1.03 1.12 –

0 47 0 103 0 11 0 12 0

0.34 – 0.33 – –

17.50% 2 0 1 4 0

(9.4–22.5%) 0.24 0.24 0 0 0.10 0.12 0.47 0.47 0 0

D 0.93 0.53

1.38* – 0.09 – 0.70 – – – – – 1.63 – –

Shaw: shaker cognate w; RNApyII: RNA polymerase II; prp8: pre-mRNA processing factor 8; mtCOI: mitochondrial cytochrome oxidase subunit I; 1 the number of chromosomes for mitochondrial is n/2; * Tajima’s D and Fay and Wu’s H (= 15.2) for mtCOI of ASI were significantly negative with p < 0.05.

174

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

and crops in the field or obtained from other researchers, preserved in 95% ethanol, and stored at 20 °C prior to analysis. Genomic DNA extractions were conducted using the BuccalAMP™ DNA Extraction kit (Epicentre, Madison, WI, USA). Four genes, including mitochondrial cytochrome oxidase subunit I (mtCOI), and nuclear shaker cognate w (shaw), RNA polymerase II (RNApyII), and pre-mRNA processing factor 8 (prp8) were targeted for amplifications. The details of the primer used and PCR conditions are listed in Table S2. PCR products were purified using a PCR Clean Up System (Viogene, Taipei, Taiwan) and sequenced on an ABI 3730 DNA Analyzer (Applied Biosystems, CA, USA) using an ABI PRISM Terminator Cycle Sequencing Ready Reaction Kit, ver. 3.1 (Applied Biosystems, CA, USA). Sequencing reactions were carried out in the core facility of the College of Medicine, National Taiwan University. 2.2. Sequence analyses and phylogenetic reconstruction Sequences were assembled using Seqman software (Lasergene, Madison, WI, USA) and aligned using Clustal X 2.0 (Larkin et al., 2007). Haplotypes of nuclear genes were reconstructed using PHASE 2.1 (Stephens et al., 2001) implemented in DnaSP version 5.10.1 (Librado and Rozas, 2009). This software was also used to estimate the average number of mutations per site (h) and nucleotide diversity (p). To evaluate whether the population was under mutation-drift equilibrium, Tajima’s D (Tajima, 1989) and Fay and Wu’s H (Fay and Wu, 2000) tests were carried out using DnaSP. The significance of the tests was assessed by 104 coalescent simulations. Genetic distances between whiteflies were estimated using the maximum composite likelihood method (Tamura et al., 2004) implemented in MEGA vers. 5.05 (Tamura et al., 2011). Nucleotide substitution models for phylogenetic reconstruction were determined by Akaike Information Criterion (AIC) (Posada and Buckley, 2004) using MrMODELTEST vers. 2.3 (Nylander, 2008). Maximum-likelihood (ML) trees were constructed by PhyML 3.0 (Guindon et al., 2010) with 100 bootstrap replications for nodal supports. Phylogenetic analyses based on Bayesian inference (BI) were performed using MrBayes ver. 3.1.2 (Huelsenbeck and Ronquist, 2001). Analyses were initiated with random starting trees, and Metropolis-coupled Markov chain Monte Carlo (MCMC) analyses were run for 107 generations and sampled every 1000 generations with the initial 10% discarded as burn-in. Two independent runs were carried out. The results were compared to confirm that both sampled the same distribution and were then combined. Mean branch lengths were presented on a 50% majority-rule consensus tree. 2.3. Bayesian species delimitation Three nuclear genes were used to assess the species status of B. tabaci by the Bayesian species delimitation method implemented in BPP 2.0 (Yang and Rannala, 2010). Bayesian species delimitation uses reversible-jump Markov chain Monte Carlo (rjMCMC) in conjunction with a user-specified guide tree to estimate the posterior distribution for species delimitation models containing different numbers of species. In this method, gamma distribution is used to specify priors for ancestral population size (hS) and age to the root of the tree (s0). The gamma G [a, b] has mean m = a/b and variance s2 = a/b2. Two approaches were applied to examine the robustness of species delimitation. First, four alternative trees, including the mtCOI tree of Boykin et al. (2007), Dinsdale et al. (2010), the present study, and the nuclear gene tree (shaw + RNApyII + prp8) derived in this study (see Section 3) were applied to test the potential influence of guide trees. Second, we used three sets of hS and s0 to examine the effect of different priors. They were comprised of a large ancestral population size and deep divergence

time (G[1, 10] for both hS and s0), a small ancestral population size and shallow divergence time (G [2, 2000] for both hS and s0), a large ancestral population size with a shallow divergence time (G [1, 10] for hS and G [2, 2000] for s0), and a small ancestral population size with a deep divergence time (G [2, 2000] for hS and G [1, 10] for s0) (Leache and Fujita, 2010). A Dirichlet prior was assigned to the other divergence time parameters (Yang and Rannala, 2010). The rjMCMC algorithm was run for 500,000 generations, sampled every 5 generations, and the initial 10% was discarded as burn-in. 2.4. Divergence time estimation Divergence times and the species trees of B. tabaci were estimated by a Bayesian MCMC approach in *BEAST version 1.70 (Drummond et al., 2012). *BEAST uses multilocus data and the multispecies coalescent approach to infer species trees. Molecular clocks were estimated from two Drosophila species, D. melanogaster and D. simulans. Assuming the two species diverged about one million years ago with 10 generations per year (Cutter, 2008), the molecular clock for mtCOI, shaw, RNApyII, and prp8 are 1.98  10 8, 7.95  10 9, 8.15  10 9, and 1.03  10 8/site/year, respectively. Analyses were performed using the GTR model of nucleotide substitution and the species tree prior was set to the Yule process. The birth rate (k) of Yule prior was set to be 1 with the uniform distribution between 0 and infinity. A relaxed clock with an uncorrelated log-normal model of rate variation was applied. *BEAST also incorporates an estimate of population sizes for each branch into species-tree estimation. We focused only on the timing of separation because our main concern was divergence dating. The length of the MCMC chain was set to 1.5  109 generations and sampled every 10,000 generations, of which the first 10% were discarded as burn-in. Two independent runs were carried out. The results were compared to confirm that both sampled the same distribution and were then combined. Effective sample size (ESS) values were verified to reach the stationary stage based on the log file using Tracer version 1.5. A summary tree was produced using TreeAnnotator version 1.70. 2.5. Approximate Bayesian computation (ABC) for inferring demographic models In order to evaluate the divergence history of B. tabaci, four demographic models – isolation, isolation with migration, early gene flow, and secondary contact models (Fig. S1) – were constructed and simulated by msABC (Pavlidis et al., 2010), a modified version of the ms program (Hudson, 2002). In the isolation model, two genetic groups diverged Tdiv generations ago with no gene flow after divergence. In the isolation with migration model, gene flow (4Nm) persisted between genetic groups after their divergence. In the early gene flow model, genetic groups diverged in the presence of gene flow Tdiv to Tml generations ago, and subsequently diverged under absolute reproductive isolation. Finally, in the secondary contact model, genetic groups had no gene flow after divergence until Tm2 generations ago when they once again experienced gene flow. The priors and parameters for simulations were determined based on our data (see Section 3). Recombination events were evaluated by the four-gamete test (Hudson, 2002) implemented in DnaSP. Divergence time was sampled from a normal distribution and the rest of the parameters were sampled from a uniform distribution. One million simulated data sets were generated for each model using the msABC program. The posterior probability for each model was calculated by weighted multinomial logistic regression using Beaumont’s R script calmod (Beaumont, 2008). The tolerance rate of 0.002 was used for testing the robustness of the posterior probability for each model. The best model was

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

determined by the Bayes factor based on the ratio of posterior probabilities between two models (Kass and Raftery, 1995), and a Bayes factor >3 was regarded as significant (Jeffreys, 1961; Kass and Raftery, 1995). After model selection, an additional 10 million simulations were generated for the best model to estimate the parameters of interest. Partial least-squares (PLSs) components were applied to convert summary statistics using the PLS package in R script (Mevik and Wehrens, 2007). Ten PLS complements were constructed from 10,000 simulations to determine the optimum number of PLS complements. In addition, observed and simulated data sets were transformed to 10 PLS complements using the Transformer program in the ABCtoolbox package (Wegmann et al., 2010). Demographic parameters were estimated based on posterior probability distribution using ABCestimator in the ABCtoolbox package.

3. Results 3.1. Sequence variations and phylogenetic relationships of B. tabaci For mtCOI, the ML and BI methods yielded concordant trees, and all genetic groups of B. tabaci were strongly supported (BI and bootstrap values >90; Fig. 1). Nevertheless, relationships among them were often poorly resolved. In general, the mtCOI phylogeny revealed three major nodes. Node 1 consisted of three species previously recognized as the genus Lipaleyrodes; i.e., B. vernoniae, B. emiliae, and B. breyniae. These species, along with B. formosana, were grouped within B. tabaci, suggesting that this

175

species complex is a paraphyletic group. Node 2 contained MI and MAMAI and node 3 included ASI, ASII, CN, AUS, and ITA. Phylogenetic relationships derived from shaw (Fig. S2A), RNApyII (Fig. S2B), and prp8 (Fig. S2C) showed great variation among each other and to Fig. 1. Many genetic groups that were strongly supported by mtCOI were not resolved by nuclear markers, suggesting incomplete lineage sorting among them. For example, the monophyletic origins of ASI, ASII, CN, and MI were not revealed by shaw. ASI and ASII also failed to form a distinct group in the trees derived from RNApyII and prp8. In analyses that concatenated all nuclear loci (Fig. 2), eight out of nine genetic groups had a monophyletic origin excluding ASII. Nodes 1 and 2 were strongly supported. NW and B. formosana were clustered with genetic groups of node 3 with strong BI and bootstrap supports. Mitochondrial COI divergence between genetic groups ranged 9.4–22.5% with an average of 17.5% (Tables 1 and S3). Within each group, nucleotide diversity (p) ranged from 0.00% to 8.17%. Tajima’s D and Fay and Wu’s H were not significant in all genetic groups and markers, except ASI, in which both D and H were significantly negative, indicating that it was not under mutation-drift equilibrium. For nuclear genes, average sequence divergences between groups were 0.65% for shaw (Table S4), 1.69% for RNApyII (Table S5), and 2.25% for prp8 (Table S6). The mean p values within groups were 0.11% (0.00–0.51%), 0.22% (0.00–0.74%), and 0.25% (0.00–1.35%) for shaw, RNApyII, and prp8, respectively. Sequence divergences of nuclear loci were only 1/100–1/10 of mitochondrial divergence between different genetic groups (Table 1). This higher rate of mtDNA evolution as compared to nuclear DNA evolution has been noticed in B. tabaci (Gueguen et al., 2010), and could be

Fig. 1. Bayesian phylogeny of Bemisia tabaci species complex constructed by mitochondrial cytochrome oxidase I (mtCOI). Numbers close to the nodes are Bayesian posterior probabilities (above slash) and maximum-likelihood bootstrap values (below slash).

176

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

The divergence time estimated by mtCOI and nuclear loci also showed great differences. Based on mtCOI, species included in this study diverged 6.47 million years ago (MYA), but nuclear species tree suggested that it was 1.25 MYA (Fig. 4). Divergence times of nodes 1, 2, and 3 were 3.89, 2.38, and 4.63 MYA, respectively, in the mtCOI species tree. In contrast, divergence estimates derived from nuclear data were considerably closer to the present. The differentiation of nodes 1, 2, and 3 were at 0.38, 0.52, and 0.63 MYA, respectively. 3.3. Approximate Bayesian computation analysis

Fig. 2. Maximum-likelihood (ML) tree of Bemisia tabaci species complex constructed by concatenation of three nuclear genes. Numbers close to the nodes are Bayesian posterior probabilities (above slash) and maximum-likelihood bootstrap values (below slash).

due to (1) a high mutation rate in mtDNA (Schlotterer et al., 1994), (2) endosymbiotic infections (Hurst and Jiggins, 2005), and (3) geographic/historical events (Galtier et al., 2009). We will come back to this issue in Section 4. 3.2. Bayesian species delimitation and divergence dating Phylogenies derived from nuclear loci suggested incomplete lineage sorting among genetic groups of B. tabaci. We next used the Bayesian species delimitation method to evaluate species status. BPP 2.0 only allows a maximum of ten taxa. We thus excluded B. vernoniae, B. emiliae, and B. breyniae in the analysis, as they are morphologically distinct from B. tabaci complex and there has been no dispute over their species statuses. Because mitochondrial and nuclear genes yielded different phylogenies, which might have a great impact on the outcome (Yang and Rannala, 2010), we applied four guide trees to test their influences. Guide trees 1 and 2 are phylogenies based on mitochondrial loci and were proposed by Boykin et al. (2007) and Dinsdale et al. (2010), respectively. Guide trees 3 and 4 are phylogenies of mtCOI (Fig. 1) and nuclear loci (Fig. 2). For each guide tree, four different prior settings were tested (see Section 2). The species statuses of different genetic groups were strongly supported under different guide trees and priors with speciation probabilities of 0.99–1.00 on all nodes (Fig. 3). Although ASII was separated into two lineages, ASII-1 and ASII-2, in Guide tree 4, their speciation probabilities were all 1.0 under different settings, indicating that they could be further classified into different species (De Barro et al., 2011; Dinsdale et al., 2010). According to the results of Bayesian species delimitation, different genetic groups within B. tabaci were treated as distinct species in the following analyses. We used the species tree ancestral reconstruction method implemented in *BEAST to estimate the divergence history of this species complex. Species trees constructed by mtCOI (Fig. 4A) and nuclear loci (Fig. 4B) showed substantial differences to each other and to Figs. 1 and 2. For example, while the SSANS was at the root position in Figs. 1 and 2, it was clustered with B. formosana in the mtCOI species tree. In Fig. 4B, SSANS became the sister group of node 2, and node 1 was a sister group of B. tabaci + B. formosana.

Inconsistent tree topologies and divergence dating derived from different methods and markers could be the result of (1) a complicated divergence history of B. tabaci due to ancestral polymorphism or gene flow during speciation, or (2) improper prior settings (e.g., mutation rates) in the analyses. In order to reveal the detail of their speciation processes, four models (i.e., isolation, isolation with migration, early gene flow, and secondary contact models) (Fig. S1) were proposed and simulated by msABC. Because msABC involves too many parameters for simulation, we first focused on MAMI and MI (node 2) as their sister relationship was strongly supported in all markers and methods analyzed. The summary statistics of three nuclear loci are listed in Table S7. Information about mtCOI was not included in the simulation because mitochondrial and nuclear genomes may have very different evolutionary histories (see Section 4). Prior distributions are in Table S8. Because nuclear and mitochondrial markers showed large variations in divergence, a broad divergence prior, from 0 to 6 million years, was implemented. One million simulated data sets were first generated for each model to calculate the posterior probability for model selection procedure. Among the proposed models, early gene flow was favored over the rest of the models (probability = 0.90; Bayes factor (BF) = 10) (Table 2). Based on the selected model, an additional 10 million simulations were generated to estimate the parameters of interest. The speciation of MAMI and MI were initiated at 2.29 MYA (Tdiv; Table 3 and Fig. 5A) with a substantial amount of migration, ca. 50 individuals per generation, across the two diverging groups. Gene flow lasted about 3/4 of their divergence history and stopped at 0.56 MYA (Tml; Fig. 5B). The effective population sizes of MI and MAMI were 0.33 and 0.38 million, respectively (Fig. 5E and F). The ancestral effective population size (NA) of 4.42 million was more than 10 times larger than the current populations. The close relationships of the four genetic groups (ASI, ASII, AUS, CN) were also selected to test speciation scenarios. As their phylogenetic relationships were not resolved, we did pair-wise comparisons among them (Table 2). In a total of six cases, four favored the early gene flow model, ASI/II (BF = 23.25), ASI/AUS (BF = 8), ASII/ AUS (BF = 2.53), and AUS/CN (BF = 5.18), one suggested the isolation model, ASI/CN (BF = 1.30), and the last one suggested the model of isolation with migration, ASII/CN (BF = 1.77). B. emiliae and B. breyniae were morphologically different species and were also tested under the four speciation scenarios. The early gene flow model was also favored in this case, with BF = 2.66. In summary, gene flow was suggested in seven out of eight cases with four of them strongly supported by the model selection process (BF > 3). 4. Discussion 4.1. Species definition and identification The species statuses of different genetic groups within the B. tabaci species complex have been debated for years. This study

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

177

Fig. 3. Four guide trees used for Bayesian species delimitation. (A) mtCOI tree of Boykin et al (2007); (B) mtCOI tree of Dinsdale et al. (2010); (C) mtCOI tree derived from Fig. 1; (D) Nuclear gene tree of Fig. 2. Numbers close to the nodes, from left to right, are the speciation probabilities from three combinations of priors: G (1, 10) for hS and s0, G (2, 2000) for hS and s0, G (1, 10) for hS and G (2, 2000) for s0, and G (2, 2000) for hS and G (1, 10) for s0.

provides the first evidence based on multiple nuclear loci, which supports a monophyletic origin for most of the genetic groups. In addition, results derived from the Bayesian species delimitation method supports the species status of all genetic groups under various priors and guide trees. Therefore, we feel that the different genetic groups of B. tabaci should be considered distinct species. There is no consensus among systematists to judge whether a population is a species (Wiens, 2004). Ideally, it is best to use phenotypic, genetic, and ecological information together to define species (Leache et al., 2009). Although we are not aware of any morphological trait that differentiates these lineages, some ecological differences between them were noticed. For example, MAMAI and MI showed differences in life history traits such as mortality, fecundity, and development (Pascual and Callejas, 2004). In addition, MI exhibited a wider thermal tolerance than ASII that results

in better temperature adaptation (Yu et al., 2012). Finally, different courtship and mating behaviors were also noticed in MAMAI and NW (Perring and Symmes, 2006). Our conclusion may be affected by several factors, including sample size, guide tree, gene flow between groups, and prior setting. It has been shown by simulations that the correct species model can be inferred with only one or two loci when 5–10 sequences are sampled from each population (Zhang et al., 2011). Therefore, the three nuclear loci with 6–14 sequences from each genetic group of B. tabaci that were applied here should provide enough information to infer speciation events. While the relationships among genetic groups are not well resolved, their species statuses are strongly supported by different guide trees, indicating a robust conclusion under a wide range of phylogenetic scenarios. In addition, different combination of priors did not affect the result

178

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180 Table 3 Parameter estimates under the early gene flow model between MAMAI and MI genetic groups.

Mode 95% HPD lower 95% HPD upper

N1

N2

NA

Tdiv

Tm1

M

0.33 0.000023 1.46

0.38 0.000025 1.41

4.42 0.45 9.65

2.29 0.15 4.37

0.56 0.03 1.10

50.25 4.52 95.98

Mode: posterior mode; N1 and N2: effective population size (in millions) of MAMAI and MI, respectively; NA: ancestral effective population size (in millions); Tdiv: the divergence time (in million years) between two groups; Tm1: the time of migration stop (in million years) between two groups; M: effective migration rate (4Nm; in individuals).

4.2. Divergence history of B. tabaci species complex

Fig. 4. Bayesian species phylogenies for Bemesia species contrasting dated divergence times from (A) mitochondrial and (B) nuclear species trees. Numbers close to the node are Bayesian posterior probabilities; only those >50% are shown.

of species delimitation. Although Bayesian species delimitation is based on the biological species concept, assuming complete cessation of gene flow following species divergence (Yang and Rannala, 2010), results of computer simulation showed that low levels of migration, with the expected number of immigrants per generation at M = Nm < 0.1, have virtually no impact on species delimitation (Zhang et al., 2011). According to msABC simulations, even though contemporary gene flow may exist between ASII and CN (Table 2), they were still concluded as different species. After excluding possible inference factors, our results should be reasonable estimates. Additional data from different aspects may strengthen the evidence that these species are distinct or recognize more unidentified species. For example, perhaps more species would be finally recognized within ASII, as distinct nuclear lineages were recovered from that group (Figs. 2 and 3D). Species may evolve, diversify, or go extinct due to fluctuations in demography and climate, and species assignments might change accordingly. Nevertheless, the contemporary scenario supports the concept that different genetic groups of the B. tabaci species complex are best treated as distinct species.

Divergence dating from different markers yielded quite different results. This could be a result of improper molecular clocks being applied, as mitochondrial and nuclear substitution rates used herein were both derived from closely related Drosophila species, D. melanogaster and D. simulans, but which may not be suitable for B. tabaci. In addition, unconformity in divergence dating can also be due to variations in divergence time (t) or in coalescence time (Osada and Wu, 2005). According to coalescent theory, gene copies take a longer time to coalesce in large populations than in small ones. As mitochondrial genes have an effective population size of 1/4 that of nuclear genes, coalescence times for the latter will average 4 times longer than the former. Therefore, the more ancient divergence of mtCOI is not due to variations in coalescence time. Computer simulations based on nuclear genes suggest the existence of gene flow at an early stage of speciation. It is noteworthy that the divergence time between MI and MAMAI, which was examined in a greater detail, was initiated 2.29 MYA (Tdiv) (Table 3) and was similar to the divergence time estimated by the mtCOI species tree (2.38 MYA; Fig. 4A). Gene flow stopped at 0.56 MYA, which was also close to the dating derived from the nuclear species tree (0.52 MYA; Fig. 4B). Because the species tree method assumes no gene flow during speciation (Heled and Drummond, 2010), divergence dating based on this approach should be close to the time when the two diverging groups were completely isolated. Therefore, Fig. 4 and Table 3 taken together reveal that gene flow at nuclear loci was still possible between different genetic groups of B. tabaci while the mitochondrial locus had differentiated. Consequently, the great discrepancy in divergence dating by mitochondrial and nuclear loci is due to different degrees of gene flow among loci during the speciation process, i.e., variations in gene divergence time (t), and not to the application of improper molecular clocks. This scenario is similar to the parapatric mode of speciation, under which the genome of the two species would be a mosaic in the extent of their divergence (Wu, 2001). That is because genomic regions corresponding to the speciation process

Table 2 Posterior probabilities of four speciation models between different pairs of genetic groups.

MI ASI

ASII AUS B. emiliae a

MAMAI ASII AUS CN AUS CN CN B. breyniae

Isolation model

Isolation with migration model

Early gene flow model

Secondary contact model

Bayes factora (BF)

0.09 1.33E 18 0.11 0.56 0.28 2.31E 02 0.16 0.27

1.33E 0.01 8.91E 1.88E 2.18E 0.62 1.49E 3.19E

0.90 0.93 0.88 0.43 0.71 2.88E 03 0.83 0.72

1.14E 0.04 2.98E 1.28E 2.48E 0.35 3.42E 5.41E

10 23.25 8 1.30 2.53 1.77 5.18 2.66

08 08 05 04 04 09

07 06 04 03 04 08

Bayes factor is the ratio of the highest probability to the second highest probability. Models with highest probability are in bold type.

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

179

Fig. 5. Prior (dashed lines) and posterior (solid lines) distributions of parameters for early gene flow model between MAMAI and MI genetic groups of Bemisia tabaci; (A) divergence time (Tdiv), (B) time of migration stop (Tm1), (C) migration rate, (D) ancestral population size, (E) current population size of the MAMAI genetic group, and (F) current population size of the MI genetic group.

would be differentiated, whereas the rest of the genome would not. Thus, different loci have different levels of divergence.

Science Council (NSC) 101-2313-B-002-054 and NSC 102-2313-B002-062 to HY Wang, and NSC 102-2811-B-002-065 – to CH Hsieh.

5. Conclusion

Appendix A. Supplementary material

This study, using multilocus data and incorporating Bayesian coalescence approaches, provides a comprehensive assessment of the species status and speciation history of B. tabaci. Different nuclear markers yielded different phylogenies, indicating the complex history of lineage sorting. Nevertheless, the species statuses of the different genetic groups are strongly supported by the Bayesian species delimitation method under different prior settings and phylogenetic scenarios. Genetic distances of nuclear loci were only 1/100–1/10 of mtCOI distance between different genetic groups. In addition, the origin of B. tabaci estimated by mtCOI (6.47 MYA) was far more ancient than that estimated by nuclear loci (1.25 MYA). Finally, according to ABC simulations, there was a substantial amount of gene flow between diverging groups of B. tabaci. We propose the parapatric mode of speciation for B. tabaci. During this process, mtDNA between different groups would differentiate first, while migrations between most parts of the nuclear genome were still possible. As a result, we observed a higher degree of divergence in mtDNA than in nuclear DNA. This study illustrates that gene flow during species divergence should not be overlooked and has a great impact on divergence time estimation.

Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ympev.2014.03. 021.

Acknowledgments We thank Dr. I. Bedford (John Innes Centre, Norwich, UK), Dr. J.K. Brown (University of Arizona, Tucson, AZ, USA), Dr. P. Caciagli (Istituto di Virologia Vegetale, CNR, Torino, Italy), Dr. P.J. De Barro (CSIRO Entomology, Indooroopilly, Australia), Dr. B.L. Qiu (Entomology Department, South China Agricultural University, Guangzhou, China), and Dr. S.S. Liu (Institute of Applied Entomology, Zhejiang University, China) for providing whitefly samples. This paper was supported in part by Grants from the National

References Beaumont, M.A., 2008. Joint determination of topology, divergence time, and immigration in population trees. In: Matsumura, S., Forster, P., Renfrew, C. (Eds.), McDonald Institute for Archaeological Research, Cambridge, pp. 135– 154. Beaumont, M.A., 2010. Approximate Bayesian computation in evolution and ecology. Annu. Rev. Ecol. Evol. Syst. 41, 379–406. Beaumont, M.A., Zhang, W.Y., Balding, D.J., 2002. Approximate Bayesian computation in population genetics. Genetics 162, 2025–2035. Bellows, T.S., Perring, T.M., Gill, R.J., Headrick, D.H., 1994. Description of a species of Bemisia (Homoptera, Aleyrodidae). Ann. Entomol. Soc. Am. 87, 195–206. Bertorelle, G., Benazzo, A., Mona, S., 2010. ABC as a flexible framework to estimate demography over space and time: some cons, many pros. Mol. Ecol. 19, 2609– 2625. Bickford, D., Lohman, D.J., Sodhi, N.S., Ng, P.K.L., Meier, R., Winker, K., Ingram, K.K., Das, I., 2007. Cryptic species as a window on diversity and conservation. Trends Ecol. Evol. 22, 148–155. Boykin, L.M., Shatters Jr., R.G., Rosell, R.C., McKenzie, C.L., Bagnall, R.A., De Barro, P., Frohlich, D.R., 2007. Global relationships of Bemisia tabaci (Hemiptera : Aleyrodidae) revealed using Bayesian analysis of mitochondrial COI DNA sequences. Mol. Phylogenet. Evol. 44, 1306–1319. Cutter, A.D., 2008. Divergence times in Caenorhabditis and Drosophila inferred from direct estimates of the neutral mutation rate. Mol. Biol. Evol. 25, 778–786. De Barro, J., Trueman, J.W.H., Frohlich, D.R., 2005. Bemisia argentifolii is a race of B. tabaci (Hemiptera : Aleyrodidae): the molecular genetic differentiation of B. tabaci populations around the world. Bull. Entomol. Res. 95, 193–203. De Barro, P.J., Liu, S.-S., Boykin, L.M., Dinsdale, A.B., 2011. Bemisia tabaci: a statement of species status. Annu. Rev. Entomol. 56, 1–19. de Queiroz, K., 1998. The general lineage concept of species, species criteria, and the process of speciation: a conceptual unification and terminological recommendations. In: Howard, D.J., Berlocher, S.H. (Eds.), Species and speciation. Oxford University Press, New York, pp. 57–75. Dinsdale, A., Cook, L., Riginos, C., Buckley, Y.M., De Barro, P., 2010. Refined global analysis of Bemisia tabaci (Hemiptera: Sternorrhyncha: Aleyrodoidea: Aleyrodidae) mitochondrial cytochrome oxidase 1 to identify species level genetic boundaries. Ann. Entomol. Soc. Am. 103, 196–208.

180

C.-H. Hsieh et al. / Molecular Phylogenetics and Evolution 76 (2014) 172–180

Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A., 2012. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol. Biol. Evol. 29, 1969–1973. Edwards, S.V., Beerli, P., 2000. Perspective: gene divergence, population divergence, and the variance in coalescence time in phylogeographic studies. Evolution 54, 1839–1854. Fay, J.C., Wu, C.I., 2000. Hitchhiking under positive Darwinian selection. Genetics 155, 1405–1413. Folino-Rorem, N.C., Darling, J.A., D’Ausilio, C.A., 2009. Genetic analysis reveals multiple cryptic invasive species of the hydrozoan genus Cordylophora. Biol. Invasions 11, 1869–1882. Galtier, N., Nabholz, B., Glemin, S., Hurst, G.D.D., 2009. Mitochondrial DNA as a marker of molecular diversity: a reappraisal. Mol. Ecol. 18, 4541–4550. Gueguen, G., Vavre, F., Gnankine, O., Peterschmitt, M., Charif, D., Chiel, E., Gottlieb, Y., Ghanim, M., Zchori-Fein, E., Fleury, F., 2010. Endosymbiont metacommunities, mtDNA diversity and the evolution of the Bemisia tabaci (Hemiptera: Aleyrodidae) species complex. Mol. Ecol. 19, 4365–4378. Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O., 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59, 307–321. Heled, J., Drummond, A.J., 2010. Bayesian inference of species trees from multilocus data. Mol. Biol. Evol. 27, 570–580. Hey, J., 2009. On the arbitrary identification of real species. In: Butlin R.K., Bridle J.R., Schluter, D. (Eds.), Speciation and patterns of diversity. Cambridge University Press, Cambridge, UK, pp. 15–28. Horowitz, A.R., Kontsedalov, S., Khasdan, V., Ishaaya, I., 2005. Biotypes B and Q of Bemisia tabaci and their relevance to neonicotinoid and pyriproxyfen resistance. Arch. Insect. Biochem. 58, 216–225. Hudson, R.R., 2002. Generating samples under a Wright–Fisher neutral model of genetic variation. Bioinformatics 18, 337–338. Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755. Hurst, G.D.D., Jiggins, F.M., 2005. Problems with mitochondrial DNA as a marker in population, phylogeographic and phylogenetic studies: the effects of inherited symbionts. Proc. R. Soc. B 272, 1525–1534. Jeffreys, H., 1961. Theory of probability. Oxford University Press, USA, London. Kass, R.E., Raftery, A.E., 1995. Bayes factors. J. Am. Stat. Assoc. 90, 773–795. Khasdan, V., Levin, I., Rosner, A., Morin, S., Kontsedalov, S., Maslenin, L., Horowitz, A.R., 2005. DNA markers for identifying biotypes B and Q of Bemisia tabaci (Hemiptera : Aleyrodidae) and studying population dynamics. Bull. Entomol. Res. 95, 605–613. Knowles, L.L., Carstens, B.C., 2007. Delimiting species without monophyletic gene trees. Syst. Biol. 56, 887–895. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., Valentin, F., Wallace, I.M., Wilm, A., Lopez, R., Thompson, J.D., Gibson, T.J., Higgins, D.G., 2007. Clustal W and clustal X version 2.0. Bioinformatics 23, 2947–2948. Leache, A.D., Fujita, M.K., 2010. Bayesian species delimitation in West African forest geckos (Hemidactylus fasciatus). Proc. R. Soc. B 277, 3071–3077. Leache, A.D., Koo, M.S., Spencer, C.L., Papenfuss, T.J., Fisher, R.N., McGuire, J.A., 2009. Quantifying ecological, morphological, and genetic variation to delimit species in the coast horned lizard species complex (Phrynosoma). Proc. Natl. Acad. Sci. USA 106, 12418–12423. Librado, P., Rozas, J., 2009. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452. McCormack, J.E., Heled, J., Delaney, K.S., Peterson, A.T., Knowles, L.L., 2011. Calibrating divergence times on species trees versus gene trees: implications for speciation history of Aphelocoma jays. Evolution 65, 184–202.

Mevik, B.H., Wehrens, R., 2007. The pls package: principal component and partial least squares regression in R. J. Stat. Softw. 18, 1–24. Nadachowska-Brzyska, K., Burri, R., Olason, P.I., Kawakami, T., Smeds, L., Ellegren, H., 2013. Demographic divergence history of pied flycatcher and collared flycatcher inferred from whole-genome re-sequencing data. PLoS Genet. 9, e1003942. Nylander, J.A.A., 2008. MrModeltest v2.3 Program distributed by the author. Evolution Biology Centre, Uppsala University, Uppsala, Sweden. Osada, N., Wu, C.I., 2005. Inferring the mode of speciation from genomic data: a study of the great apes. Genetics 169, 259–264. Pascual, S., Callejas, C., 2004. Intra- and interspecific competition between biotypes B and Q of Bemisia tabaci (Hemiptera : Aleyrodidae) from Spain. Bull. Entomol. Res. 94, 369–375. Pavlidis, P., Laurent, S., Stephan, W., 2010. MsABC: a modification of Hudson’s ms to facilitate multi-locus ABC analysis. Mol. Ecol. Resour. 10, 723–727. Perring, T.M., 2001. The Bemisia tabaci species complex. Crop Prot. 20, 725–737. Perring, T.M., Symmes, E.J., 2006. Courtship behavior of Bemisia argentifolii (Hemiptera : Aleyrodidae) and whitefly mate recognition. Ann. Entomol. Soc. Am. 99, 598–606. Perring, T.M., Cooper, A.D., Rodriguez, R.J., Farrar, C.A., Bellows, T.S., 1993. Identification of a whitefly species by genomic and behavioral studies. Science 259, 74–77. Posada, D., Buckley, T.R., 2004. Model selection and model averaging in phylogenetics: advantages of akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 53, 793–808. Schlotterer, C., Hauser, M.T., Vonhaeseler, A., Tautz, D., 1994. Comparative evolutionary analysis of rDNA ITS regions in Drosophila. Mol. Biol. Evol. 11, 513–522. Sites, J.W., Marshall, J.C., 2003. Delimiting species: a renaissance issue in systematic biology. Trends Ecol. Evol. 18, 462–470. Sites, J.W., Marshall, J.C., 2004. Operational criteria for delimiting species. Annu. Rev. Ecol. Evol. Syst. 35, 199–227. Stephens, M., Smith, N.J., Donnelly, P., 2001. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978–989. Tajima, F., 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595. Tamura, K., Nei, M., Kumar, S., 2004. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. USA 101, 11030– 11035. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. Trontelj, P., Fiser, C., 2009. Cryptic species diversity should not be trivialised. Syst. Biodivers. 7, 1–3. Wegmann, D., Leuenberger, C., Neuenschwander, S., Excoffier, L., 2010. ABCtoolbox: a versatile toolkit for approximate Bayesian computations. BMC Bioinformatics 11, 116. Wiens, J.J., 2004. Speciation and ecology revisited: phylogenetic niche conservatism and the origin of species. Evolution 58, 193–197. Wu, C.I., 2001. The genic view of the process of speciation. J. Evol. Biol. 14, 851–865. Yang, Z., Rannala, B., 2010. Bayesian species delimitation using multilocus sequence data. Proc. Natl. Acad. Sci. USA 107, 9264–9269. Yu, H., Wan, F.H., Guo, J.Y., 2012. Different thermal tolerance and hsp gene expression in invasive and indigenous sibling species of Bemisia tabaci. Biol. Invasions 14, 1587–1595. Zhang, C., Zhang, D.X., Zhu, T., Yang, Z., 2011. Evaluation of a Bayesian coalescent method of species delimitation. Syst. Biol. 60, 747–761.

Multilocus approach to clarify species status and the divergence history of the Bemisia tabaci (Hemiptera: Aleyrodidae) species complex.

The sweet potato whitefly, Bemisia tabaci, is a highly differentiated species complex. Despite consisting of several morphologically indistinguishable...
925KB Sizes 0 Downloads 4 Views