Molecular Ecology (2014) 23, 902–920

doi: 10.1111/mec.12641

Species delimitation and phylogeny in the genus Nasutitermes (Termitidae: Nasutitermitinae) in French Guiana VIRGINIE ROY,* REGINALDO CONSTANTINO,† VINCENT CHASSANY,*1 STEPHANIE G I U S T I - M I L L E R , * M I C H E L D I O U F , * P H I L I P P E M O R A * and M Y R I A M H A R R Y ‡ § *UMR 7618 Bioemco-Ibios, Universite Paris-Est Creteil, 61 avenue du General de Gaulle, 94010, Creteil Cedex, France, †Departamento de Zoologia, Universidade de Brasılia, 70910-970 Brasılia, DF, Brasil, ‡Laboratoire Evolution, Genomes et Speciation, UPR 9034 CNRS, IRD, Universite Paris Sud-11, avenue de la Terrasse, B^atiment 13, 91198 Gif sur Yvette, France, §UFR de Sciences, Universite Paris Sud-11, 91400 Orsay, France

Abstract Species delimitation and identification can be arduous for taxa whose morphologic characters are easily confused, which can hamper global biodiversity assessments and pest species management. Exploratory methods of species delimitation that use DNA sequence as their primary information source to establish group membership and estimate putative species boundaries are useful approaches, complementary to traditional taxonomy. Termites of the genus Nasutitermes make interesting models for the application of such methods. They are dominant in Neotropical primary forests but also represent major agricultural and structural pests. Despite the prevalence, pivotal ecological role and economical impact of this group, the taxonomy of Nasutitermes species mainly depends on unreliable characters of soldier external morphology. Here, we generated robust species hypotheses for 79 Nasutitermes colonies sampled throughout French Guiana without any a priori knowledge of species affiliation. Sequence analysis of the mitochondrial cytochrome oxidase II gene was coupled with exploratory species-delimitation tools, using the automatic barcode gap discovery method (ABGD) and a generalized mixed Yule-coalescent model (GMYC) to propose primary species hypotheses (PSHs). PSHs were revaluated using phylogenetic analyses of two more loci (mitochondrial 16S rDNA and nuclear internal transcribed spacer 2) leading to 16 retained secondary species hypotheses (RSSH). Seven RSSHs, represented by 44/79 of the sampled colonies, were morphologically affiliated to species recognized as pests in the Neotropics, where they represent a real invasive pest potential in the context of growing ecosystem anthropization. Multigenic phylogenies based on combined alignments (1426–1784 bp) were also reconstructed to identify ancestral ecological niches and major-pest lineages, revealing that Guyanese pest species do not form monophyletic groups. Keywords: 16S rDNA, automatic barcode gap discovery method, cytochrome oxidase II, general mixed Yule-coalescent, internal transcribed spacer 2, Nasutitermes Received 15 January 2013; revision received 5 December 2013; accepted 13 December 2013

Introduction Correspondence: Virginie Roy, Fax: +33145171505; E-mail: [email protected] 1 Present address: Universite Paris Diderot, 4 rue Lagroua Weill-Halle, 75205 Paris cedex 13, France

Diverse and taxonomically understudied taxa are particularly challenging in terms of species delimitation and identification (Puillandre et al. 2012b). This taxonomic hurdle represents a real constraint to biodiversity © 2013 John Wiley & Sons Ltd

S P E C I E S D E L I M I T A T I O N I N N A S U T I T E R M E S S P E C I E S 903 assessment (Deca€ens et al. 2013) and can impede pest management. In this domain, molecular data can be useful and complementary to traditional expertise in several ways. For example, they can help to rapidly identify referenced pest species, to delimit genetic units within large-scale collections of samples for subsequent morphological identification and to identify phylogenetic relationships between potential sibling pests. For the bulk of undescribed biodiversity, the singlegene DNA barcoding approach may be used, not to identify specimens, but to propose primary species hypotheses (PSHs) for approximating species descriptions (Goldstein & DeSalle 2011; Puillandre et al. 2012b). Methods of species delimitation use DNA sequence itself as the primary information source to establish group membership and estimate putative species boundaries. There are only two formalized methods, referred to as exploratory methods, specifically designed to delimit species from single-locus data without taking into account any kind of a priori species or population delimitation (Coissac et al. 2012): the automatic barcode gap discovery (ABGD) (Puillandre et al. 2012a) and general mixed Yule coalescent (GMYC) (Pons et al. 2006; Monaghan et al. 2009) procedures. The use of barcodes and single-gene approaches to species delimitation (especially those based on mtDNA) has been a source of controversy (Hebert et al. 2003; Moritz & Cicero 2004; Blaxter et al. 2005; Rubinoff & Holland 2005), and some major pitfalls linked to the use of a sole mitochondrial gene should be mentioned. First, DNA barcoding can overestimate the number of species when nuclear mitochondrial pseudogenes (numts) are coamplified (Song et al. 2008). Second, introgression events and/or incomplete lineage sorting can commonly affect mtDNA-based phylogenetic analyses, resulting in paraphyletic and polyphyletic relationships (Funk & Omland 2003). Finally, maternally inherited endosymbionts, such as the proteobacteriae Wolbachia, may cause linkage disequilibrium with mtDNA, resulting in a homogenization of mtDNA haplotypes (Hurst & Jiggins 2005). One strategy that has been used to avoid single-gene pitfalls is to increase the gene sampling to two or more, preferably unlinked, genes (Knowles 2009; Leache & Fujita 2010; O’Meara 2010; Yang & Rannala 2010). Comparison of different sources of data has been thus suggested as the most effective way to understand the evolutionary history of a group (Rubinoff & Holland 2005). Among insects, termites (Blattodea: Termitoidae) represent interesting models for exploratory species-delimitation methods for a number of reasons. In tropical and subtropical ecosystems, termites are particularly abundant, frequently exceeding 1000 individuals/m2 or 2000 mg/m2 (Engel et al. 2009). They play a major role © 2013 John Wiley & Sons Ltd

in the decomposition of organic matter, nutrient recycling, aeration and drainage of soils (Garnier-Sillam & Harry 1995; Lavelle et al. 1997) and are particularly sensitive to ecosystem perturbations (Fonseca de Souza & Brown 1994; Dupont et al. 2008). Additionally, about 10% of species are xylophagous or foraging species, which are described as pests for constructions or cultures. Until recently, species delimitation in termites was essentially based on three types of character, which were rarely used together: morphological observations, analytical chemistry (chemotaxonomy) and molecular phylogeny. Then, recently, Monaghan et al. (2009) applied an exploratory method (GMYC) to delimitate termite species from Madagascar. Their results indicated that the GMYC model captures species boundaries that are comparable to those from traditional methods, opening a new avenue for the exploration of termite biodiversity. The genus Nasutitermes Dudley, 1890 (Termitidae: Nasutitermitinae) comprises 243 currently described species, including approximately 71 Neotropical species in which the soldier caste possesses a frontal projection (nasus) and vestigial mandibles. Most species build arboreal nests and are wood-feeders. They occur in a variety of habitats, including primary and secondary forests, cropland and urban areas. Some Nasutitermes species play a major role in soil ecological processes, consuming up to 3% of the annual production of wood litter in Brazilian forests (Vasconcellos & Moura 2010). However, several species of Nasutitermes are important pests in South America (Constantino 2002), causing significant damage in agriculture (e.g. on coffee, maize, cotton, eucalyptus, fruit trees, rice and sugar cane) and to wood and other cellulosic materials. For example, Nasutitermes nigriceps, N. ephratae and N. surinamensis are well known to ravage timber wood, and N. corniger is a major urban pest in Brazil and Argentina (Constantino 2002). In French Guiana, three pest species of timber wood, N. nigriceps, N. ephratae and N. surinamensis, have been reported among the twenty species of Nasutitermes recorded from this region (Lefeuve 1990; Constantino 2002; Ensaf et al. 2003; Ensaf & Eggleton 2004; Ensaf 2010). As in many other termite genera, the taxonomy of the genus Nasutitermes is extremely confused, making species identification remains difficult. Specific identification of soldiers within the genus Nasutitermes usually depends upon differences in pilosity, shape of the head, colour and size (Holmgren 1910; Emerson 1925). Few studies have used information other than that on morphology to address taxonomic or phylogenetic problems in the genus Nasutitermes. Nevertheless, the use of cuticular hydrocarbons made it possible to differentiate very similar species such as Nasutitermes acajutlae and

904 V . R O Y E T A L . N. nigriceps (Thorne et al. 1994) or N. corniger and N. ephratae (Howard et al. 1988), and mitochondrial markers have successfully demonstrated the synonymy of N. corniger, N. costalis and N. polygynus (Scheffrahn et al. 2005a,b). Finally, phylogenetic relationships between Nasutitermes species are poorly known, and molecular studies have rarely focused on Neotropical species (but see Bergamaschi et al. 2007 and Miura et al. 2000 for Nasutitermes phylogenies from Australia and the Pacific tropics, respectively). The aims of the present study were to generate robust species hypotheses for 79 Nasutitermes colonies sampled throughout French Guiana, without a priori knowledge of species affiliation, thus comparing ABGD and GMYC results in termites for the first time, and to reconstruct phylogenies for Guianese species in order to retrace the history of Nasutitermes ecological niche evolution and pest species emergence.

Materials and methods Samples Nasute samples were represented by 79 colonies and collected from 16 sites in French Guiana referred as Awala-Yalimapo (AWA), Bellevue (BEL), Cacao (CAC), Counami (COU), Elahe (ELA), Iles du Salut (IDS), Ilet la Mere (ILM), Marais de Yiyi (MDY), Maripasoula (MAR), Matoury (MAT), Montjoly (MON), Nouragues (NOU), Patagai (PAT), Rocoucoua (ROC), Route de St Elie (RSE) and Sa€ ul (SAU) (Table 1 and Fig. S1, Supporting information). The sampling procedure for BEL, CAC, PAT and ROC was detailed in Dupont et al. (2008). Other sites were sampled quantitatively (1–2 days per site). Samples were conserved in absolute ethanol until molecular analyses. Sample affiliation to the Nasutitermes genus was checked using the soldier description of Mathews (1977). Two colonies belonging to Syntermitinae genera (Cyrilliotermes and Silvestritermes) were chosen as outgroups for phylogenetic reconstructions.

DNA extraction, sequencing and alignments Total genomic DNA was extracted from one soldier per colony (N = 79) by dissecting the head, thorax and legs, and removing the abdomen. DNA extraction was realized using the DNeasy Blood & Tissue Kit (Qiagen, France) according to the manufacturer’s instructions. The mitochondrial cytochrome oxidase II (COII) gene was sequenced and applied to a species-delimitation procedure. One additional mitochondrial gene, 16S rDNA, and one nuclear sequence, internal transcribed spacer 2 (ITS2), were sequenced to consolidate the

species-delimitation procedure and to concatenate loci in phylogenetic analyses. PCR amplifications were performed in 40 lL mixture using Taq PCR Master Mix Kit (Qiagen) and GoTaqâ Flexi DNA Polymerase (Promega, France) following the manufacturer protocol. The PCR primers and amplification protocols are listed in Table S1 (Supporting information). Amplification products were sent to Beckman Coulter UK Ltd for sequencing. The sequences generated for this study were submitted to GenBank, and the accession numbers are provided in Table 1. The standard barcode gene, cytochrome oxidase I (COI), was also tested for amplification and sequencing, but returned only very partial results. It failed to amplify or gave unusable sequences for numerous samples despite various modifications in the protocol, including the use of Phusion High-Fidelity DNA Polymerase (NEB, France) and nested PCRs. Furthermore, some COI sequences showed protein sequences that were highly divergent and two stop codons in the reading frame that could be attributed to parts of the mitochondrial genome that were transferred to the nucleus (nuclear mitochondrial sequences or NUMTs). Consequently, COI was not retained for further analyses. For ten samples sequenced for ITS2 nuclear marker, electropherograms showed convoluted DNA trace produced by direct sequencing of a template containing heterozygous insertions/deletions. For these samples, phase determination in length variant heterozygotes was performed by direct sequencing of mixed PCR products by combining for each individual the complementary information contained in its forward and reverse chromatograms (Flot et al., 2006) with the help of Champuru v1.0 (Flot 2007, available online at http://www.mnhn.fr/jfflot/champuru/). For all other samples, heterozygous positions were identified by a secondary peak in the electropherograms that reached at least 50% of the intensity of the primary peak, using CodonCode Aligner (CodonCode Corporation). The low-sensitivity option was used when searching for mutations to reduce false positives. Heterozygous sites were coded using the standard IUPAC codes for combinations of bases, and for all samples where two or more heterozygous sites were found in a sequence, we determined the gametic phase of alleles using the program PHASE 2.1 as implemented in DNASP v.5 (Librado & Rozas 2009). PHASE uses a Bayesian approach to infer haplotypes from diploid genotypic data accounting for both recombination and linkage disequilibrium. An initial alignment for individual loci was produced with the ClustalW2 algorithm (Thompson et al. 1994). Manual adjustments and concatenation were made using SEAVIEW v4.3.0 (Gouy et al. 2010). © 2013 John Wiley & Sons Ltd

Collection locality (collectors)

Awala-Yalimapo (VC) Bellevue (VR, MH) Cacao (VR, MH) Ilet la Mere (VC) Matoury (VC) Montjoly (VR, MH) Counami (VR, MH) Elahe (VR, MH) Bellevue (VR, MH) Route St Elie (VR, MH) Route St Elie (VR, MH) Marais de Yiyi (VC) Bellevue (VR, MH) Rocoucoua (VR, MH) F. Guiana NI (VR, MH) F. Guiana NI (VR, MH) Nouragues (VC) Nouragues (VC) Nouragues (VC) Pataga€ı (VR, MH) Pataga€ı (VR, MH) Pataga€ı (VR, MH) Route St Elie (VC) Nouragues (PM) Bellevue (VR, MH) Bellevue (VR, MH) Elahe (VR, MH) Cacao (VR, MH) Matoury (VC) Nouragues (PM) Nouragues (VC) Nouragues (VC) Pataga€ı (VR, MH) Pataga€ı (VR, MH) Sa€ ul (VC) Nouragues (VC) Nouragues (PM) Nouragues (VC) Nouragues (VC) Nouragues (VC) Pataga€ı (VR, MH) Nouragues (VC)

Sample

AWA1 BEL2 CAC2 ILM1 MAT1 (10) MON1 COU1 ELA2 BEL3 RSE3 RSE2 MDY1 (10) BEL4 ROC1 NIS2 NIS1 NOU1 NOU2 (10) NOU3 PAT2 PAT3 PAT4 RSE1 NOU25 (10) BEL1 (10) BEL5 ELA3 CAC3 MAT2 (10) NOU26 NOU4 NOU5 PAT5 PAT6 SAU1 NOU6 (10) NOU28 NOU17 NOU16 NOU15 (10) PAT7 NOU20

F C F F F F F C S S S S F F MD MD F F F F F F F F S C F F F F F F F F F F F F F F F F

Ecological niche

1 1 1 1 1 1 1 1 2a 2a 2a 2b 3 3 3 3 3 3 3 3 3 3 3 4 5 6 6 6 6 6 6 6 6 6 6 7 8 8 8 8 8 8

PSH

KC630989 KC630989 KC630989 KC630989 KC630989 KC630989 KC630990 KC630991 KC630992 KC630993 KC630993 KC630994 KC630995 KC630995 KC630996 KC630996 KC630996 KC630996 KC630996 KC630997 KC630998 KC630999 KC631000 KC631001 KC631002 KC631003 KC631003 KC631004 KC631004 KC631004 KC631004 KC631004 KC631004 KC631004 KC631004 KC631005 KC631006 KC631006 KC631006 KC631006 KC631006 KC631007

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

COII GenBank ID (COII haplotype)

1.1) 1.1) 1.1) 1.1) 1.1) 1.1) 1.2) 1.3) 2a.1) 2a.2) 2a.2) 2b.1) 3.1) 3.1) 3.2) 3.2) 3.2) 3.2) 3.2) 3.3) 3.4) 3.5) 3.6) 4.1) 5.1) 6.1) 6.1) 6.2) 6.2) 6.2) 6.2) 6.2) 6.2) 6.2) 6.2) 7.1) 8.1) 8.1) 8.1) 8.1) 8.1) 8.2)

KF724731 KF724732 MD KF724733 KF724731 KF724734 MD MD KF724735 MD KF724736 KF724737 MD KF724738 KF724739 MD KF724740 KF724739 KF724739 KF724741 KF724739 KF724738 KF724739 KF724742 KF724743 MD KF724744 KF724745 KF724745 KF724745 KF724745 KF724746 KF724747 KF724747 KF724745 KF724748 KF724749 KF724749 KF724750 KF724749 MD KF724751

© 2013 John Wiley & Sons Ltd

6.1) 6.2) 6.2) 6.2) 6.2) 6.3) 6.4) 6.4) 6.2) 7.1) 8.1) 8.1) 8.2) 8.1)

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

(haplotype 8.3)

3.3) 3.2) 3.2) 3.4) 3.2) 3.1) 3.2) 4.1) 5.1)

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

(haplotype 3.1) (haplotype 3.2)

(haplotype 2a.2) (haplotype 2b.1)

(haplotype 2a.1)

(haplotype 1.3) (haplotype 1.1) (haplotype 1.4)

(haplotype 1.1) (haplotype 1.2)

16S GenBank ID

KF724765 KF724766 MD KF724767 KF724768 KF724769 MD KF724770 KF724771 KF724772 KF724773 KF724774 KF724775 KF724776 KF724777 KF724778 KF724779 KF724780 KF724781 KF724782 KF724783 KF724784 KF724785 KF724786 KF724787 MD KF724788 KF724789 KF724790 KF724791 KF724792 KF724793 KF724794 KF724795 KF724796 KF724797 KF724798 KF724799 KF724800 KF724801 KF724802 KF724803

ITS2 GenBank ID

Table 1 Sample labels, collection localities, collectors, ecological niches and PSHs proposed for the 79 Guianese Nasutitermes samples used in the species-delimitation analysis

S P E C I E S D E L I M I T A T I O N I N N A S U T I T E R M E S S P E C I E S 905

Nouragues (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Pataga€ı (VR, MH) Counami (VR, MH) F. Guiana NI (VR, MH) Nouragues (PM) Nouragues (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Awala-Yalimapo (VC) Cacao (VR, MH) Counami (VR, MH) F. Guiana NI (VR, MH) Iles du Salut (VC) Ilet la Mere (VC) Ilet la Mere (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Nouragues (VC) Rocoucoua (VR, MH) Matoury (VC) Sa€ ul (VC) Elahe (VR, MH) Maripasoula (VR, MH) Pataga€ı (VR, MH) Cacao (VR, MH) Iles du Salut (VR, MH) Iles du Salut (VR, MH) Iles du Salut (VR, MH) Iles du Salut (VC) Iles du Salut (VC) Nouragues (PM) Rocoucoua (VR, MH)

NOU18 NOU19 NOU8 NOU7 (10) NOU9 PAT8 (2) COU3 (10) NIS4 NOU27 NOU10 NOU11 NOU12 NOU13 (10) NOU14 AWA2 CAC4 COU2 NIS3 IDS3 ILM2 ILM3 NOU21 NOU22 (10) NOU23 NOU24 ROC2 MAT3 SAU2 ELA1 MAR1 (10) PAT1 CAC1 (10) IDS4 IDS5 IDS6 IDS1 IDS2 Silvestritermes sp. Cyrilliotermes sp.

F F F F F F F MD F F F F F F F F F MD F F F F F F F C F F C C F F F F F F F — —

Ecological niche

8 8 9 9 9 10 11 11 12 12 12 12 12 12 13 13 13 13 13 13 13 13 13 13 13 13 13 13 14 14 14 15 15 15 15 15 15 — —

PSH

KC631007 KC631007 KC631008 KC631008 KC631008 KC631009 KC631010 KC631011 KC631012 KC631012 KC631012 KC631012 KC631012 KC631013 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631014 KC631015 KC631016 KC631017 KC631017 KC631018 KC631019 KC631020 KC631020 KC631021 KC631022 KC631022 KC631023 KC631024

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

COII GenBank ID (COII haplotype)

8.2) 8.2) 9.1) 9.1) 9.1) 10.1) 11.1) 11.2) 12.1) 12.1) 12.1) 12.1) 12.1) 12.2) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.2) 13.3) 14.1) 14.1) 14.2) 15.1) 15.2) 15.2) 15.3) 15.4) 15.4)

KF724752 KF724753 KF724754 KF724754 KF724754 KF724755 KF724756 MD KF724757 KF724757 KF724757 KF724757 KF724757 KF724757 KF724758 KF724758 MD MD KF724758 KF724758 KF724758 KF724758 KF724758 KF724758 KF724758 KF724758 KF724758 KF724758 KF724759 KF724759 MD KF724760 MD MD MD KF724761 KF724762 KF724763 KF724764 12.1) 12.1) 12.1) 12.1) 12.1) 12.1) 13.1) 13.1)

13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 13.1) 14.1) 14.1)

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

(haplotype 15.2) (haplotype 15.3)

(haplotype 15.1)

8.4) 8.5) 9.1) 9.1) 9.1) 10.1) 11.1)

(haplotype (haplotype (haplotype (haplotype (haplotype (haplotype (haplotype

16S GenBank ID

KF724804 KF724805 KF724806 KF724807 KF724808 KF724809 KF724810 KF724811 KF724812 KF724813 KF724814 KF724815 KF724816 KF724817 KF724818 KF724819 KF724820 KF724821 KF724822 KF724823 KF724824 KF724825 KF724826 KF724827 KF724828 KF724829 KF724830 KF724831 KF724832 KF724833 MD KF724834 KF724835 KF724836 KF724837 KF724838 KF724839 KF724840 KF724841

ITS2 GenBank ID

VC, V. Chassany; MH, M. Harry; PM, P. Mora; VR, V. Roy; F, forest; S, savannah; C, culture; NI, nonidentified site; MD, missing data; PSH, primary species hypotheses. Samples in bold are those used for morphological species affiliation, with numbers of individuals used for morphological analyses in brackets. GenBank accession numbers (AN) are indicated for COII haplotypes, 16S rDNA and ITS2 sequences.

Collection locality (collectors)

Sample

Table 1 Continued

906 V . R O Y E T A L .

© 2013 John Wiley & Sons Ltd

S P E C I E S D E L I M I T A T I O N I N N A S U T I T E R M E S S P E C I E S 907 DELIMITATION ABGD, GMYC ITS2 (N = 75) PSHs

COII (N = 79)

16S (N = 64)

BI, ML (monophyly + support)/shared haplotypes Haplo-network

CONSOLIDATION

Morphology

ITS2 (N = 64)

RSSHs

16S (N = 64)

BI on mit combined dataset

Morphological identification

BI on total combined dataset

PHYLOGENETIC RECONSTRUCTIONS

AFFILIATION

RECONSTRUCTION OF ANCESTRAL ECOLOGICAL NICHES

Species

(a)

COII (N = 64)

(b)

Fig. 1 Workflow diagram for species delimitation, phylogeny and reconstruction of ancestral states. (a) Delimitation analyses: PSHs were proposed using two methods for species delimitation (ABGD and GMYC procedures, applied to the COII data set). Phylogenetic analyses of COII, 16S rDNA and ITS2 data sets were used to choose the most likely option amongst alternate PSHs proposed by ABGD and GMYC. RSSHs were then affiliated to known species by detailed morphological identification of individuals belonging to the soldier caste. (b) Combined phylogenetic analyses: COII, 16S rDNA and ITS2 data sets were combined to reconstruct phylogenies and ancestral ecological niches. COII: cytochrome oxidase II, ITS2: internal transcribed spacers 2, ABGD: automatic barcode gap discovery method, GMYC: generalized mixed Yule-coalescent method, PSHs: primary species hypotheses, RSSHs: retained secondary species hypotheses, BI: Bayesian inference, ML: maximum likelihood, N: number of colonies sequenced/locus.

Species-delimitation procedure Principle. A workflow diagram is presented as Fig. 1. First, PSHs were constructed based on the pattern of diversity of COII as a single gene. COII is not the standard barcode gene, but it was demonstrated to be suitable for a single-locus species-delimitation procedure (Monaghan et al. 2009). Historically, COII has been preferentially sequenced in termites, and about five times more sequences are available for it in nucleotide databases than for COI. PSHs were proposed using two exploratory methods for species delimitation: ABGD and GMYC procedures applied to the COII gene. To consolidate PSHs, additional sequences, that is, mitochondrial 16S rDNA and nuclear ITS2 were phylogenetically analysed (i.e. monophyly, with support and shared haplotypes between PSHs set as criteria). This step made it possible to choose the most likely option amongst alternate PSHs proposed by ABGD and GMYC and to propose RSSHs (Puillandre et al. 2012b). Finally, RSSHs were affiliated to known species through to the detailed morphological expertise of individuals belonging to the soldier caste. ABGD method. The ABGD method (Puillandre et al. 2012a) uses several a priori thresholds to propose partitioning of specimens into PSHs based on the distribution of pairwise genetic distances. In the distribution of pairwise differences between sequences, one can observe a gap between intraspecific diversity and interspecific diversity; this gap has been named the © 2013 John Wiley & Sons Ltd

‘barcode gap’ and can be used as a threshold for delimiting primary species under the assumption that individuals within species are more similar than between species. The COII alignment was used to compute matrices of pairwise distances using the p-distance, the Kimura 2-parameter (K2P) and the Tamura–Nei (TN) models with MEGA5 (Tamura et al. 2011). Matrices were then used as inputs on the ABGD webpage (http://wwwabi.snv.jussieu.fr/public/abgd/abgdweb. html), using the default settings search (except for the relative gap width X which was set to 1 because only one group was found for the default value X = 1.5, prior maximal distance of 0.001) on a set of prior minimum genetic distances ranging from 0.001 to 0.1. GMYC procedure. The general mixed Yule-coalescent delimitation procedure is based on a coalescent approach and requires a chronometric phylogram in which branch lengths are approximately proportional to time. We applied the uncorrelated lognormal model implemented in BEAST v1.7.3 (Drummond et al. 2012) to the COII alignment. Three independent Markov chain Monte Carlo (MCMC) analyses were run for 20 million generations, sampling every 1000 with a 10% burn-in under the substitution model determined using MRMODELTEST (GTR + G + I). Convergence and effective sample size (ESS) of estimated parameters were inspected using TRACER v1.5, and a maximum clade credibility tree was reconstructed with the program TREEANNOTATOR v1.7.3

908 V . R O Y E T A L . with the default settings. The ultrametric phylogenies recovered with BEAST were then subjected to GMYC analysis. The GMYC model was used to perform analyses of species delimitation and propose PSHs (Pons et al. 2006). This method exploits the differences in the rate of lineage branching at the level of species and populations that can be visualized as a switch between slow and fast rates of branching events in a lineage-throughtime plot. A combined model that separately describes population (a neutral coalescent model) and speciation (a Yule model) processes is fitted on the ultrametric tree. The method optimizes a threshold position of switching from interspecific to intraspecific events such that nodes older than the threshold are considered to be diversification events and nodes younger than the threshold reflect coalescence occurring within each species. We generated a log-lineage-through-time plot to identify a sharp increase in lineage accumulation that represents the inferred threshold between speciation and coalescent processes. A standard likelihood ratio test (LRT) was used to assess whether the alternative model (i.e. the simple-threshold GMYC model) provided a better fit than the null model (i.e. a single PSH). These methods were implemented in R software v2.15.1 (R Development Core Team 2010), using the splits (available at http://r-forge.r-project.org/projects/ splits/) and ape (Paradis et al. 2004) packages. Consolidation of PSHs. As a first step, the monophyly and support of PSHs were evaluated from Bayesian inference (BI) and maximum likelihood (ML) based on the COII, 16S rDNA and ITS2 data sets. The appropriateness of partitioning the COII alignment by codon position was determined using the Bayes factor (BF) (Kass & Raftery 1995; Nylander 2004) on ‘gene nonpartitioned by codon’ and ‘gene partitioned by codon’ matrices. For this, we used the sump command in MRBAYES to obtain the log-transformed harmonic means. The BF can be calculated as the ratio of the harmonic means of likelihoods of the two analyses being tested. In this study, 2lnBF ≥ 10 was considered as very strong evidence supporting the alternative hypothesis based on hypothesized cut-off values (Kass & Raftery 1995). For the ITS2 phylogenetic reconstructions, we used sequences with IUPAC codes: all gaps and heterozygous positions were coded as ambiguous characters and were treated as missing data. The most appropriate likelihood models were determined with MRMODELTEST 2.0 (Nylander 2004) using the Akaike information criterion. Bayesian search was carried out with MRBAYES v.3.1.2 (Huelsenbeck & Ronquist 2001) using four simultaneous Markov chains, 10 million generations and sampling every 100 generations, resulting in 100 000

generations being saved. A burn-in of 20% was used. We used TRACER v1.5 (Drummond et al. 2012) to ascertain that our sampling of the posterior distribution had an adequate ESS. ML analyses were performed using PhyML (Guindon et al. 2010) with 1000 bootstraps. As a second step, shared haplotypes between PSHs were then identified using DNASP v.5 for COII and 16S rDNA alignments. We explored relationships between ITS2 haplotypes reconstructed by PHASE using a median joining network obtained in NETWORK 4.6 (http://www. fluxus-engineering.com) and identified shared ITS2 haplotypes. Identification to species level. RSSHs were affiliated to known species through to the detailed morphological examination of individuals belonging to the soldier caste. The 79 colonies could not all be analysed for reasons of conservation and availability of the different castes. Fifteen colonies underwent a detailed morphological identification (Table 1): 142 individuals were examined using a Leica M205C stereomicroscope under magnification up to 1009, with both incident and transmitted light. Soldier characters used for identification were shape of head capsule; number, size and distribution of hairs on head capsule, thoracic nota and tergites; gut morphology in situ (observed by transparency); number of antennal articles and their relative length; colour and colour pattern; and morphometric characters: length of head, width of head and length of posterior tibia. These were then compared with published descriptions (Holmgren 1910; Banks 1918; Emerson 1925; Mathews 1977; Bandeira & Fontes 1979) and with preserved specimens and images of type specimens. The Appendix S1 (Supporting information) presents more detailed information on the identification of each species.

Phylogenetic analyses Phylogenetic analyses were performed with BI partitioned by gene on mitochondrial combined (COII and 16s rDNA) and total combined alignments (COII, 16s rDNA and ITS2). Samples with at least one missing locus were discarded from the analyses, resulting in data sets of 64 Nasutitermes and two outgroup samples. The procedure for BI was identical to that described in the ‘Consolidation of PSHs’ section.

Reconstruction of ancestral ecological niches MESQUITE v. 2.75 (Maddison & Maddison 2011) was used to reconstruct the ecological state of the ancestral nodes (i.e. forest or savannah). The topology of the BI tree obtained with the total combined alignment was used as input in Mesquite. For each node, a likelihood

© 2013 John Wiley & Sons Ltd

S P E C I E S D E L I M I T A T I O N I N N A S U T I T E R M E S S P E C I E S 909 reconstruction method was used to find the state that maximized the probability of arriving at the observed states in the terminal taxa, given the model of evolution, and allowing the states at all other nodes to vary.

Results Delimitation of mtDNA clusters with ABGD and GMYC methods Genetic pairwise distances for COII gene were computed using three different substitution models: the p-distance, the K2P distance and the TN distance. The distribution of genetic distances, whatever the substitution model used, displayed three modes (Fig. 2a). The different ABGD a priori thresholds led to partitions with 20, 17, 16 or 14 PSHs when extreme a priori thresholds were excluded (Fig. 2c), but a first major barcode gap was evident at a priori genetic distance threshold of 0.0046 (estimated distance p = 0.010) (Fig. 2b). Thus, this prior intraspecific divergence value, supporting the presence of 16 PSHs, was favoured in further discussion about ABGD species delimitation. Five PSHs contained a single sequence, with these singletons representing 31% of delineated ABGD species (PSHs 2b, 4, 5, 7 and 10, longer black branches in Fig. 2d). A lineage-through-time plot based on the BEAST ultrametric tree revealed a sudden increase in branching rate towards the present, likely corresponding to the switch from interspecies to intraspecies branching events (Fig. 3). A LRT favoured the simple-threshold GMYC model over a null model of a single PSH (likelihood of null model: 625.1238, likelihood of GMYC model: 632.7517, likelihood ratio: 16.00316, LR test: 70 and BI Bayesian posterior probability (BPP) values >0.95, except PSH3 (BI BPP 70 and/or BI BPP >0.95 and no shared haplotypes with another PSH, for at least two genes. Consequently, 16 PSHs were converted into RSSHs (Fig. 5). Based on the detailed morphological analysis, RSSHs were linked to taxonomic names available in the

(a)

1.00/99.9

0.2

literature (Fig. 5). Each RSSH could be morphologically analysed, except RSSH 2a, which was designated as Nasutitermes sp., resulting in fourteen known species: RSSH 1-N. corniger, RSSH 2b-N. coxipoensis, RSSH 3-N. ephratae, RSSH 4-N. callimorphus, RSSH 5-N. intermedius, RSSH 7-N. guayanae, RSSH 9-N. obscurus RSSH 10-N. unduliceps, RSSH 11-N. wheeleri, RSSH 12-N. octopilis, RSSH 13-N. surinamensis, RSSH 14-N. acangussu and RSSH 15-N. acajutlae. RSSH 6 and RSSH 8 were both morphologically identified as N. similis.

Phylogenetic analyses Phylogenetic reconstruction for the combined mitochondrial alignment (66 sequences, 1426 bp, BI partitioned by gene, arithmetic mean –ln = 6245.64) is presented Fig. 6a.

(b) AWA1 ILM1 MAT1 1.00/100 BEL2 PSH 1 COU1

Species delimitation and phylogeny in the genus Nasutitermes (Termitidae: Nasutitermitinae) in French Guiana.

Species delimitation and identification can be arduous for taxa whose morphologic characters are easily confused, which can hamper global biodiversity...
610KB Sizes 0 Downloads 0 Views