Molecular Phylogenetics and Evolution 76 (2014) 202–210

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Integrating coalescent and phylogenetic approaches to delimit species in the lichen photobiont Trebouxia Anna D. Sadowska-Des´ a,b,⇑,1, Francesco Dal Grande b,1, H. Thorsten Lumbsch c, Andreas Beck d,e, Jürgen Otte b, Jae-Seoun Hur f, Jung A. Kim f, Imke Schmitt a,b,⇑ a

Department of Biological Sciences, Institute of Ecology, Evolution and Diversity, Goethe Universität, Max-von-Laue-Str. 13, D-60438 Frankfurt, Germany Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Senckenberganlage 25, D-60325 Frankfurt, Germany Science & Education, The Field Museum, 1400 S. Lake Shore Drive, Chicago, IL 60605, USA d Department of Lichenology and Bryology, Botanische Staatssammlung München, Menzinger Straße 67, D-80638 München, Germany e GeoBio-Center, Ludwig-Maximilians Universität München, Richard-Wagner-Str. 10, D-80333 München, Germany f Korean Lichen Research Institute, Sunchon National University, Suncheon 540-742, South Korea b c

a r t i c l e

i n f o

Article history: Received 7 October 2013 Revised 4 March 2014 Accepted 17 March 2014 Available online 28 March 2014 Keywords: Terrestrial green algae Cryptic species Environmental specialization Photobiont sharing Lasallia pustulata

a b s t r a c t The accurate assessment of species boundaries in symbiotic systems is a prerequisite for the study of speciation, co-evolution and selectivity. Many studies have shown the high genetic diversity of green algae from the genus Trebouxia, the most common photobiont of lichen-forming fungi. However, the phylogenetic relationships, and the amount of cryptic diversity of these algae are still poorly understood, and an adequate species concept for trebouxiophycean algae is still missing. In this study we used a multifaceted approach based on coalescence (GMYC, STEM) and phylogenetic relationships to assess species boundaries in the trebouxioid photobionts of the lichen-forming fungus Lasallia pustulata. We further investigated whether putative species of Trebouxia found in L. pustulata are shared with other lichen-forming fungi. We found that L. pustulata is associated with at least five species of Trebouxia and most of them are shared with other lichen-forming fungi, showing different patterns of species-to-species and species-to-community interactions. We also show that one of the putative Trebouxia species is found exclusively in association with L. pustulata and is restricted to thalli from localities with Mediterranean microclimate. We suggest that the species delimitation method presented in this study is a promising tool to address species boundaries within the heterogeneous genus Trebouxia. Ó 2014 Elsevier Inc. All rights reserved.

1. Introduction Cryptic diversity is a common phenomenon in many groups of organisms (e.g. Bálint et al., 2011; Bickford et al., 2007; Hebert et al., 2004; Lumbsch and Leavitt, 2011). Different lineages of species can display different adaptive responses to environmental changes, thus overlooking cryptic diversity may lead to inefficient conservation practices (Pauls et al., 2013). Furthermore, in symbiotic associations, it is important to assess diversity of symbiotic partners at the species level to understand patterns of co-evolution and co-distribution. A correct assessment of species boundaries may provide information about species-to-species ⇑ Corresponding authors at: Biodiversity and Climate Research Centre (BiK-F), Senckenberg Gesellschaft für Naturforschung, Senckenberganlage 25, D-60325 Frankfurt, Germany. E-mail addresses: [email protected] (A.D. Sadowska-Des´), [email protected] (I. Schmitt). 1 These authors contributed equally to this work. http://dx.doi.org/10.1016/j.ympev.2014.03.020 1055-7903/Ó 2014 Elsevier Inc. All rights reserved.

and species-to-community interactions, as well as co-distribution and co-speciation. For example, it has been proposed that lichenforming fungi form ecological guilds by sharing the same algae. These horizontally linked fungal networks may become evident once the system is studied at the level of species and communities (Rikkinen, 2002). Lichens are symbiotic systems composed of at least one fungal partner (mycobiont) and green algae and/or cyanobacteria (photobiont). Generally, more than one lichen-forming fungus forms associations with a single algal lineage (Friedl and Büdel, 2008). Recent studies using a combination of molecular, chemical and morphological data have shown that several groups of lichenforming fungi may include cryptic species (Crespo and Lumbsch, 2010; Crespo and Perez-Ortega, 2009). Some studies have also highlighted the presence of cryptic diversity in algae associated with lichen-forming fungi, e.g. in the genera Asterochloris and Trebouxia (Casano et al., 2011; Škaloud and Peksa, 2010). The coccoid green algae of the genus Trebouxia are the most common

A.D. Sadowska-Des´ et al. / Molecular Phylogenetics and Evolution 76 (2014) 202–210

and widespread photobionts involved in symbiotic associations with fungi (Friedl and Büdel, 2008; Tschermak-Woess, 1988). Studies exploring the genetic diversity of Trebouxia did usually not go beyond designating phylogenetic clades with letters and numerals (Helms, 2003; Ruprecht et al., 2012; Yahr et al., 2006). Despite the fact that molecular phylogenies are often not entirely congruent with current taxonomy of Trebouxia (Blaha et al., 2006; Romeike et al., 2002), attempts to assign species in a systematic way based on molecular and morphological data are rare. A self-organizing classification tool based on algal sequence information and single strand conformation polymorphism (SSCP) analysis has been proposed (Grube and Muggia, 2010), however, it is not widely employed. At present we lack a clear understanding of species boundaries and species richness within this genus (Škaloud and Peksa, 2010). The traditional way to describe species is based on phenotypical characters (e.g. Gärtner, 1985). When working with groups that show limited morphological characters, this approach may lead to underestimation of the real number of species. To improve the assessment of species boundaries there is need to test morphological characters together with molecular data (e.g. Knowles and Carstens, 2007; Welton et al., 2013). Some species delimitation methods assign samples to groups without a priori information, e.g. the general mixed Yule coalescent model (GMYC, Pons et al., 2006), Gaussian Clustering (Hausdorf and Hennig, 2010), Structurama (Huelsenbeck et al., 2011), or O’Meara’s heuristic method (O’Meara, 2010), while others require prior assignment of samples to putative lineages such as species tree estimation using maximum likelihood for gene trees under coalescence (e.g. STEM, Kubatko et al., 2009). One of the first widely used methods based on molecular data was the genealogical concordance phylogenetic species approach comparing the presence of putative species in single-locus analyses (e.g. Avise and Ball, 1990; Kroken and Taylor, 2001; Taylor et al., 2000). This method accommodates for incomplete lineage sorting of several genes and evaluates concordance among single-gene trees. Although this method is widely used, it has been shown to be flawed when attempting to delimitate closely related species (Carstens et al., 2013). Clades that are present in the majority of single-locus genealogies are likely to represent isolated lineages (Dettman et al., 2003). In the GMYC method the species delimitation is based on branch length differences (Monaghan et al., 2009; Pons et al., 2006). This approach has been widely applied to delimit species in many different groups such as bats (Esselstyn et al., 2012), insects (Hamilton et al., 2011), snails (Puillandre et al., 2012), lizards (Wiens and Penkrot, 2002) and lichenforming fungi (Leavitt et al., 2013, 2012; Parnmen et al., 2012). The assumption of GMYC models is that the independent evolution leads to appearance of genetically distinct clusters, which are separated by long internal branches (Barraclough et al., 2003; Fujisawa and Barraclough, 2013; Queiroz, 2007). Another widely used coalescence model is STEM (Kubatko et al., 2009), which was developed to assess species boundaries in systems with existing subspecies taxonomy (Carstens and Dewey, 2010). This test computes the gene tree probability for all hierarchical permutations of lineage groupings. In this method the species delimitation is not affected by the phylogenetic uncertainty in the species tree, however the correctness of the method depends on the accuracy of the gene tree (Carstens and Dewey, 2010). In the present study we used a multifaceted approach combining phylogenetic and coalescent methods to delimit species boundaries in the trebouxioid photobiont of the lichen Lasallia pustulata (L.) Mérat. Lasallia pustulata is an umbilicate macrolichen reproducing mainly by asexual propagules (isidia). From this mode of reproduction we would expect little variability of the photobiont as a result of predominantly clonal dispersal. However, genetic

203

variability of the photobiont in L. pustulata populations is in fact high (Sadowska-Des´ et al., 2013), an observation that has also been made in other asexually reproducing lichen species (Dal Grande et al., 2012; Nelsen and Gargas, 2009; Ohmura et al., 2006; Opanowicz and Grube, 2004; Piercey-Normore, 2006, 2009; Werth and Sork, 2010; Wornik and Grube, 2010). These studies suggest that multiple photobiont species associate with a single fungal species, and that various photobionts may be commonly available in a given environment. In this study we aimed to answer the following questions: (i) is Lasallia pustulata associated with a single or multiple Trebouxia species?, and (ii) does the range of compatible photobiont partners found in L. pustulata overlap with that of other lichen-forming fungi?

2. Materials and methods 2.1. Taxon sampling We collected specimens of L. pustulata across the species distribution range. Out of a total of 469 thalli we selected 22 samples that had a unique haplotype at either of four photobiont loci. Fresh samples were dried and stored at 20 °C until preparation. Information about sampling locality, haplotypes and GenBank accession numbers are given in Table S1. Specimens are deposited in the herbaria Berlin (B), Frankfurt (FR), and Oslo (O). 2.2. DNA isolation, polymerase chain reaction (PCR) amplification and sequencing Total genomic DNA was extracted from a small part of the thallus using the CTAB method (Cubero and Crespo, 2002). We sequenced the algal symbiont at the following loci: internal transcribed spacer region (nrITS rDNA), chloroplast intergenic spacer (psbJ-L), cytochrome C oxidase II (COX2) and ribulose-bisphosphate carboxylase (rbcL). Primers for PCR amplification were: nrITS: nrITS1T, nrITS4T (Kroken and Taylor, 1990) and nrITSaJOFOR2, nrITSaJOREV2 (Sadowska-Des´ et al., 2013); psbJ-L: psbF and psbR (Werth and Sork, 2008); COX2: Cox2-P2fw-50, Cox2P2rv-30 (Fernandez-Mendoza et al., 2011); rbcL: a-ch-rbcL-203, a-ch-rbcL-991 (Nelsen et al., 2011). Standard PCR amplification (25 ll) contained 0.65 U Ex Taq polymerase (TaKaRa BIO INC.), 1 buffer, 0.2 mM dNTP mixture, 0.5–1.0 lM of each primer, 2–50 ng DNA template, and H2O. Thermal cycling parameters for all loci were: initial denaturation 95 °C for 4 min, followed by 35 cycles of 95 °C for 30 s, 50 °C for 40 s, 72 °C for 1 min, and final elongation 72 °C for 5 min. PCR products were separated on 1% agarose gels stained with ethidium bromide. When multiple bands were present, fragments of the expected length were extracted using the peq-GOLD Gel Extraction Kit (PEQLAB Biotechnologie GmbH). The amplicons were sequenced using Big Dye 3.1 chemistry (Applied Biosystems, Foster City, CA, USA). The following cycle sequencing program was used: initial denaturation for 1 min at 95 °C, followed by 30 cycles of 96 °C for 10 s, 50 °C for 10 s, 60 °C for 2 min. Products were run on an ABI PRISMTM 3730 DNA Analyzer (Applied Biosystems). 2.3. Sequence alignment Sequences of each locus were aligned using the MUSCLE alignment algorithm (Edgar, 2004) as implemented in Geneious v5.4.2 (Drummond et al., 2011). Before the analysis, ambiguous regions from all loci were manually removed. We confirmed sequence identity by using BLAST searches in GenBank.

204

A.D. Sadowska-Des´ et al. / Molecular Phylogenetics and Evolution 76 (2014) 202–210

2.4. Phylogenetic analyses To test the level of compatibility among loci, we applied the Congruence Among Distance Matrices test (CADM, Campbell et al., 2011; Legendre and Lapointe, 2004). The null hypothesis assumes that all tested phylogenetic trees are completely incongruent. We performed maximum likelihood (ML) and Bayesian analyses on the single-locus and the combined four-locus datasets. ML analysis with 1000 bootstrap pseudoreplicates (Felsenstein, 1985) was performed on the concatenated dataset using RAxML v7.4.2 (Stamatakis, 2006) with the GTRGAMMA model implemented. We treated each locus as separate partition, as well as first, second and third codon positions of the protein-coding genes. We conducted Bayesian analyses in MrBayes 3.2.1 (Huelsenbeck and Ronquist, 2001). The best-fitting model was selected with the corrected Akaike Information Criterion as implemented in jModelTest 2.1.1 (model SYM, lnL = 1275.4812 for nrITS; model GTR, lnL = 4089.2736 for psbJ-L; model GTR, lnL = 992.0584 for COX2 and model GTR, lnL = 632.2397 for rbcL; Darriba et al., 2012; Guindon and Gascuel, 2003; Posada, 2008). We ran MrBayes for 10 million generations, and sampled one out of every 1000 trees. The first 40,000 trees were discarded as burn-in (likelihoods below stationary level). No molecular clock was assumed. Sequences of Trebouxia decolorans from the lichen Anaptychia ciliaris were used as outgroup in all analyses. The phylogenetic trees were visualized using FigTree v 1.4.0 (Morariu et al., 2009). All clades with ML equal or above 75% and posterior probabilities (PP) equal or greater than 0.95 were considered as strongly supported (Fig. 1). 2.5. Identification of putative species We applied a GMYC approach (Monaghan et al., 2009; Pons et al., 2006) to identify putative species. This method uses chronograms for comparison of the following models: (i) null model which assumes that all samples derived from one population and (ii) alternative GMYC model which identifies intraspecific and interspecific relationships based on different branch lengths and position of nodes that define putative species. Based on the likelihood ratio test (LRT) the null hypothesis can be rejected, and if the GMYC model is significantly better than null model, the number of putative species within dataset can be estimated. To calculate the number of species we used the chronogram calculated from the Bayesian tree of the combined dataset. Subsequently we applied single and multiple threshold methods by using the GMYC package implemented in SPLITS in R v2.15.3 (Meyer et al., 2011). A chronogram was calculated based on the penalized likelihood method (Sanderson, 2002) and obtained by using the ‘‘chronopol’’ command in the package ape (Meyer et al., 2011). Outgroup sequences were excluded from the dataset using the drop.tip command. To convert the chronogram into a fully dichotomous chronogram we used multidivtime (Thorne and Kishino, 2002). The lineages-through-time plot (Fig. 2) shows the overall pattern of diversification over time. The number of reconstructed lineages is depicted by a line. Time is presented as a proportion of the total time since the first cladogenesis event inferred for the taxon. The two vertical lines show sharp increases in branching rate, which indicate the change from interspecies to intraspecies branching events. We summarized the analysis to obtain indication of the number of recognized lineages. We used BEAST (Heled and Drummond, 2010) implemented in the program BEAST v.1.7.5 (Drummond et al., 2012) to estimate species trees from putative species-level lineages and supported clades. We employed STEM (Kubatko et al., 2009) to generate species trees based on different species delimitation scenarios. To obtain likelihood scores for alternative species delimitation scenarios

we followed the protocol by Carstens and Dewey (2010). We tested the following scenarios: 1-species scenario, 2-species scenario, 4-species scenario based on multiple threshold of GMYC, 5-species scenario, and the 6-species scenario based on a combination of multiple threshold of GMYC and phylogeny. In addition we also used a genealogical concordance approach (Avise and Ball, 1990). This approach recognizes species based on presence of clades in the majority of single-locus genealogies (Dettman et al., 2003, 2008). 2.6. Photobiont sharing analysis We performed BLAST searches with each nrITS haplotype of the photobiont of L. pustulata to find other lichen-forming fungi that are associated with the same algal strains. We set the identity threshold to 98%. A heatmap of the lichen-forming fungi and algal nrITS haplotypes was made using the package pheatmap in R. 3. Results 3.1. Congruence of the data To assess the congruence of the four loci we performed CADM test (Campbell et al., 2011; Legendre and Lapointe, 2004). The null hypothesis, which assumes complete incongruence of the phylogenetic trees, was rejected for all examined loci. Kendall’s coefficient of concordance (W) showed that nrITS and psbJ-L were completely congruent with other loci (W = 0.982, p = 0.001), whereas rbcL and COX2 were partially congruent (W = 0.581, p = 0.03; Table S2). Additionally, we checked the phylogenetic congruence by comparing ML trees of all single loci (Fig. S1). No supported conflicts between single-loci ML phylogenetic trees were found that could influence the compatibility of the combined dataset. According to these tests all datasets of nrITS rDNA, psbJ-L, COX2 and rbcL are congruent and can be analyzed simultaneously in one combined dataset. 3.2. Phylogenetic analysis The alignment length of the concatenated dataset was 2566 base pairs (nrITS rDNA 430 bp; psbJ-L 558 bp; COX2 419 bp, and rbcL 441 bp), including 328 variable positions. Statistical information about sequences used in this study is included in Table 1. The presence of outgroups, especially for psbJ-L, changed the length of the alignment and number of variable characters. The phylogenetic tree (Fig. 1) was divided into two main clades and five subclades. The first subclade comprising of four haplotypes (001, 002, 003, 004) was highly supported (ML = 99%, PP = 1). The second subclade was divided into a single haplotype 009 (ML = 100%, PP = 0.98), sister to a clade including haplotypes 005, 006, 007 and 008 (ML = 81%, PP = 0.98). Haplotypes 010, 011, 012 formed a highly supported third subclade (ML = 100%, PP = 1). The fourth subclade was divided into two groups of haplotypes 013, 014 (ML = 83%, PP = 0.84) and 015, 016 (ML = 84%, PP = 0.47). The fifth subclade (ML = 100%, PP = 1) was split into two groups: haplotypes 017 and 018 (not supported), and haplotypes 019, 020, 021, 022 (ML = 99%, PP = 1). 3.2.1. Identification of putative species based on the GMYC method Species delimitation was estimated using the GMYC method. We used the tree obtained from *BEAST (Heled and Drummond, 2010) to visualize the delimitation of putative species recognized by the single and multiple threshold GMYC methods (Fig. 1). Outgroups were removed from the analysis. The putative delimitations

A.D. Sadowska-Des´ et al. / Molecular Phylogenetics and Evolution 76 (2014) 202–210

205

Table 1 Sequence characteristics of sampled markers used in this study. In psbJ-L all ambiguities in outgroup were considered as gaps (not informative positions). Alignment length (bp) + Outgroup nrITS psbJ-L COX2 rbcL Combined

430 1276 419 441 2566

Variable characters Outgroup 430 558 419 441 1848

+ Outgroup 81 165 77 5 328

Model selected Outgroup 59 60 48 5 172

SYM GTR GTR GTR SYM

Fig. 1. Maximum likelihood tree based on the concatenated dataset including nrITS, psbJ-L, COX and rbcL sequences. Branches in bold represent Bayesian posterior probabilities support greater than 0.95. ML bootstrap support greater than 75% is shown above branches. Column A depicts species recognized by the multiple threshold GMYC model, and column B depicts the most probable 6-species scenario suggested by STEM analyses.

of a single and multiple threshold modes were visualized as ultrametric trees obtained from SPLITS (Meyer et al., 2011). The single threshold method was not preferred over the null model of uniform (coalescence) branching rate (log LGMYC =

44.40021 vs. log L0 = 44.03784, 2DL = 0.7247428, p > 0.05). The confidence intervals ranged from 1 to 6. The model fitted the switch in the branching pattern at 0.714659, resulting in four putative species. Because this model was not significant we did

206

A.D. Sadowska-Des´ et al. / Molecular Phylogenetics and Evolution 76 (2014) 202–210

ML = 81%, PP = 0.98; species 3: ML = 100%, PP = 0.98; species 4: ML = 76%, PP = 1).

Fig. 2. Lineage-through-time (LTT) plot of the multiple threshold analysis of GMYC model. Lines depict numbers of reconstructed lineages for the clades.

not consider it as informative and we excluded it from further analyses. Two switches were present in the multiple threshold method (Fig. 2). This model was preferred over the null model of uniform branching rates (log LGMYC = 49.74622 vs. log L0 = 44.03784, 2DL = 11.41676, p < 0.05). The confidence interval was equal to 3. The model fitted the switch in the branching pattern at 0.8482913 and 0.5998106 and resulted in four putative species. No unclassified individuals were present and all 22 haplotypes were categorized into putative species (Fig. 3). The multiple threshold model was preferred over the single threshold model (log LGMYC-MUL = 49.74622 vs. log LGMYC-SIN = 44.40021, 2DL = 10.69202, p < 0.05). All putative species derived by multiple threshold mode of GMYC were supported by ML and Bayesian analyses (species 1: ML = 99%, PP = 1; species 2:

3.2.2. Putative species tree analyses To select the best species delimitation scenario we used a hierarchical Bayesian model BEAST implemented in the program BEAST (Heled and Drummond, 2010). Species trees were estimated directly from the sequence data, incorporating the coalescent process and nucleotide substitution model parameters. The BEAST topology tree was congruent with the 4-locus ML phylogeny. For the analysis we followed the protocol by Carsten and Dewey (2010) and compared five different delimitation scenarios based on different assumptions of species boundaries (Table 2): 1-species scenario (all haplotypes were treated as one species – GMYC null model), 4-species scenario based on multiple-threshold results from GMYC (Fig. 1, column A), and another two scenarios (2-, 5-species) based on different phylogenetic assumptions. The 2-species scenario consisted of two main clades of haplotypes: 001–009 (ML = 100%, PP = 1) and 010–022 (ML = 76%, PP = 1). The 5-species scenario consisted of 5 subclades: haplotypes 001–004 (ML = 99%, PP = 1), haplotypes 005–009 (ML = 51%, PP = 0.98), haplotypes 010–012 (ML = 100%, PP = 1), haplotypes 013–016 (ML = 82%, PP = 1) and haplotypes 017–022 (ML = 100%, PP = 1). The 6-species scenario was selected based on results obtained from the multiple-threshold of GMYC method and combined ML and Bayesian support values (Fig. 1, column B), i.e. within the putative species 4 proposed by the multiple-threshold method we distinguished three well-supported subclades: 4a (ML = 100%, PP = 1), 4b (ML = 82%, PP = 1) and 4c (ML = 100%, PP = 1). To evaluate different scenarios we used species tree estimation using Maximum likelihood STEM as implemented in v 2.0 (Kubatko et al., 2009). This program infers ML species tree from estimated gene trees under coalescent model. Results are presented in Table 2. The best fitting scenario was the 6-species scenario ( lnL 25.29879, p < 0.001), although the difference in DlnL between 5- and 6-species scenarios was not significant (p = 0.9). The number of specimens composing the 6 putative species scenario was as following: species 1–114 samples; species 2–45 samples; species 3–2 samples; species 4a–194; species 4b–79 samples and species 4c–35 samples (Fig. 1, column B). 3.2.3. Evolution of putative species in a genealogical framework We also used a genealogical concordance approach (Avise and Ball, 1990; Baum and Shaw, 1995) that delimits well-separated lineages using molecular data. Relationships of putative species inferred in the concatenated dataset were evaluated among individual loci (Hudson and Coyne, 2002). Genealogical concordance results based on ML bootstrap values for 4-species and 6-species scenarios are shown in Table 3. Four clades of the 6-species delimitation scenario were present and highly supported in psbJ-L (species 1: ML = 100%, PP = 1; species 4a: ML = 100%, PP = 1; species 4b: ML = 95%, PP = 1; species 4c: ML = 100%, PP = 1). Putative species 1, 2, 3 and 4a were present in nrITS, but only the latter two were

Table 2 Likelihood scores for STEM analysis of species delimitation scenarios based on the concatenated nrITS rDNA, psbJ-L, COX2, rbcL dataset. High log-likelihood means high support for a given scenario (k = number of parameters, MTM = multiple threshold method of GMYC). Scenario Fig. 3. Ultrametric gene genealogy and clusters of haplotypes recognized as a putative species by multiple-threshold of GMYC (numbers in black circles). Nodes of genetic clusters recognized as putative species (best supported STEM scenario) are highlighted in colors. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

6-Species 5-Species 4-Species/MTM 2-Species 1-Species

lnL 25.29879 27.46663 94.3019 108.0971 220.32715

k

DlnL

Bonferroni corrected P

7 6 5 3 3

0 2.1678 69.003 82.798 195.0283

0.9

Integrating coalescent and phylogenetic approaches to delimit species in the lichen photobiont Trebouxia.

The accurate assessment of species boundaries in symbiotic systems is a prerequisite for the study of speciation, co-evolution and selectivity. Many s...
777KB Sizes 0 Downloads 4 Views