327

ARTICLE Phylogenetic relationships of species of genus Arachis based on genic sequences

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

Guohao He, Noelle A. Barkley, Yongli Zhao, Mei Yuan, and C.S. Prakash

Abstract: The genus Arachis (Fabaceae), which originated in South America, consists of 80 species. Based on morphological traits and cross-compatibility among the species, the genus is divided into nine taxonomic sections. Arachis is the largest section including the economically valuable cultivated peanut (A. hypogaea). Seven genic sequences were utilized to better understand the phylogenetic relationships between species of genus Arachis. Our study displayed four clades of species of Arachis. Arachis triseminata was genetically isolated from all other species of Arachis studied, and it formed the basal clade with A. retusa and A. dardani from the most ancient sections Extranervosae and Heteranthae, respectively. Species of section Arachis formed a separated single clade from all other species, within which species having B and D genome clustered in one subgroup and three species characterized with an A genome grouped together in another subgroup. A divergent clade including species from five sections was sister to the clade of section Arachis. Between the sister clades and the basal clade there was a clade containing species from the more advanced sections. Phylogenetic relationships of all the species of Arachis using multiple genic sequences were similar to the phylogenies produced with single-copy genes. Key words: EST–SSR, RGA, NBS–LRR, peanut. Résumé : Le genre Arachis (Fabacées), qui est originaire de l'Amérique du Sud, compte 80 espèces. Sur la base des caractères morphologiques et de la compatibilité sexuelle entre les espèces, le genre a été divisé en neuf sections taxonomiques. Arachis est la plus grande de ces sections et inclut l'arachide (A. hypogaea), une espèce cultivée importante sur le plan économique. Sept séquences géniques ont été employées afin de mieux définir les relations phylogénétiques entre espèces du genre Arachis. Cette étude a révélé l'existence de quatre clades regroupant les espèces d'Arachis. Arachis triseminata était génétiquement séparée de toutes les autres espèces étudiées et elle formait un clade basal avec A. retusa et A. dardani, deux espèces provenant des sections les plus anciennes, Extranervosae et Heteranthae respectivement. Les espèces de la section Arachis formaient un clade séparé de toutes les autres espèces au sein duquel les espèces aux génomes B et D constituaient un sous-groupe tandis que trois espèces ayant un génome A étaient groupées au sein d'un autre sous-groupe. Un clade divergent réunissant des espèces provenant de cinq sections était proche du clade de la section Arachis. Entre ces deux clades proches et le clade basal se trouvait un clade réunissant des espèces provenant des sections plus avancées. Les relations phylogénétiques parmi toutes les espèces d'Arachis au moyen de plusieurs séquences géniques sont semblables aux phylogénies obtenues avec des gènes a` simple copie. [Traduit par la Rédaction] Mots-clés : EST–SSR, RGA, NBS–LRR, arachide.

Introduction All species in the genus Arachis are assigned to one of the nine taxonomic sections based on a series of criteria including crosscompatibility, geographical distribution, and morphological character clustering. This convention-based taxonomy is derived also from the results of chromosome cytology, chromatographic and antigenic reactions, adaptations of plant form, and annual and perennial habit (Krapovickas and Gregory 2007). The nine taxonomic sections in Arachis consist of 80 species, from the most ancient to the most advanced species as described by Krapovickas and Gregory (1994) and Valls and Simpson (2005), providing a diverse genetic resource for phylogenetic study in this genus. Although most of the species of Arachis are diploid (2n = 2x = 20), there are five tetraploid species (2n = 4x = 40) and four aneuploid species (2n = 2x = 18) (Krapovickas and Gregory 1994; Lavia 1998; Penaloza and Valls 2005). Section Arachis is the largest and the most widely

distributed, originating from five countries of the distributional range of the genus (Brazil, Argentina, Bolivia, Paraguay, and Uruguay) (Moretzsohn et al. 2013). This section contains species possessing one of five genome types A, B, D, F, or K (Smartt et al. 1978; Stalker 1991; Robledo and Seijo 2010). The genetic relationships among species in this section have been intensively studied using molecular markers and DNA sequences because it also contains the domesticated species A. hypogaea (Jung et al. 2003; Milla et al. 2003; Tallury et al. 2005; Cunha et al. 2008; Moretzsohn et al. 2013). The genetic relationships among species from different sections have been analyzed using SRAP markers among six sections (Ren et al. 2010) and SSR markers among seven sections (Koppolu et al. 2010). Phylogeny of species of Arachis from all nine sections were reported by Hoshino et al. (2006) using SSR markers, Friend et al. (2010) using internal transcribed spacer (ITS) and plastid trnT-trnF sequences, Bechara et al. (2010) using ITS and 5.8S rDNA sequences, and Wang et al. (2011) using ITS. These studies have

Received 25 February 2014. Accepted 15 August 2014. Corresponding Editor: L. Lukens. G. He, Y. Zhao, and C.S. Prakash. Department of Agricultural and Environmental Sciences, Tuskegee University, Tuskegee, AL 36088, USA. N.A. Barkley. USDA–ARS, Plant Genetic Resources Conservation Unit, Griffin, GA 30223, USA. M. Yuan. Shandong Peanut Research Institute, Key Laboratory of Peanut Biology and Genetic Improvement, Ministry of Agriculture, Shandong Provincial Key Laboratory of Peanut, Qingdao 266100, China. Corresponding author: Guohao He (e-mail: [email protected]). Genome 57: 327–334 (2014) dx.doi.org/10.1139/gen-2014-0037

Published at www.nrcresearchpress.com/gen on 11 September 2014.

328

Genome Vol. 57, 2014

Table 1. List of species of Arachis sampled from nine taxonomic sections used in this study. Section

Species

PI number

Genome

Life cycle

Trierectoides Erectoides

A. guaranitica (Chodat & Hassl.) A. major (Krapov. & W.C. Greg.) A. paraguariensis (Chodat & Hassl.) subsp. paraguariensis A. retusa (Krapov. & W.C. Greg.) A. triseminata (Krapov. & W.C. Greg.) A. giacometti (Krapov., W.C. Greg. & C.E. Simpson) A. dardani (Krapov. & W.C. Greg.)

PI276194 PI468172 PI604838 Grif7440 PI338449 PI331189 PI591364

E1 E2 E2 Ex T He He

A. repens (Handro) A. pintoil (Krapov. & W.C. Greg.) A. matiensis (Krapov., W.C. Greg. & C.E. Simpson) A. rigonii (Krapov. & W.C. Greg.) A. glabrata (Benth.) A. diogoi (Hoehne) A. stenosperma (Krapov. & W.C. Greg.) A. duranensis (Krapov. & W.C. Greg.) A. ipaënsis (Krapov. & W.C. Greg.) A. glandulifera (Stalker)

PI338258 PI604803 PI476113 PI262142 PI262794 PI468354 PI338280 PI497483 PI468322 PI468343

C C E3 E3 R2 A A A B D

Perennial Perennial Perennial Perennial Perennial Annual Annual or biannual Perennial Perennial Perennial Perennial Perennial Perennial Perennial Annual Annual Annual

Extranervosae Triseminatae Heteranthae

Caulorrhizae

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

Procumbentes Rhizomatosae Arachis

shown that the molecular phylogeny of species of Arachis was not always consistent with the conventional section classifications, as often, ambiguous or controversial species were assigned to different groups. For instance, A. vallsii, classified in section Procumbentes, was suggested to be moved to section Arachis by Moretzsohn et al. (2013) based on their single-copy genes study, and Koppolu et al. (2010) reported that some species belonging to different sections clustered together. The advent of the genomics age has provided a vast amount of DNA sequence information, enabling more precise studies on species phylogeny. Expressed sequence tag (EST) collections provide a resource for rapidly and inexpensively developing EST–SSR markers. In general, EST–SSRs have been found to be significantly more transferable across taxonomic boundaries than are traditional “anonymous” SSRs, not just among species within a genus but in some instances among multiple genera within a family (Ellis and Burke 2007). Recently, research has revealed that EST–SSRs have clear potential for use in basic evolutionary studies owing to their transferability across taxa (Jing et al. 2007; Christelova et al. 2011). In the genus Arachis, a large amount of EST–SSRs were identified and deposited in GenBank (Luo et al. 2005; Nagy et al. 2010; Koilkonda et al. 2012). Although EST–SSR markers provide valuable information in assessing the genetic diversity, e.g., generating linkage maps and enabling comparative maps between different genomes in other crop species, EST–SSRs exhibited low level of polymorphism in cultivated peanut owing to its narrow genetic base. However, a high genetic variation in morphological descriptors exists in its related wild species. Therefore, the aims of this study were to gain insight into the molecular nature of EST–SSR repeats and their flanking sequences among species of Arachis, to identify genome affinities between these species using multiple genic sequences by determining the content of molecular variation, and to infer phylogenetic relationships in the genus Arachis.

Materials and methods Plant materials All plant material was acquired from the USDA–ARS PGRCU peanut germplasm collection in Griffin, Georgia. Accessions included in this study are listed in Table 1. The first five sections (Trierectoides, Erectoides, Extranervosae, Triseminatae, and Heteranthae) are the most primitive of the genus, the following three sections (Caulorrhizae, Procumbentes, and Rhizomatosae) are more advanced, and the last section (Arachis) is the most advanced to which the cultivated peanut was assigned (Krapovickas and Gregory 2007).

One or two species were selected from each section, and five species having three different genome types (A, B, and D) were selected from the largest section Arachis. DNA extraction Seeds and leaves from clones maintained in the greenhouse were utilized to extract DNA for this study. Samples that were extracted from a single seed included A. dardani, A. duranensis, A. glandulifera, and A. ipaënsis. The remaining samples were all extracted from leaf tissue derived from a single plant. DNA was extracted by following the instructions from an E.Z.N.A. Plant DNA kit (Omega Bio-Tek, Norcross, Ga.). Approximately, 100 mg of seed or leaf tissue was collected and placed in a 2 mL screw cap microcentrifuge tube containing two 3 mm tungsten carbide beads (Qiagen, Valencia, Calif.) and 600 ␮L of buffer P1 from the DNA extraction kit. Tissue was pulverized with a Retsch Mixer Mill 201 (Leeds, UK) for 3 min at 30 Hz. DNA concentrations were determined with a nanodrop spectrophotometer. One microlitre of DNA was loaded into a 1% agarose gel and stained with ethidium bromide to evaluate quality and quantity of each sample. A low mass DNA ladder (Invitrogen, Carlsbad, Calif.) was also loaded into the gel to determine the approximate quantity and size of the extract. DNA sequence analysis of EST–SSR amplicons Fifteen EST–SSRs with trinucleotide SSR motifs were selected from a list of EST–SSRs (Nagy et al. 2010). The primer pairs for these 15 EST–SSRs were used to amplify genomic DNAs from 17 species of genus Arachis. Six primer pairs designed for NBS–LRR sequences, which were identified in a previous collaboration with the University of California, Davis (Cook et al. 2009), were also used in this study. All genomic DNAs were amplified using selected primer pairs. DNA amplification was performed using 37.5 ng of genomic DNA, 1.5 ␮mol/L of mixed forward and reverse primers, 1× buffer (20 mmol/L Tris-HCl, 10 mmol/L (NH4)2SO4, 10 mmol/L KCl, 2 mmol/L MgSO4, 0.1% Triton X-100), 0.2 mmol/L dNTPs, and 1.5 U Taq polymerase in a total volume of 30 ␮L. PCR condition was set as 95 °C for 5 min for initial denaturing; followed by 35 cycles of 95 °C for 30 s, 58 °C (for EST–SSR) or 55 °C (for NBS–LRR) for 30 s, and 72 °C for 2 min; and final extension at 72 °C for 5 min using Dyad Peltier Thermal Cycler (Bio-Rad Laboratories, Hercules, Calif.). PCR product (5 ␮L) from each sample was resolved in 0.8% agarose gel to check amplicons. The EST–SSR or RGA primers were not used if they generated multiple fragments from any one of the 17 species of Arachis. Only those targets Published by NRC Research Press

He et al.

329

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

Table 2. Primer sequences of seven genic loci used. Primer name

motif

Blast homology

Primer sequence (5=–3=)

GM2767-F GM2767-R GM2584-F GM2584-R GM2411-F GM2411-R GM2313-F GM2313-R GM2831 F GM2831 R RGA10-F RGA10-R RGA32-F RGA32-R

(TTG)n

MYB transcription factor

(ACC)n

AP2 transcription factor

(AAG)n

Rac GTPase

(CCA)nT(CCG)n

Ring

(CCA)n

Serine-threonine protein kinase AFC3 NBS-LRR

CGTAGATTCCAAGCCTTCTCC GACAATCAAGCACCCCACTT CCGTATACAACGGCGATTTC ACATATTCAAACGCGCAACA TGAACAAAATTTCCCCTTCG CGATCCCAACATTCCAAAAC TTCCCCTTTTTACCCCTCTC ACTCAACTCCTTCCCCTCCT GCAAAAGGAAGCAGGTAGGA AAGCAGGTTTGCAGAGGTGT TTGCGACTGCAGTTTTCAAT GGAGGCCGTTGCAATAGTTA TGGGGAAGACAACCTTAGCA CGCACCTCTTGAACAAATCC

NBS-LRR

amplifying single alleles were evaluated in this study. The rest of the 25 ␮L PCR product with single fragment was used for direct sequencing by Beckman Coulter Genomics (Danvers, Mass.). A final set of gene-based markers including five EST–SSR and two NBS–LRR markers were used for phylogenetic analysis and are listed in Table 2. Phylogenetic analysis of EST–SSR and NBS–LRR amplicons The DNA sequences amplified by each EST–SSR or RGA primer pair were aligned using the software MEGA5, and aligned sequences were trimmed to remove primer termini. ClustalX (Thompson et al. 1997) was used to determine the optimal alignment between the sequences derived from the species of Arachis. Sequences from each gene-related locus were aligned separately with ClustalX and also aligned with all the sequence data from all seven loci. Low gap penalties are appropriate for interspecific data. Therefore, gap penalities were set to have a gap opening of 10 and a gap extension of 0.10. jModelTest v 0.1.1 (Posada 2008) was used to statistically select a model of nucleotide substitution by comparing a total of 88 possible models for each gene alignment. The nucleotide substitution model chosen for each gene locus and all loci together varied. The Akaike information criterion (AIC), Bayesian information criterion (BIC), and the likelihood score (LnL) were calculated for each model to aid the selection process of the appropriate nucleotide substitution model. The likelihood ratio test (LRT) was employed to determine which of the best models chosen by the AIC and BIC suited the data based on the p value obtained. Selecting an appropriate model using statistical selection is significant because the chosen model can strongly affect branch lengths and overall tree topology (Bos and Posada 2005). The chosen nucleotide substitution model was written into a PAUP block by jModelTest. PAUP* B version 4.0b10 (Swofford 1998) was employed to construct a phylogeny via maximum likelihood. Tree searching was conducted using an heuristic search random stepwise addition and a tree bisection and reconnection branch swapping algorithm. Bootstrapping was performed to test clade stability with 100 replicates. Bootstrap values of 70% or higher are indicated on the branches. Seventy percent or higher was chosen because this value has been demonstrated to correspond to a well-supported clade (Hillis and Bull 1993).

Results Variation in SSR motifs and flanking sequences Direct sequencing amplicons generated by seven gene-related markers produced a single homogenous sequence in each of 17 species selected from all nine sections in the genus Arachis. Similarity search of amplified sequences was conducted by BLAST against molecular databases to verify the identity for each of five

EST–SSR loci. The annotations of five EST–SSR loci and two RGA loci are listed in Table 2. The alignment of sequences derived from all species revealed the variation in both SSR motifs and flanking sequences. In general, the SSR length trend was toward the ancient species with longer SSR motifs, whereas the shortest SSR was observed in the more advanced species. For instance, at the AP2 locus, the ancient species A. triseminata possessed the motif (ACC)9, A. matiensis (ACC)7, and A. paraguariensis (ACC)6, whereas in the most advanced section Arachis, A genome species had the motif (ACC)3 and B and D genome species (ACC)4. However, the phenomenon was not always the same at all loci evaluated. For instance, at the MYB locus, the most advanced species had longer SSR motifs than the most primitive species (Fig. 1). The length of SSRs in the genus Arachis might also provide information on the interspecific evolutionary process. Homoplasy in which alleles were identical in state but not identical in content was observed among taxa in this study at several loci. Moreover, insertions in the motifs were also observed, such as a 10-base (GAACAAATCC) insertion in the imperfect motif (TTG)n at the MYB locus (Fig. 1) and a one-base (T) insertion between (CCA)n and (CCG)n at the Ring locus. Insertion occurring in SSR motifs in all section species could indicate that these SSRs have existed in all species before speciation in the genus Arachis. A few ancient species either did not contain SSR markers or when they did contain such markers they showed a different motif at two gene loci. For instance, A. matiensis did not contain SSRs at the Ring locus, whereas A. dardani had (AC)n motif instead of (ACC)n, which was common in other species at the AP2 locus. A further study may help reveal whether SSR was lost or mutated after speciation. Besides the length polymorphism of SSRs among species studied, nucleotide substitution occurred in SSR motifs in the most primitive species and very few happened in the most advanced species. For the flanking regions, both nucleotide substitution and insertion/deletion (indel) existed in the sequences. Some nucleotide substitutions were species specific and some indels only existed in particular genome types. This data could be used to develop markers to identify species or genomes in Arachis. For instance, a one nucleotide substitution occurred at the MYB locus that was related to species of the ancient sections, whereas a three-base insertion only happened in the species with genome B and D in the same region. Mutations and small indels were the major source of variation in the flanking sequences, providing phylogenetic informative signals to the phylogeny of species in the genus Arachis. Phylogeny of species of Arachis using seven gene-related loci Phylogenetic analysis was conducted for 17 species using genic sequences at each gene locus separately (Fig. 2). Using maximum Published by NRC Research Press

330

Genome Vol. 57, 2014

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

Fig. 1. Sequence variation in SSR motifs and flanking regions at the MYB locus. A 10-base insertion was found in the (TTG)n motifs, and some mutations in the flanking region were species specific.

Fig. 2. Phylogenetic analysis of species of genus Arachis at each of seven gene-based SSR loci.

likelihood (ML), trees showed that A. triseminata was the most isolated from the remaining species at all seven loci, and either A. retusa or A. dardani were most similiar to A. triseminata. Arachis giacomettii within the same section as A. dardanoi clustered with species from the sections Caulorrhizae, Procumbentes, and Rhizomatosae

at some loci, and it grouped with species from the sections Trierectoides and Erectoides at the other loci. At most loci, A. pintoi, A. repens, and A. matiensis were clustered into a group. Species with B and D genomes, A. ipaënsis and A. glandulifera, were grouped together and are more closely related to each other than to Published by NRC Research Press

He et al.

331

Fig. 3. Phylogeny of 16 taxa in the genus Arachis with concatenation of seven gene-related sequences. (a) Dendrogram, (b) radial tree. Phylogenetic reconstruction was conducted with PAUP* B version 4.0b10 via maximum likelihood. Measures of support for individual clades were conducted using heuristic bootstrap with 100 replicates.

a

b A rigonii

100

A giacomettii

98

A repens A glabrata

99

I

A guaranitica

64

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

A major A rigonii A glabrata A giacomettii

100

A diogoi

100 100

A matiensis

A stenosperma A duranensis

73

A major A guaranitica

II

A glandulifera

100

A glandulifera

96

A pintoi A duranensis A ipaënsis A stenosperma

A ipaënsis A repens

93 100

A pintoi

A retusa

A diogoi A triseminata

III

A dardani A matiensis 94

A dardani A retusa

IV

A triseminata

A genome species, A. duranensis, A. diogoi, and A. stenosperma, at seven loci. To reveal high-resolution phylogenetic relationships of species of Arachis, seven gene-related sequences of 16 species were combined. Arachis paraguariensis was not included in this analysis because no amplicons were produced at the RGA32 locus in this species, which may have been caused by mutation at the primer binding site. Because A. triseminata was isolated from other species in each phylogenetic tree with different genic sequences, it was chosen to serve as an outgroup in the consensus tree. The phylogeny of species from all nine sections with concatenation showed four clades (Fig. 3). Clade I supported by 99% of the bootstrap replicates was a divergent group and consisted of species from five sections, Trierectoides, Erectoides, Heteranthae, Procumbentes, and Rhizomatosae. In clade II, there was only species from section Arachis. Three species, A. pintoi, A. repens, and A. matiensis, from sections Caulorrhizae and Procumbentes formed the clade III and was highly supported with 100% bootstrap replicates. The remaining species, A. dardani and A. retusa, were grouped together and were sister to A. triseminata in the clade IV.

Discussion Variation in SSR motifs and flanking sequences SSRs are ubiquitous across the genomes of eukaryotic organisms, which make them a useful tool in genetic diversity studies within and between closely related species in plants. The length polymorphism derived from changes in the number of repeats in the SSR motifs is particularly useful for intraspecific genetic studies. In general, the variation in SSR motifs is believed to arise from polymerase slippage during replications (Levinson and Gutman 1987). Recent studies have revealed that allelic diversity involves

indels and base substitutions in both SSR motifs and its flanking regions (Chen et al. 2002). Better understanding of variation in SSR motifs and its flanking region derived from EST–SSR sequences would allow us to learn the feasibility of phylogeny using such molecular tools. A genetically diverse species existed from the ancient to the advanced taxonomic sections in the genus Arachis, which provided a platform to uncover the dynamic changes in DNA sequences. Alignment of EST–SSR sequences from 17 species of all nine taxonomic sections revealed that the ancient species generally possessed the longer SSRs than the advanced species, except one reverse case at the MYB locus (Fig. 1). It is believed that the higher rate of polymorphism occurred in the longer SSR (Temnykh et al. 2001), which implies the higher mutation rate at longer SSRs. Hong et al. (2007) stated that long SSRs in eukaryotic genomes have a mutation bias to become shorter SSRs. The result in the present study provided evidence for dynamic change of the length of SSR from the ancient to the advanced species. On the other hand, the advanced species having longer SSR than the ancient species at the MYB locus might be caused by slippage after a 10-base insertion at the locus. Most nucleotide substitutions in SSR motifs occurred in the ancient species and a few happened in the advanced species. There were small indels present in the flanking sequences; therefore, homoplasy occurred in some sequences. Because changes in flanking regions occurred independently from changes in the SSR motifs (van Zijll de Jong et al. 2011), the change in flanking region could lead to a problem of size homoplasy for diversity and evolutionary study because SSR alleles are similar in length but different in descent (Gugerli et al. 2008; Barkley et al. 2009). Analysis of SSR alleles generally assumes that identical alleles contain identical sequence content. Size homoplasy may be derived from Published by NRC Research Press

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

332

either substitutions in flanking regions or indels in flanking regions, compensating for differences in repeat number of SSR motif. As it has been suggested that length polymorphism owing to size variation is appropriate only for genetic analysis and not suitable for phylogenetic reconstruction (van Zijll de Jong et al. 2011), SSR flanking regions have been used for phylogenetic analysis in several organisms (Rossetto et al. 2002; Asahida et al. 2004; Nishikawa et al. 2005). SSRs with both size variation and size homoplasy are needed to investigate the variation in the flanking region among different species for phylogenetic or evolutionary studies. In this study, though the length of SSR provided informative signal to the phylogeny, the appropriate number of SSR markers should be optimized when only using the length polymorphism of SSRs to solve the homoplasy problem for phylogenetic analysis. Variation in the flanking sequences owing to nucleotide substitution and indel provided more informative signals than variation of SSR length alone, as nucleotide substitution or haplotypes were related to specific species at different gene loci. For instance, at the MYB locus, a three-base insertion only occurred in B and D genome species and a 12-base deletion happened to A genome species. Tallury et al. (2005) also identified a six-base indel in both B and D genome species and a 21-base indel only in A genome species using trnT-trnF sequences. However, nucleotide substitutions often occurred in “old” species compared to the “advanced” species. Chen et al. (2002) have demonstrated that the frequency of mutation in flanking regions increased as the genetic distance increased between different species. Phylogenetic signal is often interpreted as providing information about the evolutionary process or rate (Revell et al. 2008). Owing to variable evolutionary rates in lineages, concatenation of multiple gene sequences would give more assured structure of trees to solve the poor resolution of phylogenies based on single loci (Leigh et al. 2008). Phylogenetic relationship in the section Arachis Section Arachis, which consists of 31 species with five genome types, is the largest in the genus compared with the other eight sections. This section has been mostly studied on phylogenetic relationships using both molecular markers and gene sequences because this section includes a major commodity crop worldwide, the tetraploid cultivated peanut (A. hypogaea, 2n = 4x = 40). Better understanding of phylogenetic relationships between wild species and cultivated peanut in this section would effectively allow introgression of desirable genes into the genome of the cultivated species, and thus facilitate genetic improvement of the crop. Jung et al. (2003) compared the sequences of fatty acid desaturase among wild species and cultivated species, verifying the hypothesis that A. duranensis and A. ipaënsis are the progenitors of the cultivated species because two homoeologous sequences of A. hypogaea were identical with those of A. duranensis and A. ipaënsis. Also, A genome species and B genome species were separately grouped based on these sequences. Similar results were observed by Milla et al. (2003) and Cunha et al. (2008) using AFLP and RAPD markers, respectively, and B genome and D genome species were close to each other on the bases of AFLP markers and trnT-trnF region (Tallury et al. 2005). In this study, phylogenetic analysis of species within the section Arachis using seven gene-related sequences confirmed that the B and D genome species were more closely related to one another than to A genome species. Section Arachis was in a single clade separated from other sections and well supported by the bootstrap replicates. The three A genome species were clustered into one subgroup, and B and D genome species were placed in the other sister subgroup. Five of seven gene-related sequences used in this study were EST–SSR sequences, which generated “double” informative phylogenetic signals. This contrasts with other previous phylogenetic studies that employ either molecular marker or single gene sequence alone, and the difference may

Genome Vol. 57, 2014

perhaps be explained due to the varying mutation rates of SSR versus flanking regions (van Zijll de Jong et al. 2011). Phylogeny of species of Arachis of all nine sections The consensus tree resolved four well-supported clades derived from the analysis of 16 species; albeit, excluding A. paraguariensis sequences due to lack of an amplicon at the RGA32 locus. Arachis triseminata (sect. Triseminatae) was selected as an outgroup from the remaining species because it has been demonstrated to be genetically isolated and could not produce hybrids in crosses between A. triseminata and species from other sections (Krapovickas and Gregory 1994). The sister group to A. triseminata included A. dardani (sect. Heteranthae) and A. retusa (sect. Extranervosae). These three species appeared at the base in a single clade (Fig. 3). Although A. dardani and A. retusa were alternatively at the base when using different genic sequences, the phylogenetic tree based on combined genic sequences of seven loci generated strong support for the basal clade. Species from section Extranervosae were also depicted as one of two terminal lineages in the studies of Friend et al. 2010 and Wang et al. 2011 because they used several species from section Extranervosae and the same DNA sequences. We used only one species, A. retusa, from section Extranervosae, which was absent in their studies, making a slight difference for the basal clade in our study. The next clade following the base was a diverging lineage including A. repens and A. pintoi (sect. Caulorrhizae) and A. matiensis (sect. Procumbentes). The two species of section Caulorrhizae have a very close genetic relationship, showed similar genic sequence, and produce hybrids with high pollen stainability, though these two species have easily distinguishable leaf shape and leaflet size (Krapovickas and Gregory 2007). Arachis matiensis (sect. Procumbentes) had phylogenetic affinity with the section Caulorrhizae. It has been shown that taxonomically it is closely related to A. rigonii (sect. Procumbentes) and A. dardani (sect. Heteranthae) as described in the study of Friend et al. (2010), and some species of section Procumbentes were grouped with species from section Erectoides (Hoshino et al. 2006). Section Procumbentes was independent as a section from the section Erectoides, as there is a notable cytological difference between the two sections, a dot-shaped satellite in a pair of chromosomes in species of section Procumbentes but absent in species of section Erectoides (Krapovickas and Gregory 2007). Procumbentes was not well defined in a separate study evaluating phylogenetic relationships based on ITS and 5.8S rDNA sequences (Bechara et al. 2010). A further study including more individuals of this section is needed to determine genome affinities and phylogenetic placement of species from section Procumbentes. Species of section Arachis formed a well supported single clade despite containing species having three different genomes. A sister clade to the clade of section Arachis contained species from five sections, Procumbentes, Heteranthae, Rhizomatosae, Trierectoides, and Erectoides. This clade is similar to the group called Group Erectoides by Friend et al. (2010), as sections Procumbentes and Trierectoides were historically included in section Erectoides. Within this clade, A. rigonii (sect. Procumbentes), A. giacomettii (sect. Heteranthae), and A. glabrata (sect. Rhizomatosae) formed a subgroup with strong support, and A. guaranitica (sect. Trierectoides) and A. major (sect. Erectoides) were placed in another subgroup with low support. A study of ITS and rDNA also produced a clade containing species from the taxonomic sections Procumbentes, Rhizomatosae, Erectoides, and Trierectoides joined with species from section Arachis (Bechara et al. 2010). The reconstruction of phylogenetic relationships from previous studies (Friend et al. 2010; Bechara et al. 2010; Wang et al. 2011) and this study based on DNA sequences show this divergent clade/group, and detailed genome affinity of species within this group needs further investigation using additional genic sequences representing different genome regions to help resolve their relationships. Published by NRC Research Press

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

He et al.

Structure of the phylogenetic tree with concatenation of seven genic sequences was congruent with the trees by Friend et al. (2010) and Wang et al. (2011), though a few exceptions were observed for the different placement of a few species. Both of these trees were constructed based on a large number of species of all nine sections, and one or two sequences used 48 species with two DNA sequences and 43 species with one DNA sequence, respectively. Our study on phylogenetic relationships of species of the genus Arachis using seven genic sequences supports the benefit of using an increased number of gene sequences. However, all three gene trees generally agree with species tree (convention-based classification), except for the clade including five sections in the gene tree despite these species have distinguishable morphological variation. In conclusion, phylogeny of species from all nine sections using multiple genic sequences presented here provides insight into the phylogenetic relationships and genomic affinities in the genus Arachis. Although a limited number of species was sampled from this genus, multiple genic sequences were used. The reconstruction of phylogenetic relationships of species of Arachis was similar to previous studies using a larger number of species with less sequence data. Multiple genic sequences could help improve classification, clarifying the relationships of species of genus Arachis.

Acknowledgements This work was supported by grants from USAID-Zambia under the Feed the Future program, and under the project Improving Groundnut Farmers' Incomes and Nutrition through Innovation and Technology Enhancement (I-FINITE) through a subaward to ICRISAT; and the George Washington Carver Agricultural Experiment Station at Tuskegee University.

References Asahida, T., Gray, A.K., and Gharrett, A.J. 2004. Use of microsatellite locus flanking regions for phylogenetic analysis? A preliminary study of Sebastes subgenera. Environ. Biol. Fishes, 69: 461–470. doi:10.1023/B:EBFI.0000022884. 30278.32. Barkley, N.A., Krueger, R.R., Federici, C.T., and Roose, M.L. 2009. What phylogeny and gene genealogy analyses reveal about homoplasy in citrus microsatellite alleles. Plant Syst. Evol. 282: 71–86. doi:10.1007/s00606-009-0208-2. Bechara, M., Moretzsohn, M.C., Palmieri, D.A., Monteiro, J.P., Bacci, M., Martins, J., et al. 2010. Phylogenetic relationships in genus Arachis based on ITS and 5.8S rDNA sequences. BMC Plant Biol. 10: 255. doi:10.1186/1471-222910-255. PMID:21092103. Bos, D.H., and Posada, D. 2005. Using models of nucleotide evolution to build phylogenetic trees. Dev. Comp. Immunol. 29: 211–227. doi:10.1016/j.dci.2004. 07.007. Chen, X., Cho, Y.G., and McCouch, S.R. 2002. Sequence divergence of rice microsatellites in Oryza and other plant species. Mol. Genet. Genomics, 268: 331– 343. doi:10.1007/s00438-002-0739-5. PMID:12436255. Christelova, P., Valarik, M., Hribova, E., Langhe, E.D., and Dolezel, J. 2011. A multi gene sequence-based phylogeny of the Musaceae (banana) family. BMC Evol. Biol. 11: 103. doi:10.1186/1471-2148-11-103. PMID:21496296. Cook, D.R., Bruening, G., He, G.H., and Town, C. 2009. A bacterial artificial chromosome library for Arachis hypogea (peanut) cultivar Tifrunner, (accession number FI499377-FI503061). Available from http://www.ncbi.nlm. nih.gov. (Accessed 15 November 2013). Cunha, F.B., Nobile, P.M., Hoshino, A.A., Moretzsohn, M.C., Lopes, C.R., and Gimenes, M.A. 2008. Genetic relationships among Arachis hypogaea L. (AABB) and diploid Arachis species with AA and BB genomes. Genet. Resour. Crop Evol. 55: 15–20. doi:10.1007/s10722-007-9209-6. Ellis, J.R., and Burke, J.M. 2007. EST—SSRs as a resource for population genetic analyses. Heredity, 99: 125–132. doi:10.1038/sj.hdy.6801001. PMID:17519965. Friend, S.A., Quandt, D., Tallury, S.P., Stalker, H.T., and Hilu, K.W. 2010. Species, genomes, and section relationships in the genus Arachis (Fabaceae): a molecular phylogeny. Plant Syst. Evol. 290: 185–199. doi:10.1007/s00606-010-0360-8. Gugerli, F., Brodlbeck, S., and Holderegger, R. 2008. Insertions–deletions in a microsatellite flanking region may be resolved by variation in stuttering patterns. Plant Mol. Biol. Rep. 26: 255–262. doi:10.1007/s11105-008-0034-7. Hillis, D.M., and Bull, J.J. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst. Biol. 42(2): 182–192. doi: 10.1093/sysbio/42.2.182. Hong, C.P., Piao, Z.Y., Kang, T.W., Batley, J., Yang, J., Hur, Y.K., et al. 2007. Genomic distribution of simple sequence repeats in Brassica rapa. Mol. Cells, 23(3): 349–356. PMID:17646709.

333

Hoshino, A.A., Bravo, J.P., Angelici, C.M., Barbosa, A.V.G., Lopes, C.R., and Gimenes, M.A. 2006. Heterologous microsatellite primer pairs informative for the whole genus Arachis. Genet. Mol. Biol. 29: 665–675. doi:10.1590/S141547572006000400016. Jing, R.C., Johnson, R., Seres, A., Kiss, G., Ambrose, M.J., Knox, M.R., et al. 2007. Gene-based sequence diversity analysis of field pea (Pisum). Genetics, 177: 2263–2275. doi:10.1534/genetics.107.081323. PMID:18073431. Jung, S., Tate, P.L., Horn, R., Kochert, G., Moore, K., and Abbott, A.G. 2003. The phylogenetic relationship of possible progenitors of the cultivated peanut. J. Hered. 94(4): 334–340. doi:10.1093/jhered/esg061. PMID:12920105. Koilkonda, P., Sato, S., Tabata, S., Shirasawa, K., Hirakawa, H., Sakai, H., et al. 2012. Large-scale development of expressed sequence tag-derived simple sequence repeat markers and diversity analysis in Arachis spp. Mol. Breed. 30(1): 125–138. doi:10.1007/s11032-011-9604-8. PMID:22707912. Koppolu, R., Upadhyaya, H.D., Dwivedi, S.L., Hoisington, D.A., and Varshney, R.K. 2010. Genetic relationships among seven sections of genus Arachis studied by using SSR markers. BMC Plant Biol. 10: 15. doi:10.1186/14712229-10-15. PMID:20089171. Krapovickas, A., and Gregory, W.C. 1994. Taxonomy of the genus Arachis (Leguminosae). Bonplandia, 8: 1–186. Krapovickas, A., and Gregory, W.C. 2007. Taxonomy of the genus Arachis (Leguminosae). Bonplandia, 16(Suppl.): 1–205. Lavia, G.I. 1998. Karyotypes of Arachis palustris and A. praecox (section Arachis), two species with basic chromosome number x=9. Cytologia, 63: 177–181. doi:10. 1508/cytologia.63.177. Leigh, J.W., Susko, E., Baumgartner, M., and Roger, A.J. 2008. Testing congruence in phylogenomic analysis. Syst. Biol. 57(1): 104–115. doi:10.1080/10635150801910436. PMID:18288620. Levinson, G., and Gutman, G. 1987. Slipped-strand mispairing: a major mechanism for DNA sequence evolution. Mol. Biol. Evol. 4: 203–221. PMID:3328815. Luo, M., Dang, P., Guo, B.Z., He, G.H., Holbrook, C., Bausher, M.G., and Lee, R.D. 2005. Generation of expressed sequenced tags (ESTs) for gene discovery and marker development in cultivated peanut. Crop Sci. 45: 346–353. doi:10.2135/ cropsci2005.0346. Milla, S.R., Isleib, T.G., and Stalker, H.T. 2003. Taxonomic relationships among Arachis sect. Arachis species as revealed by AFLP markers. Genome, 48(1): 1–11. doi:10.1139/g04-089. PMID:15729391. Moretzsohn, M.C., Gouvea, E.G., Peter, W.I., Leal-Bertioli, C.M.S., Valls, J.F.M., and Bertioli, D.J. 2013. A study of the relationships of cultivated peanut (Arachis hypogaea) and its most closely related wild species using intron sequences and microsatellite markers. Ann. Bot. 111: 113–126. doi:10.1093/aob/ mcs237. PMID:23131301. Nagy, E.D., Chu, Y., Guo, Y., Khanal, S., Tang, S., Li, Y., et al. 2010. Recombination is suppressed in an alien introgression in peanut harboring Rma, a dominant root-knot nematode resistance gene. Mol. Breed. 26: 357–370. doi:10.1007/ s11032-010-9430-4. Nishikawa, T., Vaughan, D.A., and Kadowaki, K. 2005 Phylogenetic analysis of Oryza species, based on simple sequence repeats and their flanking nucleotide sequences from the mitochondrial and chloroplast genomes. Theor. Appl. Genet. 110(4): 696–705. doi:10.1007/s00122-004-1895-2. PMID:15650813. Penaloza, A.P.S., and Valls, J.F.M. 2005. Chromosome number and satellite chromosome morphology of eleven species of Arachis (Leguminosae). Bonpladia, 15: 65–72. Posada, D. 2008. jModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25: 1253–1256. doi:10.1093/molbev/msn083. PMID:18397919. Ren, X.P., Huang, J.Q., Liao, B.S., Zhang, X.J., and Jiang, H.F. 2010. Genomic affinities of Arachis genus and interspecific hybrids were revealed by SRAP markers. Genet. Resour. Crop Evol. 57: 903–913. doi:10.1007/s10722-0109532-1. Revell, L.J., Harmon, L.J., and Collar, D.C. 2008. Phylogenetic signal, evolutionary process, and rate. Syst. Biol. 57(4): 591–601. doi:10.1080/10635150802302427. PMID:18709597. Robledo, G., and Seijo, G. 2010. Species relationships among the wild B genome of Arachis species (section Arachis) based on FISH mapping arrangement. Theor. Appl. Genet. 121: 1033–1046. doi:10.1007/s00122-010-1369-7. PMID: 20552326. Rossetto, M., McNally, J., and Henry, R.J. 2002. Evaluating the potential of SSR flanking regions for examining taxonomic relationships in the Vitaceae. Theor. Appl. Genet. 104(1): 61–66. doi:10.1007/s001220200007. PMID:12579429. Smartt, J., Gregory, W., and Gregory, M. 1978. The genomes of Arachis hypogaea. 1. Cytogenetic studies of putative genome donors. Euphytica, 27: 665–675. doi: 10.1007/BF00023701. Stalker, H.T. 1991. A new species in section Arachis of peanuts with a D genome. Am. J. Bot. 78: 630–637. doi:10.2307/2445084. Swofford, D.L. 1998. PAUP* Phylogenetic Analysis Using Parsimony (* and other methods), Version 4. Sinauer Associates, Sunderland, MA. Tallury, S.P., Hilu, K.W., Milla, S.R., Friend, S.A., Alsaghir, M., Stalker, H.T., and Quandt, D. 2005. Genomic affinities in Arachis section Arachis (Fabaceae): molecular and cytogenetic evidence. Theor. Appl. Genet. 111: 1229–1237. doi: 10.1007/s00122-005-0017-0. PMID:16187123. Temnykh, S., DeClerck, G., Lukashova, A., Lipovich, L., Cartinhour, S., and McCouch, S. 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): frequency, length variation, transposon association, Published by NRC Research Press

334

van Zijll de Jong, E., Guthridge, K.M., Spangenberg, G.C., and Forster, J.W. 2011. Sequence analysis of SSR-flanking regions identifies genome affinities between pasture grass fungal endophyte taxa. Int. J. Evol. Biol. 1–11. doi:10. 4061/2011/921312. PMID:21350638. Wang, C.T., Wang, X.Z., Tang, Y.Y., Chen, D.X., Cui, F.G., Zhang, J.C., and Yu, S.L. 2011. Phylogeny of Arachis based on internal transcribed spacer sequences. Genet. Resour. Crop Evol. 58: 311–319. doi:10.1007/s10722-010-9576-2.

Genome Downloaded from www.nrcresearchpress.com by San Francisco (UCSF) on 12/22/14 For personal use only.

and genetic marker potential. Genome Res. 11: 1441–1452. doi:10.1101/gr. 184001. PMID:11483586. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and Higgins, D.G. 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24: 4876–4882. doi:10.1093/nar/25.24.4876. PMID:9396791. Valls, J.F.M., and Simpson, C.E. 2005. New species of Arachis from Brazil, Paraguay, and Bolivia. Bonplandia, 14: 35–64.

Genome Vol. 57, 2014

Published by NRC Research Press

Phylogenetic relationships of species of genus Arachis based on genic sequences.

The genus Arachis (Fabaceae), which originated in South America, consists of 80 species. Based on morphological traits and cross-compatibility among t...
1MB Sizes 0 Downloads 6 Views