Research

Evolution of a symbiotic receptor through gene duplications in the legume–rhizobium mutualism Stephane De Mita1,2, Arend Streng1, Ton Bisseling1 and Rene Geurts1 1

Laboratory of Molecular Biology, Department of Plant Science, Wageningen University, Droevendaalsesteeg 1, 6708PB Wageningen, the Netherlands; 2INRA Nancy-Lorraine, UMR

Interactions Arbres/Micro-organismes, 54380 Champenoux, France

Summary Author for correspondence: Rene´ Geurts Tel: +31 644229941 Email: [email protected] Received: 26 July 2013 Accepted: 16 September 2013

New Phytologist (2013) doi: 10.1111/nph.12549

Key words: coevolution, duplication, legume–rhizobium symbiosis, lipochitooligosaccharides (LCOs), molecular evolution, neofunctionalization, Nod factors.

 The symbiosis between legumes and nitrogen-fixing rhizobia co-opted pre-existing endomycorrhizal features. In particular, both symbionts release lipo-chitooligosaccharides (LCOs) that are recognized by LysM-type receptor kinases. We investigated the evolutionary history of rhizobial LCO receptor genes MtLYK3-LjNFR1 to gain insight into the evolutionary origin of the rhizobial symbiosis.  We performed a phylogenetic analysis integrating gene copies from nonlegumes and legumes, including the non-nodulating, phylogenetically basal legume Cercis chinensis. Signatures of differentiation between copies were investigated through patterns of molecular evolution.  We show that two rounds of duplication preceded the evolution of the rhizobial symbiosis in legumes. Molecular evolution patterns indicate that the resulting three paralogous gene copies experienced different selective constraints. In particular, one copy maintained the ancestral function, and another specialized into perception of rhizobial LCOs. It has been suggested that legume LCO receptors evolved from a putative ancestral defense-related chitin receptor through the acquisition of two kinase motifs. However, the phylogenetic analysis shows that these domains are actually ancestral, suggesting that this scenario is unlikely.  Our study underlines the evolutionary significance of gene duplication and subsequent neofunctionalization in MtLYK3-LjNFR1 genes. We hypothesize that their ancestor was more likely a mycorrhizal LCO receptor, than a defense-related receptor kinase.

Introduction The symbiosis between plants of the legume family (Fabaceae) and the nitrogen-fixing bacteria referred to as rhizobia appeared c. 60 million yr ago (Sprent, 2008; Doyle, 2011). This event may be one of the reasons explaining the adaptive radiation that gave rise to c. 19 500 legume species (Lewis et al., 2005; Sprent, 2007; Legume Phylogeny Working Group, 2013). Molecular and genetic studies suggest that rhizobia co-opted signaling and cellular pathways from the more widespread and much older endomycorrhizal symbiosis (Parniske, 2008; Geurts & Vleeshouwers, 2012; Ivanov et al., 2012). In addition, at least some of these features are shared between these symbioses and the actinorhizal symbiosis involving various plants from the Fagales, Rosales and Curcubitales orders and nitrogen-fixing bacteria of the genus Frankia (Pawlowski & Demchenko, 2012). Recent studies have revealed that mycorrhizal fungi of the genus Glomus and rhizobia secrete very similar lipo-chitooligosaccharide (LCO) signal molecules, which are termed Nod factors in rhizobia and Myc-LCOs in endomycorrhizal fungi (Maillet et al., 2011). Here we investigate the evolutionary history of a rhizobial Nod factor receptor gene named LjNFR1a and MtLYK3 Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

in the legume model species Lotus japonicus and Medicago truncatula, respectively. In legumes, perception of rhizobial Nod factors requires two genes encoding proteins of the LysM-domain receptor kinase family, named LjNFR1a and LjNFR5 in L. japonicus, and MtLYK3 and MtNFP in M. truncatula (Limpens et al., 2003; Radutoiu et al., 2003; Arrighi et al., 2006). These proteins have a similar structure: an extracellular domain containing three LysM domains, a transmembrane domain and an intracellular kinase domain. Mutations in either of these genes can block Nod factorinduced responses, suggesting that these proteins function in conjunction, a hypothesis that is supported by biochemical studies (Madsen et al., 2011; Broghammer et al., 2012; Pietraszewska-Bogiel et al., 2013). The finding that Myc-LCOs and Nod factors are structurally very similar reinforces the hypothesis that the rhizobial symbiosis is derived from the endomycorrhizal symbiosis. It is plausible that, at a certain point in evolution, nitrogen-fixing rhizobia gained the capacity to synthesize molecules imitating Myc-LCOs, allowing them to hijack endomycorrizal infection mechanisms. Interestingly, in both model legumes neither of the Nod factor receptors MtLYK3-LjNFR1a or MtNFP-LjNFR5 is essential for New Phytologist (2013) 1 www.newphytologist.com

New Phytologist

2 Research

the endomycorrhizal symbiosis (Catoira et al., 2001; Radutoiu et al., 2003). This suggests that other proteins are involved in the perception of Myc-LCOs, allowing legume hosts to discriminate between the two symbionts. The structural resemblance between Nod factors and Myc-LCOs suggests that Nod factor receptors evolved from ancestral Myc-LCO receptors (Zhang et al., 2007, 2009). Recent studies indicate that this is indeed the case for MtNFP-LjNFR5 (Op den Camp et al., 2011b; Young et al., 2011). Phylogenetic reconstruction has revealed that putative orthologs of MtNFP-LjNFR5 are present in many nonlegume species (Zhu et al., 2006), with the exception of Arabidopsis thaliana which is unable to establish endomycorrhizal symbiosis (Streng et al., 2011). In legumes, the MtNFP-LjNFR5 gene experienced a duplication event that was possibly driven by a whole genome duplication (WGD; Streng et al., 2011; Young et al., 2011). This WGD occurred early in the history of the Papilionoideae subfamily, but after evolution of the rhizobial symbiosis (Cannon et al., 2010). Transcriptome profiling studies in M. truncatula showed that MtLYR1, a paralog of MtNFP, is induced specifically during mycorrhization (Gomez et al., 2009; Young et al., 2011). In addition, these studies revealed that, although not essential for the mycorrhizal symbiosis, MtNFP is also still involved in Myc-LCO-induced gene expression (Czaja et al., 2012). Taken together, these results suggest the following model: the gene ancestral to MtNFP acted as a Myc-LCO receptor that was co-opted for Nod factor perception when the rhizobial symbiosis evolved in legumes. When this ancestral gene was duplicated, the two functions were separated by subfunctionalization and MtNFP became a Nod factor receptor. Then, MtLYR1 could have retained the ancestral function, and would then be a Myc-LCO receptor gene. Additional evidence supporting this model has emerged from studies on Parasponia (Cannabaceae), the only genus outside the legume family able to form the rhizobial symbiosis. Given that Parasponia is relatively distant from the legume family, the Parasponia–rhizobium symbiosis most likely evolved independently from the legume–rhizobium symbiosis. Furthermore, based on the phylogenetic position of the Parasponia genus as a single nodulating lineage amongst nonnodulating relatives, symbiosis in this genus most likely evolved relatively recently. In Parasponia, the rhizobial symbiosis requires Nod factor signaling as in the vast majority of legume–rhizobium interactions. Studies on the putative ortholog of MtNFP showed that Parasponia andersonii has only a single copy of this gene that is required for both rhizobial and endomycorrhizal interaction (Op den Camp et al., 2011b). Previous phylogenetic analysis has shown that the MtLYK3LjNFR1a gene family experienced several duplication events, including series of tandem duplications (Zhang et al., 2007, 2009; Lohmann et al., 2010). This resulted in small clusters of paralogous genes. Interestingly, A. thaliana also contains a gene similar to MtLYK3-LjNFR1a, called AtCERK1, which encodes a receptor of chitin oligomers mediating defense against fungal pathogens (Zhu et al., 2006; Miya et al., 2007; Wan et al., 2008; Liu et al., 2012). As LCOs are themselves acylated chitin oligomers, it suggests close functional relationships between AtCERK1 and MtLYK3-LjNFR1a. These observations raise the hypothesis New Phytologist (2013) www.newphytologist.com

that the ancestor of MtLYK3-LjNFR1a was a defense-related gene rather than a Myc-LCO receptor. This hypothesis is strengthened by the fact that Nod factor application transiently induces the expression of defense-related genes in L. japonicus, a response that is dependent on LjNFR1a (Nakagawa et al., 2011; Serna-Sanz et al., 2011). Recently, the mechanisms of the putative transition from innate immune receptor function of AtCERK1 to the function of symbiotic receptor of LjNFR1 were investigated (Nakagawa et al., 2011). Using chimeric gene constructs, it was shown that the kinase domain of AtCERK1 is unable to complement the L. japonicus LjNFR1a loss-of-function mutant. This is due to divergence of two small domains in the kinase; the activation loop (AL) and a three-amino acid motif (YAQ) in the aEF helix. This led to the hypothesis that these changes mediated the shift from defense to symbiosis (Nakagawa et al., 2011). The legume family is separated into three subfamilies: the basal and polyphyletic Caesalpinioideae (containing, according to the current taxonomy, c. 2250 species), the Mimosoideae (c. 3270 species) and the Papilionoideae (c. 13 800 species; Legume Phylogeny Working Group, 2013). All three subfamilies contain nodulating lineages, albeit at very different frequencies (Sprent, 2007). The rhizobial symbiosis is assumed to have evolved a few million years after the emergence of legumes either as a single evolutionary event before the split in subfamilies or, alternatively, in parallel events in different legume lineages (Doyle, 2011). Within the Caesalpinioideae subfamily, the tribes Cercideae and Detarieae are consistently placed in a basal position of the legume phylogeny, and therefore are very likely to be external to the evolutionary event of acquisition of the rhizobial symbiosis (Young & Johnston, 1989; Doyle, 1998; Wojciechowski et al., 2004; Lavin et al., 2005; Sprent, 2007; Legume Phylogeny Working Group, 2013). These species therefore constitute the closest outgroup with respect to nitrogen-fixing legumes. By investigating genes similar to MtLYK3-LjNFR1a in the basal legume Cercis chinensis, a representative of the Cercideae tribe, we aimed to determine whether the initial duplication event in this gene cluster predates the emergence of the nitrogen-fixing rhizobial symbiosis in legumes. Additionally, we sought signatures of functional divergence between paralogous copies in order to identify signatures of neofunctionalization and to provide insights into the ancestral function of this class of LysM domain containing receptor-like kinases.

Materials and Methods Bioinformatics analysis In the rest of this article, the term homologs will be used generically for genes belonging to the same gene family, whether or not they belong to the same species, were generated by duplication or speciation, or are functionally related. The coding sequence of MtLYK3 homologs in three legumes (Medicago truncatula Gaertn., Lotus japonicus (Regel) K. Larsen and Glycine max (L.) Merr.) including PsSYM37, the ortholog of MtLYK3 in Pisum sativum L. (Zhukov et al., 2008), and two nonlegumes (Arabidopsis thaliana (L.) Heynh. and Vitis vinifera L.) were retrieved from Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist GenBank (Limpens et al., 2003; Arrighi et al., 2006; Zhu et al., 2006; Zhang et al., 2007; Lohmann et al., 2010). In addition, we used genome annotation data of the legume Cajanus cajan (L.) Huth (Varshney et al., 2012) and four additional nonlegume species: Malus domestica Borkh. (Velasco et al., 2010), Populus trichocarpa Torr. & A. Gray ex Hook. (Tuskan et al., 2006), Prunus persica (L.) Batsch (International Peach Genome Initiative et al., 2013), Fragaria vesca L. (Shulaev et al., 2011) and Cannabis sativa L. (Van Bakel et al., 2011). All genes and accession numbers used are listed in Table 1. We identified homologous genes using BLASTN from the BLAST+ package (Altschul et al., 1990) against predicted coding sequence databases and selected MtLYK3-LjNFR1a homologs based on preliminary phylogenetic analysis. Identification of MtLYK3-LjNFR1a homologs in Cercis chinensis A BAC library of the Cercis chinensis Bunge individual NA63335, obtained from the USDA/ARS U.S. National Arboretum, was constructed. The genome size was determined at 350 Mbps with a ploidy level of 2n = 14 (data not shown). High molecular weight DNA, isolated from nuclei, was partially digested with HindIII and ligated in the pCC1BAC cloning vector (Epicentre, Madison, WI, USA). The average insert size was 165 kb and a total number of 36 814 clones were picked, resulting in a 9.59 genome coverage. BAC clone DNA was spotted on two Hybond N+ filters (Amersham Biosciences, Pittsburgh, PA, USA) and the filters were screened for the presence of MtLYK3-LjNFR1a homologs. To this end, two probes were designed and amplified on C. chinensis root cDNA using the following primers: CsProbe_1_F (GGTGCAAATTGCTCTGGATT), CsProbe_1_R (TTGGGTAGTTATCGCCAAGC), CsProbe_2_F (CTGCTC AAGATGGGAAGGTC) and Cs_Probe_2_R (GCAACTTTC TCGCCTCTCAG). BAC clones harboring MtLYK3-LjNFR1a homologs were identified and grouped in contigs based on RFLP analysis. Two BACs were shotgun sequenced using next-generation sequencing. Clustering and DNA sequencing of the BACs was performed according to manufacturers’ protocols trough paired-end sequencing using the Illumina Genome Analyzer II. A total of 8 pmol of DNA was used. The BAC vector sequence and contaminating E. coli sequence tags were removed from the dataset by collecting reads using either the BOWTIE 10.3 short read aligner or NextGENe v1.65 software (Softgenetics, State College, PA, USA). After filtering, the reads were assembled into contigs using the short-read assembler PEassembly from the NextGENe software package. Molecular phylogeny and molecular evolution analyses The coding sequences of MtLYK3-LjNFR1a and MtLYK8-LjLYS7 (which was used as outgroup) homologs in all studied species were translated and then aligned manually. The coding sequence alignment was derived from the amino acid alignment. The alignment length was 2793 bp for 42 sequences (including stop codons). We manually removed uninformative regions (regions Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Research 3

of poor homology or containing mostly alignment gaps) and used the resulting 1503-bp alignment. Sequences, unfiltered and unfiltered alignments are available in Supporting Information Notes S1. Phylogenetic trees were reconstructed using maximum likelihood with PHYML v20130329 (Guindon et al., 2009) using the GTR+Γ6 model of evolution of nucleotide sequences, a BIONJ starting tree and 1000 bootstrap repetitions. We divided the phylogeny into five unrooted subtrees based on the identified clades (see the Results section) and the alignment in an extracellular (amino acid residues 1 to 191 of the cleaned alignment, corresponding to amino acid residues 24–223 of the MtLYK3 sequence) and an intracellular (amino acid residues 192 to 501 of the cleaned alignment, corresponding to amino acid residues 303–619 of the MtLYK3 sequence) domains. Codon evolution models M1a and M2a were adjusted using CODEML from the PAML package version 4.6 (Yang, 2007) for both the extracellular and intracellular part of the alignment and for each of the five subtrees. Statistical significance of M2a relatively to M1a was evaluated by the likelihood ratio test based on a v2 distribution with two degrees of freedom. In the case where M2a was significant, codon sites likely to fall into the positive selection category were identified using results of the Bayes empirical Bayes procedure (Yang et al., 2005) reported by CODEML.

Results Phylogenetic reconstruction of the MtLYK3-LjNFR1a gene family In order to decipher the evolutionary history of the MtLYK3LjNFR1a gene family in relation to the origin of the rhizobial symbiosis in the Fabaceae we included a caesalpinioid legume species of the tribe Cercideae in our evolutionary analysis. To this end Cercis chinensis was selected as it has a relatively small genome size of c. 350 Mbp. A C. chinensis bacterial artificial chromosome (BAC) library with a 109 genome coverage was constructed and screened by southern blotting using MtLYK3-LjNFR1 homologous sequences as probe. The identified BAC clones grouped in two contigs, and a representative clone of each contig was sequenced and subsequently annotated. This revealed that both BAC clones presented heterozygous copies of the same region containing a small gene cluster of LysM-type receptor kinases (Fig. 1). Only a phylogenetic analysis can unambiguously resolve the evolutionary history of gene duplications. To reconstruct the phylogeny of the MtLYK3-LjNFR1a gene family, we included the homologous protein sequences from C. chinensis, five papilionoid legume species (Medicago truncatula, Pisum sativum with only one sequence available, Lotus japonicus, Glycine max and Cajanus cajan) three Rosaceae species (Fragaria vesca, Malus domestica and Prunus persica) one Cannabaceae species (Cannabis sativa), one Salicaceae species (Populus trichocarpa), and two species outside of the Fabidae clade (Arabidopsis thaliana and Vitis vinifera). In addition to MtLYK3-LjNFR1a-like genes, we also included homologs of MtLYK8-LjLYS7, a LysM-domain receptor kinase gene that is closely related (Arrighi et al., 2006). In New Phytologist (2013) www.newphytologist.com

New Phytologist

4 Research Table 1 List of genes included in this study Family

Species

Gene

Accession

Location

Brassicaceae

AtCERK1

AB367524

chr. 3

CasLYK1

PK21129

scaf. 29873

CasLYK2 CasLYK3 FvLYK1

HG426464 PK06694 gene00531

scaf. 23614 scaf. 20385 chr. 6

FvLYK2 MdLYK1

gene19496 MDP0000175360

MdLYK2 MdLYK3 PpLYK1 PpLYK2 PpLYK3 PtLYK1

– – Vitaceae

Arabidopsis thaliana Cannabis sativa – – Fragaria vesca – Malus domestica – – Prunus persica – – Populus trichocarpa – – Vitis vinifera

Length

Exons

Reference

Clade

1854

1–12

Zhu et al. (2006)

CERK1

1016

5–12*

This study

CERK1

8827/ 59 827/ 31 071 242/

1800 1743 1863

1–12 1–12 1–12

This study This study This study

Outgroup Outgroup CERK1

chr. 3 chr. 17

750 197/ 8605 950/

1866 1566

1–11 1–10*

This study This study

Outgroup CERK1

MDP0000136494 HG426465 ppa019968m ppa003023m ppa017142m XM_002301574

chr. 9 chr. 5 scaf. 3 scaf. 3 scaf. 4 chr. II

7623 570/ 1019 451/+ 16 367 422/+ 16 353 922/+ 760 712/+ 19 188 799/+

1959 1755 1827 960 1866 1845

1–12 1–11 1–12 6–12* 1–12 1–12

This study This study This study This study This study This study

CERK1 Outgroup CERK1 CERK1 Outgroup CERK1

PtLYK2 PtLYK3 VvLYK1

XM_002321105 XM_002317109 GSVIVT01030482001

chr. XIV chr. XI chr. 12

8263 253/ 735 066/ 6054 569/

1839 1863 1845

1–12 1–11 1–12

CERK1 Outgroup CERK1



VvLYK2

GSVIVT01012662001

chr. 10

416 409/+

1872

1–12





VvLYK3

GSVIVT01012665001

chr. 10

423 032/+

1863

1–12

Fabaceae

Cajanus cajan – – –

CacLYK1

C.cajan_09999

chr. 3

20 062 965/+

1803

1–12

This study This study Zhang et al. (2009) Zhang et al. (2009) Zhang et al. (2009) This study



A

CacLYK2 CacLYK3 CacLYK4

C.cajan_15801 C.cajan_16785 HG426463

4755 526/+ 15 783 159/+ 114 028/

1974 1883 1923

1–12 1–12 1–12

This study This study This study

C Outgroup B

CecLYK1

HG426462

chr. 8 chr. 8 scaf. 126590 Unknown

1614

1–12

This study

B

CecLYK2 GmLYK2 GmNFR1a GmNFR1b GmLYK3 GmLYK2b LjNFR1a

HG426462 GmW2098N11.15 GmW2098N11.16 GmW2098N15.9 GmW2026N19.18 Gm0062x00016 AJ575248

Unknown chr. 2 chr. 2 chr. 14 chr. 15 chr. 20 chr. 2

48 547 625/+ 48 554 799/+ 3509 324/ 8678 938/ 16 207 633/ 38 634 189/+

1914 1854 1860 1860 1884 1848 1866

1–12 1–12 1–12 1–12 1–12 1–12 1–12

This study Zhang et al. (2007) Zhang et al. (2007) Zhang et al. (2007) Zhang et al. (2007) Zhang et al. (2009) Zhu et al. (2006)

C B A A Outgroup C A

LjNFR1b LjNFR1c LjLYS6

AJ575249 AB503681 AB503687

chr. 2 chr. 2 chr. 6

38 618 061/+ 38 611 968/+ 13 743 066/

1893 1803 1863

1–12 1–12 1–13

B B C

Cannabaceae – – Rosaceae – – – – – – – Salicaceae

– – – –

Position/strand 7615 530/ 5118/+

– –

Cercis chinensis – Glycine max – – – – Lotus japonicus – –





LjLYS7

AB503688

chr. 6

21 859 958/+

1866

1–12



Medicago truncatula – – – – – – Pisum sativum

MtLYK1

AY372401

chr. 5

36 372 823/+

1772

1–11

Zhu et al. (2006) Zhu et al. (2006) Lohmann et al. (2010) Lohmann et al. (2010) Limpens et al. (2003)

MtLYK2 MtLYK3 MtLYK6 MtLYK7 MtLYK8 MtLYK9 PsSYM37

AY372420 AY372406 AY372404 AY372405 MtD06512 XM_003601328 EU564096

chr. 5 chr. 5 chr. 5 chr. 5 Unknown chr. 3 Unknown

36 297 139/+ 36 224 960/+ 36 189 086/+ 36 184 066/+

1839 1863 1725 1863 1797 1866 1854

1–12 1–12 1–11 1–12 1–12 1–13 1–12

Limpens et al. (2003) Limpens et al. (2003) Limpens et al. (2003) Limpens et al. (2003) Arrighi et al. (2006) Arrighi et al. (2006) Zhukov et al. (2008)

– – – – – – –

– – – – – – –

25 682 831/+

Outgroup Outgroup

Outgroup B A A B B Outgroup C A

Location is given as a chromosome or scaffold identifier when available; position and strand are given relative to this chromosome or scaffold (position is the start of the coding sequence). Length is given in base pairs (coding sequence, including the stop codon). Exon numbers are preceded or followed by an asterisk if the sequence is 5′ or 3′ partial, respectively.

New Phytologist (2013) www.newphytologist.com

Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist

Research 5

Fig. 1 Genomic organization of MtLYK3LjNFR1a homologs and pseudogenes. All gene names include a two- or three-letter prefix indicating the species of origin. The representation is not to scale. Distances between genes (based on coding sequence) and gaps are given in base pairs (bp) or kilobase pairs (kb). Colors represent groups defined after the phylogenetic analysis (see Fig. 2). Pseudogenes and partial sequences are represented by dashed lines and stripes at gene ends, respectively. Species of origin: At, Arabidopsis thaliana; Cas, Cannabis sativa; Fv, Fragaria vesca; Md, Malus domestica; Pp, Prunus persica; Pt, Populus trichocarpa; Vv, Vitis vinifera; Cac, Cajanus cajan; Cec, Cercis chinensis; Gm, Glycine max; Lj, Lotus japonicus; Mt, Medicago truncatula.

legumes, many pseudogenes can be identified due to disrupted reading frame or incomplete coding sequences. These pseudogenes were not included in the phylogenetic analysis. MtLYK4 is likely a chimeric copy of other paralogs as its extracellular domain is nearly identical to MtLYK3 (Limpens et al., 2003) and its intracellular domain is most similar to other copies (Zhu et al., 2006). A preliminary phylogenetic analysis supported that view (data not shown). Tandem duplications of MtLYK3-LjNFR1a genes can be identified in all legume species and, outside the legume family in P. persica (taking into account a pseudogene). In total, we retrieved 42 genes from 13 different species with 1 up to 7 copies per species (Table 1; Notes S1). These were used to construct a phylogenetic tree based on an alignment of coding sequences. This resulted in a tree with two main clades representing MtLYK8-LjLYS7 homologs and MtLYK3-LjNFR1a homologs (Fig. 2). Except A. thaliana, which has no MtLYK8-LjLYS7 homolog, all included species have at least one gene in both groups. The topology of both groups is consistent with species phylogeny; V. vinifera genes emerge first, followed by P. trichocarpa genes, then genes from the Rosales species forming a clade, and finally legume genes. This pattern strongly suggests that the initial duplication that gave rise to MtLYK8-LjLYS7 and MtLYK3-LjNFR1a is the oldest point of the tree and allows us to deduce the location of the root (Fig. 2). Two duplications predate the origin of the legume family and of the nitrogen-fixing symbiosis We identified the nodes that correspond to duplication events through reconciliation of the gene tree with the species Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

phylogeny (Fig. 2). Except for the ancestral duplication that generated the MtLYK8-LjLYS7 and MtLYK3-LjNFR1a clades, duplications that are shared by at least two of the species we examined are restricted to the MtLYK3-LjNFR1a clade in legumes. The legume-specific group within the MtLYK3LjNFR1a clade is divided into three subclades, all strongly supported by the bootstrap analysis. We named these groups A, B and C, with the Nod factor receptor genes represented by group A. All three groups contain at least one gene copy of the symbiotic papilionoid legumes M. truncatula, L. japonicus, G. max and C. cajan. The single P. sativum sequence clusters with MtLYK3, confirming that the two genes are functional orthologs and that the MtLYK2–MtLYK3 duplication predates the Medicago–Pisum divergence. Groups B and C both contain also a C. chinensis gene, whereas a representative of group A was not identified in this species. To rule out the possibly that an A gene exists in C. chinensis but had been missed in the analysis, a PCR experiment was conducted using degenerate primers and genomic DNA of C. chinensis and a related species, Cercis siliquastrum L. This experiment also failed to identify the A gene, but yielded both B and C gene copies in C. siliquastrum (data not shown). Therefore, it seems that Cercis does not have a homolog of Nod factor receptor genes. As the Cercideae are the most basal legume tribe, and as Cercis has genes belonging to both the B and C groups, we can conclude that the two rounds of duplication preceded the origin of the legume family, giving rise to three copies represented by group A, B and C (A being the most basal), followed by the loss of the A gene in Cercis species. Therefore the duplications occurred earlier than the legume WGD (which is specific to the New Phytologist (2013) www.newphytologist.com

6 Research

Papilionoideae), but also before the evolution of the rhizobial symbiosis. The legume-specific copies of MtLYK3 might result from tandem duplications In M. truncatula, L. japonicus and G. max, group A, B and C genes are always organized as two separated loci. The genes from group A, including Nod factor receptors, are downstream of one or more genes from group B. In G. max this locus is duplicated, probably due to a recent WGD event. In one of the paralogous regions the group B gene copy has been lost. However, the pseudogene can still be located (Fig. 1, chr. 14). The cluster of A and B genes has been more widely extended in M. truncatula with two group B genes, two group A genes and a chimeric copy (MtLYK4), all interspersed with multiple pseudogenes. With the exception of MtLYK5, pseudogenes are truncated and seem to belong to group B (data not shown). In all papilionoid legumes investigated, a single C gene has been identified located at

New Phytologist different, unlinked locus. By contrast, in the caesalpinioid legume C. chinensis, group B and C genes are linked in a single locus. The most parsimonious scenario to explain this divergence between papilionoid and caesalpinioid legumes is that the original duplications which gave birth to the A, B and C groups were initially in tandem. For the B and C genes this organization remained conserved in C. chinensis, whereas in an ancestral papilionoid legume the chromosomal localization of the C gene changed. This can be caused by a translocation event, gene conversion or after a WGD followed by complementary gene loss (loss of the C copy on one homologous chromosome, and loss of A and B copies on the other). Gain of conserved motifs in kinase domain is not specific for legume MtLYK3-LjNFR1a receptors Domain swapping experiments between AtCERK1 and LjNFR1a showed that two regions of the kinase domain of LjNFR1a, namely the activation loop and a three amino acid (YAQ) motif,

Fig. 2 Maximum-likelihood phylogenetic tree of the MtLYK3-LjNFR1a and MtLYK8-LjLYS7 gene families. All gene names include a two- or three-letter prefix indicating the species of origin. Robustness of the topology was assessed by 1000 bootstrap replications. Portions of the tree poorly supported (< 700 replications) are represented by dotted lines. Duplication nodes are indicated by white circles and were inferred by strict parsimony-based reconciliation of gene and species trees. Most recent common ancestors are represented by black diamonds (ancestor of all legume genes) and black (ancestor of a monophyletic group) and white (ancestor of a paraphyletic group) squares. CERK1 is the group gathering all nonlegumes homologs of MtLYK3-LjNFR1a in nonlegumes, and A, B and C are legume-specific groups predating the origin of the legume family. Species of origin: At, Arabidopsis thaliana; Cas, Cannabis sativa; Fv, Fragaria vesca; Md, Malus domestica; Pp, Prunus persica; Pt, Populus trichocarpa; Vv, Vitis vinifera; Cac, Cajanus cajan; Cec, Cercis chinensis; Gm, Glycine max; Lj, Lotus japonicus; Mt, Medicago truncatula. New Phytologist (2013) www.newphytologist.com

Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist are required for symbiotic signaling. This has led to the hypothesis that adaptation of the activation loop and gain of the YAQ motif mediating shifting the kinase function from defense to symbiosis (Nakagawa et al., 2011). We aimed to consider this hypothesis from a phylogenetic perspective. We used WebLogo (Crooks et al., 2004) in order to summarize the conservation of the kinase region containing the activation loop and the YAQ motif (corresponding to positions 464–500 of the MtLYK3 sequence). We defined five clades defined after the phylogenetic tree (Fig. 2): the MtLYK8-LjLYS7 homologs (outgroup), the nonlegume MtLYK3-LjNFR1 homologs including AtCERK1 (CERK1) and the three legume-specific clades A, B and C. We generated one logo for each of those clades (Fig. 3). Strikingly, the YAQ motif is perfectly conserved not only in group A (containing Nod factor receptors), but also in the MtLYK8-LjLYS7 outgroup, not known to play a role in the rhizobial symbiosis. Furthermore, we found that the YAQ motif is also strongly conserved in the CERK1 clade, except for AtCERK1 itself and for PpLYK2 which has a YAR motif instead. Next we focused on the activation loop domain. In line with previous analysis (Nakagawa et al., 2011), we found that the activation loop domain in Nod factor receptors of group A is well conserved, with a consensus motif xEVGxSTLxTRLV. The CERK1 clade shows a similar consensus motif TEVGSxSLPTRLV, which again is strongly degraded in AtCERK1 (seven substitutions with respect to this consensus) and shows some variation in PtLYK2 (P replaced by A) and PpLYK2 (E replaced by I). Consistently, a similar motif was found in the MtLYK8-LjLYS7 group, xExGSxSLxTRLV, with two exceptions; E replaced by V in VvLYK3 and first L replaced by I in VvLYK2, the latter likely to be functionally neutral. We conclude that the activation loop and YAQ motif are essentially conserved in the Nod factor receptor clade (group A), the CERK1 clade as well as the MtLYK8-LjLYS7 outgroup, except for minor variation in duplicated genes and, more significantly, for AtCERK1. The

Research 7

latter appears to have undergone a lineage-specific degradation of these two motifs. Therefore, it is likely that the regions necessary for symbiotic signaling predated the MtLYK3-LjNFR1aMtLYK8-LjLYS7 duplication and a fortiori the origin of the Rosids clade of plants, and have been lost secondary in the A. thaliana lineage rather than gained in the legume family. We also examined the conservation of the activation loop and YAQ motif region in the legume-specific group B and C genes. Both regions are perfectly conserved in group C, and highly similar for group A (Fig. 3). By contrast, both regions are strongly degenerated and highly variable in proteins encoded by group B genes. Interestingly, the YAQ motif is never found in this clade while other regions of the kinase are better conserved. This shows that the symbiotic motifs present in the activation loop and YAQ domain were also lost in clade B genes following gene divergence. Molecular evolution shows that legume paralogs have undergone functional divergence Changes in protein function, such as those occurring during episodes of neofunctionalization, are susceptible to modify patterns of selective pressures on protein sequences. Selective constraints can be measured by determining the ratio of the nonsynonymous to synonymous rates of nucleotide substitutions, a ratio usually termed x. On the one hand, stronger constraints on the protein sequence owing to purifying selection reduce the number of allowed amino acid substitutions and consequently lower the value of x. On the other, fast rates of adaptive nonsynonymous changes (positive selection) can result in increased x ratios. Positive selection is more confidently detected through the presence of positions with x > 1 rather than by an elevation of the global x ratio (Yang & Bielawski, 2000). To investigate possible changes in overall evolutionary constraint in the protein sequences in the different clades of the phylogeny of MtLYK3LjNFR1a homologs following legume-specific duplications, we

Fig. 3 Sequence logos showing amino acid conservation of the activation loop and YAQ regions of the kinase domain of MtLYK3LjNFR1a and MtLYK8-LjLYS7 (outgroup) homologs within each of the five clades defined after the phylogenetic analysis. The left and right dotted frames indicate the activation loop and YAQ regions, respectively. Letter size is proportional to the conservation of the corresponding residues. Gaps in logos indicate gaps in the alignment. Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist (2013) www.newphytologist.com

New Phytologist

8 Research

fitted two models of Yang et al. (2000): M1a is a two-ratio model allowing constrained (x < 1) and neutral (x = 1) positions and M2a is a three-ratio model allowing constrained (x < 1), neutral (x = 1) and positively selected (x > 1) positions. These models allowed us to evaluate patterns of purifying constraints in the different clades and test for occurrence of positive selection. We conducted the analysis for the extracellular and intracellular portions of the alignment (Table 2). First we analyzed the MtLYK8-LjLYS7 clade and the nonlegume CERK1 clade. In both cases, most positions were fairly conserved, with nearly two thirds of the positions in the extracellular domain and over 90% of the positions in the intracellular domain falling into the constrained positions category. Among the three legume clades (groups A, B and C) proteins of the C clade exhibited the most similar pattern to CERK1, with no positive selection and a high proportion of strongly constrained positions. By contrast, clade B proteins showed a marked relaxation of selective constraints, with up to 54% of extracellular and 31% intracellular positions classified as neutral (Table 2). The elevation of evolutionary rates in the kinase domain is particularly notable. However, despite this elevation of x ratios, no position under positive selection was detected. By contrast, for the group A proteins we detected highly significant signatures of positive selection, which were all restricted to the extracellular domain. The estimated proportion of positions under positive selection (4%) corresponds to eight residues. In fact, only three positions were predicted with high statistical confidence to fall in the category where x > 1 (positions in the MtLYK3 protein sequence: 43Q, 45R and 77G). These all are within the first LysM chitin oligomer binding motif. Apart from these signatures of positive selection, clade A proteins show patterns of purifying constraints that are similar to CERK1. The patterns of molecular evolution strongly suggest that the three copies of MtLYK3-LjNFR1a homologs in legumes have diverged functionally, supporting a model of neofunctionalization, assuming that group C proteins fulfill the ancestral function associated with CERK1 proteins in other species (with the

notable exception of the A. thaliana gene) while clades A and B proteins have evolved new functions, namely Nod factor receptors in the case of clade A. The occurrence of positive selection in the ligand-binding domains of group A proteins strongly points to coevolution with Nod factors.

Discussion We investigated the evolutionary history of Nod factor receptors from the MtLYK3-LjNFR1a gene family in order to understand the mechanisms that allowed them to gain their legume-specific function. We found that this gene family underwent two duplications in a relatively short time window that predates the origin of the legume family and a fortiori of the rhizobial symbiosis. We documented signatures of functional divergence between the resulting three paralogous copies, signatures of adaptive evolution between orthologous genes that encode rhizobial Nod factor receptors, and the loss of this gene copy in the basal and nonnodulating Cercis lineage. Based on these results, we argue that an event of gene duplication and subsequent neofunctionalization in the MtLYK3-LjNFR1a lineage was fundamental for the evolutionary gain of a legume-specific Nod factor receptor. The phylogeny of LysM-domain receptor kinases in plants has been studied extensively (Arrighi et al., 2006; Zhu et al., 2006; Zhang et al., 2007, 2009; Lohmann et al., 2010). However, none of these studies was specifically focused on the MtLYK3LjNFR1a clade. Instead, they all considered the whole LysM receptor family. By focusing specifically on the Nod factor receptor clade, a more reliable reconstruction of the phylogeny of this clade was achieved. The phylogenetic tree presented here is robust and distinct from those presented in earlier studies. By including a representative species of the most basal legume tribe (the Cercideae), we demonstrated that two rounds of duplication in the MtLYK3-LjNFR1a clade are ancestral to all speciation events in the legume family. This means that these duplications predate the probable origin of the legume–rhizobium symbiosis.

Table 2 Result of evolutionary analyses based on the nonsynonymous to synonymous evolutionary rates ratio (x) Clade

P-value

Extracellular domain Outgroup > 0.99 CERK1 0.17 A < 10 6 B 0.34 C > 0.99 Intracellular domain Outgroup > 0.99 CERK1 > 0.99 A > 0.99 B > 0.99 C > 0.99

x0

x1

x2

p0

p1

p2

Predicted positions

0.11 0.15 0.13 0.07 0.06

1.0 1.0 1.0 1.0 1.0

– – 13.25 – –

0.66 0.68 0.61 0.46 0.77

0.34 0.32 0.35 0.54 0.23

– – 0.04 – –

– – 43, 45, 77 – –

0.08 0.07 0.06 0.10 0.07

1.0 1.0 1.0 1.0 1.0

– – – – –

0.92 0.96 0.89 0.69 1.00

0.08 0.04 0.11 0.31 0.00

– – – – –

– – – – –

The P-value gives the result of the test comparing the three-ratio (M2a) to the two-ratio (M1a) models. If the test is significant (one instance: clade A for the extracellular domain), results of the M2a are provided. Otherwise, the results of M1a are provided. x0, x1 and x2, the three rate categories which are constrained to be < 1, = 1 and > 1, respectively; p0, p1 and p2, their respective frequencies. x2 and p2 are not defined in M1a. In case M2a is significant, positions with > 0.95 posterior probability of falling into category x2 are given with respect to the MtLYK3 sequence. New Phytologist (2013) www.newphytologist.com

Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist A WGD is known to have occurred in the Papilionoideae subfamily shortly after the origin of the rhizobial symbiosis in this lineage (Cannon et al., 2010; Young et al., 2011). As a result, this WGD did not affect more basal legume lineages of the Caesalpinioideae and Mimosoideae subfamilies although nodulating lineages exist in both. Several genes involved in the rhizobial symbiosis have been maintained as paralogous gene pair in papilionoid legumes after the WGD (Op den Camp et al., 2011a; Young et al., 2011; Ivanov et al., 2012). Many of those paralogous gene pairs show signatures of functional redundancy, suggesting that the ancestral gene (before the WGD event) already had a symbiotic function. Mechanisms such as complementary degenerative mutations may be progressively separating the functions of the two paralogs (subfunctionalization), a process that may occur independently in the different legume lineages. As the duplications in the MtLYK3-LjNFR1a clade are ancestral to the WGD event in papilionoid legumes, it may have allowed complete functional divergence (neofunctionalization) of the paralogous genes (Lynch & Conery, 2000). A line of strong evidence supports the hypothesis that the three clades (A, B and C) in the MtLYK3-LjNFR1a lineage underwent functional divergence. Functional analyses in L. japonicus and M. truncatula demonstrated that LjNFR1a and MtLYK3 (group A) have a symbiotic function. It is likely that one or both of the copies GmNFR1a and GmNFR1b generated by a G. max-specific WGD event fulfill the same function. Strikingly, C. chinensis lacks a putative ortholog of this gene. Because this species contains gene copies of the other two clades (B and C), and because the emergence of the A clade predates the emergence of the other two, the Cercis lineage must have lost the ortholog of MtLYK3LjNFR1a Nod factor receptors. The lack of large-scale genomic or transcriptomic sequences in these nonmodel species makes it impossible to definitively rule out the presence of an A-clade copy. However, we studied a second species of the same genus (C. siliquastrum) in which we isolated homologs falling in the B and C clades, but not in the Nod factor receptor clade A. The loss of this gene in the Cercis lineage makes sense because these plants do not establish the rhizobial symbiosis. It is currently unclear whether the rhizobial symbiosis evolved once or several times independently within legumes (Doyle, 2011). The single evolution hypothesis would place the event at the position of the common ancestor of Papilionoideae and Mimosoideae, including most of the Caesalpinioideae but not the Cercideae. The finding that the ancestor of the Cercideae had an A-clade gene supports this scenario. However, our results do not rule out the multiple evolution hypothesis. In either case, both hypotheses clearly imply several events of loss of the rhizobial symbiosis. Furthermore, we cannot rule out less parsimonious models where the evolution of nodulation occurred even earlier, as we have no means of determining whether the Cercis lineage never evolved the rhizobial symbiosis or, alternatively, inherited the common ancestor of legumes and lost it subsequently. One notable feature of the rhizobial symbiosis in legumes is its complex pattern of host–symbiont specificity exhibiting a wide spectrum ranging from generalists to specialists (Perret et al., Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Research 9

2000). Among other factors, structural characteristics of Nod factors were shown to control specificity at early stages of the symbiosis (Downie, 2010). Therefore, it is likely that the extracellular domain of Nod factor receptors coevolved with the Nod factor structure of the bacterial symbiont. Our phylogenetic analysis singled out a few sites in the first LysM domain evolving rapidly in MtLYK3-LjNFR1a proteins, which was not observed in the other two clades. Therefore, we hypothesize that these residues may be instrumental for host specificity. However, the precise role of the first LysM domain in ligand binding remains elusive. Biochemical and structural characterization of the extracellular region of AtCERK1 suggested that the second LysM domain is involved exclusively in ligand binding (Liu et al., 2012). This is supported by studies on the MtNFP-LjNFR5 Nod factor receptor that underlined the importance of the second LysM domain in ligand specificity (Radutoiu et al., 2007; Bensmihen et al., 2011). However, one of the sites we detected (43Q in both MtLYK3 and PsSYM37 sequences) was shown to be linked with withinspecies variation of sensitivity to Nod factor structure in P. sativum (Li et al., 2011), supporting the hypothesis that the first LysM domain contributes to ligand specificity. In the remaining two clades, opposed evolutionary constraints were observed. LysM receptors of the B clade (LjNFR1b, LjNFR1c, MtLYK1, MtLYK6 and MtLYK7) displayed a striking relaxation of selective constraints, especially in the kinase domain, resulting in a substantial increase in rates of amino acid evolution. The symbiotic-specific conserved region of the kinase domain (activation loop and YAQ motif) was strongly degenerated in all genes of the B clade. One may argue that these genes might actually include pseudogenes and therefore bias evolutionary rates estimates. However several gene copies within the B clade are shared between species that have diverged over tens of million years and divergent gene expression patterns supports functionality of these genes (Limpens et al., 2003; Zhang et al., 2007; Benedito et al., 2008; Lohmann et al., 2010). Comparative functional analysis of AtCERK1 and LjNFR1a revealed that the activation loop and YAQ motif of the kinase domain of LjNFR1a are required for symbiotic signaling (Nakagawa et al., 2011). It was concluded that these regions appeared during the evolution of legumes and were instrumental for the evolution of the rhizobial symbiosis. However, we present evidence that the amino acid residues constituting symbiotic regions in the activation loop and YAQ motif are actually ancestral to CERK1, and can even be found in the more distal clade of MtLYK8-LjLYS7 homologs. Therefore, we argue that the symbiotic regions in the activation loop and YAQ motif are an ancestral feature, but were lost in the A. thaliana lineage. In parallel, a second loss of the same motifs occurred in the legume-specific B clade. In line with it, we argue that an ancestral function of AtCERK1-like genes in innate defense induction is unlikely, but that this function evolved secondarily and specifically in the A. thaliana lineage. Besides, our results suggest that the clade C in legumes has taken over the ancestral function of CERK1, which is conserved in most nonlegumes (except A. thaliana). It sounds reasonable to suggest that this ancestral function is perception of symbiotic signals from endomycorrhizal fungi. New Phytologist (2013) www.newphytologist.com

New Phytologist

10 Research

It is interesting to note that the proteins from the MtLYK8LjLYS7 clade also contain the activation loop and YAQ motif that is associated with symbiotic signaling. This observation points to an even older origin of symbiotic functions in the LysM receptor kinase gene family. This ancestral function seems to have been inherited by all species we studied, with the exception of A. thaliana. Most of these species are known to be able to form successful endomycorrhizal interactions, including Cercideae species (Alexander, 1989; Wang & Qiu, 2006). Arabidopsis thaliana is a clear exception, but P. trichocarpa might also be another, as it is reported to form exclusively ectomycorrhizae (Harley & Harley, 1987). However, many Populus species are able to form both types of mycorrhizal symbioses and some results suggest that P. trichocarpa can establish both mycorrhizal symbioses (Baum & Makeschin, 2000). Recently it was demonstrated that the establishment of endomycorrhizae requires Myc-LCOs and that a homolog of the other Nod factor receptor MtNFP-LjNFR5 is necessary for the perception of both rhizobia and endomorrhizal fungi in Parasponia andersonii (Maillet et al., 2011; Op den Camp et al., 2011b). This strongly suggests that Nod factor perception was derived from Myc-LCO perception. As Nod factor perception most probably requires a heterodimeric receptor complex involving both MtNFP-LjNFR5 and MtLYK3-LjNFR1 (Madsen et al., 2011; Pietraszewska-Bogiel et al., 2013), it suggests that besides a homolog of MtNFP-LjNFR5 a second receptor is also needed for Myc-LCO perception. Based on the presence of the kinase symbiotic domains, their expression in root tissue and loss of these genes in A. thaliana, we suggest that genes among the MtLYK8LjLYS7, CERK1 or legume C clades fulfill such a function. The expression profile of MtLYK8 (probeset ID: Mtr.45170.1.S1_at) in the Medicago Gene Atlas (Benedito et al., 2008) further supports this view, with moderate but consistent induction of expression in endomycorrhizae over several experiments (Gomez et al., 2009; Hogekamp et al., 2011; Ortu et al., 2012). Consistently, MtLYR1 (probeset ID: Mtr.19870.1.S1_at) exhibits the same pattern but not MtNFP (probeset ID: Mtr.15789.1.S1_at). This could mean that the ancestral gene of the gene family gathering MtLYK8-LjLYS7 and MtLYK3-LjNFR1a genes was involved in symbiotic perception in endomycorrhizae. Although none of the species we examined was actinorhizal, their common ancestor predates the common ancestor of all plants establishing the actinorhizal symbiosis with Frankia spp. (Doyle, 2011). The closest outgroup to actinorhizal plants would be P. trichocarpa. We note that, at this point of evolution, the ancestors of MtLYK8-LjLYS7 and MtLYK3-LjNFR1a genes both already contained the specific kinase domain associated with the symbiotic function. This may have allowed the evolution towards the perception of postulated Frankia signaling molecules in actinorhizal species (Hocher et al., 2011; Pawlowski & Demchenko, 2012). The study presented here provides an example of a major functional innovation that was predated by duplication events in a key gene. It supports the hypothesis that gene duplications in MtLYK3-LjNFR1a LysM domain receptor kinases have played a key role in the evolution of the rhizobial symbiosis in legumes. New Phytologist (2013) www.newphytologist.com

Furthermore, our evolutionary analysis provides working hypotheses for the ongoing functional characterization of members of this gene family.

Acknowledgements We thank Douglas Cook and Margaret Pooler for collaborating on the construction of the Cercis chinensis genomic library. Three anonymous referees improved the quality of the manuscript through constructive suggestions. This research was funded by grants NWO-VIDI-864.06.007, ERC-2011-AdG-294790 and NWO-NSFC-846.11.005. S.D.M.’s laboratory is supported by the Laboratory of Excellence ARBRE (ANR-2011-LABXARBRE01).

References Alexander IJ. 1989. Systematics and ecology of ectomycorrhizal fungi. In: Stirton CH, Zarucchi JL, eds. Advances in Legume Biology. St Louis, MO, USA: Missouri Botanical Garden, 607–634. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. Journal of Molecular Biology 215: 403–410. Arrighi J, Barre A, Ben Amor B, Bersoult A, Campos Soriano L, Mirabella R, de Carvalho-Niebel F, Journet E, Gherardi M, Huguet T et al. 2006. The Medicago truncatula LysM-receptor kinase gene family includes NFP and new nodule-expressed genes. Plant Physiology 142: 265–279. Baum C, Makeschin F. 2000. Effects of nitrogen and phosphorus fertilization on mycorrhizal formation of two poplar clones (Populus trichocarpa and P. tremula 9 tremuloides). Journal of Plant Nutrition and Soil Science 163: 491–497. Benedito VA, Torres-Jerez I, Murray JD, Andriankaja A, Allen S, Kakar K, Wandrey M, Verdier J, Zuber H, Ott T et al. 2008. A gene expression atlas of the model legume Medicago truncatula. Plant Journal 55: 504–513. Bensmihen S, De Billy F, Gough C. 2011. Contribution of NFP LysM domains to the recognition of Nod factors during the Medicago truncatula/ Sinorhizobium meliloti symbiosis. PLoS One 6: e26114. Broghammer A, Krusell L, Blaise M, Sauer J, Sullivan JT, Maolanon N, Vinther M, Lorentzen A, Madsen EB, Jensen KJ et al. 2012. Legume receptors perceive the rhizobial lipochitin oligosaccharide signal molecules by direct binding. Proceedings of the National Academy of Sciences, USA 109: 13 859–13 864. Cannon SB, Ilut D, Farmer AD, Maki SL, May GD, Singer SR, Doyle JJ. 2010. Polyploidy did not predate the evolution of nodulation in all legumes. PLoS One 5: e11630. Catoira R, Timmers AC, Maillet F, Galera C, Penmetsa RV, Cook D, Denarie J, Gough C. 2001. The HCL gene of Medicago truncatula controls Rhizobium-induced root hair curling. Development 128: 1507–1518. Crooks GE, Hon G, Chandonia JM, Brenner SE. 2004. WebLogo: a sequence logo generator. Genome Research 14: 1188–1190. Czaja LF, Hogekamp C, Lamm P, Maillet F, Martinez EA, Samain E, Denarie J, Kuster H, Hohnjec N. 2012. Transcriptional responses toward diffusible signals from symbiotic microbes reveal MtNFP- and MtDMI3-dependent reprogramming of host gene expression by arbuscular mycorrhizal fungal lipochitooligosaccharides. Plant Physiology 159: 1671–1685. Downie JA. 2010. The roles of extracellular proteins, polysaccharides and signals in the interactions of rhizobia with legume roots. FEMS Microbiology Reviews 34: 150–170. Doyle JJ. 1998. Phylogenetic perspectives on nodulation: evolving view of plants and symbiotic bacteria. Trends in Plant Science 3: 473–478. Doyle JJ. 2011. Phylogenetic perspectives on the origins of nodulation. Molecular Plant-Microbe Interactions 24: 1289–1295. Geurts R, Vleeshouwers VGAA. 2012. Mycorrhizal symbiosis: ancient signalling mechanisms co-opted. Current Biology 22: R997–R999.

Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

New Phytologist Gomez SK, Javot H, Deewatthanawong P, Torres-Jerez I, Tang YH, Blancaflor EB, Udvardi MK, Harrison MJ. 2009. Medicago truncatula and Glomus intraradices gene expression in cortical cells harboring arbuscules in the arbuscular mycorrhizal symbiosis. BMC Plant Biology 9: 10. Guindon S, Dufayard JF, Hordijk W, Lefort V, Gascuel O. 2009. PhyML: fast and accurate phylogeny reconstruction by maximum likelihood. Infection, Genetics and Evolution 9: 384–385. Harley JL, Harley EL. 1987. A check-list of mycorrhiza in the British flora. New Phytologist 105: 1–102. Hocher V, Alloisio N, Bogusz D, Normand P. 2011. Early signaling in actinorhizal symbioses. Plant Signaling & Behavior 6: 1377–1379. Hogekamp C, Arndt D, Pereira PA, Becker JD, Hohnjec N, K€ uster H. 2011. Laser microdissection unravels cell-type-specific transcription in arbuscular mycorrhizal roots, including CAAT-box transcription factor gene expression correlating with fungal contact and spread. Plant Physiology 157: 2023–2043. International Peach Genome Initiative, Verde I, Abbott AG, Scalabrin S, Jung S, Shu SQ, Marroni F, Zhenbentyayeva T, Dettori MT, Grimwood J et al. 2013. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nature Genetics 45: 487–494. Ivanov S, Fedorova EE, Limpens E, De Mita S, Genre A, Bonfante P, Bisseling T. 2012. Rhizobium-legume symbiosis shares exocytotic pathway required for arbuscule formation. Proceedings of the National Academy of Sciences, USA 109: 8316–8321. Lavin M, Herendeen PS, Wojciechowski MF. 2005. Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the Tertiary. Systematic Biology 54: 575–594. Legume Phylogeny Working Group. 2013. Legume phylogeny and classification in the 21st century: progress, prospects and lessons for other species-rich clades. Taxon 62: 217–248. Lewis GP, Schrire BD, Mackinder BA, Lock M. 2005. Legumes of the world. Kew, UK: Royal Botanical Garden. Li RH, Know MR, Edwards A, Hogg B, Ellis THN, Wei GH, Downie JA. 2011. Natural variation in host-specific nodulation of pea is associated with a haplotype of the SYM37 LysM-type receptor-like kinase. Molecular Plant-Microbe Interactions 24: 1396–1403. Limpens E, Franken C, Smit P, Willemse J, Bisseling T, Geurts R. 2003. LysM domain receptor kinases regulating rhizobial Nod factor-induced infection. Science 302: 630–633. Liu B, Li JF, Ao Y, Qu JW, Li ZQ, Su JB, Zhang Y, Liu J, Feng DR, Qi KB et al. 2012. Lysin motif-containing proteins LYP4 and LYP6 play dual roles in peptidoglycan and chitin perception in rice innate immunity. Plant Cell 24: 3406–3419. Lohmann GV, Shimoda Y, Nielsen MW, Jorgensen FG, Grossmann C, Sandal N, Sorensen K, Thirup S, Madsen LH, Tabata S et al. 2010. Evolution and regulation of the Lotus japonicus LysM receptor gene family. Molecular Plant-Microbe Interactions 23: 510–521. Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate genes. Science 290: 1151–1155. Madsen EB, Antolin-Llovera M, Grossmann C, Ye JY, Vieweg S, Broghammer A, Krusell L, Radutoiu S, Jensen ON, Stougaard J et al. 2011. Autophosphorylation is essential for the in vivo function of the Lotus japonicus Nod factor receptor 1 and receptor-mediated signalling in cooperation with Nod factor receptor 5. Plant Journal 65: 404–417. Maillet F, Poinsot V, Andre O, Puech-Pages V, Haouy A, Gueunier M, Cromer L, Giraudet D, Formey D, Niebel A et al. 2011. Fungal lipochitooligosaccharide symbiotic signals in arbuscular mycorrhiza. Nature 469: 58–64. Miya A, Albert P, Shinya T, Desaki Y, Ichimura K, Shirasu K, Narusaka Y, Kawakami N, Kaku H, Shibuya N. 2007. CERK1, a LysM receptor kinase, is essential for chitin elicitor signaling in Arabidopsis. Proceedings of the National Academy of Sciences, USA 104: 19613–19618. Nakagawa T, Kaku H, Shimoda Y, Sugiyama A, Shimamura M, Takanashi K, Yazaki K, Aoki T, Shibuya N, Kouchi H. 2011. From defense to symbiosis: limited alterations in the kinase domain of LysM receptor-like kinases are crucial for evolution of legume–rhizobium symbiosis. Plant Journal 65: 169–180. Ó 2013 The Authors New Phytologist Ó 2013 New Phytologist Trust

Research 11 Op den Camp RHM, De Mita S, Lillo A, Cao Q, Limpens E, Bisseling T, Geurts R. 2011a. A phylogenetic strategy based on a legume-specific whole genome duplication yields symbiotic cytokinin type-A response regulators. Plant Physiology 157: 2013–2022. Op den Camp RHM, Streng A, De Mita S, Cao Q, Polone E, Liu W, Ammiraju J, Kunrdna D, Wing R, Untergasser A et al. 2011b. LysM-type mycorrhizal receptor recruited for rhizobium symbiosis in non-legume Parasponia. Science 331: 909–912. Ortu G, Balestrini R, Pereira PA, Becker JD, K€ uster H, Bonfante P. 2012. Plant genes related to gibberellin biosynthesis and signaling are differentially regulated during the early stages of AM fungal interactions. Molecular Plant 5: 951–954. Parniske M. 2008. Arbuscular mycorrhiza: the mother of plant root endosymbioses. Nature Reviews Microbiology 6: 763–775. Pawlowski K, Demchenko KN. 2012. The diversity of actinorhizal symbiosis. Protoplasma 249: 967–979. Perret X, Staehelin C, Broughton WJ. 2000. Molecular basis of symbiotic promiscuity. Microbiology and Molecular Biology Reviews 64: 180–201. Pietraszewska-Bogiel A, Lefebvre B, Koini MA, Klaus-Heisen D, Takken FLW, Geurts R, Cullimore JV, Gadell TWJ. 2013. Interaction of Medicago truncatula lysin motif receptor-like kinases, NFP and LYK3, produced in Nicotiana benthamiana induces defence-like responses. PLoS One 8: e65055. Radutoiu S, Madsen LH, Madsen EB, Felle H, Umehara Y, Grønlund M, Sato S, Nakamura Y, Tabata S, Sandal N et al. 2003. Plant recognition of symbiotic bacteria requires two LysM receptor-like kinases. Nature 425: 585–592. Radutoiu S, Madsen LH, Madsen EB, Jurkevitch A, Fukai E, Quistgaard EMH, Albrektsen AS, James EK, Thirup S, Stougaard J. 2007. LysM domains mediate lipochitin-oligosaccharide recognition and Nfr genes extend the symbiotic host range. EMBO Journal 26: 3923–3935. Serna-Sanz A, Parniske M, Peck SC. 2011. Phosphoproteome analysis of Lotus japonicus roots reveals shared and distinct components of symbiosis and defense. Molecular Plant-Microbe Interactions 24: 932–937. Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP et al. 2011. The genome of woodland strawberry (Fragaria vesca). Nature Genetics 43: 109–116. Sprent JI. 2007. Evolving ideas of legume evolution and diversity: a taxonomic perspective on the occurrence of nodulation. New Phytologist 174: 11–25. Sprent JI. 2008. 60 Ma of legume nodulation. What’s new? What’s changing? Journal of Experimental Botany 59: 1081–1084. Streng A, Op den Camp R, Bisseling T, Geurts R. 2011. Evolutionary origin of rhizobium Nod factor signaling. Plant Signaling & Behavior 6: 1510–1514. Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A et al. 2006. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313: 1596–1604. Van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, Page JE. 2011. The draft genome and transcriptome of Cannabis sativa. Genome Biology 12: R102. Varshney RK, Chen WB, Li YP, Bharti AK, Saxena RK, Schlueter JA, Donoghue MTA, Azam S, Fan GY, Whaley AM et al. 2012. Draft genome sequence of pigeonpea (Cajanus cajan), an orphan legume crop of resource-poor farmers. Nature Biotechnology 30: 83–89. Velasco R, Zharkikh A, Affourtit J, Dhingra A, Cestaro A, Kalyanaraman A, Fontana P, Bhatnagar SK, Troggio M, Pruss D et al. 2010. The genome of the domesticated apple (Malus 9 domestica Borkh.). Nature Genetics 42: 833–839. Wan JR, Zang XC, Neece D, Ramonell KM, Clough S, Kim SY, Stacey MG, Stacey G. 2008. A LysM receptor-like kinase plays a critical role in chitin signaling and fungal resistance in Arabidopsis. Plant Cell 20: 471–481. Wang B, Qiu YL. 2006. Phylogenetic distribution and evolution of mycorrhizas in land plants. Mycorrhiza 16: 299–363. Wojciechowski M, Lavin M, Sanderson MJ. 2004. A phylogeny of legumes (Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family. American Journal of Botany 91: 1846–1862. New Phytologist (2013) www.newphytologist.com

New Phytologist

12 Research Yang ZH. 2007. PAML 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution 24: 1586–1591. Yang ZH, Bielawski JP. 2000. Statistical methods for detecting molecular adaptation. Trends in Ecology and Evolution 15: 496–503. Yang ZH, Nielsen R, Goldman N, Pedersen AK. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155: 431–449. Yang ZH, Wong WSW, Nielsen R. 2005. Bayes empirical Bayes inference of amino acid sites under positive selection. Molecular Biology and Evolution 22: 1107–1118. Young JPW, Johnston AWB. 1989. The evolution of specificity in the legume– rhizobium symbiosis. Trends in Ecology & Evolution 4: 341–349. Young ND, Debelle F, Oldroyd GED, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KFX, Gouzy J, Schoof H et al. 2011. The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature 480: 520–524. Zhang XC, Cannon SB, Stacey G. 2009. Evolutionary genomics of LysM genes in land plants. BMC Evolutionary Biology 9: 183. Zhang XC, Wu XL, Findley S, Wan JR, Libault M, Nguyen HT, Cannon SB, Stacey G. 2007. Molecular evolution of lysin motif-type receptor-like kinases in plants. Plant Physiology 144: 623–636. Zhu HY, Riely BK, Burns NJ, Ane JM. 2006. Tracing nonlegume orthologs of legume genes required for nodulation and arbuscular mycorrhizal symbioses. Genetics 172: 2491–2499.

Zhukov V, Radutoiu S, Madsen LH, Rychagova T, Ovchinnikova E, Borisov A, Tikhonovich I, Stougaard J. 2008. The pea Sym37 receptor kinase gene controls infection-thread initiation and nodule development. Molecular Plant-Microbe Interactions 21: 1600–1608.

Supporting Information Additional supporting information may be found in the online version of this article. Notes S1 Compressed archive in ZIP format containing sequence data and alignments used for phylogenetic analyses (data are provided as fast-formatted sequence files, along with a text file describing the data). Please note: Wiley Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office.

New Phytologist is an electronic (online-only) journal owned by the New Phytologist Trust, a not-for-profit organization dedicated to the promotion of plant science, facilitating projects from symposia to free access for our Tansley reviews. Regular papers, Letters, Research reviews, Rapid reports and both Modelling/Theory and Methods papers are encouraged. We are committed to rapid processing, from online submission through to publication ‘as ready’ via Early View – our average time to decision is

Evolution of a symbiotic receptor through gene duplications in the legume-rhizobium mutualism.

The symbiosis between legumes and nitrogen-fixing rhizobia co-opted pre-existing endomycorrhizal features. In particular, both symbionts release lipo-...
735KB Sizes 0 Downloads 0 Views