Developmental and Comparative Immunology 49 (2015) 79–95

Contents lists available at ScienceDirect

Developmental and Comparative Immunology j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / d c i

Chemokine receptors in Atlantic salmon Unni Grimholt a,*, Helena Hauge b, Anna Germundsson Hauge b, Jong Leong c, Ben F. Koop c a

Soeren Jaabaeksgate 10B, 0460 Oslo, Norway Norwegian Veterinary Institute, P.O. Box 750 Sentrum, 0106 Oslo, Norway c Centre for Biomedical Research, Department of Biology, University of Victoria, PO Box 3020 STN CSC, Victoria, Canada b

A R T I C L E

I N F O

Article history: Received 8 July 2014 Revised 9 November 2014 Accepted 10 November 2014 Available online 15 November 2014 Keywords: Atlantic salmon Chemokine receptors Whole genome duplication Inflammation

A B S T R A C T

Teleost sequence data have revealed that many immune genes have evolved differently when compared to other vertebrates. Thus, each gene family needs functional studies to define the biological role of individual members within major species groups. Chemokine receptors, being excellent markers for various leukocyte subpopulations, are one such example where studies are needed to decipher individual gene function. The unique salmonid whole genome duplication that occurred approximately 95 million years ago has provided salmonids with many additional duplicates further adding to the complexity and diversity. Here we have performed a systematic study of these receptors in Atlantic salmon with particular focus on potential inflammatory receptors. Using the preliminary salmon genome data we identified 48 chemokine or chemokine-like receptors including orthologues to the ten receptors previously published in trout. We found expressed support for 40 of the bona fide salmon receptors. Eighteen of the chemokine receptors are duplicated, and when tested against a diploid sister group the majority were shown to be remnants of the 4R whole genome duplication with subsequent high sequence identity. The salmon chemokine receptor repertoire of 40 expressed bona fide genes is comparably larger than that found in humans with 23 receptors. Diversification has been a major driving force for these duplicate genes with the main variability residing in ligand binding and signalling domains. © 2014 Elsevier Ltd. All rights reserved.

1. Introduction Chemokine receptors (CRs) and their ligands play an important role in coordination of cell trafficking in many biological processes. They are predominantly expressed on the surface of leukocytes and because they dictate migration of these cells between tissues, they are crucial for an effective immune response. Chemokines and their receptors have traditionally been divided into two main functional categories; homeostatic chemokines and their receptors are involved in basal cell trafficking and homing while inducible chemokines and their receptors are involved in inflammatory responses. There are also a few receptors with dual function in addition to some atypical chemokine receptors (Bonecchi et al., 2010; Cancellieri et al., 2013). CRs belong to the large family of G-protein-coupled seven transmembrane receptors with four extracellular and four intracellular domains. The extracellular N-terminal part of the receptor is responsible for ligand binding while the intracellular domains including the C-terminus are involved in intracellular signalling (Neel et al.,

* Corresponding author. Soeren Jaabaeksgate 10B, 0460 Oslo, Norway. Tel.: +47 92661039. E-mail address: [email protected] (U. Grimholt). http://dx.doi.org/10.1016/j.dci.2014.11.009 0145-305X/© 2014 Elsevier Ltd. All rights reserved.

2005; Szpakowska et al., 2012). CRs are named according to the chemokine class they bind. CCRs bind to CC-chemokines, CXCRs bind to CXC chemokines, XCR binds to XC chemokines and CX3CR binds to CX3C chemokines where X is any amino acid and C is cysteine (Allen et al., 2007; Charo and Ransohoff, 2006). In general there are fewer receptors than chemokines, with approximately 20 receptors versus 50 ligands identified in mammals, so most receptors bind more than one ligand. With the exception of atypical CRs, ligand binding causes conformational changes in the receptors that in turn trigger intracellular signals causing cellular events such as directional cellular migration. Homeostatic chemokines are constitutively expressed and are important for normal cell trafficking and for the development and maintenance of the immune system (Moser and Loetscher, 2001; Proudfoot, 2002). The human homeostatic receptors are CCR7, CCR9, CCR10, CXCR4 and CXCR5, where for instance CCR7 is expressed on cells destined for lymph nodes, CCR9 directs leukocytes to the intestine, CCR10 directs T-cells to skin and intestine and CXCR5 directs B-cells to lymph node follicles (Charo and Ransohoff, 2006). CXCR4 is widely expressed with multiple functions including a role in the central nervous system (Bonecchi et al., 2010; Tran and Miller, 2003). Expressions of inflammatory chemokines are induced by mediators such as tumour necrosis factor, interferon-γ, or microbial products associated with infection or trauma (Charo and Ransohoff,

80

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

2006). A classic example would be a pathogen recognised by a tolllike receptor which then induces expression of secreted chemokine (Kaisho, 2012). In humans, the inflammatory chemokine receptors are CCR1-3, CCR5, CXCR1, CXCR2, and CX3CR1. The receptors CCR4, CCR6, CCR8, CXCR3, CXCR6, and XCR1 have dual functions participating in both inflammatory as well as homeostatic processes. Most of the atypical or silent CRs also bind chemokines, but this binding does not induce a signalling cascade with subsequent cell migration. Instead, several of these receptors have regulatory roles. Human atypical receptors are CCBP2 (D6), CCRL1 (CCX-CKR), CCRL2, DARC, and CXCR7 (RDC1). CCBP2 has been shown to act as a scavenger for CC-chemokines and can drastically reduce the amount of ligand available for other CRs (Graham et al., 2012). The chemokinelike receptor CMKLR1 does not bind to a chemokine, but may have multiple functions as it can regulate CCRL2 activity through competitive binding to the ligand chimerin (Yoshimura and Oppenheim, 2011). Some atypical receptors have recently been renamed to ACKRs where CCBP2 is now ACKR2, CXCR7 is ACKR3 and CCRL1 is ACKR4 (Bachelerie et al., 2014). CRs have been identified in many teleost species with the primary focus on teleosts with sequenced genomes i.e. fugu, tetraodon, medaka, stickleback and zebrafish (Bajoghli et al., 2009; Chang et al., 2007; DeVries et al., 2006; Diotel et al., 2010; Huising et al., 2003b; Liu et al., 2009; Nomiyama et al., 2011; Oehlers et al., 2010; Sasado et al., 2008; Verburg-van Kemenade et al., 2013; Xu et al., 2010). Results from these studies show clear-cut teleost orthologues to mammalian homeostatic receptors, but orthology to mammalian inflammatory CRs is less obvious. A few publications also exist on CRs in salmonids i.e. CCR6, CCR7, CCR9/9b, CCR13 (CCR3), IL8R (CXCR1), CXCR2, CXCR3a, CXCR3b and CXCR4 (Daniels et al., 1999; Dixon et al., 2013; Ordas et al., 2012; Xu et al., 2014; Zhang et al., 2002). There are also a few accepted teleost chemokine receptor ligand pairs such as CCR6 with CCL20-like ligands and CCR7 with CCL19/21-like ligands (Laing and Secombes, 2004). However, the functional role of most fish CRs remains unresolved. Understanding the specific function of individual receptors and identifying their ligands is essential for understanding teleost homeostasis and inflammation. To broaden our understanding of CRs in salmonids, we made use of the salmon genome (Davidson et al., 2010) to identify receptors and study their evolution and potential function. From a disease prevention point of view we paid particular attention to receptors potentially involved in inflammation. Of the 48 receptors identified several have potential roles in inflammation being expressed in immunologically important tissues. Most importantly, the salmonid-specific whole genome duplication event approx. 95 million years ago (Macqueen et al., 2014) has had a significant impact on the receptor repertoire. 2. Material and methods 2.1. Bioinformatics Using available CR sequences from published articled and/or retrieved from GenBank, BLASTN (Altschul et al., 1997) and TBLASTN (Schaffer et al., 2001) searches were initially performed against both expressed and genomic Atlantic salmon resources at NCBI GenBank, cGRASP [cGRASP, Internet 2009; (Rondeau et al., 2014)] and/or the SalmonDB in Chile (Di Genova et al., 2011). Identified genomic sequences from the latest Atlantic salmon genome assembly (GenBank: AGKD00000000.3) or from the Northern pike genome assembly (AZJR00000000.1) were subjected to gene prediction analysis using GenScan (Burge and Karlin, 1997), FGENESH (Solovyev et al., 2006) and/or Augustus (Stanke et al., 2006). Predicted ORFs were tested through alignment with similar sequences from other species and sometimes changed using expressed match in Spidey (Wheelan

et al., 2001). Spidey was also used to define exon intron boundaries. To assess evolutionary relationships and orthology, all identified amino acid sequences were aligned using ClustalX2.0.11 (Larkin et al., 2007). ClustalX was also used to calculate percentage identity. Phylogenetic analyses were performed using the neighborjoining method (Saitou and Nei, 1987). The percentages of replicate trees in which the sequences clustered together in the bootstrap test (1000 replicates) are shown next to the branches (Felsenstein, 1985). The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the p-distance method (Nei and Kumar, 2000) and are in the units of the number of amino acid differences per site. All ambiguous positions were removed for each sequence pair. Evolutionary analyses were conducted in MEGA5 (Tamura et al., 2011). Secondary structure including transmembrane domains were predicted and visualised using Rhythm (Rose et al., 2009). Prediction of N-linked glycosylation sites was performed using NetNGlyc 1.0 (Gupta and Brunak, 2002) while tyrosine sulfation sites were predicted using Sulfinator (Monigatti et al., 2002). O-linked glycosylation sites were predicted using NetOGlyc 3.1 (Julenius et al., 2005). 2.2. Northern pike cDNA and genomic DNA Pike (Esox lucius) genome, transcriptome and genetic map data are described fully in Rondeau et al. (2014). In brief, DNA from a single pike individual (Leong et al., 2010) was submitted directly to BGI (http://www.genomics.cn/en/index) for Illumina sequencing; DNA libraries of 180 bp were constructed for paired-end sequencing and libraries of 2 kb and 6 kb fragments were constructed for mate-pair sequencing and assembly. Fragment assembly used ALLPATHS-LG (Gnerre et al., 2011). The resulting contigs ≥200 bp were screened and trimmed for vector and contamination, which produced 94,267 contigs (N50 = 16,909, bioproject PRJNA221548, accession GenBank:AZJR00000000) and 5688 scaffolds ≥1000 bp with a total genome size of 877,777,613 bp. 2.3. Tissue transcriptomes and analysis For transcriptome data, tissues were extracted from a single, 1 year old, presmolt juvenile male Atlantic salmon. RNA from 11 tissues – brain, eye, gill, hind gut, head kidney, heart, kidney, liver, muscle, stomach, spleen – were extracted and submitted to BGI for Illumina sequencing. Contig assembly used Trinity (Haas et al., 2013). The resulting set of transcripts was reduced by retaining those with a significant BLASTX (Altschul et al., 1997) match to the SwissProt or Gene Ontology protein databases (≤10−5) or had a predicted open reading frame ≥300 bp. Further, only those transcripts that mapped to our genome assembly using BLAT (Kent, 2002) were retained. To remove possible alleles from our assembly, we retained a single, longest representative of transcripts that were ≥98% similar over a minimum length of 300 bp, as determined by BLASTN (Altschul et al., 1997). This curated set represents our RNA-seq reference transcriptome. FPKM values were then determined for each transcript for each of the 11 different tissues. Pike chemokine receptors were identified using known salmon genes as queries that were BLASTed against the pike genome and transcriptome (Rondeau et al., 2014). Identified contigs were further examined as earlier. 2.4. RNA extraction Three healthy Atlantic salmon weighing 70–80 g (AquaGen breed) kept in a freshwater flow system at 12 °C with regular feeding were sacrificed by overexposure to Finquel® (ScanAqua AS) and tissues

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

were collected in RNAlater® (Invitrogen) and stored at −80 °C until further processing. Total RNA was extracted using RNeasy® Mini Kit (Qiagen) followed by DNase-treatment with RQ1 RNase free DNase (Promega) to remove genomic DNA contamination. All protocols were according to the manufacturer’s instructions. The concentration of total RNA was measured using a NanodropTM 2000 spectrophotometer (Thermo Scientific) and the samples were stored in RNasefree water at −80 °C.

81

homology is questionable, different names have been suggested by different authors (DeVries et al., 2006; Dixon et al., 2013; Liu et al., 2009; Nomiyama et al., 2011). Based on our phylogenetic analysis we have adopted some of the nomenclature suggested by others, but also propose some changes to distinguish between evolutionary and most likely functionally distinct sequences (see Table 1).

3.1. Phylogenetic classification 2.5. Real-time PCR To evaluate the levels of CR expression in different tissues, 6 μg of total RNA from three individuals (2 μg each) were pooled and reverse transcribed in 60 μl reaction volume using RevertAid™ H Minus First Strand cDNA Synthesis Kit (Fermentas) with oligo d(T)18 primers according to the manufacturer’s instructions. Pooling of RNA from three individuals was performed to reduce the effect of individual variation. Correct amplification of each CR gene was confirmed by examining the melting curve, analysing product size and fragment sequence. SYBR Green® Real-time PCR assays (QuantiTect SYBR Green PCR Kit, Qiagen) were optimised on 10-fold serial dilutions of cDNA and in 25 μl reaction volume. Two microlitres cDNA and 0.3 μM of each primer (listed in SF5c) were used for all assays, except for 18S where a 1:1000 dilution of cDNA was used. PCR was performed in triplicate in 96-well optical plates on an Mx3005 realtime thermal cycler (Stratagene). The PCR cycling conditions were 95 °C for 15 min, 42 cycles of 94 °C for 50 s, 60 °C for 15 s and 72 °C for 1 min, and finally 95 °C for 1 min, 55 °C for 30 s and 95 °C for 30 s. Validation of assays and data handling were according to the MxPro Manual and baseline and cycle threshold (CT) were set manually. Each assay was tested on different samples in the same plate to ensure optimal reproducibility and the 18S reference gene assay was included in all plates. Real-time PCR efficiencies were calculated from the given slopes in MxPro software. The corresponding real-time PCR efficiency (E) of one cycle in the exponential phase of each gene was calculated according to the equation: E = 10[−1/slope] on 10-fold serial dilutions of cDNA for each assay (Pfaffl, 2001). The relative expression of CCR transcripts was normalised to the geometric mean of the CT value of 18S and then presented as relative expression compared to the corresponding control groups using the Pfaffl method. 2.6. Sequencing CR PCR-products were sequenced using the ABI Prism Big Dye Terminator Cycle sequencing kit on an ABI 3130xl Genetic Analyser according to the manufacturer’s instructions. 3. Results and discussion To identify salmon CRs we performed various blast searches against GenBank nucleotides, ESTs and ultimately preliminary and final salmon genome scaffolds (GenBank version AGKD01-AGKD03 contigs) using salmonid, zebrafish and human CR sequences as queries. A total of 48 salmon chemokine and chemokine-like receptors were identified (Fig. 1, Appendix S1) with amino acid sequence identity ranging from 19% to 98% (Appendix S2). Forty five of these sequences appear to be bona-fide CRs, while three are most likely pseudogenes (ssCXCR7.2b, ssCCRL1.2b and ssCMKLR2b). Nomenclature for human CRs relies on ligand binding. Although many teleost CC and CXC ligands have been identified (Alejo and Tafalla, 2011; Laing and Secombes, 2004; Nomiyama et al., 2008; Peatman and Liu, 2007) only a few receptor–ligand pairs have been established (Bajoghli, 2013). Thus, current nomenclature for teleost CRs relies on homology to mammalian receptors and where

By performing a phylogenetic analysis we identified the CCR6, CCR7, CCR9, CCR9B, CCR13, IL8R (CXCR1/2), CXCR2, CXCR3a, CXCR3b and CXCR4 receptor sequences previously identified in salmonids (Daniels et al., 1999; Dixon et al., 2013; Huising et al., 2003a; Ordas et al., 2012; Xu et al., 2014) (Appendices S1 and S3) in addition to 38 new sequences. We and Xu et al. redefined the Zhang et al. (2002) CXCR1/2 sequence to CXCR1 while the sequence defined as CXCR3a by Xu et al. (2014) clusters with the receptor we have defined as CXCR8 being quite distinct from the other sequences defined as CXCR3. Some Salmo salar (ss) sequences are clear-cut orthologues to human CCR6, CCR7, CCR9, CXR4, CXCR7, XCR, CCRL1 CCBP2 and CMKLR1 while other sequences, here defined as ssCCR1, ssCCR2, ssCCR3, ssCCR4, ssCCR5, ssCXCR1, ssCXCR2, ssCXCR3, ssCXCR5, ssCXCR6 and ssCXCR8, show less clear-cut orthology to human CR sequences (Fig. 1). All salmon sequences cluster with zebrafish orthologues although some subsets of genes have expanded or contracted differently in the two species. Based on convincing bootstrap values the salmon CCR2, CCR4 and CCR5 sequences are interesting inflammatory receptor candidates clustering with the human inflammatory receptors CCR1-5 and CX3CR1. We have thus defined this clade as inflammatory-like receptors group A (Fig. 1). Zebrafish orthologues to the ssCCR4 sequence are called either CCR4La/b (Nomiyama et al., 2011) or CCR8 (Liu et al., 2009), where we propose use of CCR4 to distinguish these sequences from the highly divergent zebrafish sequence called CCRL4c, which is an orthologue to the gene we define as ssCCR2. The salmon sequences defined as ssCCR2 and ssCCR4 form a stable clade together with the human inflammatory receptors CCR1-5 and a similar clustering was seen when including zebrafish CR sequences (Liu et al., 2009). The salmon ssCCR5 sequences form a clade with zebrafish sequences defined as either drCCR2/5 (Liu et al., 2009), drCCR11 (Nomiyama et al., 2011) or drCX3CR1 (DeVries et al., 2006). These sequences may be functional analogues to the human inflammatory CX3CR1 receptor, but for now we have chosen to use the CCR5 nomenclature. Support for these genes having an important role in inflammation also comes from zebrafish where two CCR5 orthologues (zfCCR2-2 and zfCCR5, Fig. 1) were shown to be uniquely expressed in immunologically important tissues (Liu et al., 2009). Teleost ligands for these receptors are not defined, but several candidates have been suggested (Montero et al., 2009). The remaining two human inflammatory receptors CXCR1 and CXCR2 cluster with the salmon ssCXCR1 and ssCXCR2 sequences. Based on convincing bootstrap values we thus chose to define this cluster as inflammatory-like receptors group B (Fig. 1). A CXCR1 orthologue has previously been published in trout (Zhang et al., 2002) and both ssCXCR1 and ssCXCR2 have orthologues in trout and zebrafish (Nomiyama et al., 2011; Xu et al., 2014). The functional distinction between human CXCR1 and CXCR2 sequences has been debated (Stillie et al., 2009), but recent data suggest they couple with distinct G protein-coupled receptor kinases (GRKs) to mediate and regulate leukocyte function (Raghuwanshi et al., 2012). The IL8 ligand, alias CXCL8, in mammals recruits polymorphonuclear leukocytes (neutrophils, basophiles and eosinophils) to the site of infection and these cells express both CXCR1 and CXCR2 on the surface. In carp, two lineages of CXCL8 have been described, and they both have a crucial role in recruitment of neutrophilic

82

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR9.1a ssCCR9.1b drCCR9a drCCR9b 66 ssCCR9.2a 99 100 ssCCR9.2b CR9.2b 51 hsCC hsCCR9 hsCCR7 100 drCCR7 97 ssCCR7a 42 100 ssCCR7b ssCCR hsCCR6 ssCCR6.2 69 100 drCCR6b 100 23 drCCR6a 91 ssCCR6.1a 100 ssCCR6 1b hsCXCR6 59 ssCXCR6 100 drCCR10 hsCCRL1 ACKR4 47 ssCCRL1.1a 100 100 ssCCRL1.1b 100 drCCRL1a 99 drCCRL1b 100 ssCCRL1.2a 100 ssCCRL1.2b 28 ssCXCR2.1 100 ssCXCR2.2 100 drCXCR1c d CXC drC XC C 100 hsCXCR1 100 95 hsCXCR2 ssCXCR1.1 100 ssCXCR1.2 76 dreCXCR1 ssCXCR3.1a 100 25 ssCXCR3.1b 100 ssCXCR3.2 99 100 drCXCR3b drCXCR3a 25 100 ssCXCR8a ssCXCR8b 100 drCXCR3.2 33 hsCCR10 85 hsCXCR3 29 hsCXCR5 86 ssCXCR5 100 drCXCR5b 30 100 ssCXCR4.2a ssCXCR4.2b ssCXC 100 hsCXCR4 ssCXCR4.1a 99 100 ssCXCR4.1b 98 drCXCR4a 66 drCXCR4b 43 drXCR1 100 drXCR1b 100 drXCR1c 95 ssXCR1a 100 100 ssXCR1b 74 drXC drX drXCR1Lc 37 hsXCR1 ssXCR2 drCCR12.1 (CCR3-1) 100 21 drCCR12.2 (CCR3-2) 100 drCCR12.3 (CCR3-3) 100 ssCCR3a 100 80 ssCCR3b ssCCR1 100 drXCR1Ld 41 ssCCBP2 100 drCCBP2 80 hsCCBP2 (ACKR2) drCCR11.2 (CCR2-2) 99 drCCR11d 99 drCCR11c (CCR5) 100 35 drCCR11.1 (CCR2-1, CX3CR1) 100 ssCCR5a 100 93 ssCCR5b hsCX3CR1 ssCCR4a 100 93 ssCCR4b 100 drCCR4La 100 drCCR4Lb 40 drCCR4Lc 100 100 ssCCR2a 68 ssCCR2b ss C hsCCR4 CCR4 48 hsCCR8 22 34 hsCCRL2 (ACKR5) hsCCR1 79 99 hsCCR3 97 hsCCR2 100 hsCCR5 hsCXCR7 (ACKR3) ssCXCR7.1a 100 ssCXCR7.1b 100 83 drCXCR7a 100 drCXCR7b 93 100 ssCXCR7.2a ssCXCR7.2b hsGPER1 (CMKRL2) ssCMKRL2a 100 59 ssCMKRL2b 100 hsCMKLR1 77 ssCMKRL1 100 drCMKRL1 hsDARC (ACKR1) 91

100

100

Inflammatory-like group B

XCR-like

Inflammatory-like group A

0.1 Fig. 1. Phylogeny of Atlantic salmon, zebrafish and human CRs. Phylogenetic tree of salmon, zebrafish and human chemokine receptor sequences. Red font indicates salmon CCRs, while some alternative zebrafish names are given in parentheses. Success in percentage per 1000 bootstrap trials is shown on each node. Human XCR and CCRs are shaded according to function i.e. green are inflammatory receptors, pink are dual function receptors, yellow are homeostatic receptors, and those without background colour are atypical receptors. Three clusters of salmon inflammatory-like and XCR1-like receptor sequences of particular interest are shown. Sequences and references are gathered in supplementary file Appendix S1.

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

83

Table 1 Atlantic salmon CR references with trout, zebrafish and human orthologues. Gene name

Genbank mRNA/ TSA match

Genomic accession #

Zebrafish orthologue

Human orthologue

CCR1 CCR2a CCR2b CCR3a CCR3b CCR4a CCR4b CCR5a CCR5b CCR6.1a CCR6.1b CCR6.2 CCR7a CCR7b CCR9.1a CCR9.1b CCR9.2a CCR9.2b XCR1a XCR1b XCR2 CXCR1.1 CXCR1.2 CXCR2.1 CXCR2.2 CXCR3.1a CXCR3.1b CXCR3.2 CXCR4.1a CXCR4.1b CXCR4.2a CXCR4.2b CXCR5 CXCR6 CXCR7.1a CXCR7.1b CXCR7.2a CXCR7.2bᴪ CXCR8a CXCR8b CCRL1.1a CCRL1.1b CCRL1.2a CCRL1.2bᴪ CCBP2 CMKLR1 CMKLR2a CMKLR2bᴪ

DW581300, TSA DW540320, TSA n.m., TSA DY717613, TSA ACN11153, TSA EG840655, TSA EG775179, n.t. CX353926, TSA n.m., n.t. NM001139972, TSA n.m., TSA n.m., TSA DY719066, TSA DY730093, n.t. NP001133990, TSA ACI34134, n.t. n.m., TSA n.m., TSA n.m., TSA n.m., n.t. n.m., TSA DY725174, TSA n.m., TSA DW566408, n.t. n.m., TSA NP001133965, TSA n.m., n.t. DY730916, TSA NP001158765, n.t. EG756489.1, TSA CK898894, TSA n.m., TSA n.m., TSA n.m., TSA n.m., n.t. n.m., TSA n.m., TSA n.m., n.t., pseudogene n.m., TSA GE781327, n.t. n.m., TSA EG877626, TSA n.m., n.t. n.m., n.t., pseudogene DW566026, TSA GE772893.3, TSA DY731983.2, TSA n.m., TSA, pseudogene

AGKD03059705.1: 5,994-11,468 AGKD03026506.1: 9,208-10,254 AGKD03006887.1: 85,547-86,575 AGKD03008339.1: 1,504-2,592 AGKD03016107.1: 299,769-300,881 AGKD03006887.1: 82,144-83,187 AGKD03026506.1: 4,575-5,630 AGKD03006887.1: 95,781-96,845 AGKD03026506.1: 18,558-19,616 AGKD03062538.1: 18,246-19,421 AGKD03009351.1: 25,921-27,111 AGKD03001847.1: 205,631-206,966 AGKD03010725.1: 286,009–287,822 AGKD03025083.1: 110,755-112,501 AGKD03018795.1: 65,371-66,767 AGKD03032792.1:19,966-23,112 AGKD03006697.1: 9,923-12,219 AGKD03004078.1: 128,853-129,911 AGKD03010859.1: 38,153-39,160 AGKD03007607.1: 36,321-37,340 AGKD03004371.1: 4,805-5,718 AGKD03004513.1: 4,88-1,558 AGKD03009118.1: 8,235-9,317 AGKD03037522.1: 6,239-7,318 AGKD03002110.1: 138,592-139,671 AGKD03039132.1: 5,838-7,407 AGKD03042494.1: 23,419-24,848 AGKD03039134.1: 22,614-23,927 AGKD03013215.1: 347,401-348,477 AGKD03005053.1: 292,622-293,620 AGKD03032216.1: 4028–5086 AGKD03003579.1: 94,798-95,853 AGKD03059412.1: 57-1,929 AGKD03014471.1: 118,094–119,377 AGKD03000675.1: 40,646-41,776 AGKD03023309.1: 66,937-68,070 AGKD03004717.1: 23,331- 24,786 AGKD03074418.1: 66–842 AGKD03039134.1: 41,583-42,698 AGKD03042494.1: 6,656-7,762 AGKD03018928.1: 5,621-6,703 AGKD03014586.1: 46,887-47,990 AGKD03004035.1: 162,432-163,631 AGKD03037567.1: 309,437-310,606 AGKD03027389.1: 155,215-156,330 AGKD03014115.1: 214,415-215,431 AGKD03004179.1: 202,945-204,021 AGKD03030477.1: 28,926-29,958

XCR1Ld CCR4Lc CCR4Lc CCR12.3 (CCR3-3) n.m. CCR4La+b (CCR8-1/2) n.m. CCR11 (CCR2/5) n.m CCR6a n.m. CCR6b CCR7 n.m. CCR9a n.m. CCR9b n.m. XCR1+b + c and XCR1Lc n.m. n.m. n.m. CXCR1 CXCR1c

n.m n.m. n.m. n.m. n.m. n.m. n.m. n.m. n.m. CCR6 n.m n.m CCR7 n.m. CCR9 n.m. n.m. n.m. XCR1 n.m. n.m. CXCR1/2 n.m. n.m n.m. n.m. n.m. n.m. CXCR4 CXCR4 n.m. n.m. CXCR5 CXCR6 CXCR7 n.m. n.m. n.m. n.m. n.m. CCRL1 n.m. n.m. n.m. CCBP2 CMKLR1 n.m. n.m.

CXCR3a+b n.m. n.m. CXCR4a+b n.m. n.m CXCR5b CCR10 CXCR7a n.m CXCR7b CXCR3.2 n.m. CCRL1a n.m. CCRL1b CCBP2 CMKLR1 n.m. n.m.

Trout ref.

Dixon et al., 2013

Dixon et al., 2013

Ordas et al., 2012 Daniels et al., 1999

Dixon et al., 2013

Zhang et al., 2002 Xu et al., 2014

Xu et al., 2014 Daniels et al., 1999

Xu et al., 2014

Identical coloured genes are linked within contigs. Abbreviation n.m. means no matching EST or orthologue while n.t. defines no matching shotgun transcript. Ref. is reference and ᴪ defines likely pseudogenes. Zebrafish orthologues in parentheses derive from Liu et al., 2009. The trout CCR7 and CXCR8 sequences may be either a or b orthologues.

granulocytes during the early phase of inflammation (van der Aa et al., 2010). At least one of these lineages is present in salmonids and represents a potential ligand for the CXCR1 and CXCR2 receptors (Chen et al., 2013). In trout, this CXCL8 variant was shown to specifically attract a monocyte-like sub-population while the unrelated CC chemokine CK6 specifically attracted a macrophagelike cell sub-population (Montero et al., 2008). The receptor sequences here defined as ssCCR3, but defined as CCR13 by Dixon et al. (2013), cluster with zebrafish sequences defined as CCR3 (Liu et al., 2009) or CCR12 (Nomiyama et al., 2011) (Fig. 1). The ssCCR1 sequence has an orthologue defined as XCR1Ld in zebrafish that together with the CCR3 sequences mentioned earlier form a separate clade alongside salmon, zebrafish and human sequences defined as XCR1. Based on convincing bootstrap values we define this clade as XCR-like receptors (Fig. 1). Evidence supporting these genes as interesting candidates for further studies comes from zebrafish where an orthologue to the salmon CCR3 sequences (zfCCR3-2) was found to be uniquely expressed in spleen,

kidney and gills. The ssXCR2 gene may be an expressed pseudogene with a deleted N-terminal region thus disrupting the 7-transmembrane structure, but this needs to be verified by more thorough studies. Although XCR sequences have been identified in teleosts previously (Crozat et al., 2010; Nomiyama et al., 2011), the XCL ligand has not been agreed upon although some candidate sequences have been suggested (Gilligan et al., 2002; Nomiyama et al., 2008). A salmon orthologue to the human atypical CCBP2 sequence is phylogenetically related to the ssCCR1-5 and XCR sequences. The remaining salmon receptors cluster with dual function, homeostatic or atypical human receptors (Fig. 1). As noted previously in zebrafish, there is clear-cut orthology between teleost CCR6, CCR7, CCR9 and CCRL1 sequences (Liu et al., 2009). Potential salmon ligands for the CCR6, CCR7 and CCR9 receptors have been suggested as CK8, CK10/CK12 and CK9 respectively as these ligands cluster with mammalian CCR6, CCR7 and CCR9 ligands (Laing and Secombes, 2004). However, such assumptions may be misleading as sequence

84

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

identity between mammalian and teleost chemokines is low. This is exemplified by trout CK12 that shows a weak phylogenetic clustering with the human CCR7 ligands CCL19/21. But when studied in further detail, CK12 was in fact shown to be a chemokine produced by epithelial cells of mucosal tissues through which these peripheral tissues recruit immature B- and T-like lymphocytes (Montero et al., 2011). Thus, although weakly similar to human CCL19/21 in phylogenetic studies, in functional studies CK12 behaves more like the human CCL6, CCL14, CCL15 chemokines. Salmon sequences defined as CXCR3, CXCR5 and CXCR8 cluster weakly with human CXCR3, CXCR5 and CCR10 (Fig. 1), but there are no clear-cut orthology between these teleost and human sequences. The salmon CXCR6 sequence variably clusters with either the human CXCR6 or the human CCR10 sequences. The salmon CXCR4 sequence forms a clade with other CXCR4 sequences, clustering either with CXCR7/CMKLR or CXCR3-5 sequences depending on which sequences are included. The CXCR4 ligand CXCL12 is fairly conserved between mammals and teleosts suggesting potentially similar functional roles in for instance organogenesis or in brain function (Diotel et al., 2010; Sasado et al., 2008; Verburg-van Kemenade et al., 2013). Interestingly, in zebrafish the CXCL12 gene is duplicated and the two a and b variants were shown to have acquired different functions primarily due to one amino acid difference (Boldajipour et al., 2011). Additional CXC ligands have been identified in teleosts, but the individual pairing between receptors and ligands remains unclear (Chen et al., 2013). Salmon orthologues to the atypical or silent chemokine receptors CMKRL, CMKLR, CCRL1, CXCR7 and DARC were also found (Fig. 1). Sequences with convincing identity to human CCRL1 have previously been described in many teleosts (Liu et al., 2009; Nomiyama et al., 2011).

3.2. Gene organisation and regional syntenies With a few exceptions, mammalian CRs have a typical one exon open reading frame (ORF) gene organisation. This is also true for the majority of salmon genes i.e. ssCCR2, ssCCR3, ssCCR5, ssCCR6, CXCR1, CXCR2, CXCR6, CXCR7, ssXCR1, CCRL1, CMKLR1 and CCBP2 all share this one exon gene structure (Appendix S2 and data not shown). The salmon genes CCR4a/b, CCR9, CXCR4, CXCR8 share an exon intron structure with many other zebrafish and human CRs i.e. one or two smaller exons followed by a larger exon. The two remaining salmon receptors i.e. CCR7 and CXCR3 both have a quite unusual gene structure where both genes have an intron dividing the larger ORF which in ssCCR7a/b is preceded by one smaller exon. The zebrafish CXCR3 orthologue (drCXCR3.b; ENSDARG00000007358) also has this additional intron separating the larger exon, while the zebrafish CCR7 orthologue does not. The salmon CCR1 gene has a more unusual gene structure with four medium sized exons. Some salmon genes are closely linked such as the duplicate ssCXCR3–ssCXCR8 genes and the duplicate ssCCR4–ssCCR2– ssCCR5 genes (Table 1). Although denoted drCXCR3a/b (ENSDARG00000070699, ENSDARG00000007358) and drCXCR3.2 (ENSDARG00000041041) by Nomiyama et al. (2011), these zebrafish sequences are related to the salmon ssCXCR3 and ssCXCR8 sequences respectively (Fig. 1) and are also closely linked within a 19 kb region on zebrafish chromosome 16. Thus, this linkage existed prior to the split between salmonids and cyprinids more than 250 MYA (Near et al., 2012). In zebrafish, the CCR4Lb (CCR8.2) gene is duplicated on chromosome 16 (ENSDARG00000095789 and ENSDARG00000086616) with an un-annotated orthologue to the salmon ssCCR2 gene located 18 kb downstream (drCCR4Lc; XP_002664844.1). The remaining salmon contigs contain one gene only, but once the salmon genome scaffolds are published, more receptors may be linked.

3.3. Assessing secondary structure There are many structural features conserved between salmon and other CR sequences including the G-coupled protein seven transmembrane signatures. CR sequences typically bear one cysteine residue in each extracellular domain, where the N-terminal and extracellular loop (ECL3) cysteines (C1 and C6, Figs. 2 and 3) form a disulphide bridge in all known CRs except human CXCR5 and CXCR6 (Wu et al., 2010). In salmon, ssCXCR3.1a and the atypical ssCMKLR receptors form the exceptions that lack this bridging potential. The cysteines connecting ECL1 and ECL2 are also present in all salmon CRs (C2and C3, Figs. 2 and 3) except for ssCMKLR2b that is most likely a pseudogene having a 6-transmembrane domain structure. Additional cysteines are also found in other domains, but their functional relevance is unclear. The DRY motif known to be important for intracellular signalling in classical CRs is located right after the third transmembrane domain in most salmon CRs (Figs. 2 and 3). The exceptions are ssCCR1, ssXCR1a/b and ssCMKLR1/2, which should then be classified as atypical CRs according to this definition. However, the two salmon XCR sequences share a HRY motif with their human and mouse XCR orthologues and the salmon ssCKLMRs share a DRC motif with their human orthologues suggesting unique intracellular signalling and potentially also different functions for these orthologues. The atypical nature of the ssCCR1 receptor remains to be established. The salmon ssCCBP2 sequence on the other hand does contain a DRY motif unlike its atypical human ssCCBP2 counterpart, thus questioning the atypical nature of this salmon molecule. All full-length sequences contain the typical CR seven transmembrane domains with four potential extracellular and four intracellular domains including the N- and C-terminal sequences (Figs. 2 and 3). Most N-terminal salmon CR sequences contain from one to three N-glycosylation sites in addition to several potential tyrosine O-sulfation sites equivalent to that found in human CRs (Bannert et al., 2001; Liu et al., 2008). The exceptions are ssCCR7b, ssCCR9.1a/b and ssCCR9.2b that completely lack N-linked glycosylation sites, but they may use O-linked glycosylation instead. Several salmon CRs also have predicted N-linked glycosylation sites in intracellular and extracellular domains, where for instance the ECL2 domain displays fifteen such sites suggesting a functional relevance. Dixon et al. (2013) noted that the CCR6 ECL2 domain is much larger in teleost than in mouse and human sequences further pointing to a potentially more complex function for this domain in teleosts. The C-terminus of salmon CR sequences contain several motifs known to be important for intracellular signalling such as the dileucine motif known to interact with adaptor proteins found in ssCCR7, ssCCR9 and ssCXCR6 sequences [D/E-XXXL-(I/L), (Mattera et al., 2011)]. Some sequences also have predicted C-terminal Nor O-linked glycosylation sites that may participate in intracellular transport or regulation. 3.4. Three R or 4R duplications Many teleost CR genes have duplicates, where some are seemingly unique to salmon such as CCR2, CCR3 and CCR4 (Fig. 1). Other genes show simple orthology between salmon and zebrafish such as ssCCR1/drXCR1Ld, ssCXCR6/drCCR10 and ss/drCXCR5 sequences (Fig. 1, Table 1). Some subsets of genes have expanded and/or contracted differently such as the four CCR11 and XCR genes in zebrafish. There are also examples of old gene duplications that occurred prior to the split between zebrafish and salmonids such as ssCCR6.1/ 6.2, ssCCR9.1/9.2, CCRL1.1/1.2 and CXCR7.1/7.2. However, a pronounced difference between the two species is the multiple younger gene duplications observed in Atlantic salmon. There are 18 genes that occur in duplicate with sequence identities ranging from 82 to 95% (CCR2, CCR3, CCR4, CCR5, CCR6.1, CCR7, CCR9.1,

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

* 20 * 40 * 60 * 80 * 100 ------------------------------MMNLSESWKTMVNETSSVNDSDYTDEGYDDEKHVKLC----------DEVGGLEEVTAGCFLVIFLLSVTGNGLL : -----------------------------------------MNTTEATST-DDYSGDNYYGNMISPC-------STGTSLTQGSNYQPILFYLVFTLGLTGNSLV : -----------------------------------------MNTTEATSTDDYYGYD-------SPC-------STGTSLTQGSNYQPILFYLVFTLGMTGNSLV : -------------------------------MADYEDFLAFFNEDNFTDYNNSVDTSYVVDEMVNLC-------AKTEVNRFGAKFIPTFYTINFLLSVVGNGLV : -------------------------------MAEYKDFLDLFSDENDMDYN-YTDPIYVVDKVVNFC-------VTADVNRFGAKFTPILYTINFLLSFFGNGLV : MNTTGYPVHTTEGGNTTTIPFSSVSVENGNSSSYAYENSYSYAYGTHFADAFEVTTYDYSDYDDGIC----------EYKPHGASFLPVLYSLFFILGFLGNVLV : --------------------------------MNITGYPVHTTASTHFADAFEVTTYDYNNYDDGVC----------KYNAHGASFLPVLYSLFFILGFLGNVLV : -----------------------------------MPDKDMEPTTEYNYSSYYDDTEG-LYRSE-PC-------NTANVKEFGRVFLPTLYSLVFIVGFIGNGLV : ----------------------------------------MEPTTDYNYSAYYDGIEGLDTSEGQPC-------NNANVKEFGRVFLPTLYSLVFIVGFIGNGLV : -----------------------------------------MNHTDNGEETVNNSVAY-DYDLVEPC-------NMEDNNSVERVVRLYIHSVICILGLLGNILV : -----------------------------------------MDGTGYSESTNGITEDYGEMDYVEPC-------QMTKNNSVERVVRLYIHSVICILGLLGNILV : -------------------------------------MNEMCTDAYDYDNTENYTKDY-PDDNEYIC-------NLNPNRDMEIVIQTYFHSFICAFGFCGNALV : ----------MTAVKDIQILVPALLIWTYFETCFSQNENMTTEFTTDYTDYPTDKTDLDYDHWTQQC-------QKESNRHFRSWFMPTFYSLICFLGLVGNILV : ----------MTAVKDIRILVPALLIWTYFETCFSQNEKMTTEFITDYT---MDKTDLDYEYWTQQC-------QKESNRYFRSWFMPTFYSFICFLGLVGNILV : ------------------------------MPIIGDLVTSPMVSEVYDYDSSFTPTAGEDDLEDFMC-------DKSPVRAFRGQYEPPLYWTIIILGGLGNLTV : ------------------------------MPVIGDMVTSPMDSEVYDYDSSFTPTVGEDGLDDFMC-------DKSAVRAFRGQYEPPLYWSIVILGGLGNLTV : -------------------------------ME-WPLFTALPTDETLSGDYTDDY-GTFTETPGGLC-------DKSWGREFRALYEPPLFWLIFVLGAVGNLMV : -------------------------------MESPSSFTIFPTFETGSGDYTEDYEGGFTETPGGMC-------DKSWVREFRGLYEPPLFWLIFALGAVGNLMV : ---------------------------------------------MEYNETN-ITYDYDYDYKDEVC-------NKEGVVKFGSIATPAFFSVVTILSLAGNILV : ---------------------------------------------MGDIETNGTDYGYDDYYTDEVC-------NKAGVVKFGSIATPAFFSVVTILSLAGNILV : ----------------------------------------------------------------------------------------------------------: ------------------------------MTELEQPYVLDYDYNSTNDSYNFNITSFDLDSNTLSC-------AAQPLGPSAVIFLCVLHIAIFLLAVPGNLLV : ------------------------------MADPNISYLLTLEDFGEYFNYTDFNTTYELDENTLIC-------DTSPISSGVTVVLCALYVLILLLAIPGNLVV : -------------------------------------MQDMDYADSPYSDIFNCTYPPIDELKAAPC--------SVSILGLSSVGLMVTYIIVFVLSVLGNGVV : -------------------------------------MPEMDVDLSLFVEFLNFTYPPIDELMGVPC--------NVSILGLSSVGLMITYITVFILSVMGNSVV : ----------------------------------------MDLDLGGIFLENSTYNYDEDYVYKEEC-----SPEDGVGVRFGTVFLPMLYSLTLVLGLVGNVLV : ----------------------------------MANVTDMDLDLGGIFLENSTYNYDEDYVYKEEC-----SPEDGVGVRFGTVFLPMLYSLTLVLGLVGNGLV : ------------------------------MDSLTANGEKFTITISGGDLDNYYDEYNNYTDTSDTCCSTGEVCSLEEGMSFDAVFLPVFYSLTLVLGLLGNGLV : ----------------------------MSSFYEVEHIFLDNTSYEE------SGDFDLDLGFEEPC------N-RVGGDYFQRIFLPTVYGIIFLLGIVGNGLV : ----------------------------MSTYYETI-IFYNDNSSEE------SGDYDLG--YEEPC------N-RVSGDDFQRIFLPTVYGIIFLLGIVGNGLV : -----------------------------MSYYEHFVIPESDYDYNDTSSGFGSGLGDFGTGFEEPC------D-QLLSPSVQRIFLPVVYGIIFTLGITGNGLV : -----------------------------MSYYEHFVIQESDYDYNDTISGFGSGLGDFGAGFEEPC------DRELLSPSVQSIFIPVVYGFIFTLGITGNGLV : ------------------------------MTYDKGSFEDGDLFFGFDNYSDLESPNNSSGDTEYTC------NDGAGLQLFHTVFQPLVYSLVFFLGLTGNGLM : ------------------------------MDLTSFFDMDYDHSLATGDYFDYNDTSTRGYMLIERC-------EASEQQLTIKVFQTCVFLLVFLLGLLGNSLV : ----------------------------------MNSFDLDELFDTWEDLNLTGLLENGTRVEMGGC-------PTAFDRSALLHSMCILYVFIFVVGLAANGLV : ----------------------------------MSSFDLVELLDTWEDLNLTGLLENGTRVEMVGC-------PTAFDRSALLHSMCILYIFIFVVGLAANGLV : --------------------------------MSLSVNELTELMEMWAELNFTGDNMSSHHVEALLC-------PAGFSHAAVLYTLSVLYIFIFLVGLAANTLV : ---------------------------------MDHVKATTDYYIYEDSYN-YSPETGSSQSSGVPC-------NQDGIMDFTRSYSPVVYSLVFVLALLGNILV : ---------------------------------MDHVNATTDYYIYEDIYNSSSSETGSSQSS-VPC-------YKDGIMDFTQSYSPVVYSLVFVLALVGNILV : ---------------------------------------MDLVE--DYDYYDNLTLNYSYEDYHTVC-------EKADVRSFAGLFLPVVYSVCVAVGLAGNSLV : ---------------------------------------MDLTEEDDYDYHNNLTLNYSYEDYHTVC-------EKADVRSFAGLFLPVVYGACVVVGLAGNSLV : -----------------------MSLYSLTSQRTERMEMDEE-DYNYDFGNTSSNDSDDYDDYHSVC-------DKAEVRSFGRLFLPVVYALALVVGVAGNALV : ------------------------------------MDLNIPELTDDYNYSHYYDYGDEPLDGFGLC-------EKAHVKVFGRIFLPISYIIICTLSIIVNILF : -----------------------------------------------MEDFDYKEYGEDYTADNETYENTSVSGSVTFNHPRSFSVETGINILISLLGLSGNAIV : -----------------------------------------MIFIAIENRMEMENSTMVYSDVTTGM-------DSVLDTRHLDIISLVVYCVAFVLGPIGNGLV : -------------------------------------------------------------------------MVYSDVTTGMDIISLVVYCVAFVLGPTGNGLV : N-terminal C1 TM1

65 56 50 67 66 95 63 61 58 56 69 60 88 85 68 68 96 67 52 53 68 68 60 60 60 66 75 64 61 69 70 69 68 64 64 66 64 64 57 59 74 62 67 57 32

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR1 ssCCR2a ssCCR2b ssCCR3a ssCCR3b ssCCR4a ssCCR4b ssCCR5a ssCCR5b ssCCR6.1a ssCCR6.1b ssCCR6.2 ssCCR7a ssCCR7b ssCCR9.1a ssCCR9.1b ssCCR9.2a ssCCR9.2b ssXCR1a ssXCR1b ssXCR2 ssCXCR1.1 ssCXCR1.2 ssCXCR2.1 ssCXCR2.2 ssCXCR3.1a ssCXCR3.1b ssCXCR3.2 ssCXCR4.1a ssCXCR4.1b ssCXCR4.2a ssCXCR4.2b ssCXCR5 ssCXCR6 ssCXCR7.1a ssCXCR7.1b ssCXCR7.2 ssCXCR8a ssCXCR8b ssCCRL1.1a ssCCRL1.1b ssCCRL1.2 ssCCBP2 ssCMKLR1 ssCMKRL2a ssCMKRL2b

Fig. 2. Amino acid sequence alignment of Atlantic salmon CRs. Amino acid alignment of all identified Atlantic salmon chemokine receptor sequences (see Appendix S1 for references). Residues in red font define transmembrane regions while blue font residues define the DRY motif known to be involved in CR signalling (Allen et al., 2007). The lacking DRY motif in CCR1, XCR and CMKLR sequences are boxed. Purple shaded residues are N-linked glycosylation sites, yellow shading shows Y-linked sulfation sites, green shading shows dileucine motifs important for binding to AP2, while grey shaded residues represent potential O-linked glycosylation (N-terminal) or phosphorylation sites (C-terminal) (Blom et al., 2004; Borroni et al., 2010). Regions and conserved/semi-conserved cysteine residues are numbered and shown below the alignment. Abbreviations used are ECL = extra-cellular loop, ICL = intracellular loop, TM = transmembrane domain. CCRL1.2b and CXCR7.2b are likely pseudogenes with no transcript support and thus not included.

85

86

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

* 120 * 140 * 160 * 180 * 200 * LVALCRYEG-------LRRVTNLFILNLLFSDLLFTLTLPFWAVYYL--SHWMFGDLACKLLTGAYFTGLYSSIMLLTSMTVYRCVIVVASR----WTAVPRRRL LWVLLKYMK-------LKTMTDICLLNLALSDLLLALSLPLWAYHAQG-HEFE-GDSPCKIMAGVYQVGFYSSILFVTLMSVDRYLAIVHA-----VTAMRARTL LWVLLKYMK-------LKTMTDICLLNLALSDLLLALSLPLWAYHAQG-HEFE-GDSPCKIMAGVYQVGFYSSILFVTLMSVDRYLAIVHA-----VAAMRARTL LCIIYKYEK-------LTSVTNIFLLNLVISDLLFASSLPFLATYYS--SEWIFGPFMCKLVGSMYFIGFYSSILFLTLMTFDRYLAVVHA-----INAAKQRRK LCIIYKYEK-------LTCVTNIFLLNLVISDLLFASSLPFWALYYF--YGWIFGPVMCKLVGSVYFIGFYSSILFLTLMTFDRYLAVVHA-----INAAKRRRK LWVILLGVK-------LCSMTDVCLLNLALADLLLVCTLPFLAHHAT--DQWVFGDIMCKVVLSAYHIGFYSGIFFITLMSVDRYLAIVHA-----VYAMRARTR LWVILLRVR-------LRSMTDVCLLNLALADLLLVCSLPFLAHHAR--HQWVFGDVMCKVVLSAYHVGFYSGIFFITLMSVDRYLAIVHA-----VYAMRARTR VCVLVKFRR-------IRSITDLCLFNLALSDLFFIISLPFWSHYATA-AKWLLGDFMCRLVTGLYMLGFYGSIFFMVILTVDRYVVIVHA-----HTMARPRSV VYVLVKCRR-------TRSMTDLCLLNLALSDLFFVISLPFWSHYATA-AEWLLGDFMCRLVTGLYMLGFYGSIFFMMILTVDRYVVIVHA-----HKMARLRSV IVTYA-FYK------KAKSMTDVYLLNVAIADMLFVVALPLIIYNEQS-D-WAMGTVACKVLRGAYSVNLYSGMLLLACISTDRYIAIVQAR---RSFRLR--SL IVTYA-FYK------KAKSMTDVYLLNVAIADMLFVAALPLIIYNEQS-D-WAMGTVACKILRGAYSINLYSGMLLLACISTDRYIAIVQAR---RSFMLRSFTL IVTYA-FYK------KAKTMTDVYLLNVAVADLLFIVALPLIIYNEQH-D-WSMGSVACKAFRGAYSINLYSGMLLLACISRDRYISIVQAR---RSFGLRSQNL IGTYV-YFN------RLKTGTDVFLLSLSIADLLFAVSLPLWATNSMT-E-WVLGLFICKVMHTIYKVSFYSGMFLLTSISVDRYFAISKAV---SAHRHRSKAV IGTYV-YFN------RLKTGTDVFLLSLSIADLLFAVSLPLWATNSMT-E-WVLGLFICKAMHTIYKVSFYSGMFLLTSISVDRYFAISKAV---SAHRHRSMAV VWIYL-HFRQ-----RLKTMTDVYLLNLAVADLFFLGTLPLWAVEATQ-G-WSFSSGLCKVTSALYKINFFSSMLLLTCISVDRYVVIVQTT---MAQNSKRQRL VWIYL-HFHQ-----RLKTMTDVYLLNLAVADLFFLGTLPFWAVEGNQ-G-WSFGLGLCKITSALYKINFFSSMLLLTCISVDRYVVIVQTT---KAQNSKRQRL VFIFT-TVRH-----RLKTMTDVYLLNLAVADLLFLGTLPFWAADATK-G-WMFGLSLCKLLSAIYKINFFSSMLLLTCISVDRYVAIVQVT---KAHNQKNKRL VFIFT-TVRH-----RLKTMTDVYLLNLAVADLLFLGTLPFWAADATR-G-WVFGLGLCKILSAVYKINFFSSMLLLTCISVDRYVAIVQVT---KAHNLKNKRL LVILAKYEN-------LKSLTNIFILNLALSDLLFTFGLPFWAAYHI--WGWTFGWLLCKTVTFVFYAGFYSSVLFLTIMTIHRYLAVVHP-----LSDHGSQRG LVILAKYEN-------LKSLTNIFILNLALSDLVFTFGLPFWAAYHI--WGWTFSRILCKTVTFVFYAGFYSSVLFLTIMTIHRYLAVVHP-----LSDHGSQRG -------------------MTNAFMMNLALSDLVFTCGLPFWVSYHL--SGWSYGDLTCKAVSFLFYAGYYSSGIFLILMTLHRYLAVLRPLSRLVSGPSRSQ-G GLVIG--FSQ-----QSLTPSDVFLFHLTVADGLLALTLPFWAANTLH-G-WIFGDFLCKCLSLVMEASFYTSILFLVCISVDRYLVIVRPAK-----SRKGRRR GLVIA--SSK-----QPLSPSDLYLLHLAVADFLLALTLPFWAASVTV-G-WVFGDVMCKLVSIFQEVSFYASILFLTCISVDRYLVIVRAMEA----SKAARRR IYVVC--CMA-----RGRTTTDIYLMHLAMADLLFSLTLPFWAVYVYS-H-WIFGTFLCKFLSGLQDAAFYSGVFLLACISVDRYLAIVKTTQ------ALAQRR IYVVC--CMA-----RDRTTTDVYLMHLAMADLLFSLTLPFWAVYVYS-H-WIFGTFLCKFLSGLQDAAFYCGVFLLACISVDRYLAIVKATR------ALAQRR LVVLVQKRR-------SWSVTDTFILHLGLADTLLLVTLPLWAVQATG--EWSFGTPLCKITGAIFTINFYCSIFLLACISLDRYLSVVHAVQ---MYSR--RKP LVVLVQKRR-------SWRVTDTFILHLGLADTLLLLTLPLWAVQATG--EWSFGTPLCKITGAMFTINFYCSIFLLACISLDRYLSVVHEVQ---MYSL--RKT LLVLVQRRR-------GWSVTDTFILHLCVADILLVLTLPFWAAQATG--EWSFGTPLCKITGAIFTINFYCGIFLLACISLDRYLSVVHAVQ---MYSR--RKP VTVMGYQKK------VKT-MTDKYRLHLSVADLLLVFTLPFWAVDAAS--SWYFGGFLCTTVHVIYTINLYSSVLILAFISVDRYLAVVHATN---SQTTRKRKL LIVMGYQKK------VKT--TDKYRLHLSVADLLFVLTLPFWAVDAAS--SWYFGGFLCTAVHMIYTINLYSSVLILAFISVDRYLAVVHATN---SQTTRTFLA VFVLGCQRK------ARLSLTDRYRLHLSAADLLFVLALPFWAVDAAL-GDWRVGAVMCVGVHVIYTVNLYGSVLILAFISLDRYLAVVKATV---TSTTHTRQL VFVLGCQRK------ARLSLTDRYRLHLSAADLLFVLALPFWAVDAAL-GDWRFGAVTCVGVHVIYTVNLYGSVLILAFISLDRYLAVVKATD---TSTTHIRQR LTVLLKRRG-------LLRITEIYLLHLGLADLMLLATFPFALAQVSF--GVVFGDVLCKLIGLLNRLNFLCGSLLLACIGFDRYLAIVHAIT---SLQS--RRP IATFVLYRRL-----RLRSMTDIFLFQLALADLLLLLTLPIQAGDTLL-GHWAFGNALCKATHASYAVNTYSGLLLLACISVDRYMVVARTQEVLR---LRSRML LWINIRAQHTTSSS-SPRHETHLYIAHLAAADLCVCVTLPVWVSSLAQHGHWPFSELACKLTHLLFSVNLFSSIFFLACMSVDRYLSVTRPAD---SEDGGRRRK LWVNVRSQRTTSSS-SPRHETHLYIAHLAAADLCVCVTLPVWVSSLAQHGHWPFGEVACKLTHLLFSVNLFSSIFFLACMSVDRYLSVTRPAD---SENGGRRRK VWVNLRSERN-------RFETHLYILNLAVADLCVVATLPVWVSSLLQRGHWPFGEAVCKITHLVFSVNLFGSIFFLTCMSVDRYLSVALFGD---GGNS-RRKK LCVLMRYRTSQTGGACSFSLTDTFLLHLAVSDLLLALTLPLFAVQWAH--LWVFGVTACKISGALFSLNRYSGILFLACISFDRYLAIVHAVS---TGWK--RNT LCVLMRYRTSQTGGACSFSLTDTFLLHLAVSDLLLALTLPLFAVQWAR--QWVFGVAACKISGALFSLNRYSGILFLACISFDRYLAIVHAVS---TSWK--RNT LSVYAYHKRL-----RR-TMMDAFLVHLAVADLLLLLTLPFWAADAAR-G-WELGLPLCKLVSACYTINFTCCMLLLACVSMDRYLASIRAEGRNHGRLGRVFTR LAVYAYHTRL-----RR-TMTEAFLAHLAVADLLLLLTLPFWAADAAL-G-WELGLPLCKLVSACYAINFTCCMLLLACVSMDRYLASVRAEGRNQGRLGRVFTR VVVYASPRRL-----R--TLTDVCILNLAVADLLLLFTLPFWAADAVH-G-WWIGVAACKLTSFLYTTNFSCGMLLLACVSVDRYRALAHNAGGRAGSGPR--DR ISTLIKSKHH----------RKTFPMSMAISDMLFALTLPFWAVYAHN--EWIFGNDSCKTVTAIYITTLYSSILFITCISVDRYLNVVWTLS-----SWNHCTP IWISGFKMR -------TSVNTTWYLSLAISDFLFCVCLPFNIVYMVT-SHWPFGLVMCKLTSSTMFLNMFSSVFLLVLISVDRCVSITFPVW-----AQNNRTI IYVTSCRIK--------KTVNSVWFLNLAMADFLFTSFLLLYIINIARGYDWPFGDILCKLNSMVNVLNMFASIFLLAAISLDRCVSTWVVVW-----AHNKCTP IYVTSCRIK--------KTTNSVWFLNLALADFLFTSFLLLYIINMARGYDWPFGDILCKLNSMVTVLNMFASIFLLAAISLDRCLSTWVVVW-----AHNKCTP ICL1 TM2 ECL1 C2 TM3 DRY ICL2 Fig. 2. (continued)

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

157 147 141 158 157 186 154 153 150 147 162 153 181 178 162 162 190 161 143 144 83 159 160 150 150 151 157 166 157 153 164 165 160 164 165 165 160 162 162 154 156 168 150 158 149 124

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR1 ssCCR2a ssCCR2b ssCCR3a ssCCR3b ssCCR4a ssCCR4b ssCCR5a ssCCR5b ssCCR6.1a ssCCR6.1b ssCCR6.2 ssCCR7a ssCCR7b ssCCR9.1a ssCCR9.1b ssCCR9.2a ssCCR9.2b ssXCR1a ssXCR1b ssXCR2 ssCXCR1.1 ssCXCR1.2 ssCXCR2.1 ssCXCR2.2 ssCXCR3.1a ssCXCR3.1b ssCXCR3.2 ssCXCR4.1a ssCXCR4.1b ssCXCR4.2a ssCXCR4.2b ssCXCR5 ssCXCR6 ssCXCR7.1a ssCXCR7.1b ssCXCR7.2 ssCXCR8a ssCXCR8b ssCCRL1.1a ssCCRL1.1b ssCCRL1.2 ssCCBP2 ssCMKLR1 ssCMKRL2a ssCMKRL2b

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

220 * 240 * 260 * 280 * 300 * RYALAACTASWVVSLAASLSDVIASQVQEV------------------------------ENGTRIFTCEVLPG-----TTDEELGYYLQVFLLFVLPLIIIILC : RYGTLASIIVWVASISAALPEAIFAAVVRE------------------------------NDENSGTSCQRIYPE-DTEKTWKLLRNLGENGVGLLLCLPIMVFC : RYGTLASIIVWVASISAALPEAIFVAVVRE------------------------------NDESSGTSCQRIYPE-DTEKTWKLLRNFGENGVGLLLCLPIMVFC : IYACVSSAVVWCISLLASVKELVLYNVWKD--------------------------------PQSGHLCEETGFSKDIMDKWELVGYYQQFVIFFLLPLAMVMYC : IYACVSSAVVWCISLLASVNELVLYNVWKD--------------------------------PRVGHLCEETGFSNEIMIKWQLVGYYQQFVIFFLFPLAMVMYC : KYGAIAAVVTWLAGFLASFPEALFLKVEKH---------------------------------NEKENCRPVY-DG---HAWGIFGLFKMNTLGLLIPLVIMGFC : KYGAIAAVVTWLAGFLASFPEALFLKVEKN---------------------------------NEKENCRPVY-DG---HSWGIFALFKRIIFGLLIPLIIMGFC : RVGVTLSLFMWAVSLCASLPTIIFTKVNNE---------------------------------SGLTTCKPEYPEG---SMWRQVSYLEMNILGLLLPLSIMVIC : RLGVTLSLFMWALSLCASLPTIIFTKVNNE---------------------------------SGLTTCKPEYPEG---SMWRQVSYLEMNVLGLLLPLSVMVIC : IYSRIICAAVWNLALLLSVPTFVYYERYVPAHSTFGN-DYDNYDYNNATTPFDLENTIFLE-EENYVVCDFRFPDNATARQMKILVPSTQMAVGFFLPLLVMGFC : LYSRIICATVWSLALLLSVPTFVYYERYVPAHSFYNVSEYGFYDYRNAMTPVGLKNPISSESEEDSVVCKFRFPDNATARQMKVLVPSTQMAVGFFLPLLVMGFC : IYSRLICTAIWALAIALSVPTVIYNER--------------------------VEETILLE--GTITVCQAQFQSNRTARLMKVLVPSLQVAMGFFLPLLAMVIC : FISKVTSVVIWVMALVFSVPEMSYTNIS-------------------------------------NKTCTPYTAGS---DQVRVGIQVSQMVLGFVLPLLIMAFC : FISKVTSVVIWVMALVFSVPEMSYTNIS-------------------------------------NKTCTPYTAGS---DQVRVAIQVSQMVLGFVLPLLIMAFC : SCSKLVCACVWLLAVALALPEFMFANVK---------------------------------ELEGRDYCTMVYWSN-QDNSTKILVLALQICMGFCLPLLVMVFC : SCSKLVCTCVWLLAVVLALPEFMFANVK---------------------------------ELDGRFYCTMVYWSN-QDNRTKILVLVLQICMGFCLPLLVMVFC : SVSKLTCLAVWIISGLLALPELIFAQVKP--------------------------------DHRGNSFCVLVYTNN-LFNRTKILVLVLQICVGFCLPLLVMVLC : FVSKLVCLAVWIISGLLALPEFIFAQVKP--------------------------------DRRGNSFCVLVYPNN-LFNRTKILVLALQVCVGFCLPLLVMVLC : CYGVTISLIIWAISFGSAVPALIFSSVQKN------------------------------PHEGDHLHCEYS------VPLWKKVSTYQQN-VFFLAAFAVMAFC : CYGVTVSLVIWVVSFGAAVPALIFSSVQEN-----------------------------PHEEDIHFYCEYW------DPLWKRVGSYQQN-VFFLAAFAVIGFC : TWSAVVSLVVWTVSLLAAMPALIFTKLIITD---------SNDLKDLLDHNNPDGPSDSPAPSGEQRYCEVA------DVSWRLWGVYQQN-ILFIVTLLVVCVC : ACRWYACTFIWALGGALSLPALFN-EAFTPP-------------------------------SGGPTRCVER-FDLGSATHWRLATRGLRHILGFLLPLVIMVAC : EVSWGTCATVWLVGGLLSLPGLFN-HVFLLP-------------------------------GTERMTCTES-YDPGSAEAWRLVIRVLGHTLGFLLPLTVMVVC : HLVGIVCGAVWLGAGLLSLPAVLQREAIQLE------------------------------DLGDQSICYE-NLTASSSNQWRVFVRVLRHTLGFFLPLAVMVVC : HLVGLVCGAVWLGAGLLSLPVALQREAIQPE------------------------------DLEGQIICFE-NLTAASSDRSRVGVRVIRHVLGFFLPLSVMVVC : WMVQASCLSVWLLSILLSIPDWHFLESVRDTRR------------------------------DKQECVHNYPSLSQSGFDWRLASRLLYHTVGFLLPSVMLLFC : WMVQASCLSVWLLSLLLSIPDWHFLESVRDARR------------------------------DKVECVHNYLSLSQSGFDWRLASRLLYHTVGFLLPSAVLLFC : WMVQASCMSVWLLSILLSIPDWHFLESVRDTRR------------------------------DKQECVHNYPSLSQSGFDWRLASRLLYHTVGFLLPSAVLLFC : LADRWIYVAVWLPAAVLTVPDIVFAT------ALD--------------------------SG-SRTICQR-IYPQKTSFYWMAAFRFQHILVGFVLPGLVILTC : --DRVIYVAVWLPAVILTVPDTVFAT------AQN--------------------------RV-SRTICQR-IYPQETSFYWMAGFRFQHILVGFVLPGLVILTC : LARRLVYAGAWLPAGLLAIPDMVFAR------TQE--------------------------AGEGEMVCTR-LYPPENAPLWVSLFHLQTVLVGLVVPGLVLLVC : LARRYVYAGAWLPACLLAIPDMVFAR------TQE--------------------------AGEGEMVCAR-LYPPENAPLWVSLFHLQTVLVGLVVPGLVLLVC : RNVHLTCLALWLVCLALSVPNAVFLS-VGESPI-----------------------------DPTQLSCFF-HSHGLHANNWDLTERLLTHVLCFFLPLGVMTYC : TVGKLASLGVWLTALLLSLPEILFSGVER--------------------------------EQEGEAHCGMNVWV--AESWRVKTATRCAQIAGFCLPFLVMVAC : LIRHSVCMGVWLLALVASLPDTYFLRALRS-------------------------------SQGEVVLCRP-VYPEEHPREWMVGVQLSFILLGFIIPFPIITLA : LIRRSVCVGVWLLALVASLPDTYFLQAVRS-------------------------------SHGEVVLCRP-VYPEEHSREWMVGVQLSFILLGFVLPFPVIALA : VVRRVICILVWLLALAASVPDTYFLQAVKS-------------------------------THSDATVCRP-VYPTDNPREWMVGIQLSFIVLGFAIPFPVIAVF : CHAQIACALIWIVCFGLSGVDIAFRQVVKMEVGRS-------------------------GDHQGLLVCQT--VFPHSSLQWEVGMPLVNLVLGFGLPLLVMLYC : CHAQIACALIWTVCLGLSGVDIAFRQ--KMEVGRS-------------------------GDHQGLLVCQT--VFTHSSVQWQVGMPLVNLVLGFGLPLLVMLYC : AHCGKVCLGVWAVALLLGLPDLLFSTVSE---------------------------------TSRRRVCLA-VYPSSLAQEVKACLEMVEVLLGFLVPLLVMAWC : AHCGKVCLGVWAVAFLLGLPDLLFSRVRE---------------------------------TPGRRVCMT-VYPPSLAREVKACLEVVEVLLGFLVPLLVMMWC : RQWILVCAVVWTTAVCLGLPDMVFFTVKN---------------------------------TPHRLACTA-IYPSSMARPAKAALELLEVLLSFLLPFLVMVVC : MENTLVCFVVWSLSILAAAPHWTFVQEQE---------------------------------FHGQKICMYPFGEENHLPLWKILMKFQLNVFGFLTPFLIMLFC : PRASGVVVLVWALSAALTVPSLVHRQIKTHG---------------------------------ADTLCYTD-YQSG-----HKAVALSRFVCGFVIPLLIIVFC : GRAEVICVGIWLASLVCSLPFTIFRQIMHY---------------------------------GNWTMCSY-S--ISHDSSTYRNLVVFRFLLGFLIPFLVIIGS : GRAEA---GTWLSSASC-------------------------------------------------WAFSSHSI----------------------------IGS : TM4 ECL2 C3 TM5 C4

227 221 215 231 230 254 222 222 219 250 267 230 246 243 233 233 262 233 211 213 172 231 232 224 224 226 232 241 228 222 236 237 234 235 238 238 233 240 238 225 227 239 222 224 218 149

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR1 ssCCR2a ssCCR2b ssCCR3a ssCCR3b ssCCR4a ssCCR4b ssCCR5a ssCCR5b ssCCR6.1a ssCCR6.1b ssCCR6.2 ssCCR7a ssCCR7b ssCCR9.1a ssCCR9.1b ssCCR9.2a ssCCR9.2b ssXCR1a ssXCR1b ssXCR2 ssCXCR1.1 ssCXCR1.2 ssCXCR2.1 ssCXCR2.2 ssCXCR3.1a ssCXCR3.1b ssCXCR3.2 ssCXCR4.1a ssCXCR4.1b ssCXCR4.2a ssCXCR4.2b ssCXCR5 ssCXCR6 ssCXCR7.1a ssCXCR7.1b ssCXCR7.2 ssCXCR8a ssCXCR8b ssCCRL1.1a ssCCRL1.1b ssCCRL1.2 ssCCBP2 ssCMKLR1 ssCMKRL2a ssCMKRL2b

Fig. 2. (continued)

87

88

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

320 * 340 * 360 * 380 * 400 * 420 YSAILRTVLVTA----T----RRRHRTVLVVFCIVVAFFVCWAPYNLFMFVSSVYTP-----VD-CGVKE-RLHVVLVVCRIVAYAHCFLNPALYMLS-HSFRRH YISILTVLQRLR----N----SKKDRAMKLIFAIVGVFVVSWVPYNVVVFLRTLQMFDIGN--S-CEAST-QVDRAMEVTETIALAHCCVNPVIYAFVGEKFRKC YISILTVLQRLR----N----SKKDRAMKLIFAIVGVFVVSWVPYNVVVFLQTLQMFDIGN--S-CEAST-QLDAAMEVTETIALAHCCVNPVIYAFVGEKFRKC YVRITVRVMSTR----M----REKCRAVKLIFVIVFSFFVCWTPYNIVILLRALQMSTSHSFEP-CSD---VLDYALYVTRNIAYLYCCVSPVFYTFLGKKFQSH YVRITVRIMSTQ----M----RGKCRAVKLIFVIIFTFFVCWTPYNVVILLRALQISTSDDSDP-CFE---VLNYALYVTRNIAYLYCCVSPVFYTFVGKKFQSH YTQIVKRLLSCP----S----SKKQ-TIRLILIVVVVFFCCWTPYNMTAFFKALELSEVYS--S-CESSK-AIRLTLQITEAMAYSHSWLNPILYVFVGQKFRRP YTQIVRRLLSAP----S----SKKQ-AIRLILIVVVVFFCCWTPYNMTAFFKALELSEVYS--S-CESSK-AIRLTLQITEAMAYSHSCLNPILYVFVGQKFRRP YSRIVPMLVTIK----T----TKKHKAIKLIIIIVVVFFCFWTPYNVVILLRYLETQSYFG--D-CTTHT-NIDLAMQCTEVIAFTHCCLNPIIYAFAGQKFMSL YSRIVPMLVNIK----T----TKKHKAIKLIIIIVVVFFCFWTPYNVVIVLRYLEAQSYFG--D-CITHK-NIDLAMQWTEVIAFTHCCLNPIIYAFVGQKFTSL YANIIVTLLRAK----N----FQRHKAVRVVLAVVVVFIICHLPYNAALLYDTINKFK--ILP--CSQVD-ATEVAKTVTETVAYLHCCLNPVLYAFIGVKFRNH YASVIITLLRVK----N----FQRHKAVRVVLAVVVVFIACHLPYNAALLYDTVHMFK--PQL--CGEID-TTQVAKTVTETVAYLHCCLNPVLYAFIGVRFRNH YASILWTLLRAQ----S----TQRHKAVRVVLAVVVVFIVCHLPYNVVLLYHTVALFQ--QRE--CEVEN-IILTTLTITRSLAYLHCCLNPILYAFIGVKFRSR YGAIVKTLCQAR----S----FEKNKAIKVIFAVVAVFLLCQVPYNLVLLLTTLDTAKGGSKD--CIYDN-SLLYASDITQCLAFMRCCLNPFVYAFIGVKFRRD YGAIVKTLCQAR----S----FEKNKAIKVIFTLVAVFLLCQVPYNLVLLLTTLDAAKGGSKD--CIYDN-SLLYASDITQCLAFLRCCLNPFVYAFIGVKFRRD YAGIIRTLLKTR----S----FQKHKALRVILVVVAVFVLSQLPYNTVLVMEATQAANSTETD--CSAAK-RFDVVGQMLKSLAYTHACLNPFLYVFVGVRFRRD YAGIIRTLLKTR----N----FKKHKALRVIMVVVVVFVLSQLPYNSVLVVEATKAVNSTGMD--CDAEK-RFDVVGQVLKSLAYMHASLNPFLYVFVGERFRRD YSVIIRTLLQAK----S----FEKHKALRVIFAVVAVFVLSQLPYNGLLVVNATQAADTTITD--CAVSE-HFDVAGQIAKSLAYTHACINPFLYVFIGVRFQKD YSVIIRTLLQAK----S----FEKHKALRVIFAVVAVFVLSQLPYNGLLVVDATQAANTTITD--CAISG-HFDVAGQIAKSLAYTHACINPFLYVFIGVRFRKD YVRILAAIFKSR----S----HMRNRTMNLIFSIVAVFFLGWAPYNVVIFLRLLTDHSVAPFND-CEVSM-KLDYGFYVCRLIAFSHCCLNPVFYAFVGIKFRNH YVRILRTIFKSR----S----HMRNRTVKLIFSIVAVFFLGWAPYNVVIFLRLLHDYTVAPFNT-CEVST-WLDYGFYVCRLIAFSHCCLNPVFYAFVGIKFRNH YSQIVVRLLRPRVRVRRQRSGGDSRSQRTARLVLGLVLVFFVGWAPYNVVIFLRTLVYKSQDGGGVGQCCVILNTMGGVWHQQYVGLLLLCDQAAGVLLLLSQPT YSITVSRLLQ-T----SG---FQKHRAMRVIIAVVFAFLLCWTPFHMTVMADTLMRARLVRFD--CAERN-RVDLALQVTHSLALVHSFVNPVLYAFVGEKFRGN YGVIVARLLR-T----RGG--FQRNRAMRVIVALVLAFLLCWMPYHLAVMADTLFWAKVVGYG--CRERS-AVDTAMFATQSLGLLHSCVNPVLYAFVGEKFRRR YSCTAATMFRGM----RNG--DHKHKAMRVILAVVLAFVMCWLPCNVSVLVDTLMRSGSLGEET-CEFRN-SVSVALYVTKVIAFTHCAVNPVLYAFIGQKFRNQ YSCTAVTLFRGV----RNG--GQKHKAMRVILAVVLAFVACWLPRNISVLVDTLMRSGSLGEET-CEFQN-NVSVALYVTEVMAFTHCAVNPVLYAFIGQKFRNQ YSCILLRLQ-------RGSVGLQKQRAVQVILVLVLVFFLCWTPYNITLMVGTFQGRPGEPVSGSYENGRTALENSLVVTFALACLHACLNPVLHLGLCRNFRRH YSCILLQLQ-------RGSQGLQKQRAVRVILALVLVFFLCWTPYNITLMVDTFQGRPGEPVSVSCENGRTAVEKSLIVTFALACLHACLNPVLHLGLCRNFRRR YSCILLQLQ-------RGSQSLQKQRAVRVILALVLVFFLCWTPYNITLMVDTLYSN-STLVDT-CE-SRKALDISLTATSSLGYLHCSLNPVLYAFVGVKFRHH YCIIIAKLSQG-----AKG-QVLKRKALKTTVILILCFFSCWLPYCVGIFVDTLMLLNVISHN--CALEQ-SLQTWILITEALAYFHCCLNPILYAFLGVKFKKS YCIIIAKLSQG-----SKG-QVLKRKALKTTVILVLCFFSCWLPYCVGIFVDTLMLLNVISHS--CALEQ-SLQTWISITEALAYFHCCLNPILYAFLGVKFKKS YCVIVSRLTRG-----PLGGQRQKRRAVRTTVALVLCFFLCWLPYCIGIAVDALLRLELIPRG--CMLES-GLGVWLAVSEPMAYAHCCLNPLLYAFLGVGFKSS YCVIVSRLTRG-----PLGGQRQKRRAVRTTVALVLCFFLCWLPYCIGITVDALLRLELIPRG--CTLES-GLGLWLAVSEPMAYAHCCLNPLLYAFLGVGFKSS YAAVAITLHHSQ----RGQRSLEKEGAIRLAALVTAVFCLCWLPYNITMLVKTLVDRGLDSGLS-CQ-SRTSLDKALVVTESLGYTHCCLNPLLYAFTGVRFRQD YSLIGRLLCEGR----G-QGGWRRQRTLRLMVVLVAVFLLFQLPYTVVLSLKVAGPG-AARQT--CDQWA-ATLLREYVTCTLAYTRCCLNPLLYALVGVRFRSD YALLAKALSSS---FSSSAVEQERRVSRKVILAYIVVFLGCWGPYHGVLLADALSLLGLVPLS--CGLEN-ALYVALHLTQCLSLLHCCFNPILYNFINRNYRYD YALLAQALSSSS--CSSSAVEQDRRVSRRVILAYTVVFLGCWGPYHGVLLADALSLLGLVPLS--CGLEN-ALYVALHLTQCLSLLHCCFNPILYNFINRNYRYD YLLLAGAIGNANPPGSSANSNQERRISRNIILTYIVVFLVCWLPYHGVLLVDTLSLLNVLPFS--CRLEK-FLYVSLHLTQCFSLIHCCINPVIYNFINRNYRYD YIRIFRSLC--------NASRRQKRKSLHLIVSLVSMFVLCWAPYNSFQLVESLKKLGMISGG--CQFGR-TVDIGILVSESMGLSHCALNPLLYGFVGVKFRRE YIRIFRSLC--------NASRRQKRKSLHLIVSLVSMFVLCWAPYNSFQLAESLKKLGVISGG--CQFGR-TVDIGILVSESMGLSHCALNPLLYGFVGVKFRSE YFNVGRVLGRLP----V-ESRGRRLSAIRVLLVVVGVFVVTQLPYNTVKMYRAMDSAYTLVTH--CGVSK-ALDRAAQVTESLALTHCCLNPLLYAFLGSSFRKH YAGVGRVLRRLP----E-ESRGRRRRAIRVLLVVVGLFVVTQLPYNAVKMCRAMDSVYTLVTH--CGVSK-ALDRAAQVTESLALTHCCLNPLLYVFLGSSFRQY YCWVGRALVRIG----AGVRREKRWRALRVLLAVVGVFLFTQLPYNLVKLWRTLDVIYGLVTD--CDLSK-GLDQALQVTESLALTHCCINPMLYAFIGSSFRGY YLRVCCAVAKVK--------VGPRRKSLKLVMIVVVVFFVLWFPYNIVSFLHSLQHLHAIYN---CATSL-HLDFAIQVTEVIAYSHGFVNPIVYAFVNKRVWKG YSVIFVQLRSRP---------MKSTKPVKVMTVLIVSFFVCWVPYHTFVLLEVNLGNHSLE----------MLYTWLKVGSTMAAANSFLNPILYVLMGHDFRQT YIAIWIRARRLQ----R----GTTRRSLRIIVSVVLAFFICWMPFHVLQFLDIMANG--------SPGLNLVVHIVIPLSTSLAYLNSCLNPILYVFMCDEFQKK YIAIWIRAKRLQ----R----GRTCRSLRTIVSVVLAFFICWMPFHVFQFMDIMEED--------NQGLELVVHIGIPLSASLAYLNSCLNPILYVFMCDEFQKK ICL3 TM6 C5 ECL3 C6 TM7 C7-8 Fig. 2. (continued)

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

316 314 308 324 323 346 314 315 312 342 359 322 340 337 327 327 356 327 306 308 277 325 327 321 321 324 330 336 324 318 333 334 333 331 337 338 335 334 332 322 324 337 315 310 307 238

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR1 ssCCR2a ssCCR2b ssCCR3a ssCCR3b ssCCR4a ssCCR4b ssCCR5a ssCCR5b ssCCR6.1a ssCCR6.1b ssCCR6.2 ssCCR7a ssCCR7b ssCCR9.1a ssCCR9.1b ssCCR9.2a ssCCR9.2b ssXCR1a ssXCR1b ssXCR2 ssCXCR1.1 ssCXCR1.2 ssCXCR2.1 ssCXCR2.2 ssCXCR3.1a ssCXCR3.1b ssCXCR3.2 ssCXCR4.1a ssCXCR4.1b ssCXCR4.2a ssCXCR4.2b ssCXCR5 ssCXCR6 ssCXCR7.1a ssCXCR7.1b ssCXCR7.2 ssCXCR8a ssCXCR8b ssCCRL1.1a ssCCRL1.1b ssCCRL1.2 ssCCBP2 ssCMKLR1 ssCMKRL2a ssCMKRL2b

: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :

* 440 * 460 * 480 * 500 * 520 LWSLL------CCLMGEERGGQAGGGERSVGYNMHHITPRPKRTSFGVSGP----------------------------------------------- : LGTALSRY--PLCKKLSKHAMVSSRGSENETSNTPV-------------------------------------------------------------- : LGTVLSRY--PLCKKLSKHAMVSSRGSENETSNTPV-------------------------------------------------------------- : FRKLLAKH--IPCLKSYIDTNQSSQSRTTSQKSPHTMYEY---------------------------------------------------------- : FRKLLAKR--IPCLKRHIPTSQNSNSRITSQKSPHNTYEYEKGTGLQTRV------------------------------------------------ : LIRLINKAPRRMCQFMKNYLPWDFRASRTGSVYSQTTSMDERSTAV---------------------------------------------------- : LIRLINKAPCRMCQFMKNYLPRDFRVSRTGSIYSQTTSMDERSTAVGTAT------------------------------------------------ : VLKLLRKWMP-MCFARPYVCGLSERNISVYSRSSEISSTRLL-------------------------------------------------------- : VLKLLRKWMP-FCFARPNVSELPEQKSSVYSRSSEITSTRLL-------------------------------------------------------- : FRKIVEDVW---CIGKRVMNPRRFSRVTSEMYVSTVRKSMDGSSTDNASSFTM--------------------------------------------- : FKKIVEDVW---CVGKRVMNVRRFTRVKSEIYVSTARRSVDGSSTDNASSFTM--------------------------------------------- : FRKILEDLW---CMGRKYIYPSGRSSRMTSDLYIPAHKSSDGSNKNGSSFTM---------------------------------------------- : LLKLLKDLG---CMSQERFFQYTCGKRRSSAVAMETETTTTFSP------------------------------------------------------ : LLKLLKDLG---CMSQERFFQYTCGKK-SSAAAMETETTTTFSP------------------------------------------------------ : ILKLLRIYH---CWPAKGKLSKIQGGPGRSSVMSDTDTTQALSL------------------------------------------------------ : ILKLIRIYH---CWPAQGVLSKIQGGPGRSSVMSDTDTTQALSL------------------------------------------------------ : LLRLLKLCT---CGLSQGGVSKLQAIPKRPSVMSDTETTCALAL------------------------------------------------------ : LLRLLRQYT---CGLNQRGLSKMQAVPKRPSVMSDTETTPALSL------------------------------------------------------ : LKVILQEH----CRRQSTIDSQQIRAIP--SRGSMY-------------------------------------------------------------- : LKVILLKL----CRRQSTMDTQQIRLPNIYSMGSMY-------------------------------------------------------------- : VLRVRWGQVPEPPEENVEGLLSRCYRCQ---------------------------------------------------------------------- : LGALVRKS-RGPERGSSSRFSRSTSQTSEGNGLL---------------------------------------------------------------- : LLQMFQKAGVMEQRASLTRASRYFSQTSEATSTFM--------------------------------------------------------------- : FLLTLHKHELISKRVLAAYRRGSAHSTVSQRSRNTSVSL----------------------------------------------------------- : LLVVLYKHGLISKRLMVAYRSGSANSTASQRSRNTSVTL----------------------------------------------------------- : VLDMMR------CVEGVQNDPKLSLWDSGVVEDSPDLAEEKGTLNPITTMGQVQSTQS---------------------------------------- : VLDMVR------CVEGVQDDPKLSLWDSGVVEDSPDQAEEKGTLNPMTTMGQVVEASCSVGLSDAVH------------------------------- : LLDMLRSLG---CKLKSGVRLQTASRRSSMWSESGDTSHTSAIY------------------------------------------------------ : ARNALTVSSRSSHKVLTKKR-GPISSVSTESESSSVLYS----------------------------------------------------------- : ARNALTFSSRSSHKILTKKR-GPISSVSTESESSSALSS----------------------------------------------------------- : ARRALTLTRTSSLKIVPRRRTGAMTSTTTESESSSLHSS----------------------------------------------------------- : ARRALTLTRMSSLKILPRRRTGATTSTTTESESSSLHSS----------------------------------------------------------- : LLRLLAH------------------------------------------------------------------------------------------- : VLKLLHGVG-CLCWAVSGPHLESCTSGSPSSLGLTTLSPLPPTSPLLLPPETLAHSIKYQPPTASHLSGPTKVFLFSSRPTLPSDGLLQSTVFKTKPV : LMKAFIFKYSTRTGLARLIEQTHVSETEYSAVAVENTPQI---------------------------------------------------------- : LMKAFIFKYSTRTGLTRLIEQPHVSETEYSAVAVENPPQI---------------------------------------------------------- : LMKAFIFKYSTKTGLAKLIDASHVSETEYSAVAAVENNV----------------------------------------------------------- : LTRM--------CKGLLGQRFYTGMNGWGGQSRARRTTGSFSSAESENTSHFSVMA------------------------------------------ : LTRM--------CKGLLGQRFYPGMKGWGGQRRTRRPTGSFSSAESENTSHFSVMA------------------------------------------ : VLKAAKAFGERTRRR-----EEQPVEMSFNNSQAASQETSAFSI------------------------------------------------------ : VLKAAKAFGERTKRRRGEQREDEGMEMSFNSHNTASQETSTFSI------------------------------------------------------ : VLRVAKSLGQRLGGRMRLGGRMRGGRHGNEEPAVEISLNTHNSAGHTHSHSVSEDEDTSTFTI----------------------------------- : FAKM----CGGKCRRRTSDEYVLECSDSTKSMSVQSGVIELQAVQSYLENNTNQPTNTERR------------------------------------- : LKRSVLWKIENAMAEDGRTGGRNLSKSGSFESKAFTHV------------------------------------------------------------ : LRQSVLLVFENAFAEDHGMNFVSSTRSLSSHLSRISRKSESLAPGEGGHLRLTGDQSDSKVETEV--------------------------------- : LRQSVLLVFENAFAEDHGMNFVSSTRSLSSHLSRISRKSESLAPGEGGHLRLTGDQSDSKVETEV--------------------------------- : C9 C-terminal

361 348 342 362 371 392 364 356 353 392 409 371 381 377 368 368 397 368 336 340 305 358 362 360 360 376 391 377 362 356 372 373 340 428 377 378 374 382 380 361 368 400 372 339 372 303

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

ssCCR1 ssCCR2a ssCCR2b ssCCR3a ssCCR3b ssCCR4a ssCCR4b ssCCR5a ssCCR5b ssCCR6.1a ssCCR6.1b ssCCR6.2 ssCCR7a ssCCR7b ssCCR9.1a ssCCR9.1b ssCCR9.2a ssCCR9.2b ssXCR1a ssXCR1b ssXCR2 ssCXCR1.1 ssCXCR1.2 ssCXCR2.1 ssCXCR2.2 ssCXCR3.1a ssCXCR3.1b ssCXCR3.2 ssCXCR4.1a ssCXCR4.1b ssCXCR4.2a ssCXCR4.2b ssCXCR5 ssCXCR6 ssCXCR7.1a ssCXCR7.1b ssCXCR7.2 ssCXCR8a ssCXCR8b ssCCRL1.1a ssCCRL1.1b ssCCRL1.2 ssCCBP2 ssCMKLR1 ssCMKRL2a ssCMKRL2b

Fig. 2. (continued)

89

90

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

Fig. 3. Secondary structure of a chemokine receptor. Predicted secondary structure of a salmon seven-transmembrane chemokine receptor using ssCCR3a as a model. Extracellular N-terminal, transmembrane (cylinders), extracellular loop (ECL), intracellular loop (ICL) and intracellular C-terminal regions are shown. Contact font colour codes are red for helix contact while green is membrane contact. Numbered cysteines are boxed in pink and potential cysteine bonds are shown with double red lines. The conserved DRY motif is boxed blue. The enlarged ECL2 domain of teleost CCR6 sequences is shown with a green loop and membrane orientation is shown with IN and OUT.

CCR9.2, XCR1, CXCR3.1, CXCR4.1, CXCR4.2, CXCR7.1, CXCR7.2, CXCR8, CCRL1.1, CCRL1.2 and CMKLR2; Fig. 1, Appendix S2) representing potential remnants of the unique salmonid WGD often defined as the 4R WGD that occurred approximately 95 MYA (Macqueen et al., 2014). The range of 82–95% identity between duplicates (Appendix S2) seems surprisingly broad assuming these genes all originated as a result of the 4R WGD. To test if some of these duplications had arisen prior to the 4R WGD, we used cDNA and genomic resources from Northern pike [Esox lucius, Esociformes; (Rondeau et al., 2014)], to identify pike orthologues to salmon CRs. As pike belongs to a diploid sister group of salmonids (Carmona-Antonanzas et al., 2013), any gene that was duplicated prior to the 4R WGD should also appear in the Northern pike data as duplicates. Initially we investigated pike cDNA (Leong et al., 2010), and found orthologues to most salmon CRs with the exception of elCCR5 and elCXCR5. All salmon duplicates appeared as single sequences in Northern pike (Fig. 4), suggesting that the salmon duplicates originated as a result of the 4R WGD. As one could argue that both duplicates may not be expressed in pike, we looked at genomic DNA for three pike genes. We found one variant only for elCCR5 (GenBank accession # AZJR01040242.1), elCCR6 (AZJR01034387.1) and elXCR (AZJR01031223.1) further supporting the 4R origin of the eighteen salmon duplications. To investigate if some of these genes were duplicated after the 4R WGD such as the CCR7 and CXCR8 with sequence identities between 93 and 95%, we looked at trout ESTs. As we found expressed trout orthologues of both CCR7a/b and CXCR8a/b (data not shown) it seems that all the duplications occurred at the same time, but the genes have since evolved at different evolutionary rates. Examples are CCR3 and CCR6 that have sequence identities of 81– 82% as opposed to CCR7 and CXCR8 that have 93–95% sequence identities. Considering the phylogenetic clustering of these recep-

tors, it makes sense that ssCCR3 and ssCCR6 clustering with human XCR1 and dual function CCR6 receptors evolve faster due to potential coevolution with pathogens than ssCCR7 and ssCXCR8 clustering with human homeostatic receptors CCR7 and CXCR5. To discriminate between copies originating from the 4R WGD versus other duplications, we follow the previously introduced terminology of -a and -b for 4R WGD duplicates (Lukacs et al., 2010; Shiina et al., 2005) as opposed to .1 and .2 for more divergent duplicates. 3.5. Expression patterns Gene duplications are often followed by silencing or diversifications events, leaving the question as to how many of the duplicated genes are still functional in Atlantic salmon. To address this, we first performed a thorough search of expressed GenBank resources. We found expressed match for 24 salmon CR genes leaving 24 genes as potential pseudogenes (see Table 1). Subsequently, we then analysed salmon CR expression under normal physiological conditions using RNAseq transcriptomes from various tissues. As expected due to sheer number of sequences we found expression of sixteen additional salmon CR genes providing expressed support for 40 of the 48 receptors ignoring the match for CMKLR2b being a transcribed pseudogene (Table 2). We did not find expressed signature of the receptors ssCCR5b, ssXCR1b, ssCXCR3.1b, ssCXCR7.1a, ssCXCR7.2b and ssCCRL1.2a/b suggesting they are either rarely expressed or silenced pseudogenes. In teleosts, head kidney (HK) has a role similar to mammalian bone marrow, while the functions of mammalian lymph nodes are performed by teleost spleen, HK and most likely gills (Haugarvoll et al., 2008; Uribe et al., 2011). This is consistent with the fact that gills, HK/kidney and spleen contain most expressed CRs but also the highest number of CR transcripts, dominated by orthologues to the human homing receptors CCR7, CCR9 and CXCR4.

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

91

ssCXCR5 ssCXCR3.1a ssCXCR3.1b 100 ssCXCR3.2 99 53 elCXCR3.2a 100 elCXCR3.2b elCXCR8 100 ssCXCR8a 100 ssCXCR8b elCXCR4.1 100 ssCXCR4.1a 94 ssCXCR4.1b 100 elCXCR4.2 100 ssCXCR4.2a 99 ssCXCR4.2b elCXCR7.1 59 100 ssCXCR7.1a 100 ssCXCR7.1b 100 ssCXCR7.2a 100 ssCXCR7.2b ssCMKRL1 93 100 elCMKRL1 100 elCMKRL1.2 100 elCMKRL1.3 100 elCMKRL3 92 elCMKRL2 100 ssCMKRL2a 100 ssCMKRL2b 56 ssCCR1 100 elCCR1 elCCR3.2 100 86 elCCR3 100 ssCCR3a 80 ssCCR3b 92 elXCR1 100 ssXCR1a 54 100 ssXCR1b 97 ssXCR2 100 elXCR2 ssCCBP2 100 elCCBP2 89 elCCR2 100 ssCCR2a 100 ssCCR2b 84 elCCR4 100 ssCCR4a 94 100 ssCCR4b elCCR5 100 ssCCR5a 100 ssCCR5b ssCCR6.2 100 elCCR6 100 ssCCR6.1a 100 86 ssCCR6.1b elCCR7 100 ssCCR7a 100 33 ssCCR7b elCCR9.1 100 ssCCR9.1a 100 ssCCR9.1b 100 elCCR9.2 100 ssCCR9.2a 35 92 ssCCR9.2b ssCXCR6 100 elCXCR6 elCCRL1.1 61 100 ssCCRL1.1a 100 ssCCRL1.1b 100 elCCRL1.2 100 45 ssCCRL1.2a 100 ssCCRL1.2b ssCXCR1.1 100 elCXCR1.1 100 ssCXCR1.2 100 elCXCR1.2 99 elCXCR2 100 ssCXCR2.1 97 ssCXCR2.2 71

100

Fig. 4. Phylogenetic tree of salmon and northern pike CR sequences. The pike sequences are all cDNA sequences with the exception of elCCR5 (Appendix S1). Salmon genes are shown in red font and pike genes in black font. Unique pike duplicate sequences are shown with green shading. Success in percentage per 1000 bootstrap trials is shown on each node.

92

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

Table 2 Expression patterns of chemokine receptors in Atlantic salmon tissue transcriptomes. Gene

Brain

Eye

Gills

Gut

HK

Kidney

Heart

Liver

Muscle

P. caecum

Spleen

Query length

CCR1 CCR2a CCR2b CCR3a CCR3b CCR4a CCR5a CCR6.1a CCR6.1b CCR6.2 CCR7a CCR9.1a CCR9.2a CCR9.2b XCR1a XCR2 CXCR1.1 CXCR1.2 CXCR2.2 CXCR3.1a CXCR3.2 CXCR4.1b CXCR4.2a CXCR4.2b CXCR5 CXCR6 CXCR7.1b CXCR7.2a CXCR8a CCRL1.1a CCRL1.1b CCBP2 CMKLR1 CMKLR2a CMKLR2bᴪ Total # reads

0.09 0 0.04 0.04 0.24 0.12 0.09 0.48 1.55 0 1.03 0.56 0 0 0.04 0.21 0.08 0 0 0 0.42 1.56 0 0.04 0 0.39 13.13 8.08 0.28 0 0 0.51 0 0.58 0.60 58,939,250

0 0 0 0.34 0.27 0.04 0.05 0.05 0.22 0.05 0.30 0.66 0.14 0.19 0 0 0 0.06 0.81 0.57 0.08 1.95 0.44 0.67 0 0.14 1.59 10.11 0.13 0 0.45 0.55 0.07 0.58 0.19 60,380,888

0.59 0.41 0.31 0.15 1.69 1.22 1.50 3.10 1.39 3.09 33.03 19.92 3.05 12.04 3.26 0.35 0.16 0.63 0.88 1.32 1.58 19.97 0.53 1.80 0.11 1.21 20.69 11.36 6.52 1.32 2.39 1.52 0.80 1.08 0.48 59,793,962

0.09 0 0.18 0.19 0.30 0.86 0.97 3.29 0.40 0.12 18.71 4.03 0.12 11.90 0.52 0.64 0.85 0.30 0.74 1.87 0.34 4.15 0.20 0.51 0.05 0.13 2.74 8.86 0.73 0.33 0.67 2.12 0.37 0.48 0.58 59,806,348

1.28 2.03 3.62 7.36 6.09 6.41 23.97 0.72 1.23 0.06 93.74 62.44 0.25 13.05 2.16 0.49 11.71 5.69 2.75 4.89 12.71 199.18 5.06 6.13 2.15 0.28 0.88 3.58 15.41 0 0.28 12.73 4.49 4.48 4.82 59,084,708

0.97 0.28 0.59 3.84 1.63 2.34 7.30 0.75 0.59 0.23 20.78 17.18 0 3.91 0.52 0 3.84 2.51 1.89 1.09 2.06 79.16 1.22 2.33 0.36 0.36 2.85 4.94 6.91 0 0.67 7.14 2.46 1.96 3.17 61,054,936

0.07 0 0 0.11 0.24 0.04 0.19 0.09 0.38 0 1.91 0.49 0.87 0.50 0.20 0 0 0 0 0.05 0.15 1.64 0 0.40 0.05 0.26 8.05 17.67 0.40 0 0.08 1.19 0.19 0.21 0.21 58,163,180

0 0 0 0 0.11 0.32 0.21 0.05 0.17 0 1.86 0.27 1.28 0.52 0.09 0 0 0 0 0.32 0.26 0.98 0.18 0 0 0.04 0.37 1.74 0.06 0 0.04 0.81 0.11 0.19 0.15 58,784,272

0 0 0 0.33 0.85 0.34 0.21 0.26 0.31 0.06 1.72 1.92 0.25 0.50 0.27 0 0 0 0.10 0.62 0.05 3.67 0.40 0.52 0 0.19 2.11 6.50 1.15 0.23 0.56 1.90 0.25 0.11 0.42 61,426,586

0.10 0.11 0.05 0.14 0.11 0.41 0.27 4.46 0.29 0.04 7.78 0.75 0.22 3.28 0.38 0.28 0.05 0 0.17 0.67 0.30 1.21 0 0.10 0 0.04 1.59 5.98 1.07 0.20 0.11 1.09 0.16 0.36 0.22 61,602,874

5.46 2.06 3.48 1.93 5.24 8.71 22.90 2.37 0.83 0.04 110.72 43.78 0 5.76 1.85 0.17 1.97 0.06 0.25 10.16 14.76 115.02 4.62 3.43 2.20 0.22 4.10 9.47 12.38 0.23 0.12 3.65 6.69 2.03 1.95 60,203,316

361 348 342 346 371 392 356 392 392 392 381 368 397 340 336 305 358 362 360 376 393 362 372 372 179 428 378 378 382 361 361 372 339 372 372

Transcriptional values are given in RPKM (reads per kilobase per million mapped reads). Mapping reads back to our unpublished Atlantic salmon reference transcriptome was done with CLC v 5.1.5 software. Reads were mapped with high stringency i.e. greater than 95% identity over more than 90% of the total length of the query read. The transcriptome was based on analysis of tissues of a single 1-year old individual and contained >70,000 non-redundant contigs. RPKM values above 10 are shaded blue. The receptors CCR4b, CCR5b, CCR7b, CCR9.1b, XCR1b, CXCR2.1, CXCR3.1b, CXCR4.1a, CXCR7.1a, CXCR8b and CCRL1.2a had no matching transcripts. CXCR7.2b and CCRL1.2b are likely pseudogenes while CMKLR2b is transcribed, but has an error disrupting the open reading frame making it a transcribed pseudogene.

Non-immunologically important tissues such as brain, eye, and heart also express many CR genes, but at lower levels with the exception of CXCR7. Older duplicates such as CCR6.1b/CCR6.2, CXCR1.1/ CXCR1.2 and CXCR4.1b/CXCR4.2a display differences in expression patterns consistent with the time frame they have had to acquire different functional roles. However, some 4R duplicates have different expression patterns such as ssCCR9.2a/b and CCRL1.1a/b suggesting diversification also of these more recent duplicates. Some potential salmon ligands are also duplicated such as the CK8a/b and CK12a/b chemokines potentially interacting with the duplicate CCR6 and CCR7 receptors (Laing and Secombes, 2004). As the tissue transcriptomes all originated from one fish, we decided to investigate expression of some CR genes using realtime RT-PCR. Also in this study, ssCCR7 had the overall highest expression restricted to spleen, HK and gills (Appendix S2). Furthermore, the results from the RT-PCR showed that ssCCR1 was highly expressed in spleen and gills as opposed to the transcriptome study where CCR1 had very low expression in these tissues. XCR1 also showed a difference with only gills as a major organ for transcription using RT-PCR, while the transcriptome study also showed high expression in HK and spleen. The difference between the two studies may be due to immune status and/or genetic background of the included animals. In the RT-PCR study we also pooled mRNA from three Norwegian fishes while the transcriptomes originate from one Canadian fish.

When we compared expression patterns between different teleost groups, we also found major differences and some similarities. For instance, zebrafish CCR7 had the highest expression in brain and gills (Liu et al., 2009) while salmon displayed low ssCCR7a expression in brain. A zebrafish analogue to the salmon ssCCR4 sequence (zfCCR8-2) was primarily expressed in the brain with minute expression in other tissues as opposed to the salmon orthologue which had highest expression in HK and spleen. In contrast, the suggested zebrafish inflammatory receptors zfCCR2-2/zfCCR5 and zfCCR3-2 were highly expressed in spleen, HK and gills where the salmon orthologues ssCCR5a and ssCCR3a/b sequence displayed medium expression levels. Equivalents to ssCCR1, ssCCR2 and ssXCR1 were not included in the study by Liu et al. (2009). Without data from more individuals and different physiological conditions it is not possible to evaluate if the intra- and inter-species differences are true or just a product of small sample size. 3.6. Functional diversification Six of the 4R duplicates may have been silenced. ssCCR5b, XCR1b, ssCXCR3.1b, ssCXCR7.1a, ssCXCR7.2bψ and ssCCRL1.2a/bψ are not found in GenBank or tissue transcriptomes, but they may still be transcribed in specialised tissues or under specific biological conditions. Other genes seem to be in the process of becoming silenced. CMKLR2b for example is expressed but has a 6-transmembrane

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

93

Table 3 Variability distribution of expressed 4R WGD duplicates. Gene

CCR3a/b

CCR4a/b*

CCR6.1a/.1b

CCR7a/b*

CCR9.1a/.1b*

CCR9.2a/.2b

Total % Vari

N-term. TM1 ICL1 TM2 ECL1 TM3 ICL2 TM4 ECL2 TM5 ICL3 TM6 ECL3 TM7 C-term. # variable/ #Total sites

45% (21:47) 20% (5:25) 10% (1:10) 0% (0:19) 35% (6:17) 5% (1:20) 5% (1:20) 4% (1:25) 23% (6:26) 4% (1:24) 16% (3:19) 14% (3:22) 30% (8:27) 6% (1:17) 37% (20:54) 21% (78:372)

68% (41:60) 4% (1:25) 25% (2:8) 5% (1:20) 19% (3:16) 5% (1:20) 0% (0:21) 0% (0:17) 7% (2:27) 17% (4:23) 15% (3:20) 0% (0:23) 0% (0:26) 6% (1:18) 14% (8:56) 18% (67:380)

54% (23:43) 0% (0:20) 0% (0:9) 5% (1:21) 6% (1:18) 5% (1:19) 17% (4:24) 10% (2:20) 30% (21:70) 15% (3:20) 8% (1:12) 5% (1:22) 40% (10:25) 0% (0:19) 18% (10:55) 20% (78:397)

15% (11:75) 5% (1:20) 0% (0:8) 0% (0:19) 5% (1:20) 0% (0:19) 10% (2:20) 0% (0:20) 3% (1:30) 0% (0:20) 0% (0:16) 9% (2:22) 4% (1:26) 5% (1:20) 4% (2:46) 6% (22:381)

13% (7:52) 10% (2:21) 9% (1:11) 5% (1:22) 31% (5:16) 0% (0:20) 4% (1:24) 10% (2:20) 12% (3:25) 4% (1:26) 12% (2:17) 14% (3:22) 27% (7:26) 15% (3:20) 7% (3:46) 11% (41:368)

25% (13:51) 5% (1:21) 0% (0:11) 0% (0:22) 25% (4:16) 5% (1:20) 8% (2:24) 10% (2:21) 8% (2:25) 8% (2:26) 0% (0:17) 0% (0:22) 15% (4:27) 0% (0:19) 24% (11:46) 11% (42/368)

35% (116:328) 8% (10:132) 7% (4:57) 2% (3:123) 19% (20:103) 3% (4:118) 8% (10:133) 6% (7:123) 17% (35:203) 8% (11:139) 9% (9:101) 7% (9:133) 19% (30:157) 5% (6:113) 18% (54:303) 15% (328:2266)

Percent variability calculated as number of variable residues divided by the total number of compared residues within individual domains. The CCR4a/b, CCR9.1a/1b and ssCCR7a/b duplicates marked * were not expressed in duplicate in the transcriptomes, but ESTs for both genes were found in GenBank. The transmembrane regions are shaded grey.

structure that most likely disrupts intracellular signalling. But what about the 4R duplicates that have been retained as seemingly bonafide expressed duplicates? Are they functionally identical or have they diversified? To address these questions we investigated the sequence variability distribution of the ssCCR3a/b, ssCCR6.1a/b and ssCCR9.2a/b genes showing expression of duplicates in the transcriptome analysis in addition to ssCCR4a/b, ssCCR7a/b and ssCCR9.1a/b where both duplicates have matching GenBank ESTs. Crystal structures of CRs suggest that the N-terminal and ECL domains are involved in specificity and affinity docking of the ligand (Tan et al., 2013; Veldkamp et al., 2008; Wu et al., 2010). Thus, when we divide the sequences into transmembrane (TM) and nontransmembrane (non-TM) regions, we found that 6% of all TM residue positions and 15–25% of all non-TM positions were variable (Table 3). The diversity patterns match the classes we have defined for these receptors. The potential inflammatory or dual function analogues CCR3, CCR4 and CCR6 receptors have the highest variability in the N-terminal domain ranging from 45 to 68%. The remaining three gene pairs defined as homeostatic receptors i.e. sCCR7, ssCCR9.1 and ssCCR9.2 have lower variability in the N-terminal domain ranging from 13 to 25%. 4. Conclusion Using the preliminary salmon genome we identified a total of 48 chemokine receptors in Atlantic salmon, including the ten reported previously. Forty of these receptors seem functional with expressed support. The majority of receptors have orthologues in zebrafish, while mainly the homeostatic and atypical receptors have mammalian orthologues. We defined two clades with inflammatorylike salmon receptors and one clade with XCR-like receptors, all potentially important in immune responses towards pathogens. Expression patterns showed that a majority of the receptors are expressed in the immunologically important tissues gills, head kidney and spleen. Many salmon CRs also have roles in non-immune tissues such as brain and eye. Eighteen of the genes exist in duplicate and when tested against a diploid sister group, were shown to represent remnants of the salmonid 4R WGD event that occurred approximately 95 million years ago. Sequence identity of 82–95% between duplicates suggests that both diversifying as well as conservative selection has acted upon these genes. Six duplicates may have been silenced while others show evidence of functional diversification. The data significantly increase our knowledge of

chemokine receptors in salmonids, and provide a solid foundation for future studies defining their individual biological roles. Acknowledgement This study was funded by the Norwegian Research Council grant 206965/S40 from the Havbruk program (UG, HH) and partially by an NSERC (138282-2012) grant (BFK). Appendix: Supplementary material Supplementary data to this article can be found online at doi:10.1016/j.dci.2014.11.009. References Alejo, A., Tafalla, C., 2011. Chemokines in teleost fish species. Dev. Comp. Immunol. 35, 1215–1222. Allen, S.J., Crown, S.E., Handel, T.M., 2007. Chemokine: receptor structure, interactions, and antagonism. Annu. Rev. Immunol. 25, 787–820. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. Bachelerie, F., Ben-Baruch, A., Burkhardt, A.M., Combadiere, C., Farber, J.M., Graham, G.J., et al., 2014. International union of pharmacology. LXXXIX. Update on the extended family of chemokine receptors and introducing a new nomenclature for atypical chemokine receptors. Pharmacol. Rev. 66, 1–79. Bajoghli, B., 2013. Evolution and function of chemokine receptors in the immune system of lower vertebrates. Eur. J. Immunol. 43, 1686–1692. Bajoghli, B., Aghaallaei, N., Hess, I., Rode, I., Netuschil, N., Tay, B.H., et al., 2009. Evolution of genetic networks underlying the emergence of thymopoiesis in vertebrates. Cell 138, 186–197. Bannert, N., Craig, S., Farzan, M., Sogah, D., Santo, N.V., Choe, H., et al., 2001. Sialylated O-glycans and sulfated tyrosines in the NH2-terminal domain of CC chemokine receptor 5 contribute to high affinity binding of chemokines. J. Exp. Med. 194, 1661–1673. Blom, N., Sicheritz-Ponten, T., Gupta, R., Gammeltoft, S., Brunak, S., 2004. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633–1649. Boldajipour, B., Doitsidou, M., Tarbashevich, K., Laguri, C., Yu, S.R., Ries, J., et al., 2011. Cxcl12 evolution – subfunctionalization of a ligand through altered interaction with the chemokine receptor. Development 138, 2909–2914. Bonecchi, R., Savino, B., Borroni, E.M., Mantovani, A., Locati, M., 2010. Chemokine decoy receptors: structure-function and biological properties. Curr. Top. Microbiol. Immunol. 341, 15–36. Borroni, E.M., Mantovani, A., Locati, M., Bonecchi, R., 2010. Chemokine receptors intracellular trafficking. Pharmacol. Ther. 127, 1–8. Burge, C., Karlin, S., 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94. Cancellieri, C., Vacchini, A., Locati, M., Bonecchi, R., Borroni, E.M., 2013. Atypical chemokine receptors: from silence to sound. Biochem. Soc. Trans. 41, 231–236.

94

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

Carmona-Antonanzas, G., Tocher, D.R., Taggart, J.B., Leaver, M.J., 2013. An evolutionary perspective on Elovl5 fatty acid elongase: comparison of Northern pike and duplicated paralogs from Atlantic salmon. BMC Evol. Biol. 13, 85. Chang, M.X., Sun, B.J., Nie, P., 2007. The first non-mammalian CXCR3 in a teleost fish: gene and expression in blood cells and central nervous system in the grass carp (Ctenopharyngodon idella). Mol. Immunol. 44, 1123–1134. Charo, I.F., Ransohoff, R.M., 2006. The many roles of chemokines and chemokine receptors in inflammation. N. Engl. J. Med. 354, 610–621. Chen, J., Xu, Q., Wang, T., Collet, B., Corripio-Miyar, Y., Bird, S., et al., 2013. Phylogenetic analysis of vertebrate CXC chemokines reveals novel lineage specific groups in teleost fish. Dev. Comp. Immunol. 41, 137–152. Crozat, K., Guiton, R., Contreras, V., Feuillet, V., Dutertre, C.A., Ventre, E., et al., 2010. The XC chemokine receptor 1 is a conserved selective marker of mammalian cells homologous to mouse CD8alpha+ dendritic cells. J. Exp. Med. 207, 1283–1292. Daniels, G.D., Zou, J., Charlemagne, J., Partula, S., Cunningham, C., Secombes, C.J., 1999. Cloning of two chemokine receptor homologs (CXC-R4 and CC-R7) in rainbow trout Oncorhynchus mykiss. J. Leukoc. Biol. 65, 684–690. Davidson, W.S., Koop, B.F., Jones, S.J., Iturra, P., Vidal, R., Maass, A., et al., 2010. Sequencing the genome of the Atlantic salmon (Salmo salar). Genome Biol. 11, 403. DeVries, M.E., Kelvin, A.A., Xu, L., Ran, L., Robinson, J., Kelvin, D.J., 2006. Defining the origins and evolution of the chemokine/chemokine receptor system. J. Immunol. 176, 401–415. Di Genova, A., Aravena, A., Zapata, L., Gonzalez, M., Maass, A., Iturra, P., 2011. SalmonDB: a bioinformatics resource for Salmo salar and Oncorhynchus mykiss. Database (Oxford) 2011. Diotel, N., Vaillant, C., Gueguen, M.M., Mironov, S., Anglade, I., Servili, A., et al., 2010. Cxcr4 and Cxcl12 expression in radial glial cells of the brain of adult zebrafish. J. Comp. Neurol. 518, 4855–4876. Dixon, B., Luque, A., Abos, B., Castro, R., Gonzalez-Torres, L., Tafalla, C., 2013. Molecular characterization of three novel chemokine receptors in rainbow trout (Oncorhynchus mykiss). Fish Shellfish Immunol. 34, 641–651. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Gilligan, P., Brenner, S., Venkatesh, B., 2002. Fugu and human sequence comparison identifies novel human genes and conserved non-coding sequences. Gene 294, 35–44. Gnerre, S., Maccallum, I., Przybylski, D., Ribeiro, F.J., Burton, J.N., Walker, B.J., et al., 2011. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc. Natl. Acad. Sci. U.S.A. 108, 1513–1518. Graham, G.J., Locati, M., Mantovani, A., Rot, A., Thelen, M., 2012. The biochemistry and biology of the atypical chemokine receptors. Immunol. Lett. 145, 30–38. Gupta, R., Brunak, S., 2002. Prediction of glycosylation across the human proteome and the correlation to protein function. Pac. Symp. Biocomput. 310–322. Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P.D., Bowden, J., et al., 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512. Haugarvoll, E., Bjerkas, I., Nowak, B.F., Hordvik, I., Koppang, E.O., 2008. Identification and characterization of a novel intraepithelial lymphoid tissue in the gills of Atlantic salmon. J. Anat. 213, 202–209. Huising, M.O., Stet, R.J., Kruiswijk, C.P., Savelkoul, H.F., Lidy Verburg-van Kemenade, B.M., 2003a. Molecular evolution of CXC chemokines: extant CXC chemokines originate from the CNS. Trends Immunol. 24, 307–313. Huising, M.O., Stolte, E., Flik, G., Savelkoul, H.F., Verburg-van Kemenade, B.M., 2003b. CXC chemokines and leukocyte chemotaxis in common carp (Cyprinus carpio L). Dev. Comp. Immunol. 27, 875–888. Julenius, K., Molgaard, A., Gupta, R., Brunak, S., 2005. Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology 15, 153–164. Kaisho, T., 2012. Pathogen sensors and chemokine receptors in dendritic cell subsets. Vaccine 30, 7652–7657. Kent, W.J., 2002. BLAT – the BLAST-like alignment tool. Genome Res. 12, 656–664. Laing, K.J., Secombes, C.J., 2004. Trout CC chemokines: comparison of their sequences and expression patterns. Mol. Immunol. 41, 793–808. Larkin, M.A., Blackshields, G., Brown, N.P., Chenna, R., McGettigan, P.A., McWilliam, H., et al., 2007. Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948. Leong, J.S., Jantzen, S.G., von Schalburg, K.R., Cooper, G.A., Messmer, A.M., Liao, N.Y., et al., 2010. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome. BMC Genomics 11, 279. Liu, J., Louie, S., Hsu, W., Yu, K.M., Nicholas, H.B., Jr., Rosenquist, G.L., 2008. Tyrosine sulfation is prevalent in human chemokine receptors important in lung disease. Am. J. Respir. Cell Mol. Biol. 38, 738–743. Liu, Y., Chang, M.X., Wu, S.G., Nie, P., 2009. Characterization of C-C chemokine receptor subfamily in teleost fish. Mol. Immunol. 46, 498–504. Lukacs, M.F., Harstad, H., Bakke, H.G., Beetz-Sargent, M., McKinnel, L., Lubieniecki, K.P., et al., 2010. Comprehensive analysis of MHC class I genes from the U-, S-, and Z-lineages in Atlantic salmon. BMC Genomics 11, 154. Macqueen, D.J., Johnston, I.A., 2014. A well-constrained estimate for the timing of the salmonid whole genome duplication reveals major decoupling from species diversification. Proc. R. Soc. B. 281, 20132881. Mattera, R., Boehm, M., Chaudhuri, R., Prabhu, Y., Bonifacino, J.S., 2011. Conservation and diversification of dileucine signal recognition by adaptor protein (AP) complex variants. J. Biol. Chem. 286, 2022–2030. Monigatti, F., Gasteiger, E., Bairoch, A., Jung, E., 2002. The Sulfinator: predicting tyrosine sulfation sites in protein sequences. Bioinformatics 18, 769–770.

Montero, J., Coll, J., Sevilla, N., Cuesta, A., Bols, N.C., Tafalla, C., 2008. Interleukin 8 and CK-6 chemokines specifically attract rainbow trout (Oncorhynchus mykiss) RTS11 monocyte-macrophage cells and have variable effects on their immune functions. Dev. Comp. Immunol. 32, 1374–1384. Montero, J., Chaves-Pozo, E., Cuesta, A., Tafalla, C., 2009. Chemokine transcription in rainbow trout (Oncorhynchus mykiss) is differently modulated in response to viral hemorrhagic septicaemia virus (VHSV) or infectious pancreatic necrosis virus (IPNV). Fish Shellfish Immunol. 27, 661–669. Montero, J., Ordas, M.C., Alejo, A., Gonzalez-Torres, L., Sevilla, N., Tafalla, C., 2011. CK12, a rainbow trout chemokine with lymphocyte chemo-attractant capacity associated to mucosal tissues. Mol. Immunol. 48, 1102–1113. Moser, B., Loetscher, P., 2001. Lymphocyte traffic control by chemokines. Nat. Immunol. 2, 123–128. Near, T.J., Eytan, R.I., Dornburg, A., Kuhn, K.L., Moore, J.A., Davis, M.P., et al., 2012. Resolution of ray-finned fish phylogeny and timing of diversification. Proc. Natl. Acad. Sci. U.S.A. 109, 13698–13703. Neel, N.F., Schutyser, E., Sai, J., Fan, G.H., Richmond, A., 2005. Chemokine receptor internalization and intracellular trafficking. Cytokine Growth Factor Rev. 16, 637–658. Nei, M., Kumar, S., 2000. Molecular Evolution and Phylogenetics. Oxford University Press, New York. Nomiyama, H., Hieshima, K., Osada, N., Kato-Unoki, Y., Otsuka-Ono, K., Takegawa, S., et al., 2008. Extensive expansion and diversification of the chemokine gene family in zebrafish: identification of a novel chemokine subfamily CX. BMC Genomics 9, 222. Nomiyama, H., Osada, N., Yoshie, O., 2011. A family tree of vertebrate chemokine receptors for a unified nomenclature. Dev. Comp. Immunol. 35, 705–715. Oehlers, S.H., Flores, M.V., Hall, C.J., O’Toole, R., Swift, S., Crosier, K.E., et al., 2010. Expression of zebrafish cxcl8 (interleukin-8) and its receptors during development and in response to immune stimulation. Dev. Comp. Immunol. 34, 352– 359. Ordas, M.C., Castro, R., Dixon, B., Sunyer, J.O., Bjork, S., Bartholomew, J., et al., 2012. Identification of a novel CCR7 gene in rainbow trout with differential expression in the context of mucosal or systemic infection. Dev. Comp. Immunol. 38, 302–311. Peatman, E., Liu, Z., 2007. Evolution of CC chemokines in teleost fish: a case study in gene duplication and implications for immune diversity. Immunogenetics 59, 613–623. Pfaffl, M.W., 2001. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res. 29, e45. Proudfoot, A.E., 2002. Chemokine receptors: multifaceted therapeutic targets. Nat. Rev. Immunol. 2, 106–115. Raghuwanshi, S.K., Su, Y., Singh, V., Haynes, K., Richmond, A., Richardson, R.M., 2012. The chemokine receptors CXCR1 and CXCR2 couple to distinct G protein-coupled receptor kinases to mediate and regulate leukocyte functions. J. Immunol. 189, 2824–2832. Rondeau, E.B., Minkley, D.R., Leong, J.S., Messmer, A.M., Jantzen, J.R., von Schalburg, K.R., et al., 2014. The genome and linkage map of the northern pike (Esox lucius): conserved synteny revealed between the salmonid sister group and the neoteleostei. PLoS ONE 9 (7), e102089. Rose, A., Lorenzen, S., Goede, A., Gruening, B., Hildebrand, P.W., 2009. RHYTHM-a server to predict the orientation of transmembrane helices in channels and membrane-coils. Nucleic Acids Res. 37, W575–W580. Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425. Sasado, T., Yasuoka, A., Abe, K., Mitani, H., Furutani-Seiki, M., Tanaka, M., et al., 2008. Distinct contributions of CXCR4b and CXCR7/RDC1 receptor systems in regulation of PGC migration revealed by medaka mutants kazura and yanagi. Dev. Biol. 320, 328–339. Schaffer, A.A., Aravind, L., Madden, T.L., Shavirin, S., Spouge, J.L., Wolf, Y.I., et al., 2001. Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements. Nucleic Acids Res. 29, 2994–3005. Shiina, T., Dijkstra, J.M., Shimizu, S., Watanabe, A., Yanagiya, K., Kiryu, I., et al., 2005. Interchromosomal duplication of major histocompatibility complex class I regions in rainbow trout (Oncorhynchus mykiss), a species with a presumably recent tetraploid ancestry. Immunogenetics 56, 878–893. Solovyev, V., Kosarev, P., Seledsov, I., Vorobyev, D., 2006. Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol. 7 Suppl. 1 (S10), 11–12. Stanke, M., Tzvetkova, A., Morgenstern, B., 2006. AUGUSTUS at EGASP: using EST, protein and genomic alignments for improved gene prediction in the human genome. Genome Biol. 7 Suppl. 1 (S11), 11–18. Stillie, R., Farooq, S.M., Gordon, J.R., Stadnyk, A.W., 2009. The functional significance behind expressing two IL-8 receptor types on PMN. J. Leukoc. Biol. 86, 529–543. Szpakowska, M., Fievez, V., Arumugan, K., van Nuland, N., Schmit, J.C., Chevigne, A., 2012. Function, diversity and therapeutic potential of the N-terminal domain of human chemokine receptors. Biochem. Pharmacol. 84, 1366–1380. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. Tan, Q., Zhu, Y., Li, J., Chen, Z., Han, G.W., Kufareva, I., et al., 2013. Structure of the CCR5 chemokine receptor-HIV entry inhibitor maraviroc complex. Science 341, 1387–1390. Tran, P.B., Miller, R.J., 2003. Chemokine receptors: signposts to brain development and disease. Nat. Rev. Neurosci. 4, 444–455.

U. Grimholt et al./Developmental and Comparative Immunology 49 (2015) 79–95

Uribe, C., Folch, H., Enriquez, R., Moran, G., 2011. Innate and adaptive immunity in teleost fish: a review. Vet. Med. (Praha) 56, 486–503. van der Aa, L.M., Chadzinska, M., Tijhaar, E., Boudinot, P., Verburg-van Kemenade, B.M., 2010. CXCL8 chemokines in teleost fish: two lineages with distinct expression profiles during early phases of inflammation. PLoS ONE 5, e12384. Veldkamp, C.T., Seibert, C., Peterson, F.C., De la Cruz, N.B., Haugner, J.C., 3rd, Basnet, H., et al., 2008. Structural basis of CXCR4 sulfotyrosine recognition by the chemokine SDF-1/CXCL12. Sci. Signal. 1, ra4. Verburg-van Kemenade, B.M., Van der Aa, L.M., Chadzinska, M., 2013. Neuroendocrine-immune interaction: regulation of inflammation via G-protein coupled receptors. Gen. Comp. Endocrinol. 188, 94–101. Wheelan, S.J., Church, D.M., Ostell, J.M., 2001. Spidey: A Tool for mRNA-to-Genomic Alignments. Genome Res. 11, 1952–1957. Wu, B., Chien, E.Y., Mol, C.D., Fenalti, G., Liu, W., Katritch, V., et al., 2010. Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists. Science 330, 1066–1071.

95

Xu, Q., Li, R., Monte, M.M., Jiang, Y., Nie, P., Holland, J.W., et al., 2014. Sequence and expression analysis of rainbow trout CXCR2, CXCR3a and CXCR3b aids interpretation of lineage-specific conversion, loss and expansion of these receptors during vertebrate evolution. Dev. Comp. Immunol. 45, 201– 213. Xu, Q.Q., Chang, M.X., Sun, R.H., Xiao, F.S., Nie, P., 2010. The first non-mammalian CXCR5 in a teleost fish: molecular cloning and expression analysis in grass carp (Ctenopharyngodon idella). BMC Immunol. 11, 25. Yoshimura, T., Oppenheim, J.J., 2011. Chemokine-like receptor 1 (CMKLR1) and chemokine (C-C motif) receptor-like 2 (CCRL2); two multifunctional receptors with unusual properties. Exp. Cell Res. 317, 674–684. Zhang, H., Thorgaard, G.H., Ristow, S.S., 2002. Molecular cloning and genomic structure of an interleukin-8 receptor-like gene from homozygous clones of rainbow trout (Oncorhynchus mykiss). Fish Shellfish Immunol. 13, 251–258.

Chemokine receptors in Atlantic salmon.

Teleost sequence data have revealed that many immune genes have evolved differently when compared to other vertebrates. Thus, each gene family needs f...
1022KB Sizes 0 Downloads 18 Views