0161-5890/92 %5.00+0.00

Moieczdur Immunology, Vol. 29,No.7/S,pp. 829-836, 1992

6 1992Pergamon Press Ltd

Printed in Great Britain,

ISOLATION AND SEQUENCE OF A cDNA CODING FOR THE IMMUNOGLOBULIN ,u CHAIN OF THE SHEEP SYLVIEPATRI* and FRANCOISNAU

CNRS URA 1172, Laboratoire d’Immunologie Moleculaire, Fact&C des Sciences de Poitiers, 40 Avenue du Recteur Pineau, 86022 Poitiers Cedex, France (First received 7 December 1991; accepted in reuised form 20 January 1992) Abstract-A sheep cDNA library was screened with a human Cp probe, and the complete nucleotide

sequence of a 1923 nt cDNA was determined. It contains sequences corresponding to all the exons (Vu, D,, Jn, C, 1, C, 2, C, 3 and Cn4) characteristic of the immunoglobulin p heavy chain regions. The deduced amino acid sequence shows a percentage of identical residues in the range 6545% when compared with the p chains of various species. The V, region of this clone is clearly related to a group of genes that includes mouse V, 36-60 and V, 452, human V, 2, Vi.,4 and V, 6 gene families and Xenopus V,II gene families. The constant region shows an unusual repartition of cystein and proline residues at the beginning of the C,2 domain, that may result in a molecule with enhanced stability and reduced flexibility.

INTRODUCTION

As a result of the extensive investigation they have been subjected to, the structure, organization, rearrangement and expression of immunoglobulin genes are now understood in much detail in two mammalian species, man and mouse (for a recent review, see Honjo et al., 1989; Rathbun et al., 1989; Selsing et al., 1989; Zachau et al., 1989). The rather scattered data obtained to date in other mammalian and non-mammalian species (Litman et al., 1989) show, on the whole, a remarkable conservation of the molecular structure of the immunoglobulins throughout the vertebrate evolutionary tree, leading to the unambiguous definition of the prototype 110 amino acid-long immunog~obulin domain, with its characteristic B sheet structure and intrachain disulfide bond (Williams, 1987). While some degree of recombinational diversity, using V, D and J segments, seems to be involved in all cases in the generation of the early antibody repertoire, it appears now that this mechanism is by no means the only one, or possibly the major one, in some instances, and that gene conversion and somatic mutation may play an important role in various species such as chicken (Reynaud et al., 1989) and rabbit (Becker and Knight, 1990). It is thus possible that a more thorough study of the organization of the immunoglobulin loci in different species may lead to new insights into the various ways in which the extensive diversity of the antibody combining sites is achieved. Among mammals, the sheep is remarkable in that its immune system has been studied quite extensively (for a review, see for instance Morris and Miyasaka, 1985), although no data about the structure of its immunoglobulins were available until Foley and Beh (1989) described the sequence of a I chain and the constant region of a y chain. This is the more striking since this species has *Author to whom correspondence should be addressed.

been, for close to a century, a major source of specific antisera or immunog~obulins designed for analytical, clinical or even therapeutic uses. We decided therefore to determine the structure and organization of the immunoglobulin locus ;f the sheep, and as a first approach toward this aim, we describe here the sequence of a cDNA coding for an immunoglobulin p chain. The results published recently by Reynaud et al. (1991), showing that somatic mutation is an early event involved in the generation of the initial lambda repertoire rather than a late phenomenon associated with affinity maturation, has comforted us in our belief that new and potentially important mechanisms could still be discovered by investigating mammalian species quite close in evolution to the human and murine groups. MATERIALS AND METHODS Preparation of DNA and RNA from sheep spleen Fresh sheep spleens were obtained from Montmorillon slaughterhouse, and immediately frozen in liquid nitrogen. DNA was prepared according to Davis et al. (1986) by grinding small tissue fragments to a fine powder in a mortar partially filled with liquid nitrogen, overnight incubation at 37°C in the presence of 1% SDS and with saturated phe5 mg/ml pronase, extraction nol/chloroform/isoamyl alcohol (10: 10: 1) and ethanol precipitation. Total RNA was prepared from a spleen sample by homogenization (Polytron) of the tissue in 4 M guanidine isothiocyanate followed by centrifugation at 30,OOOg for 18 hr through a 5.7 M cesium chloride pad (Chirgwin et al., 1979). Southern and Northern blatting Approximately 10 ,ug of genomic DNA was digested with the appropriate endonuclease, and submitted to 829

SYLVIEPATRIand FRANCOB NAU

830

electrophoresis on a 0.8% agarose gel. Human placental DNA was run in an adjacent lane as a control. The agarose gel was denatured, neutralized and the DNA was transferred onto a nitrocellulose membrane (Hybond-C extra, Amersham, U.K.) in a high-salt buffer either by capillarity (Southern, 1975) or under vacuum (LKB 2016 VacuGene vacuum blotting system). Total RNA (IO-15 pg) was analyzed on a 1% agarose gel containing 5.3% formaldehyde. Human RNA form the Burkitt’s lymphoma cell line Ly 67 (Lenoir et al., 1982) was used as a control. The gel was washed in 10 x SSC to remove formaldehyde, and the RNA was transferred onto nitrocellulose in a 10 x SSC buffer as above. The nucleic acids (DNA or RNA) were fixed onto the nitrocellulose by heating for 2 hr at 80°C prior to hybridization with the radiolabeled probe. Hybridization

qf the blots

The probes (see text for description) were labeled with 32P by the random priming technique (Feinberg and Vogelstein, 1983). The blots were hybridized overnight at 65°C with shaking in 6 x SSC, 5 x Denhardt’s, 0.5% SDS and 0.5 mg/ml of sonicated salmon sperm DNA denatured in the presence of the 32P-labeled probe by heating for 5 min in a boiling water bath. The filters were washed once in 6 x SSC, 0.1% SDS; once in 2 x SSC, 0.1% SDS; and twice in 0.2 x SSC, 0.1% SDS; all the washings were performed at 65°C. Autoradiography was performed at -70°C for about 4 hr, using Fuji RX film and DuPont Cronex intensifying screens. Preparation of the cDNA library The cDNA library was constructed according to standard procedures. Briefly, poly(A) mRNA was prepared by affinity chromatography on oligo(dT)-cellulose (Pharmacia, Uppsala, Sweden) and used as a template for synthesizing single-stranded cDNA by extending oligo(dT) primers with reverse transcriptase. Doublestranded cDNA was obtained by adding RNase H and DNA polymerase I, and cloned in the 3. gtl0 vector using EcoRI adaptors (cDNA synthesis system plus and cDNA cloning system, Amersham, U.K.). IdentiJication and sequencing of recombinant clones The unamplified cDNA library was screened with the human ($3’ probe under the conditions of hybridiz-

CHl

1 Sac I 4

CH2

I Eco RI tt 0.9 kbp (W’)

ation and washing described above. The DNA from the positive clones was digested with the restriction endonucleases EcoRI and SacI, subcloned into M 13 mp 18 or mp19 vectors and sequenced by the dideoxy termination method (Sanger et al., 1977) using T7 DNA polymerase, on an automated laser fluorescent DNA sequencer (ALF, Pharmacia, Uppsala, Sweden). The sequence reported in this paper has been deposited in the EMBL/GenBank database (accession number X59994). Nucleic acid and protein sequence comparisons Sequence alignments were performed using either the Align program from Pearson and Litman (1988) or a program developed in our laboratory, that uses a heuristic algorithm for finding the most probable zone of homology (F. Nau, unpublished). The sequences used for comparisons were obtained from the CD-ROM edition of the EMBL Nucleic Acid Database (release 27 May 1991) and the Swiss-Prot Protein Sequence Database (release 18 May 1991). RESULTS

Choice of probe The high homology between immunoglobulins of __ different species has made interspecific DNA hybridization an effective means of identifying the immunoglobulin gene pool of a variety of vertebrate species. This study demonstrates that DNA coding for human p chain can indeed be used to identify the homologous isotype in sheep genomic DNA. Sheep genomic DNA was digested with EcoRI, HindIII, or BamHl and assayed by Southern blotting for hybridization to three 32P-labeled human probes derived from the human genomic Cp region DNA (Fig. 1, redrawn from Cog& et al., 1988). The central 1.2 kbp EcoRI fragment yielded no hybridization signal under normal stringency conditions. Several fragments were revealed using either the 0.9 kbp SacI-EcoRI fragment (probe Cp5’) or the 0.9 kbp EcoRI fragment (probe C,u3’). As a check of the possibility of using the latter two probes for screening the sheep cDNA library, we realized a Northern blot of total spleen RNA. No hybridization was obtained using probe Cp5’; with probe Cp3’, one single band was observed, whose length (-2.4 kb) was

CH3

CH4

7 Eco RI *

i Eco RI v 1.2 kbp

0.9 kbp (CW

Fig. I. Probes used in Southern and Northern blotting experiments. The 3 kbp genomic DNA segment from which the three probes were derived includes the four human immunoglobulin p constant region exons, 750 bp of the JH-Q intron, and 300 bp of the 3’ untranslated sequence. It does not contain the membrane exons.

831

Sheep Ig Jo chain sequence to the length of a human mRNA coding for the secreted form of a p immunoglobulin chain (not shown). Probe Cp3’ was thus used in all the following experiments. similar

Isolation of sheep Cp cDNA clones

Out of 750,000 phage clones in the cDNA library, more than 100 were found to hybridize with the human Cp3’ probe under normal stringency conditions. Four of them were plaque-purified. Their DNA were digested with EcoRI and were subjected to Southern blot analysis, that revealed a -2 kbp fragment hybridizing to the human probe.

Nucleotide sequence of cDNA insert SHp7

One of the four clones (SHp7) was subcloned into the EcoRI site of the bacteriophage Ml3 for sequencing. Further subcloning of HaeIII fragments and of fragments obtained from the three unique sites (BamHI, Sac1 and HindIII) revealed by sequencing from both ends, allowed the complete sequence of the insert to be determined. The sequence of SHp7 consists of 1923 nucleotides including part of a poly(A) tract (Fig. 2). An open reading frame, extending from nucleotide 16 to nucleotide 1791, is followed by two in-frame termination codons, 109 untranslated nucleotides and a polyadenylation signal 12 nucleotides before the poly(A) tract.

A: rb Leader -10 LICSKMNPLWTLLFVLSAPR CTCATCTGCTCCAAGATGAACCCACTGTGGACCCTCCTCTTTGTGCTCTCAGCCCCCAGA AG__C_____A_AT______TT_T____TC_CC__G_GG____T-___--

/-, FRl 10 GVLSQVQLQESGPSLVKPSE GGGGTCCTGTCCCAGGTGCAACTGCAGGAGTCGGGACCCAGCCTGGTGAAGCCCTCAGAG T___________________G______________C__AG_A___________T__G___

,-,CDRl 30 TLSLTCTVSGSSLTVNHVSW ACCCTCTCCCTCACCTGCACGGTCTCTGGATCCTCATTAACCGTCAATCATGTCTCCTGG _____G______________T________TGG___CA_C_GTAGTT_CT_CTGGAG____

r'

FR2 40 rbCDR2 IRQASGKMPEWLGGVEKGGN ATCCGCCAGGCTTCAGGAAAGATGCCGGAGTGGCTTGGTGGTGTAGAAAAAGGTGGAAAC _____G___C_CC____G___GGA_T_______A____GTA_A_CT_TT_CA____G_G.

rbFR3 60 70 TYYNPALKSRLSIARDTSKS ACATACTATAACCCGGCCCTGAAATCCCGGCTCAGCATCGCCAGGGACACCTCCAAGAGC __CA____C_____CT____C__GAGT__AG___C___AT_AGTA_____G_____._A_

80 82 a b c 83 90 QVSLSLSSMAIDDTAVYYCA CAAGTCTCCCTGTCACTGAGCAGCATGGCAATTGACGACACGGCCGTGTACTACTGTGCG __GT________AAG______TCTG__A_CGC__CG______________T_________

,,DH-JH 100 a b c d RSAGAYFLADVDIWGRGLLV AGATCTGCTGGAGCTTATTTTCTTGCAGATGTTGATATCTGGGGTCGAGGACTCCTGGTC T_____________C_A___CACAA_____ ---GTCT-ATT-A--GGGGA-GC-

110 T V S S ACCGTCTCCTCA ________T___

Fig. 2. Sequence of sheep immunoglobulin /J chain cDNA (clone SHp7). The deduced amino acid sequence is shown above the nucleotide sequence. The potential glycosylation sites are underlined. For comparison, the sequence of the human VnII clone 58P2 (Schroeder et al., 1988) in (A), and the human Cp gene (Word et al., 1989) in (B), are shown under the sheep nucleotide sequence. Gaps were imposed so as to maximize identity at the amino acid level. Dashes denote identity. The polyadenylation signal is underlined.

B:

390 400 YFERHLNDTFSARGEASVCSEDW TATTTTGAGAGACACCTCAACGACACCTTCAGCGCCAGGGGCGAGGCCTCAGTCTGCTCGGAGGACTGG ATC_CC_____C____C___T-C___T--__---__GT---GT---T------AGCA-----GA___T______

140 150 VALGCLARDFVPNSVSFSWKF GTGGCCCTGGGCTGCCTGGCCCGGGACTTCGTGCCCAATTCTGTCAGCTTCTCCTGGAAGTTC ______G_T___----_C__A-A---___-C-T__-G-C--CA___CT-__________A_A_AAG___

510 520 DPSAYFVHSILTVTEEDWSKGET GACCCCAGCGCGTACTTCGTGCACAGCATCCTGACAGTGACTGAGGAGGACTGGAGCAAAGGGGAGACC _C___AG_-CG_____---CC______-----____C_._T_C__A_____A__._A__CG______-__ 530 540 YTCVVGHEALPHMVTERTVDKST TACACCTGCGTCGTGGGCCACGAGGCCCTGCCCCACATGGTCACCGAGCGGACCGTGGACAAGTCCACC ------_____G_.._C___T_.__________A___G_____~__~-A____~_~_____________ 550 566 GKPTL(YNVSLVMSDTASTCY*) GGTAAACCCACCCTGTACAACGTGTCCCTGGTCATGTCTGACACGGCCAGCACCTGCTACTGATGCCTG ._-.--______________-____--_-______--_ C_____A__TG_________---__CC_TGC GTCAGAGCCCCCAGG TGACCGTCGCTGTGTGCATGCATG ACTCTAACCATGCTGATGCG TGGCCT----A----CTCGGGGCG-CTG-C----C-T--TG------C-AA------G--TCA-C-G-G

260 270 QATDFSPKQISLSWFROGKRIVS CAGGCCACTGACTTCAGCCCCAAACAGATCTCCTTGTCCTGGTTTCGTGATGGAAAGCGGATAGTGTCT _____---G_GT----_T__-CGG-____TCAGG_____---C_G__C__G__G____A_G_G_G_~~.

280 290 DISEGQV ETVQSSPTTYRAYS GATATTTCTGAAGGCCAGGTG GAGACTGTGCAGTCCTCACCCACAACATACAGGGCCTACAGC _GCG_CA_CACG-A_-_____CAGGCT___G-CAAAG----TGGG-.--.G.-C-.__A_-TGAC____

300 310 VLTITEREWLSQSAYTCQVEHNK GTGCTGACCATCACGGAGCGAGAATGGCTCAGCCAGAGCGCGTACACCTGCCAGGTGGAGCACAACAAG ACA_----_----AA-__A_C__C_______________AT__T________GC_____T____GGGGC

320 330rwZH3 ETFOKNASSSCDA'TPPSPIGVF-TGAAACCTTCCAGAAGAACGCGTCCTCTTCGTGTGATGCTACACCACCGTCTCCCATCGGGGTCTTCACC

350 360 IPPSFADIFLTKSAKLSCLVTNL ATCCCCCCATCCTTTGCCGACATCTTCCTCACGAAGTCAGCCAAGCTTTCCTGTCTGGTCACAAACCTG __________________AG____________C_____ CA_____T_GA____C_________G_--_-

480 490 WLQKGEPVAKSKYVTSSPAPEPQ TGGCTGCAGAAGGGGGAGCCTGTGGCCAAGAGCAAGTACGTGACAAGCAGCCCGGCGCCCGAGCCCCAG ___A______G____C--__CT__T_-CC-GAG_____T_____C_._GC___AAT__-T--_______

230 240 250 VSVFVPPCNSLSGNGNSKSSLIC GTGAGTGTCTTTGTCCCGCCTTGCAACAGCCTCTCTGGTAACGGCAATAGCAAGTCCAGCCTCATCTGC _____C_____C_____A__CC__G__G__T___TC__C___ CCCC_______-_AG_____----

Fig.2(b) (caption OPTpage 831)

TGAGATGTCGCCTTTTATAAAAATTAGAAATAAAAAGATCCATTCAAAAAAAAAAAAA ____-_T__A_C_----___ _____.________._____..

460 470 SLRESASVTCLVKGFAPADVFVQ AGCCTGCGGGAGTCAGCCTCCGTCACCTGCCTGGTGAAGGGCTTCGCGCCCGCGGACGTGTTCGTGCAG _A____________G---A--A-___G___-_-____C_-_____T_T-----______C_________

210 j-,CHZ QHPKGEDVGHKGVPREVEVLSPV CAGCACCCCAAGGGAGAAGACGTCGGACACAAGGGTGTCCCCAGAGAGGTGGAAGTGTTGTCCCCCGTC -----------C--CA-CA-A-AAAAGA--GT-CC-C-T--AG-TATl-CT--G C--C-T---AAA

340

,,CH4 440 450 SKPKDVAMKPPSVYVLPPTREQL TCCAAGCCCAAAGACGTCGCCATGAAACCGCCGTCCGTGTACGTGCTGCCTCCAACGCGGGAACAGCTG ___CG______G_GT__G___C__C_CAG___CGAT__C___T___--__A___G_C_____G______

190 200 SSOVALHSSSTFOGTDGYLVCEV TCCTCTCAGGTGGCCCTGCACTCCTCAAGCACCTTTCAAGGGACGGATGGCTACCTGGTGTGTGAAGTC A____A-_____CTG____CT__-AAGGA_GT_A_G__G--C__A_.C-AAC__G_-_.___CA_----

430

410 420 ESGEEYTCTVAHLDLPFPEKSAI GAGTCCGGAGAGGAGTACACCTGCACAGTGGCCCACTTGGACCTGCCCTTCCCAGAAAAGAGCGCTATC A_T___--G--_AG__T___G_____C___A_____ACA_______--_CG___CTG___CAGA_C___

160 170 180 NSTVSSERFWTFPEVLRDGLWSA AACAGCACGGTCAGCAGCGAGAGGTTCTGGACCTTCCCCGAAGTGCTGAGGGACGGCTTGTGGTCGGCC _____ATC___C--___A_GG___AA__ACG_A-----TCTGACA--------ACCC--GG-

N AAC

370 380 ASYDGLNISWSHQNGKALETHT GCTTCCTATGATGGCCTGAACATCAGCTGGTCCCATCAGAATGGCAAGGCCCTGGAGACCCACACT A_CA-._____CA__G___C__-_TC----A___GC---_--_--G-A_~TG--A-A-_______CA~C

120 130 ESESHPKVFPLVSCVSSPSDENT GAAAGTGAATCTCACCCGAAAGTCTTCCCCCTGGTGTCCTGTGTGAGCTCCCCGTCTGATGAGAACACG ____C___CGC___A_CCC_T__.___-C--C__C_______A__AT________G___AC__G__GC

loll

500

5

5 8

7 2 e, g ;jl

g 2

iz

Sheep Ig p chain sequence Table 1. Comparison of the three framework domains of the sheep SHp7 V, segment with human, mouse and Xpnopus V, gene families (in % identical amino acids)

Human Vu families SIE (VHl) 5-1Rl (VHS)’ CE-1 (VH2) NEWM (VH4) 6-lR1 (VH@’ NIE (VH3) Mouse V, families 36-60’ Q52d 5558” S107’ J606R 7183h 441-4’ 3609’ GAM3-8’ MRL-DNA4~ Xen0pu.s V, families’ I II III IV V VI VII VIII IX X XI

FRI (%)

FR2 (%)

40 50 63 73 73 43

57 57 64 50 57 57

FR3 TOTAL (%) (%) 38 47 59 63 59 50

42 50 fi2 64 64 49

833

From the beginning of the C,l to the first stop codon, the overall identity between sheep and human sequences was 72%. The identity ranged from 62%, in the first two domains, to 72.5% in C,3, and 82% in the C,.,4 domain, consistent with the fact that only the human probe containing the C,4 sequence yielded a clear signal in both Southern and Northern hybridization experiments under normal stringency conditions. The sheep sequence is thus one more example of the gradient of sequence conservation from C,l to C,4 already described among mammalian and non-mammalian immunoglobulin p constant region sequences.

Amino acid sequence of the sheep p chain 80 77 50 43 40 43 43 47

43 43 43 50 43 29 57 57 50 57

56 66 41 47 44 50 44 56 38 44

63 66 45 46 42 43 46 -

37 67 57 47 33 47 53 20 23

50 64 64 50 43 50 50 36 50 36

44 47 50 44 50 44 47 38 47 4i 47

42

-

47

58 54 46 42 46 50 30 38 -

“Kabat et al. (1987). *Berman et al. (1988). ‘Juszczak and Margolies (1983). dReth et al. (1986). ‘Schilling et al. (1980). ‘Gearhart et al. (1981). 8Johnson et al. (1982). *Yancopoulos et al. (1984).

‘0110er al. (1981). ‘Winter et al. (1985). ‘Kofler (1988). ‘Haire et al. (1990). Comparison with V, sequences belonging to different subgroups from man, mouse and Xenopus laevis showed

that the V region was clearly related to the Vu11 subgroup in man (Table 1). The sheep sequence was thus aligned with a human immunoglobulin Vu11 sequence [Fig. 2(A)] and the human ,Uconstant region [Fig. 2(B)]. By comparison of the sequences, we were able to identify all the segments expected in a complete secreted p chain: a signal sequence 57 nucleotides long, a variable region with characteristic framework, hypervariable, D and JH segments, and a constant region consisting of four domains (C&l, C,2, Cu3 and C&4) and a 3’ end corresponding to the secreted form of a p chain.

The amino acid sequence of the p chain of the sheep was deduced from the cDNA nucleotide sequence. In the variable region, all the residue positions that are buried in the p sheet structure of the immunoglobulin fold are occupied by amino acids identical or similar to those found in mouse or human V, regions. These include the conserved cysteines at positions 22 and 92 (numbering system according to Kabat, 1987) that form the intrachain disulfide bond. The conservation of the overall conformation of this domain is further supported by the analysis of the residues supposed to be involved in V,-V, interchain contact (nos 37, 39, 45, 47, 91 and 93). All of these residues are identical to those present in mouse and human V,, with the notable exception of position 45 that is occupied by a proline in the sheep sequence instead of the quasi-invariant leucine. This substitution may, however, still allow the formation of a functional immunoglobulin, since a proline can be found in position 4.5 in at least one human V,III sequence (DOB; Kabat, 1987). The identity of the last nine residues of the V, segment and of the last five residues of the J, segment between sheep and man makes it easy to identify the D-Ji., region limits. However, in the absence of genomic data, we could not ascertain the exact location of the D-J junction. Figure 3 shows the comparison of amino acids 95-113 of sheep with the six human J, sequences. The sheep J, sequence appears to be related most closely to the human J,,4, with which it shares 10 amino acids out of a total of 15, and a 78% identity at the nucleotide level. Depending on the exact length of the sheep J, segment, the D region would thus begin at position 95 and terminate at position 100 to 100d. No homology whatsoever could be found between this sequence and published human D regions (Ichihara et al., 1988) either at the amino acid or at the nucleotide levels. The p constant region of the sheep shows four domains very similar in size and overall organization to those found in other mammalian species. Five potential glycosylation sites [underlined in Fig. 2(B)] are present, four of them at exactly the same location as in human p chains. However, some possibly significant differences appear upon close examination of the sequence at the level of cysteine, tryptophan and, to a lesser extent, proline residues that are extremely well conserved, both in number and in position, in all the p constant regions

SYLVIE PATRI and FRANCOIS NAU

834

95

Sheep

D-JH

Human

Jl

Human

Human

Human

Human

Human

TCT S

GCT A

GGA G

GCT A

TAT Y

100 TTT F

101

CTT L

GCA A

GAT D"

GTT

GAT D

ATC I

103 TGG W

GGT G

CGA R

GGA G

CTC L

CTG L

GTC V

ACC TV

GTC

TCC S

TCA S

GC_

_A_

T-C

T-C

C-G

CA-

-_-

--c

_AG

__c

AC_

___

_--

-__

-_-

---

---

A

J2

E

Y

F

Q

H

TAC

TGG

T-C

T-C

---

C--

Y

W

T

F

-C-

T--

A

F

T-C

T--

Y

F

CAC

TGG

T-C

M

W

F

-G-

A-G

Y

Y

J3

J4

J5

J6

T

-_-

--C

--T

__C

AC_ T

___

___

__T

--_

---

---

__-

--C

-A-

_-G

ACA T

A__ M

---

--_

___

_-T

---

---

ACT

---

---

--_

-__

---

---

---

AC_ T

_-_

_--

--_

---

---

---

-A-

_-G

AC-

AC-

---

--_

---

---

---

Q

T

T

_--

---

---

---

---

---

L ---

G--

Q

V --C

TA-

__-

--C

Y __C

TC-

--C

G--

__-

__C

_A_

Q __-

--G

V

T__

-AQ

S

__c

consensus

Q

___

F

__c

_A_

0

--G

AC-

T

Fig. 3. Alignment of sheep D-J, segment with human J gene sequences (Ravetch et al., 1981). Dashes indicate identity with the sheep nucleotide sequence. The amino acids are indicated only when they differ from the sheep sequence.

from higher vertebrates described to date. The sheep p chain differs from those in containing 13 cysteines instead of the canonical 12. The additional cysteine (residue No. 235) is located close to the beginning of the C,2 domain. Another distinctive feature is the presence of nine tryptophan residues instead of the usual seven or eight. The C,l domains of mammalian p chains invariably contain one tryptophan residue between the two cysteines involved in the intradomain disulfide bond. Three such residues are present in this region in sheep (residues Nos 155, 168 and 179). Finally, there are 32 proline residues in the sheep p chain, which puts it at the lower end of the observed range of 38 (hamster and house shrew) to 32 (rabbit). Of particular significance might be the number of prolines found clustered at the C,l-C,2 junction. While eight proline residues are located between the second intrachain disulfide bond forming cysteine of the C,l domain and the first one of the C,2 in the human p chains, only five are present in

Table

the homologous location p constant region.

in the sheep (as in the rabbit)

DISCUSSION

On the basis of sequence comparisons and crosshybridization data, Tutter and Riblet (1988) argued that the group III V, genes, whose prototypes are the mouse V,7183 and the human V,3 family sequences, were particularly conserved in all mammalian lineages. This observation extends in fact to non-mammalian species, since the V, genes isolated from chicken and several fishes are clearly related to this family, that may thus be the closest descendent of the primordial V, genes. However, the SHp7 sequence, as already mentioned, clearly belongs to the group of genes that includes mouse V,3660 and V&J52 and human V,2, V,4 and V,6 gene families. Table 1 shows that its overall similarity with representatives of these families (mouse 3660 and

2. Comparison of the amino acid sequences of p-coding between sheep and six mammalian species

regions

Sheep/ Mouse”

Sheep/ Rabbitb

Sheep/ Dog’

Sheep/ Hamsterd

Sheep/ Human’

Sheep/ S. murinus’

C”1 (106)

48/106 (45.3%)

501107 (46.7%)

48/104 (46.2%)

491105 (46.7%)

571104 (54.8%)

;;:, Cu3 (105)

(::i&?) 61/106 (57.5%)

(:;!;x) 71/108 (65.7%)

,::r:z, 661106 (62.3%)

(Ed) 661106 (62.3%)

(X?) 65/106 (61.9%)

G$l) Total (454)

,;:i;;, 2581455 (56.7%)

~zl;t 2841458 (62%)

(“,Z) 2601450 (57.8%)

(ZZ) 2651454 (58.4%)

,z;, 2811453 (62%)

531105 (50.5%) 56/l 15 (48.7%) 651103 (63.1%) 94/131 (71.8%) 2681454 (59%)

“Kawakami et al. (1980). ‘Berstein et al. (1984). ‘McCumber and Capra (1979). dMcGuire et al. (1985). ‘Word et al. (1989). IIshiguro et al. (1989).

835

Sheep Ig p chain sequence human 71-2) is over 64%, whilst it is under 50% with all other families, when calculated at the amino acid level. In addition, the comparison with the 11 Xenopus families described recently by Haire et al. (1990) shows that the best identity is obtained with the Vi, family. We have isolated and sequenced eight additional cDNAs, coding for p or y chains (unpublished results). The V, regions of all these clones appear to belong to the same family, and none was found that may be classified within the V,III group. Table 2 shows the result of the comparison of the amino acid sequences of the sheep Cp region, deduced from the nucleotide sequence of clone SHp7, with that of other mammalian species. On the basis of overall percentage of identity, the sheep appears to be closest to rabbit and human (62% identity), whereas approximately the same degree of identity is found with rodents, dog and house shrew (57-59%). If only the C,l and C,2 regions are considered, only the human sequence stands out with 54% identity, while the rabbit sequence is not significantly more similar to sheep than other species. This relationship is to be compared with that obtained from comparisons of 7 chains (Foley and Beh, 1989; Patri and Nau, unpublished data). As expected, the sheep Cy sequence is very close to that of bovine (89O/, identity). It is again closer to human and rabbit (69 and 72%, respectively) than to rodents (64 to 65% with mouse and rat, respectively). In that case, however. the similarity of sequences is higher in the C,l and C,2 regions than in the C,3 domain. Altogether, these results are in favor of a phylogenetic tree where the rodent lineage diverged prior to the differentiation of the primate, lagomorph and artiodactyl phyla. Although some data stemming from the analysis of hemoglobin sequences may indeed be interpreted in the same way, this interpretation is by no means unambiguous, as shown by the different trees obtained from the same set of data by Czeluzniak et al. (1989). The differential conservation of various domains of the immunoglobulin heavy chain constant regions may in fact introduce such a bias that these molecules are not particularly well suited for phylogenic analysis. The main differences between sheep and other higher vertebrate ,u chain constant regions are to be found in the Cul-C,2 joining region. Since this part of the molecule is generally assumed to play a role similar to the hinge region of other isotypes, it is conceivable that these differences might bear on the flexibility and/or the stability of the immunoglobulin molecule. If, for instance, the additional cysteine residue at position 235 was engaged in a disulfide bond between the two heavy chains, then the “hinge-like” segment might be functionally restricted to the region between cysteine 202 and cysteine 235, which would make it some 15 amino acids shorter than in other species. Interestingly, all five proline residues, supposed to be involved in the flexibility of the molecule, are clustered in this very region. Taken together, these observations lead to the prediction

that the together of other level of support

sheep p chains should be more tightly and less flexible than the corresponding mammalian species. Further experiments the protein molecules will be needed to or disprove this model.

bound chains at the either

Acknowledgements-We thank Dr M. Cogne, whose kind and competent help has been invaluable throughout the early parts of this work.

REFERENCES Becker R. S. and Knight K. L. (1990) Somatic diversification of immunoglobulin heavy chain VDJ genes-evidence for somatic gene conversion in rabbits. Cell 63, 987-997. Berman J. E., Mellis S. J., Pollock R., Smith C. L., Suh H., Heinke B., Kowal C., Surti U., Chess L., Cantor C. R. and Ah F. W. (1988) Content and organization of the human Ig V, locus: definition of three new V, families and linkage to the Ig C, locus. EMBO. J. 7, 727-738. Bernstein K. E., Alexander C. B., Reddy E. P. and Mage R. G. (1984) Complete sequence of a cloned cDNA encoding rabbit secreted p-chain of V,a2 allotype: comparison with V,al and membrane p sequences. J. Immun. 132, 490-495. Chirgwin J. M., Przybyla A. E., MacDonald R. J. and Rutter W. J. (1979) Isolation of biologically active ribonucleic acid from sources enriched in ribonuclease. Biochemistry 18, 5294-5299. Cogne M., Mounir S., Preud’homme J. L., Nau F. and Guglielmi P. (1988) Burkitt’s lymphoma cell lines producing truncated p immunoglobulin heavy chains lacking part of the variable region. Eur. J. Immun. 18, 148551489. Czeluzniak J., Goodman M., Moncrief N. D. and Kehoe S. M. (1989) Maximum parsimony approach to construction of evolutionary tree from aligned homologous sequences. In Methods in Enzymology (Edited by Doolittle R. F.), Vol. 183, pp. 601-615. Davis L. G., Dibner M. D. and Battey J. F. (1986) In Basic Methods in Molecular Biology, pp. 47-50. Elsevier. Amsterdam. Feinberg A. P. and Vogelstein B. (1983) A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Analyt. Biochem. 132, 6-13. Foley R. C. and Beh K. J. (1989) Isolation and sequence of sheep Ig H and L chain cDNA. J. Immun. 142, 708-71 I. Gearhart P. J., Johnson N. D., Douglas R. and Hood L. (1981) IgG antibodies to phosphorylcholine exhibit more diversity than their IgM counterparts. Nature 291, 29-34. Haire R. N., Amemiya C. T., Suzuki D. and Litman G. W. (1990) Eleven distinct V, gene families and additional patterns of sequence variation suggest a high degree of immunoglobulin gene complexity in a lower vertebrate, Xenopus laevis. J. exp. Med. 171, 1721-1737. Honjo T., Shimizu A. and Yaoita Y. (1989) Constant-region genes of the immunoglobulin heavy chain and the molecular mechanism of class switching. In Immunoglobulin Genes (Edited by Honjo T., Ah F. W. and Rabbitts T. H.), pp. 123-149. Academic Press, London. Ichihara Y., Matsuoka H. and Kurosawa Y. (1988) Organization of human immunoglobulin heavy chain diversity gene loci. EMBO J. 7, 4141-4150. Ishiguro H., Ichihara Y., Namikawa T., Nagatsu T. and Kurosawa Y. (1989) Nucleotide sequence of Suncus murinus

836

SYLVIE

PATRIand FRANCOIS NAU

immunoglobulin p gene and comparison with mouse and human p genes. FEBS Left. 247, 317-322. Johnson N., Slankard J., Paul L. and Hood L. (1982) The complete V domain amino acid sequence of two myeloma inulin-binding proteins. J. ~~2~2~~2. 128, 302-.308. Juszczak E. C. and Margolies M. N. (1983) Amino acids sequence of the heavy chain variable region from the A/J mouse anti-arsonate monoclonal antibody 36-60 bearing a minor idiotype. Biochem~tr~ 22, 4291-4296. Kabat E. A., Wu T. T., Reid-Miller M., Perry H. M. and Gottesman K. (1987) In Sequences qf Proteins of’ Immunological Interest. U.S. Dept Health and Human Services. Washington, DC. Kawakami T., Takahashi N. and Honjo T. (1980) Complete nucleotide sequence of mouse immunoglobulin ld gene and comparison with other immunoglobulin heavy chain genes. Nucleic Acids Res. 8, 3933-3945. Kofler R. (1988) A new murine Ig V, gene family. J. Zmmun. 140, 4031-4034. Lenoir G. M.. Preud’homme J. L., Bernheim A. and Berger R. (1982) Correlation between immunoglobul~n light chain expression and variant translocation in Burkitt’s lymphoma. Nature 298, 474-476.

Litman G. W., Hinds K. and Kokubu F. (1989) The structure and organization of immunoglobulin genes in lower vertbrates. In ImmunogIobu~in Genes (Edited by Honjo T., Alt F. W. and Rabbitts T. H.), pp. 163-180. Academic Press, London. McCumber L. J. and Capra J. D. (1979) The complete amino-acid sequence of a canine p chain. Muiec. ftnnmun. 16, 565-570. McGuire K. L., Duncan W. R. and Tucker P. W. (1985) Phylogenetic conservation of immunoglobulin heavy chain: direct comparison of hamster and mouse C/I gene. Nucleic Acids Res. 13, 561 l-5628. Morris B. and Miyasaka M. (1985) In Immunology of the Sheep. Editiones “Roche”, Basle, Switzerland. 0110 R.. Auffray C., Sikorav J. L. and Rougeon F. (1981) Mouse heavy chain variable regions: nucleotide sequence of a germ-line V, gene segment. Nucleic Acids Res. 9,4099-4109. Pearson W. R. and Litman D. J. (1988) Improved tools for biological sequence analysis. Proc. natn. Acad. Sci. U.S.A. 85, 2444-2448. Rathbun G., Berman J.. Yancopoulos G. and Alt F. W. (1989) Organization and expression of the mammalian heavy-chain variable-region locus. In Im~zunoglobuZ~n Genes (Edited by Honjo T., Alt F. W. and Rabbitts T. H.), pp. 63-90. Academic Press, London. Ravetch J. V., Siebenlist U., Korsmeyer S., Waldmann T. and Leder P. (1981) Structure of the human immunoglobulin fl locus: characterization of embryonic and rearranged J and D genes. Cell 27, 583-591.

Reth M. G., Jackson S, and Alt F. W. (1986) V(H)DJ(H) formation and DJ(H) replacement during pre-B differentiation: non-random usage of gene segments. EMBO J. 5, 2131-2138. Reynaud C. A., Dahan A., Anquez V. and Weill J. C. (1989) Development of the chicken antibody repertoire. In Immunoglobulin Genes (Edited by Honjo T., Alt F. W. and Rabbitts T. H.), pp. 1.5-162. Academic Press, London. Reynaud C. A., Mackay C. R., Miiller R. G. and Weill .I. C. (1991) Somatic generation of diversity in a mammalian primary lymphoi’d organ: the sheep ileal Peyer’s patches. Cell 64, 995-1005. Sanger F., Nicklen S. and Coulson A. R. (1977) DNA sequencing with chain-terminating inhibitors. Proc. natn. Acad. Sci. U.S.A.

74, 5463-5467.

Schilling J., Clevinger B., Davie J. M. and Hood L. (1980) Amino acid sequence of homogeneous antibodies to dextran and DNA rearrangements in heavy chain V-region gene segments. Nature 283, 35-40. Schroeder H. W. Jr, Ealter M. A., Hofker M. H., Ebens A.. Dijk K. van, Liao L. C., Cox D. W., Milner E. and Perlmutter R. M. (1988) Physical linkage of a human immunogiobulin heavy chain variable region gene segment to diversity and joining region elements. Proc. natn. Acad. Sci. U.S.A. 85, 8196-8200. Selsing E., Durdik J.. Moore M. W. and Persiani D. M. (1989) Immunoglobulin lambda genes. In Immunoglobuiin Genes (Edited by Honjo T.. Ah F. W. and Rabbitts T. H.). pp. 1I l-l 22. Academic Press. London. Southern E. M. (1975) Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. mole

Isolation and sequence of a cDNA coding for the immunoglobulin mu chain of the sheep.

A sheep cDNA library was screened with a human C mu probe, and the complete nucleotide sequence of a 1923 nt cDNA was determined. It contains sequence...
980KB Sizes 0 Downloads 0 Views