J. Mol. Biol. (1991) 222, 835841

COMMUNICATIONS

Cysteine Residue Periodicity is a Conserved Structural Feature of Variable Surface Proteins from Paramecium tetraurelia Erik Nielsen, Yun You and James Forneyt Department of Biochemistry Purdue University West Lafayette, IN 47907, U.S.A. (Received

21 May

1991; accepted 28 August 1991)

The DNA sequences of the entire coding regions of the A and C type variable surface protein genes from Paramecium tetraurelia, stock 51 have been determined. The 8151 nucleotide open reading frame of the A gene contains several tandem repeats of 210 nucleotides within the central portion of the molecule as well as a periodic structure defined by cysteine residues. The 6699 nucleotide open reading frame of the C gene does not contain any identifiable tandem repeats or internal similarity but maintains a periodicity based on the cysteine residue spacing. The deduced amino acid sequences encoded by the two genes are most similar within the 600 amino-terminal and 600 carboxyl-terminal amino acid residues, the central portions show only limited sequence similarity. We conclude that internal repeats are not a conserved feature of variable surface proteins in Paramecium and discuss the possible importance of the regular pattern of cysteine residues. Keywords:

surface protein; protozoa;

paramecium;

Variable surface proteins are a common feature of parasitic protozoans as well as free living protists such as Paramecium and Tetrahymena. Expression of surface proteins in both types of organisms is mutually exclusive, only a single type is found on one cell at any given time. The parasitic trypanosomes rely on antigenically diverse surface proteins to evade the host immune response (Cross, 1990). The function of variable surface proteins in Paramecium is not yet known, nevertheless they are present on a ciliated protozoan where this interesting example of mutually exclusive surface protein expression can be studied using a combination of molecular biology and genetics (for reviews, see Caron & Meyer, 1989; Preer, 1986). A single cell line of Paramecium tetraurelia, stock 51 can express at least 11 different types of surface proteins. Cultures expressing a common surface protein define a serotype and are identified by their reaction with antiserum produced against a pure culture. Each serotype is identified by a letter, A, B, C, etc., which also defines a surface protein. Treatment with homologous antiserum immobilizes and kills the cells, hence variable surface proteins in

cysteine-rich

protein; cysteine periodicity

Paramecium are sometimes referred to as immobilization antigens. The surface proteins are single polypeptides that range in size from 251 to 308 kDa, and cover the entire surface of the cell as well as the ciliary membrane. Further research has indicated they are attached to the plasma membrane by a glycosylphosphatidylinositol anchor, but the mature C-terminal amino acid residue that is attached to the anchor has not been identified (Capdeville et al., 1987). Immunological studies have been used to group surface proteins of Paramecium tetraurelia based on antibody cross reactivity. Using this criteria, A, B, G and Q form one related group, D, J and M another, and C, E and H are unrelated (Preer, 1959). Several surface protein genes have been cloned and characterized. These include the A, C, D and H genes in Paramecium tetraurelia stock 51, and the G gene from both stock 168 and 156 Paramecium primaurelia (168G and 156G). Partial DNA sequence for the 5’ upstream regions and random internal regions has been reported for the A, C and H genes from Paramecium tetraurelia (Preer et al., 1987; Godiska, 1987). The complete DNA sequences of both 168G and 156G have been determined (Prat et al., 1986; Prat, 1990). The most striking feature of the sequence for the G surface protein gene is the

t Author to whom all correspondence should be addressed.

835 0022-2836/91/24083547

$03.00/O

0

1991 Academic

Press Limited

E. Nielsen,

836

Y. You and J. Forney

presence of substantial internal homology, including five nearly identical tandem repeats of 222 nucleotides in the center of the gene. In order to identify common sequence motifs and conserved functional domains in different variable surface proteins we have determined the entire coding sequenceof the A and C genes of Paramecium tetraurelia, stock 51 (51A, 51C). Both of these genes are commonly and stably expressed yet previous immunological studies have shown they are not closely related (Preer. 1959). DNA

sequence of the A and C surface

protein

genes

The complete nucleotide sequencefor the A and C gene coding regions has been determined and deposited in Genbank under the accessionnumbers M65163 (51A) and M65164 (51C). The A gene has an open reading frame of 8151 nucleotides encoding a primary translation product of 2717 amino acids. The C gene has an open reading frame of 6699 nucleotides encoding a polypeptide of 2233 amino acids. Neither gene is interrupted by stop codons or a shift in the reading frame, which is consistent with earlier S, protection experiments that indicated a lack of introns (Forney et al., 1983). The deduced amino acid sequence of each gene was translated using the ciliate genetic code in which TGA (UGA) is used as the only stop codon, and TAA and TAG (UAA, UAG) code for the amino acid glutamine (Caron & Meyer, 1985; Preer et al., 1985; Kink et al.. 1990). The calculated molecular weight for the A polypeptide is 280,014 and for the C polypeptide 237,076. These values are within 8% of the experimentally determined molecular weights (A, 301,000 and C, 258,000 daltons) and the slightly smaller calculated values obviously do not include the carbohydrate portion of these glycoproteins (Merkel et al, 1981). The previously determined amino acid compositions of stock 51 A, B and D surface proteins have indicated a remarkably high cysteine (approximately 11 mol %) and serine plus threonine (20 to 26 mol %) content (Reisner et al., 1969). The deduced amino acid sequence of the A and C polypeptides also indicates a high cysteine content (about 11 mole/o) as well as threonine plus serine (26 mol o/o for A and 21 mol o/o for C). A chi square analysis comparing the experimentally determined amino acid composition of 51 A to the deduced composition results in a p value close to 05, indicating close agreement (data not shown). KyteDoolittle hydropathy plots for A and C indicate conserved hydrophobic regions within the first 20 amino acids and the carboxyl-terminal 20 amino acids (data not shown). No other highly conserved hydrophobic or hydrophilic regions were identified. A dot matrix analysis of the nucleotide or amino acid sequence of the A gene indicates a number of regions that contain internally repeated sequences. Figure l(a) shows a dot matrix analysis of the deduced amino acid sequence for the A surface protein. This analysis compared each overlapping group of 30 amino acids to all other contiguous

groups of 30 amino acids. If a minimum of 25 of the 30 are identical (83%) a dot is placed at that position. The unbroken line across the diagonal in Figure l(a) represents the comparison of each 30 amino acid block with itself and the additional diagonal lines represent regions of internal similarity. Most striking is the presence of seven diagonal lines in the cent,er of the plot. This indicates the presence of eight direct) repeats approximately 70 amino acids in length. Additional internal similarity is found near the amino terminus of the protein (from amino acid 300 through 500) though none is detected within the first. 300 amino acids at the amino terminus or the 300 earboxylterminal residues. Using the same parameters for analysis of the deduced amino acid sequence from the C gene indicates an almost complete lack of internal similarity. only a small segment near the carboxyl terminus is repeated in an adjacent region. Additional homology plots using t’he same window size (30 amino acids) but, lower stringencies (between 83% and 5676) were unable to detect any significant internal repeats (data not shown). Comparisons between diflerent

surface proteh

The amino acid sequence for C is compared with A in Figure l(c) at the same stringency shown in Figure 1(a) and (b). As indicated, the cent,ral regions of the proteins are less similar than the aminoterminal and carboxyl-terminal regions. In fact,, despite an overall identity of 42% between the two sequences. the region between amino acid 600 and 1500 shows only 35y0 identity (data not shown). The first 600 residues of the amino terminus are 48% identical and the final 750 residues of the carboxyl terminus are 52:/, identical. Alignment of the two sequencesindicates that the smatler amino acid sequence of C is a. consequence of fewer amino acid residues within the central region of the protein. This can be visualized by the shift of the diagonal line in the dot plot matrix (Fig. 1(c)).There are straight and relatively constant, tines at the amino terminus and carboxyl terminus of the plot, but if extended these lines will not intersect. Shifts in the diagonal indicate additions or deletions of amino acids relative to the other sequence, thus a more gradual shift in the line would indicate more insertions and deletions of sequencein t,he amino or carboxyl terminus. When the ent’ire length of 51A, C and 156G are compared with each other the amino and carboxyl-terminal regions can be aligned with few extensive gaps (the amino-terminal is shown in Fig. Z(a)). In contrast’, t,he central region of 5lC from the same comparison does not result in any meaningful alignment (Fig. 2(b)). The similar ends of the molecules force the alignment program to introduce gaps in the smaller (3 polypeptide, but these gaps do not maintain the cysteine periods discussedbelow. Unlike the limit,ed similarity between A and (‘. the comparison of 51A to the previously published 156G sequence indicates extensive similarity

Communications

837

.

,/: i‘ I:’

r2,0

,/”

-

1000

.'

/’ /

_

- 500

-500

/

4,,,I,,*I,8 , ,-NH 2

(b)

NH2 ! , , # , 500 , , , 1000 I

1500 , , I,

.I

2ooo I .;,. COOH /’

,/’

I’

. ..

.I

y2

I. low

* 500

1500 I,

.,

2500 COOH ,I

2ooo .I

:2x)o : -200a

./: I, ,

,;!’

,

,

/ 1500

- 1000

/

t -2000

-1500

,,.

-2500

- 1000

-500

/

,, /

Figure 1. Analysis of 51A and 51C deduced amino acid sequence for internal repeats and comparison with other genes. Dot matrix analysis was performed on the deduced amino acid sequence of 51A and NC. A window of 30 amino acids and a stringency of 25 matches were used for all panels. (a) 51A 2rerau.s51A; (b) 51C versus 51C; (c) 51A versus 51C; D, 51A wersus 156G. Hind111 and EcoRl restriction fragments of the previously cloned 51A and 51C genes (Forney et al., 1983) were subcloned into pUCl18 or pUCl19. Exonuclease III was used to construct a nested set of deletions according to the method of Henikoff (1987). The resulting plasmids were transformed into E. coli strain JMlOl or DH5af’ and single strand DNA was produced according to standard protocols (Maniatis et al., 1989). Sequencing reactions were performed using the Sequenase version 2.0 DNA sequencing kit (U.S. Biochemicals, Cleveland, OH). DNA sequence was determined from both strands of all regions that have not been published previously and at least 1 strand in previously determined regions. DNA and protein sequences were analyzed using the programs of the University of Wisconsin GCG Sequence analysis software package Version 6.2 Copyright(c) 1989 John Devereaux (Devereaux et al., 1984). The nucleotide sequence data reported in this paper will appear in the EMBL, GenBank, and DDBJ Nucleotide Sequence Databases under the accession numbers M65163 (51A) and M65164 (51C).

throughout the molecule, including the tandem repeats (Fig. l(d)). Overall, there is 80% identity between the 51A and 156G amino acid sequences. Alignment of the sequencesof three tandem repeats in the central region indicates that the local identity between A and G is 58% (Fig. 2(b)), thus the repeat region has a lower identity than the other regions. The difference in size between the repeats in 51A and 156G is the result of four consecutive amino acids missing in the 51A repeat that are present in the 156G repeat. Interestingly, the 168G polypeptide is also missing four amino acids in the corresponding region of its central repeat (Prat, 1990).

Periodicity

of cysteine

residues

Previous analysis of the 156G and 168G alleles of primaurelia has indicated a periodic structure of the surface protein based on the spacing of the cysteine residues (Prat et al., 1986; Prat, Paramecium

1990). These proteins can be divided into 37 periods, each containing eight cysteine residues except for

four half periods containing four cysteines each. The 100 amino-terminal and 150 carboxyl-terminal residues do not share the periodicity and are not included in the 37 periods. Not surprisingly, the 51A protein can also be arranged into a similar pattern of 37 periods. As previously noted for the 156G and

*

*

t******

l **

*

l **

tt

t

l

l

*

***

l

l *

*

***

*

**

l

*

tt

l

t

**

l

l

*

*

l

l

*

l *

l l

*

tt

t

***t**t

**

l

*

**

**

1271 1263 1041

1389 1383 1073

C C

A

A

C c

A

A

l

l

*

l

l

l

tt

l

*

*t

l *

l

l

tt

l *

t*

l

t

l

l

*

*

*

*

l

l

*

l

l

*

l

t

R- Sharp. l!KM) was used to align the entire amino acid One tandem repeat of 70 amino acids is overlined in the spaces insert,ed to optimize alignments. (a) K-terminal

KCYWNSlVLSPAACIQISTVATDCQLV~SGLN~SKCSAYNMCTSLV~TACQEE~CKDYTTQNKC~S..TSSVTCIWFENA. VANSTGTACQEKKAACTDYTTSTACGTSTAA.... CLAVTTVATECAYVTGTGLTMICATYNACCINLKDGTCCQEA~CIYTTSNKCTAQTTSTLSCLWIDNS. TANKAGTACQEKKATCNLYTTEATCSTSAAAATADKCAWSGAA.... GDVKTGTOGMKITA..KY............................CKDVSG......................GKCTALPDQSACCVWMKCEDYKTAS......TNU;SCYWATEGY l l l * l ttttt *t * t

l

KCYWNSTACIQISTVGTDCLKVTGTGLDDAKCIAYNAGC TM . . ..KCYWNSTTCIQISTVGTDCLKVTGTGTGLDDTKC~AYNAGCVANSTGTACQEK~CTDYADSTACG~S~~.... ~TADKCAWSGAACLAVTTVATECAYVTGTGLTNAICATCNLYTTEATCSTSAAMTADKCAWSGMCLAVTTVATECAYVTGTGLTNAICMYNANC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..LTSNTCT . . . . . . . . . . . . . ..KGKDGTCQL..TGTTCGQKTIM......................................DANC

l

VTGSGLDDTQCATYNAGCVANATGTACQEKKAACTDYTTSTACATSTM.... KCYWNSTTCIQISTVGTDCLKVTGTGLDDAKCIAYNAGCVGTS VTGTGLTDLICAAYNANCTANKAGTACQEKKATCNLYTTEATCSTSAAAATADKCAWSGMCLAVTmATECAYVTGTGLTDLICMYNANCTANKM;TACQEKKATCNLYTTEATCSTS . . . . . . . . . . . . . . . . . . . . . ..NCASLADPQTYD.SCQTFDAECSVQNDGKTC........YYFAD~CSQ. LAGT..TDAECKTLKSSCI.................FGINGKCKP..

l

**

1161 1143 975

l ******

l

C C

****

ttt

DRICDNAPTSLTTDDACKLfRVDGSCTTKANGGCVTRTTCSMTIQASCVKNSSGGDCYWNGTA..CVDKNCANAPVT13TTNSACA.GFVTGCITKSGGGCVANGACSVANVQMC.VKN DRICDNAPTSLTTDDACKTFRTDGTCTTKANGGCVTRTTC~TIQASCIKNSS~DCYWTGTA..CVDKACANTPTTIATNSACA.GFVTGCITKSGGGCWNGACSVANVQMC.VKN DKKCENASTNIKTHVDCQAFLP..TCTAKDGGGCVDIKTCADGKIKEGCKIDSAKKECYWSDKDLKCKDKIC~APNTLTTNSDCQKQF~CITN.GAGCVDDTSCGSSSVQE~A~R * l l * * * l l l * l l *** l * * l l *** l l * t* * l l ** * * l ** l ***

l

DCQAISNRCITDGTHCVEMACNTYKKQLPCVKNTAGSLCYWDATNNTCVDANTCDKLPVN~TDSDC~LIS.TCTTKTGGGC~SGNNCSDQTLEI~~KLKTTACYWDGM..CK ECQAISNRCITDGTHCVEVDACSTYKKQLPCAKNMGSLCYWDTTNNTCVDANTCDKLPATFATDK~RDVIS.TCTTKTGGGC~SGNNCSDQTLEI~~KLKTTSCYWDGM..CK DCQAISKRCITDGTICVEIDLCSTYLTSTSCYQNKAGNYCVWDETAKKCSDVTECAQLPTALTKDSECRAYLKFECTAKPAGGCVDSGTNCADQVSVEGCVTNKTRSVNCFF

l

233 233 229

116 116 109

*

C C

C C

A

1

MNQKFFILSLMLALAASQTYSLT.SCTCAQLLSEGDCTKNASLGCSWDSTKKACAV..STTPVTPVMTYM..YCDTFAETDCPKAKPCTDCGSYMCA~DSKCTYFTGCTAF~TTDS ~KFIIFSLLLALVASQTYSLT.SCTCAQLLSEGDCIKNVSLGCSWDTTKKTCGV..STTPVTPTVTYAA..YCDTFAETDCPKAKPCTDCGNYMCAWESKCTFFTGCTPF~TLDS MKRTLLIIA.HISIATCQWSKSEACTCAQLLTSGDCARNSN..CSWNTTKLACEVPQSTGPVTVTKNYGKSLYCEGLAQTDCLKLN.........ECAWIDNKCTFFTSCTPYEKTIKD

Figure 2. Alignment of 51A. 51(’ and 1X(+ amino avid sequences. The Uustal multiple alignment pr~jgram (Higgins sequence of all 3 molecules. The rrsuking data are shown for the S terminus and the central region of the 3 sequences. central region. Numbers correspond to those found in Fig. 1. Stars indicate idmtit?of ail 3 sequenws. dots indicak sequences: (h) central region.

(b)

(a)

Al Cl c

Communications

839

Figure 3. Periodicity of 51A and 51C. The amino acid sequence of each polypeptide is displayed to illustrate its periodic structure Periods are numbered starting from the K terminus. Each period contains 8 cysteine residues except for 4 half periods in both sequences and a 10 cysteine period in 51A that contains a 16 amino acid insert (bold face type) as compared with the 156G sequence. (a) 51A: (b) 51C

168G protein, the tryptophan residues are preferentially found in column 7, which usually begins CXW where X represents any amino acid residue; this is true for 26 out of the 37 periods. Within the region comprising the 37 periods only four tryptophan residues are located outside of column seven. It was pointed out previously (Preer et al., 1987) that the A gene contains 48 nucleotides (16 amino acids) not found in 156G. This inserted sequence is located in period 10 of the A protein, and is highlighted in boldface type in Figure 3(a). The sequence includes two cysteine residues, which technically disrupts the pattern of eight cysteines per period. It is interesting to note that independent of its position in a period the inserted nucleotide sequence begins and

ends within cysteine codons, thus it comprises exactly two cysteine units (data not shown). The corresponding pattern of cysteine residues for the C protein is shown in Figure 3(b). Despite the lower identity between A and C as compared with A and G (42 o/o and SO%, respectively) the pattern of cysteine repeats shows remarkable similarity. The sequence can be arranged as 30 periods, each including eight cysteines except for four half periods each including four cysteines. The tryptophan residues in C are also maintained in the number seven column at roughly the same frequency as A and G (20 out of 30 columns). Interestingly, many of the exceptionally long cysteine segments in A and G are maintained in C. For example, period 10

840

E. Nielsen,

Y. You and J. E’orney

segment I is the longest cysteine segment in column 1 for all the surface proteins (22 residues in A and G, 24 in C). Segment 6 in the third from the last period (35 in A, 28 in C) is maintained as an unusually long segment; also compare Figure 3(a) periods 27 and 28, column 3 to Figure 3(b) periods 20 and 21, column 3. The only disruption of the periodic structure in the C protein is the presence of seven cysteine residues instead of eight in the last period. Alignment of the C sequence with A and G indicates the substitution of a serine residue in place of a cysteine at amino acid position 2048, which results in the loss of segment 4 in the last period. A total of four Paramecium surface protein genes have now been completely sequenced; 156G, 168G, 51A and 51C. Even though these data do not directly suggest a function for these proteins, certain important features have emerged. The surface proteins are cysteine, serine, and threonine rich and the cysteine residues are maintained in a non-random periodicity. The basic period consists of eight cysteine residues, but there is considerable variation in the number of residues within one cysteine segment. Even the number of cysteines within a period is not completely restricted since half periods of four cysteines exist in all the surface proteins, as well as one ten-cysteine residue period in the A protein. The presence of tandem sequence repeats is common but apparently not required of surface proteins since the C protein does not contain any detectable repeats. The presence of bandem repeats may reflect the evolutionary history of the gene rather than any specific functional importance. It, is difficult to speculate on precisely why some surface antigen genes contain internal tandem repeats and others do not. Presumably 51 A , 156G and 168G represent genes t,hat have undergone internal duplications more recently than 51C. The G genes and 51A are located in the macronuclear genome within 5 and 8 kbt of telomeres, respectively (Meyer et al., 1985; Prat, 1990: Forney &, Blackburn, 1988) yet the C gene is more than 20 kb from a telomere (J. Forney, unpublished results). It will be interesting to determine if there is a correlation between genomic location a,nd presence of tandem central repeats. The carboxyl-terminal and amino-terminal regions are more heavily conserved than the central portions of the molecule. This is supported by our comparisons between A and G, A and C, as well as comparison between the two G alleles of P. primaurelia (Prat, 1990). Not only is the sequence most divergent’ in the central region but the smaller size of the C protein is the result of fewer amino acids, thus fewer cysteine periods, in the central portion relative to A or G. Variation in the number of central tandem repeats occurs in the surface proteins of other organisms. The rickettsia Anaplasma marginale, causative agent of anaplasmosis, shows strain to strain differences in the t Abbreviations used: kb, lo3 bases or base-pairs; lXGF, epidermal growth factor.

number of tandem repeats within the major surface protein (Allred et al., 1990). We have determined that some of the most conserved regions between A and C are located within the cysteine periods (e.g. 51 A amino acid 100 to 600 and 1700 to 2550), yet the central periods contain the most divergent regions. Tt, is not, immediately clear why some cysteine periods are more conserved than others since none of the periods are strikingly different from each other. Tt suggests that all periods are not* functionally equivalent. One possible explanation is that the more conserved distal cysteine periods are constrained by their involvement in protein-protein interactions and the less conserved central periods are exposed to the surface. Comparison of the A and (1 protein sequences to the data base (Genbank and EMRL) revealed no strong identity to proteins other than Parameci~~.m variable surface proteins (156G, 168G and 51 H). Nevertheless, as pointed out (Pmt. 1990), the regular patterns of cysteine residues found in Paramecium surface proteins are common t,o other classes of proteins including epidermal growth factor (EGF) related proteins such as ZIrosophila Notch. Delta and crumbs and t’he Caenorhabditis elegans lin-12 and glp-1 gene products (reviewed by Massague, 1990). All of these membrane-anchored glycoproteins contain multiple EGF-like repeats in their extracellular domains, each repeat containing six cysteines characteristically spaced in a region of 35 t,o 40 amino acids. Other proteins containing cysteine periodicity include the family of int,egrins. which are involved in cell adhesion (Hynes, 1990). These molecules include periods of eight cysteinrs spaced over about 40 amino acids. All t,hesr examples, and numerous others, are related by their known or suspected ability t,o form ext’racellular protein-protein complexes. Rather than proposing a receptor function for Paramecizcm surface proteins we suggest that the protective surface protein coat of Paramecium involves important intermolecular protein interactions that maintain a cell barrier, and at least some of the conserved cysteine residues may be involved in intermolecular disulfide bonds. In t.his regard it, should be noted that, t.hr rariahle surface proteins (VSGs) of t)rypanosomes exist as dimers in solution and these may be disulfide-linked (Auffret & Turner, 1981; Cross, 1977). This is consistent with the conservation of cysteine residues in most VSG sequences in Trypanosoma brurei ((‘ross. 1990). Tf protein-protein interactions maintain the integrity of the cell surface, then these internct’ions could be significant in the control of mutual exclusion. Although there is evidence for transcriptional as well as post-transcript)ional control of surface protein gene expression (Gilley et nb., 1996). there is no information on the ultimate source of t,hix control. Genetic analysis of t’hr allelic exclusion between 156G and 168G indicat,es that expression is controlled by the surface protein gene or adjacent DNA sequences (Capdeville et ab.. 1978). This could

Communications

be interpreted as evidence for transcriptional promoter sequencesas the controlling elements, but equally support a direct role of the gene product in regulation of mutual exclusion. Recent work has indicated that under some conditions more than one surface protein gene is transcribed yet only a single type is found on the cell surface (Gilley et al., 1990). Detectable steady state levels of more than one surface protein mRNA were found, but it was not determined if both RNAs were translated into protein. The role of surface proteins in the control of mutual exclusion can now be approached through the construction and in viva expression of chimeric genes that contain portions of two surface protein genesthat are not normally co-expressed. Although we anticipate regulatory roles for 5’ upstream promoters will be detected, additional roles for the structural gene may be found by substitution of coding portions of surface protein genes. In fact, chimeric genes may define new serotypes that show mutual exclusion from both parent molecules. This

work

was supported by National Institutes of and A127713, as well as a Junior Faculty Award from the American Cancer Society to J.F. Y.Y. was supported by an American Cancer Society Institutional Grant IN- 17- 29. This is journal paper no. 12897 from the Purdue Ag. Exp. Station.

Health grant GM43357

References

Allred, D. R., McGuire, T. C., Palmer, G. H., Leib, S. R., Harkins, T. M., McElwain, T. F. & Barbet, A. F. (1990). Molecular basis for surface antigen size polymorphisms and conservation of a neutralizationsensitive epitope in Anaplasma margin&. Proc. Nat. Acad. Sci., U.S.A. 87, 3200-3224. Auffret, C. A. & Turner, M. J. (1981). Variant specific antigens of Trypanosoma brucei exist in solution as glycoprotein dimers. Biochem. J. 193, 647-650. Capdeville, Vierny, Y., C. & Keller, A.-M. (1978). Regulation of surface antigens expression in Paramecium primaurelia. Mol. Gen Genet. 161, 23-29. Capdeville, Y., Cardoso de Almeida, M. L. & Deregnaucourt, C. (1987). The membrane-anchor of Paramecium temperature specific surface antigens is a glycosylinositol phospholipid. B&hem. Biophys. Res. Commun. 147, 121%1225. Caron, F. & Meyer, E. (1985). Does Paramecium primauretia use a different genetic code! Nature (London), 314, 185-188. Caron. F. & Meyer, E. (1989). Molecular basis of surface antigen variation in paramecia. Annu. Rev. Microbiol. 43, 2342. Cross, G. A. M. (1977). Isolation, structure and function of variant’-specific surface antigens. Annu. Sot. Beige. Med. Trop. 57, 38%399. Cross, G. A. M. (1990). Cellular and genetic aspects of antigenic variation in trypanosomes. Ann. Rev. Immunol. 8, 83-l 10. Devereaux.

Cysteine residue periodicity is a conserved structural feature of variable surface proteins from Paramecium tetraurelia.

The DNA sequences of the entire coding regions of the A and C type variable surface protein genes from Paramecium tetraurelia, stock 51 have been dete...
825KB Sizes 0 Downloads 0 Views