Molecular and Biochemical Parasitology, 42 (1990) 143-152 Elsevier

143

MOLBIO 01382

A retroposon in the 5 p flank of a Trypanosoma brucei VSG gene lacks insertional terminal repeats Bob L. Smiley, Robert F. Aline Jr., Peter J. M y l e r and Kenneth Stuart Seattle Biomedical Research Institute, Seattle, WA, U.S.A. (Received 20 September 1989; accepted 18 April 1990)

A retroposon-like repeated sequence, ingi, occurs in high copy number in the genome of Trypanosoma brucei brucei. An ingi is present in the 5' flank of the 5C gene, an intrachromosomal IsTat 1.5 variant surface glycoprotein (VSG) gene family member. The 5' end of the ingi is located 22 bp upstream of the putative VSG start codon and the ingi open reading frame is in the opposite orientation to that of the VSG gene. The termini of the ingi are not flanked by a short repeat sequence and there are no sequences upstream of the ingi insertion which are homologous to the 5' flanking sequence of other 5 VSG gene family members. Thus, it appears that recombination and/or gene conversion between two ingi sequences may have eliminated the original 5C gene flanking sequence. Similar events may also have occurred with all but one previously reported ingi. Key words: Retroposon; Antigenic variation; Recombination; Trypanosome repeated sequence; lngi

Introduction

The genome of Trypanosoma brucei is rich in repeated sequences. These include the (TAA)n containing v-sequences (or 70-76-bp repeats) located 5' to variant surface glycoprotein (VSG) genes [1-4] and sequences near the 3' end of VSG genes [5], which are involved in gene conversions associated with antigenic variation. In addition, there are telomeric CCCTAA repeats [6,7], subtelomeric conserved sequences [8], and satellite DNA contains 177-bp repeats [9,10]. Retroposonlike sequences, generally containing 3' poly(A) sequences, an open reading frame with homology to reverse transcriptase and a short (4-9-bp) chroCorrespondence address: Kenneth Stuart, Seattle Biomedical Research Institute, 4 Nickerson Street, Seattle, WA 981091651, U.S.A. Note: Nucleotide sequence data reported in this paper have been submitted to the GenBank TM database with the accession numbers M33483/M33484/M33485/M33486/M33487. Abbreviations: DTT, Dithiothreitol; ELC, expression linked copy; kb, kilobase pairs; PFGE, pulse field gel electrophoresis; SDS, sodium dodecyl sulfate; SL, spliced leader; VAT, variant antigenic type; VSG, variant surface glycoprotein.

mosomal sequence duplicated at either end, have also been found in multiple copy numbers in the T. brucei genome. These include the 7-kb spliced leader associated conserved sequence (SLACS) [11] and the 5.2-kb ingi [12] (or trypanosome repeat sequence; ref. 13). The 494-bp ribosomal mobile element (RIME) [ 14,15] sequence matches the terminal sequence of the ingi, the first 249 bp matching the 5' end of the ingi and the last 245 bp and the variable length poly(A) tail matching the 3' end. The RIME lacks the central 4.7 kb of ingi. The 5 VSG gene family consists of 3 related intrachromosomal VSG genes (5B, 5C and 5D) and one telomeric VSG gene (5A), which is located on a minichromosome [2,16-20]. In variant antigen type (VAT) 5, an expression linked copy (5 ELC) produced by duplication of the minichromosomal 5A gene is transcribed from a telomeric expression site, T5 [ 16,18]. Restriction enzyme, hybridization, and DNA sequence analyses indicate that while all VSG family members share homology within the coding regions, the 5' flanking sequences present in 5 ELC, 5A and 5B are absent from the 5C gene copy [2]. We have found that the 5' flank of the 5C gene contains an ingi in an orientation opposite to that

0166-6851/90/$03.50 © Elsevier Science Publishers B.V. (Biomedical Division)

144

of the VSG gene. Sequence analysis indicates that the ingi does not contain a repeated chromosomal sequence adjacent to the poly(A) tail and the interrupted 5' 5 VSG flank sequence is not present beyond the ingi as would be expected from a simple insertion, These results suggest that a recombination event involving the ingi and sequences elsewhere in the genome have led to the loss of the original sequences 5 ~ to the 5C gene. Analysis of the repeated sequences flanking other reported ingis indicate that such recombinations may be frequent events.

Kb 0

1

2

3

I

I

I

I

4

5

6

Hd

Materials

Methods

Organisms and recombinant DNA. Production of the IsTaR 1 serodeme of T. brucei and cloning of the VATs utilized in this study has been described previously [21-23,18]. The VAT 5 cDNA clones have been described [17]; clone pTbl.5-c2 contains the spliced leader (SL) sequence [24,25]. The genomic DNA clone A46 contains the 5C VSG gene [2]. Restriction fragments of the A46 insert were subcloned as diagrammed in Fig. 1 and grown in DH5c~ cells (BRL). pL46B 1 contains the 1.6-kb BamHI fragment and pL46ARv contains the A46 sequences from EcoRV to BamHI (2.6 kb), produced by a deletion of the ingi sequence

7

Ec

and

8

9

10

11

12

I

I

I

,

I

_

13

14

i

I

Hd

5 ELC 5 VSG pTbl.5-e2

Sp

Sp

B

Sp

Sm K

B B

HdHd B

t'~:N,:'~'~d~'I'~'Z~/2 ".~,d?~.,~,~,/ ,

Sp

/

5'p rrl,nt obe

pBS35

Sp

Ec S

f

Hd S

//;

Hd

Hd

i

5B

5B VSG

Ec

/;.,

SpHd

"

K

B

5C

~

INGI pL46SHI

(

I Pu

"-...

pL46BI

,-"

B

pL46HRv

~el ft...., S

PPHd

t t PHdHc

.~ pL46ARv

i B

Fig. 1. Restriction maps of the 5 ELC, 5B and 5C family members. The 5B and 5C genes were mapped as A1059 clones and all three were additionally mapped by genomic Southern hybridizations. Only those enzymes for which complete maps of all three family members exist are shown in the genome maps, additional restriction sites were presented in [2,17]. Dashes at the end of maps indicate additional sequence not shown, filled circles indicate telomere ends. The solid bars show the VSG sequences and the hatched box below the 5C map shows the ingi sequence. The position of the 5' flank probe used in ref. 2 is indicated below the 5B map. The cDNA clone pTb1.5-c2 is oriented below the ELC map with the box indicating the location of the SL sequence. The subclones utilized in the sequencing strategy are shown below and oriented to their maps. Restriction enzyme abbreviations: B, BamHI; Ec, EcoRV; Hc, HinclI; Hd, HindIII; K, KpnI; P, Pstl; Pu, PvuII; S, SalI; Sc, Scal; Sm, Sinai; Sp, SphI.

145

in a 4-kb BamHI fragment clone; both are in pUC19 (BRL) [26]. Clone pL46SH1 contains the 3. l-kb HindlII-SphI fragment and pL46HRv contains the 2.1-kb HindlII-EcoRV fragment of A46; both are in pBS (Stratagene). Plasmid pBS35 contains a 3.5 kb BamHI-SalI fragment from the 5B VSG gene family member as previously described [2]. Subclones of pBS35 used for sequencing are pBS35SS which contains a 1.0-kb SalI-SphI fragment in pBS and pBS35AHRv which is pBS35SS from which the SphI-EcoRV fragment has been deleted (not shown).

Southern hybridization and sequencing. Genomic DNA was prepared as previously described [21], digested with restriction endonucleases according to the supplier's directions, and separated by electrophoresis in either 1.0% or 0.7% agarose gels. Pulse field gel electrophoresis (PFGE) [27] was performed as described [28] with the following modifications: DNA was electrophoresed through 0.4% agarose at 400 V for 18 h; the field switching varied from 3 to 6 s with ramping using a Sound Scientific Model H Intervalometer. DNA was transferred to Nytran membranes according to the manufacturer's instructions (Schleicher and Schuell) and hybridized as previously described [29]. Subclones in pUC19 or pBS were sequenced directly by the double-stranded plasmid sequencing procedure [30] using the dideoxy chain termination method [31]. Results

The region in the 5' flank of the 5C VSG gene which we previously showed to contain a sequence that is repeated in the genome [2] was sequenced (Fig. 2) and found to be related to the ingi [12] (or TRS [13])and RIME [14] sequences. The 1.6-kb BamHI fragment that is contained in pL46B1 originates from the 5' flank of the 5C VSG gene. This sequence has 97.6% homology to a BamHI fragment from the internal region of the ingi (Figs. 1 and 2B). Regions corresponding to the 5' (Fig. 2A) and 3' (Fig. 2C) ends of the ingi were sequenced and show that an entire 5.2kb ingi is present in the flank of the 5C gene. This is evident from the presence of the ATG of the ingi open reading frame and the poly(A) tail

that is characteristic of the ingi retroposon. The ends of this ingi have 98% sequence homology to other ingi sequences. The ingi and the 5C VSG gene have opposite orientations, as indicated by the arrows; their ATG initiation codons are separated by 28 bp (Fig. 2A). The sequence is numbered beginning with the ATG of the 5C VSG gene and consequently the ATG of the ingi open reading frame is at - 2 9 and the ingi numbering is negative. V5' and V3 t are used to indicate the 5' and 3' ends, respectively, of the VSG gene and 15' and 13' for the ingi. In contrast to other reports [12-14], the chromosomal DNA sequences flanking the ingi adjacent to the 5C VSG gene were not duplicated. A 4-9bp sequence immediately flanking the poly(A) sequence at the 3' junction of other ingis is duplicated at their 5' junction (see Fig. 3). However, the four bp sequence (TGAT) immediately flanking the poly(A) sequence in pL46 is not duplicated until 176 bp upstream of the ATG of the ingi. While the 3' boundary of the ingi is evident from homology to other ingis and the poly(A) sequence (Figs. 2c and 3), the 5' boundary is less defined. Homology of the ingi to other ingis extends 11 bp upstream of the ATG of the ingi (Fig. 3). This homology ends in the ACTC sequence that is underlined in Fig. 2A. Homology of the 5C VSG gene to other 5 VSG gene family members extends 21 bp upstream of the ATG of the 5C VSG gene (Fig. 4) and ends in the same ACTC sequence (shown as the reverse complement GAGT, underlined in Fig. 4). Thus, the exact 5' limit of the ingi is ambiguous and the homology to other ingis begins in a sequence that is also homologous among 5 VSG gene family members. The sequence of the 5C VSG gene shows 85% homology to the 5B VSG gene (clone pBS35) and 5A gene (from cDNA clone pTb1.5-c2) for at least 307 bp V3' to the ingi junction (Fig. 4). Comparison of the 5B VSG gene sequence with that of the 5 cDNA clone shows homology for another 6 bp V5' to the ingi junction, up to the splice site for addition of the spliced leader (SL) sequence found at the 5' end of mature trypanosomatid mRNAs [32-36]. The 6 bp conserved between the 5 cDNA and 5B sequences is not found in the 202 bp at the 13' end of the ingi in pL46. Nor is there any homology to additional sequences 5' to the 5B VSG

146

A) ~V PL 46

1 Eco RV CATGGGTTGCGATATC

PL46 RIME

-20 -30 -40 -50 -60 -70 -80 -90 -I00 -ii0 TTAAACTCCCTGGCGATGCCGGCCACCTCAtCGTGGTGCCAGaGTCCAGTACCCCGTATCATCGGGGGAAGCCAAGAGCCAGCAGCGTTCCTTTCATGGG [AC g A G

pL46 RIME

-120 -130 -140 -150 -160 -170 -180 -180 -200 -210 GAACACTGCTGTGCTCCGGC TACGGCATCATACAGCACAGGGATCAGCAGCGTC TTGCTGGGACACCGTT TTT aATTTGTCGGTCCCTGGGCACGTGCCA T • C

B) PL46B1

Barn HI GGATCCGCAGCCGGACC TGAT TGC TTATACAACGAGGCAC TGCAACATC TCGGCAGAACAGCGC TGAATGT TOT TC TGAGGC TAT TCAATGAGAGCC TAC a

PL46BI

G~CGGGAGTCGTGCCGC CTGCATC,-GAAGACTC~3TGTTATCATt CCCATCCT G / ~ C C G G A A . ~ C , C

GCGGAGGACCTCGATTCT TAC/~GC-Ct TGTGAC C

PL46BI

GC TCACGAGCTGTCTCTGCAAAGTCATGGAGCGCATAATTGCCGCGAGACTTAGAGACACTGTTGAGTCCCAGCTGACGCCGCAGCAATCAGGC TTTCGC

PL46BI

C aCGGATGCTC~CGCTCG.,~CAACT&CTGCACGTCCGCGCTC~CCTCTC.42C~CCCACGCACC.,~TCTCGTAtGGGTGCTGTATTCGTTGAC TACGAGA

H1nc II C

C

a

C

c

PL46BI

AGGCATTTGATA~AGTAGA~A~GACAA~TTGCGAGGGAAATGCA~AGAATGAAGGTATCACCCCACATTGTGAAGTGGTGCGTATCATT TC TGAGTAA

PL46BI

CCGAA•Tc•GCAGAGTGAGATT•AAGGAGAAGCTTTCCAGcAGCAGAACATTTGAGCGAGGAGTGC•ACAAGGAACTGT•CTTGGCC•AATCATGTTCATT

Hind III A PL46BI

Pst I AT TGTCATGAACTCGTTGAGCCAACGCCTTGCAGAAGTGCCGTTACTGCAGCACGGATTCTTTGCAGACGACC TGACGCTACT cGCGAGGCACACAGAGA t

PL46BI

GGGATGTCg TCAACCACAtGC TACAATGCGGC g T cAACGTGGTGTTAgAGTGGTCAcAAGAGTACTTCATGTC TGTCAACGTAGCGAAtACGAAGTGCAC A C C A C A A a Hind III

PLa6BI

ACTC TTCGGGTGTATAGAC~GCCACCCCCT TACATTACAACTC.~3ACGGCG~.AGAATAC~3AC~TGACAGGACACCG~ T TCTAGGAGTAt CATTC CAG A

PL46BI

Psi I .Psi I cGTC TGCAGGGGATGGCAACACATGCGGCCGAAACGAGACGCAAGAT GGAC T TCC GAC TAC TGCAGATAGCAGCCAT C T CAC-CT TC TACATGGGGGCCAA T Sca I

PL46B1

GACGAC~.GTACTGAGAGCTTTTTATCTAGCACTCGTACAC~3aACACACCATGTAT~CATTGAGGTATGGTACTGGGACGCTTCGG~CG.~GTCGgGA C C

PL46BI

CCT aCTTGCAGCAGCgCAACACAAgGg tAGTCGCATCATAGCTGGCATACC aCATGGGACGCGCAAAGAGGACTCTCTGCTGGAAc CAAACCTC t TGCCA C t A A CC n G G C

PL46B1

CTCAAGACG/~CCACTCTTGTC~AGCATGAAATTCATC~TGATGTGTGAGTCACGA~GGATGTTT~GC~AGTGCTG~GAAGTATACCACAcCA G

PL48BI

AA~ACCCAGT~AGAGCC~TACATTt~CGCATCATGCGGT~CTA~CCCA~TC~GCATTGAGCCACGCGAGCAC~A~TAGAGACATCGACGCT~GCCA C Pvu II

PL46BI

CA~TGCCGACCGcTATTTCACACC~AGAT,~GCCTGTGTC~ aCTGATGACCgTGACGATGTC~G£~GAGGCTTCCGAA-,~GTGGATTGCACGC~AT A G C g

PL46BI

Barn HI T T TGCAC GGAGGCK74~AAGGAGCCACC GCGGC GAGAGCAC TACGAAT TGTGGAC TGATGGATCC

c) PL46 PL46 RIME

-GGAGAC TC T GCCA6AGTCGCCAGACCGATAGCAiC TCAGGGC TC T A C G G T G A T ~ C TGATGGCCGC GCCAGTGGGGGGAAAC TC G HhaI. C C A C G A A G G C A C G A A G A A ~ T T T C A A A A A ~ A A T G A T A A G T TGGTGCGC T TGC T TGGTGGCCGC TGAAGT TAGTGCTAT TAGC TOT TCAAGC TGC T T T T T T poly A][ 3' JUNCTION 5C

PL46

TTTGTGCGTTGCTTTGTTATTGCTTGCGGCCCTGGCTTCCTGAGCATCCGCTATGCCGTTGAGGTGGCCGAGCCACTCTGCTTTGGTGAGGTCTGGTTT T

PL48

Sph I CCTTCTGCTGCCCCGTTCC~GTATTTGAAC-CATGC

RIME

AGGATGAAT~AA g

147 5' J u n c t i o n GTAC CGATTGATTGAGICcCCCTGG TATTATGTGTTTTTGTICcCCCTGG GTTAGTTTGATTTATTTITCCaTGG GATAACTTTTAeAAAAAAAICCTGt GAATITCCCTGG GTTG CGATATCTTAAIACTCIC CTGG IACACTCCCTGG TAGTTCCLACACgCCCTGG GTTGATATTTTTG CTAICc CCCTGG

Inqi

3 ' Junction (POLY (POLY (POLY (POLY (POLY (POLY (POLY (POLY (POLY

A) A) A) A) A) A) A) A) A)

TTGATTGAGCCC TTTTTAATACTA TAGTGCTATACT T T A C G A G TAT

T C

TGAAGTGTAGTT T G A T A A G T T G G T T T TG G T T A G T T C C T T T T T G CT

A C C A

TRS-I o6 TRS-I. 4 INGII

[13] [13] [12]

INGI2

[12]

INGI3 [12] pL46 [This paper] R I M E - m i d d l e [14] RIME-dimer [14 ] RIME

[15]

Fig. 3. Comparison of sequences flanking the ingi ends. The RIME-dimer sequence represents the junctions of the RIME dimer demonstrating the rRNA sequence duplication. The RIME middle sequence represents the 3' junction of the RIME A monomer and is the remnant of an earlier insertional event. The putative 5 t and 3' boundaries of the ingi (or RIME) insertions are indicated by vertical lines.

gene although a probe from that region hybridizes to the flanking sequences of all other 5 VSG family members [2]. The HhaI to SphI fragment of pL46SH1 was subcloned to remove ingi/RIME sequences and hybridized to SphI + KpnI digested genomic DNA. Plasmid dilutions were utilized as copy number controls and a single fragment at single copy intensity was observed (data not shown), indicating that the 202 bp at the IY end of pL46 is unrelated to any other sequences in the genome. Although ingis appear to be retroposons, there are termination codons interrupting the long open reading frames which show homology to reverse transcriptases [ 12,13]. It is unknown whether ingis are still active as retroposons. Studies looking for ingi-specific transcripts [12,13] were inconclusive since results could be attributed to read-through from a nearby promoter. A change in the chromosomal location of an ingi during trypanosome growth, however, could indicate active movement of an ingi. Since all of the larger chromosomes (the unresolved L chromosomes) as well as the

megabase size chromosomes (MI-M4 and I) of the IsTaR 1 serodeme [18] (unresolved in Fig. 5) hybridized with the plA6B1 ingi probe, analysis of movement between those chromosomes was impossible. However, only three of the 20-30 minichromosome bands (50-150 kb) hybridized to this probe (Fig. 5), in contrast to other T. b. brucei serodemes in which a greater proportion of the minichromosomes hybridized to an ingi probe [12,13]. Since the number and hybridization intensities of the minichromosomes containing the ingi were constant, and the sizes varied no more than that expected for telomere growth and collapse in the VATs studied (Fig. 5), there is no indication of ingi movement to or from a minichromosome during the 166 days of infection represented by the serodeme [21], nor during the subsequent generations produced by the VAT cloning and clonal expansion. In order to test for the presence of possible ingi particles similar to the Ty particles of yeast [37] and LR1 particles in L. b. guyanensis [38],

Fig. 2. Analysis of portions of the pL46 ingi sequence. The pL46 sequence is presented in the order and orientation as found in the ingi, opposite to the genomic map in Fig. 1. Deviations of previously published sequences from the pL46 sequence are shown below the plA6 sequences if more than one sequence differs from pL46. The pL46 sequence will be in capitals if at least one published sequence agrees. The published consensus is in lower case when only two sequences match. Restriction sites mentioned in the text or in Fig. 1 are shown above the sequence. (A) Comparison of the ingi 5' region of clone pL46 (GenBank accession number M33483) with other ingi/RIME [12-14] sequences. The underlined bases share homology with both the 5 VSG gene family and the RIME. The arrows indicate the first ATG codon and the open reading frame orientation for the VSG gene and the ingi (V,I respectively). Base +1 is the first base of the VSG gene ATG codon. The numbering is negative to show that it is in the opposite orientation from the VSG coding sequence. The G insert, relative to the RIME dimer, at position - 4 9 was seen in all ingi sequences examined. (B) Comparison of the pL46B1 sequence (GenBank accession number M33484) to Ingi3 [12], DRI [12] and TRS-1.4 [13]. (C) Comparison of the poly(A) end of the ingi in pL46 with published ingi and RIME Y ends, and the pL46 flanking sequence to the SphI site (GenBank accession number M33485). The poly(A) tail and the 3' junction region are specified as are the Sphl and HhaI sites which generate the 5C flank probe (SH sequence).

148 -50

40

-30 ~I

-i0 Eco RV

ggcatcGccaggGAGTTT~GATATC~CCC

pL4fi

~TGTGAAACCA~TTATATACCCGTTTCACCTT~GAAACCGAGTTT~GATATC~CCC

pBS35

(cgctattattagaacagtttctgtactatattg)AAACCGAGTTT~GATATC~gAgCC

pTbl 5-c2 V~ pL46

-20

I0

20

30

40

50

60

70

80

90

I00

AT~GGC~CACACTGGGTCTGATTTTA~A~GGC~TCAC~GAG~aC~CA~t~GATCgC~AGAT~TG~GgTCTT~T~C

pBS35

AT~GCaC~CACAC

pTbl 5-c2

AT~GGC M

Q

TGGGT C TGAT T T T A ~ A ~ G G C ~

TCAC~GAG~

tC ~ C A ~ . . . . . . . . . . . . ~ A G A T ~ T G g A ~ G A T C

TT~T~C

aCCACACTGGGTCTGATTTTA~A~GGC cCTCAC~GAG~gCAtCAGgGGCGATCAC~AGAT~TG~GATCTT~T~C

G

T

T

L

G

i10

L

I

L

A

A

A

L

T

120 Sal I 130

R

E

A

S

140

G

A

I

T

150

S

G

D

160

N

E

A

I

170

L

M

Q

L

180

190

pL46

TTT~C~T~GTT~A~TA~CGACG~GcCCaTT~GTTCGA~CAtCactgGcaGAttg~CCA~G~CCGAAAGAC~TCTATAGACTAAATATGTC

pBS35

TTT~CAT~GTT~A~TAGtCGAC

pTbl 5-c2

TTT~CAT~GTT~AaCTA~CGACGGGACCCTT~GTTCGA~CA~GGCTGGCGA~A~CA~G~CCGAAAGACCTCTATAGACTAAATATGTC C

H

A

L

Q

L

210 pL46 pTbl 5-c2

D

G

T

220

L

K

F

E

230

P

A

A

G

240

E

E

P

S

250

E

P

K

280

D

L

Y

R

270

L

N

M

S

280

290

300

TTTA~GcCACAT~tT~AgaTCTAAATTCagCAAAACcGaTGcCAcc~CAAAtt~TA~GGCta~TCTACaGcCaGAAATAC TTTA~GACACAT~CT~ATGTC~AAATTCGTCAAAACA~TGC~ACAAA~AAA~TA~GGCCCCTCTAC~GAC~AAATACGGGACG~G~T~ L

A

T

H

N

W

M

S

K

F

320

310 pTbl 5-c2

A

200

V

K

T

G

330

G

T

N

K

340

A

I

A

A

P

L

P

T

E

I

R

D

E

E

W

350

AAA~C~GT~ACAGTAT~AC~AC-C.~GGCAGTTCACATCTCTGACAAA~AAACCTCG

K

A

K

W

T

V

W

T

E

A

A

V

H

I

S

D

K

A

N

L

Fig. 4. Partial sequence of the 5 VSG family member 5' coding regions. The sequence is numbered from the ATG codon of the VSG gene, and the first ATG codons for the VSG and ingi (V,I respectively) are marked as in Fig. 2A. The SL sequence of pTbl.5-c2 is in parentheses and the entire sequence of that clone is presented. Restriction enzyme sites listed in Fig. 1 are underlined, as is the 4 base homology between the ingi/R1ME and 5 VSG family sequences. The amino acid sequence derived from pTbl.5-c2 is shown below the sequence. GenBank accession numbers: pL46 (VSG 5C), M33483; pTbl.5-c2 (VSG VAT 5), M33486; pBS35 (VSG 5B), M33487.

live trypanosomes were lysed with Triton-Xl00 [39], the nuclei removed by centrifugation, and the cytoplasm fractionated on a 15-45% sucrose gradient. Both Southern and Northern blots of the gradient fractions were hybridized to the ingi probe, but only those fractions containing contaminating chromosomal DNA showed hybridization (data not shown). This suggests that there is no high-copy-number ingi particle in the cytoplasm, although a low-copy-number particle might not have been detected due to background from chromosomal DNA.

Discussion We have found an ingi retroposon-like sequence immediately 5' to the 5C gene, one of four members of the 5 VSG gene family. The ingi is in the opposite orientation to the VSG gene, between the predicted start of the coding region and a SL

attachment site of the 5 VSG gene. If the ingi sequence were inserted into the 5' flanking sequence of the 5C gene by a normal retroposon insertion mechanism, then the 3' flank of the ingi sequence would contain homology to the 5' flank of the 5C VSG gene. Any proposed duplication sequence VY to the 4-bp ingi/VSG homology would have to include the remainder of the conserved VSG family member sequences between that position and the 4-bp sequence at the ingi poly(A) tail junction. However, the 202-bp fragment from the ingi poly(A) tail to the SphI site (SH sequences, Fig. 2C) does not have homology to the 5' flank of the 5A and 5B genes (see Fig. 4), nor to the conserved region of all three family members VY to the ingi insertion. In addition, this fragment is present as a single copy within the genome. Furthermore, the termini of the 5C ingi contains no duplications, unlike other ingis and RIMEs where 4-9-bp duplications have been reported [12-15].

149

,6,

B

97 48.5

1

2

3

4

5

6

7

1

2

3

4

5

6

7

Fig. 5. Minichromosomes of variant antigenic types in the IsTaR 1 serodeme probed with the ingl sequence. The minichromosome region of T. b. brucei was separated by pulse field gel electrophoresis, stained with ethidium bromide (A), transferred to Nytran membranes and hybridized with radiolabelled ingi sequence, excluding the RIME sequences (B). Lane 1, A multimeric markers; lanes 2-7 are VATs: A, l, 3, 7, 1l, 44 respectively. The size of the A multimers is given in kb.

Our results suggest that a recombinational event occurred subsequent to the original ingi insertion. The recombination could have involved either an intrachromosomal or interchromosomal event. An intrachromosomal recombination between two ingis, one inserted in the original SH sequence and the other in the original 5C flank would have produced the current linkage between the 5C VSG gene and the current SH sequence. This would delete the intervening sequence, which includes both the original 5C 5' flanking sequence and the original ingi insertion duplications at the SH ingi 5' end and the 5C ingi 31 end, leaving the observed non-related ingi junctions. Alternatively, the intrachromosomal recombination could have occurred between two non-ingi homologous sequences, one found at the 5' end of an ingi inserted at the SH sequence and the other in the 5C gene flank, part of which may be represented by the four base overlap observed between the ingi and the 5 VSG family members. This second alternative only differs from the first in that an ingi insertion is not required at the 5C gene locus and requires that the homology block be small enough to have excluded the insertional duplication sequence at the ingi 5' end. An interchromosomal recombination could have occurred in two ways. The first would involve a reciprocal recombination event starting with a cross-over between two ingis, one in the SH sequence and the other in the 5C gene flank,

and ending somewhere beyond the SphI site. This would require subsequent loss of the recombination product containing the original 5C 5' flank. The second interchromosomal recombination mechanism would involve a gene conversion event in which the SH sequence and its ingi acts as a donor to gene convert the 5C ingi and the 5C 5' flanking sequence. This would require subsequent loss of the donor sequence. Analysis of the 5' and 3' sequence flanking other reported ingis (see Fig. 3) reveals that in some cases they may also have been involved in recombination subsequent to the original ingi insertion. We have defined the 5' ingi junction to be the site where homology to the RIME and other ingis stops. The sequences duplicated 5' and 3' to TRS-1.4, ingil, ingi2 and ingi3 (see Fig. 3) do not occur immediately proximal to the 5' ingi junction, suggesting that recombinations may have occurred after insertion of these ingis. Although no sequence-specific target signal for ingi/RIME insertion has been proposed, an analysis of the junction sequences shows no conservation of sequence at either end except for an invariant T at the first base position following the poly(A). Ingi sequences-are under-represented in minichromosomes both in the IsTaR 1 serodeme and in other serodemes [12,13] based on computations of ingi copy number and the total mass of DNA in the minichromosome region. Only three minichromosomal bands in the IsTaR 1 serodeme

150 h a v e ingi sequences. The stability in our s e r o d e m e o f the three m i n i c h r o m o s o m e size classes w h i c h c o n t a i n ingis argues a g a i n s t a r a p i d t u r n - o v e r o f at least s o m e m i n i c h r o m o s o m e s . T h o s e m i n i c h r o m o s o m e s do not v a r y in size m o r e than e x p e c t e d f r o m t e l o m e r e g r o w t h a n d c o l l a p s e , b a s e d on the n u m b e r o f g e n e r a t i o n s r e p r e s e n t e d in the s e r o d e m e . A d d i t i o n a l l y , an u n r e l a t e d p r o b e also h y b r i d i z e s to the three i n g i - c o n t a i n i n g m i n i c h r o m o s o m e s (data not s h o w n ) c o n f i r m i n g that the c h a n g e s in their size are c o n s i s t e n t with t e l o m e r e c h a n g e s and not with p o s s i b l e m o v e m e n t s o f ingis b e t w e e n m i n i c h r o m o s o m e s . That two u n r e l a t e d s e q u e n c e s are on the s a m e subset o f m i n i c h r o m o s o m e s , and not o t h e r m i n i c h r o m o s o m e s , suggests that there m a y b e s o m e t h i n g unique a b o u t the origin or stability o f these m i n i c h r o m o s o m e s . I f ingis function as r e c o m b i n a t i o n a l hot spots, their p r e s e n c e in m u l t i p l e c o p i e s in the g e n o m e m a y h a v e p r o f o u n d effects on g e n o m i c o r g a n i z a tion. T h e r e a p p e a r s to be a high p r o b a b i l i t y (as high as 6/8, 7 5 % ) that r e c o m b i n a t i o n m a y occur a m o n g ingis, l e a d i n g to r e a r r a n g e m e n t s o f the flanking sequences. In addition, ingis m a y act as h o m o l o g y b l o c k s for gene c o n v e r s i o n . S u c h rec o m b i n a t i o n s c o u l d affect the e x p r e s s i o n o f various genes. A l t h o u g h it is u n k n o w n if ingis are c u r r e n t l y active r e t r o p o s o n s , they m i g h t be activ a t e d b y either m u t a t i o n or s u p e r i n f e c t i o n with ano t h e r r e t r o p o s o n p r o v i d i n g the e n z y m e s n e c e s s a r y for e x c i s i o n and insertion in trans. G e n e s c o u l d be i n a c t i v a t e d either b y insertion d i r e c t l y into the c o d i n g r e g i o n or b y an insertion that separates c o n t r o l l i n g r e g i o n s f r o m the c o d i n g sequence, as has b e e n seen for V S G g e n e e x p r e s s i o n [40]. This m i g h t be significant in a n t i g e n i c variation w h e r e i n a c t i v a t i o n w o u l d a l l o w for d i v e r g e n c e o f the V S G gene w h i l e l e a v i n g the gene a v a i l a b l e for e x p r e s s i o n as a h y b r i d or m o s a i c V S G gene as has b e e n r e p o r t e d for T. brucei [41] and T. equiperdum [42]. Thus, ingis m a y have b e e n i n v o l v e d in b r o a d e n i n g the repertoire o f V S G genes in trypanosomes.

Acknowledgements We w o u l d like to t h a n k A n d r e a P e r r o l l a z for e x c e l l e n t t e c h n i c a l assistance. This w o r k was supp o r t e d b y N I H grant A I 17375 a n d the M u r d o c h

C h a r i t a b l e Trust. K.S. is the recipient o f a special f e l l o w s h i p from the B u r r o u g h s W e l l c o m e Fund.

References 1 Liu, A.Y.C., Van der Ploeg, L.H.T., Rijsewijk, F.A.M. and Borst, P. (1983) The transposlnon unit of variant surface glycoprotem gene 118 of Trypanosoma brucet: presence of repeated elements at its border and absence of promoterassociated sequences. J. Mol. Biol. 167, 57-75. 2 Aline Jr., R., MacDonald, G., Brown, E., Allison, J., Myler, P., Rothwell, V. and Stuart, K. (1985) (TAA), within sequences flanking several intrachromosomal variant surface glycoprotein genes in Trypanosoma brucet. Nucleic Acids Res. 13, 3161-3177. 3 Campbell, D.A., van Bree, M.P. and Boothroyd, J.C. (1984) The 5'-limit of transposition and upstream barren region of a trypanosome VSG gene: tandem 76 base-pair repeats flanking (TAA)90. Nucleic Acids Res. 12, 2759-2774. 4 Shah, J.S., Young, J.R., Kimmel, B.E., Iams, K.P and Williams, R.O. (1987) The 5 r flanking sequence of a Trvpanosorna brucei variable surface glycoproteln gene. Mol. Biochem. Parasltol. 24, 163-174. 5 Borst, P. and Cross, G.A.M. (1982) Molecular basis for Trypanosome antigenic variation. Cell 29, 291-303. 6 Bernards, A., Mlchels, P.A.M., Llncke, C.R. and Borst, P. (1983) Growth of chromosome ends in multiplying trypanosomes. Nature 303, 592-597. 7 Van der Ploeg, L.H.T., Llu, A.Y.C. and Borst, P. (1984) Structure of the growing telomeres of trypanosomes. Cell 36, 459-468. 8 Aline Jr., R.F. and Stuart, K. (1989) Tcvpanosorna brucet: conserved sequence organization 3' to telomeric variant surface glycoprotein genes. Exp. Parasltol. 68, 57~o6. 9 Sloof, P., Bos, J.L., Konings, A.F.J.M., Menke, H.H., Borst, P., Gutteridge, W.E. and Leon, W. (1983) Charactenzation of satellite DNA in Trypanosoma brucei and Trypanosoma cruz1. J. Mol. Biol. 167, 1-21. 10 Sloof, P., Menke, H.H., Caspers, M.P.M. and Borst, P. (1983) Size fractlonation of Trypanosoma brucet DNA: locahzation of the 177-bp repeat satellite DNA and a variant surface glycoprotein gene in a mira-chromosomal DNA fraction. Nucleic Acids Res. 11, 3889-3901. 11 Aksoy, S., Lalor, T.M., Martin, J., Van der Ploeg, L.H.T. and Richards, F.F. (1987) Multiple copies of a retroposon interrupt spliced leader RNA genes in the African trypanosome, Trvpanosoma gamblense. EMBO J. 6, 3819-3826. 12 Klmmel, B.E., Ole-Molyol, O.K. and Young, J.R. (1987) lngi, a 5.2-kb dispersed sequence element from Trypanosoma brucet that carries half of a smaller mobile element at either end and has homology with mammalian LINEs. Mol. Cell. Biol. 7, 1465-1475. 13 Murphy, N.B., Pays, A., Tebabi, P., Coquelet, H., Guyaux, M., Steinert, M. and Pays, E. (1987) Trypanosoma brucet repeated element with unusual structural and transcripnonal properties. J. Mol. Biol. 195, 855-871. 14 Hasan, G., Turner, M.J. and Cordingley, J.S. (1984) Complete nucleonde sequence of an unusual mobile element from Trvpanosoma bruce1. Cell 37, 333-341. 15 Pays, E., Tebabi, P., Pays, A., Coquelet, H., Revelard, P., Salmon, D. and Stemert, M. (1989) The genes and transcripts of an antigen gene expression site from T brucei Cell 57, 835-845.

151 16 Aline Jr., R.F. and Stuart, K. (1985) The two mechanisms for antigenic variation in Trypanosoma brucei are independent processes. Mol. Biochem. Parasitol. 16, 11-20. 17 Parsons, M., Nelson, R.G., Newport, G., Milhausen, M., Stuart, K. and Agabian, N. (1983) Genomic organization of Trypanosoma brucei variant antigen gene families in sequential parasitemias. Mol. Biochem. Parasitol. 9, 255-269. 18 Myler, P.J., Aline Jr., R.F., Scholler, J.K. and Stuart, K.D. (1988) Multiple events associated with antigenic switching in Trypanosoma brucet. Mol. Biochem. Parasitol. 29, 227-241. 19 Aline Jr., R.F., Myler, P.J. and Stuart, K.D. (1989) Trypanosoma brucei: frequent loss of a telomeric variant surface glycoprotein gene. Exp. Parasitol. 68, 8-16. 20 Rothwell, V., Aline Jr., R.F., Parsons, M., Agabian, N. and Stuart, K. (1985) Expression of a mmichromosomal variant surface glycoprotein gene in Trypanosoma brucei. Nature 313, 595-597. 21 Milhausen, M., Nelson, R.G., Parsons, M., Newport, G., Stuart, K. and Agabian, N. (1983) Molecular characterization of initial variants from the IsTat 1 serodeme of Trypanosoma brucei. Mol. Biochem. Parasitol. 9, 241-254. 22 Myler, P., Nelson, R.G., Agabian, N. and Stuart, K. (1984) Two mechanisms of expression of a predominant variant antigen gene of Trypanosoma brucei. Nature 309, 282-284. 23 Stuart, K., Gobright, E., Jenni, L., Milhausen, M., Thomashow, L. and Agabian, N. (1984) The IsTat 1 serodeme of Trypanosoma brucei: development of a new serodeme. J. Parasitol. 70, 747-754. 24 De Lange, T., Liu, A.Y.C., Van der Ploeg, L.H.T., Borst, P., Tromp, M.C. and Van Boom, J.H. (1983) Tandem repetition of the 5' mini-exon of variant surface glycoprotein genes: a multiple promoter for VSG gene transcription. Cell 34, 891-900. 25 Nelson, R.G., Parsons, M., Barr, P.J., Stuart, K., Selkirk, M. and Agabian, N. (1983) Sequences homologous to the variant antigen mRNA spliced leader are located in tandem repeats and variable orphons in Trypanosoma brucei. Cell 34, 901-909. 26 Yanisch-Perron, C., Vieira, J. and Messing, J. (1985) New M13 host stratus and the complete sequences of M13mp and pUC vectors. Gene 33, 103-119. 27 Schwartz, D.C. and Cantor, C.R. (1984) Separation of yeast chromosomal-sized DNAs by pulse field gel electrophoresis. Cell 37, 67-75. 28 Scholler, J.K., Reed, S.G. and Stuart, K. (1986) Molecular karyotype of species and subspecies of Leishmania. Mol. Biochem. Parasitol. 20, 279-293. 29 Myler, P., Allison, J., Agabian, N. and Stuart, K. (1984) Antigenic variation in African trypanosomes by gene replacement or activation of alternate telomeres. Cell 39,

203-211. 30 Hattori, M. and Sakaki, Y. (1986) Dideoxy sequencing method using denatured plasmid templates. Anal. Biochem. 152, 232-238. 31 Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors, Proc. Natl. Acad. Sci. USA 74, 5463-5467. 32 Boothroyd, J.C. and Cross, G.A.M. (1982) Transcripts coding for different variant surface glycoproteins in Trypanosoma brucei have a short identical exon at their 5' end. Gene 20, 279-287. 33 Van der Ploeg, L.H.T., Liu, A.Y.C., Mlchels, P.A.M., De Lange, T., Borst, P., Majumder, K., Weber, H. and Veeneman, G.H. (1982) RNA splicing is required to make the messenger RNA for a variant surface antigen in trypanosomes. Nucleic Acids Res. 10, 3591-3604. 34 De Lange, T., Michels, P.A.M., Veerman, H.J.G., Cornelissen, A.W.C.A., and Borst, P. (1984) Many trypanosome messenger RNAs share a common 5' terminal sequence. Nucleic Acids Res. 12, 3777-3789. 35 Dorfman, D. and Donelson, J. (1984) Characterization of the 1.35 kb DNA repeat unit containing the conserved 35 nucleotides at the 5'-termini of VSG mRNAs in Trypanosoma brucei. Nucleic Acids Res. 12, 4907--4920. 36 Parsons, M., Nelson, R.G., Watkins, K.P. and Agabian, N. (1984) Trypanosome mRNAs share a common 5' spliced leader sequence. Cell 38, 309-316. 37 Mellor, J., Mahm, M.H., Gull, K., Tmte, M.F., McCready, S., Dibbayaw.an, T., Kingsman, S.M. and Kingsman, A.J. (1985) Reverse transcriptase activity and Ty RNA are associated with virus-like particles in yeast. Nature 318, 583-586. 38 Tart, P.I., Aline Jr., R.F., Smiley, B.L., Scholler, J., Keithly, J. and Stuart, K. (1988) LRI: A candidate RNA virus of Leishmania. Proc. Natl. Acad. Sci. USA 85, 9572-9575. 39 Shapiro, S.Z. and Young, J.R. (1981) An immunochemical method for mRNA purification: Application to messenger RNA encoding trypanosome variable surface antigen. J. Biol. Chem. 256, 1495-1498. 40 Cornelissen, A.W.C.A, Johnson, P.J., Kooter, J.M., Van der Ploeg, L.H.T. and Borst, P. (1985) Two simultaneously actwe VSG gene transcription units in a single Trypanosoma brucei variant. Cell 41, 825-832. 41 Pays, E., Houard, S., Pays, A., Van Assel, S., Dupont, F., Aerts, D., Huet-Duwllier, G., Gomes, V., Richet, C., Degrand, P., Van Meirvenne, N. and Steinert, M. (1985) Trypanosoma brucei: the extent of conversion in antigen genes may be related to the DNA coding specificity. Cell 42, 821-829. 42 Longacre, S. and Eisen, H. (1986) Expression of whole and hybrid genes in Trypanosoma equiperdum antigenic variation. EMBO J. 5, 1057-1063.

A retroposon in the 5' flank of a Trypanosoma brucei VSG gene lacks insertional terminal repeats.

A retroposon-like repeated sequence, ingi, occurs in high copy number in the genome of Trypanosoma brucei brucei. An ingi is present in the 5' flank o...
910KB Sizes 0 Downloads 0 Views