Molecular and Biochemical Parasitology, 55 (1992) 207-216 © 1992 Elsevier Science Publishers B.V. All rights reserved. / 0166-6851/92/$05.00

207

MOLBIO 01825

The actin genes of Onchocerca volvulus W e n l i n Z e n g a n d J o h n E. D o n e l s o n Department of Biochemistry, University of Iowa and Howard Hughes Medical Institute, Iowa City, 1A, USA (Received 31 January 1992; accepted 16 July 1992)

The genome of Onchocerca volvulus was found to contain 2 actin gene classes (called 1 and 2) of 2 genes each. The 4 genes are located in 2 clusters (called A and B), each containing a gene class member. Five short introns of 122-207 bp occur within each gene. The sequences of the fourth intron and the 5' and 3' untranslated regions of all the 2 gene classes are completely different even though their coding regions share 95% identity. Mature transcripts from the actin genes have the nematode spliced leader (SL) at their 5' ends. One actin c D N A was found to be derived from an actin pre-mRNA which locks both the 5 introns and the 5' SL, suggesting that in at least some transcripts cis-splicing is completed before transsplicing occurs. Key words: Spliced leader; Gene cluster; Intron; Primer extension; Precursor RNA

Introduction

This work evolved from a search of an Onchocerca volvulus c D N A library for c D N A clones that possess the spliced leader (SL) of 22 nucleotides first shown in Caenorhabditis elegans to be at the 5' ends of some nematode m R N A s [1,2]. When the O. volvulus c D N A library was initially screened with a SL oligonucleotide, many c D N A clones were detected but all of the cDNAs examined possessed the SL at internal sites rather than at their 5' ends [3]. The significance of these internal SLs remains unknown but they are not trans-spliced to their RNAs as are the SLs at the 5' ends of C. elegans mRNAs; rather, they are encoded by their corresponding genes [3,4]. Correspondence address: John E. Donelson, Dept. of Biochemistry, University of Iowa, Iowa City, IA 52242, U.S.A. Tel. 319 335 7889; Fax: 319 335 6764. Abbreviations: SL, nematode spliced leader; PCR, polymerase chain reaction. Note." Nucleotide sequence data reported in this paper have been submitted to the GenBank T M data base with the accession numbers M84915 (actin-2B gene) and M84916 (actin-lA gene).

Reasoning that partial length cDNAs of 5' SLcontaining m R N A s would not be detected by this screening procedure, we decided to examine specific cDNAs of O. volvulus whose m R N A s had previously been shown in C. elegans to have a 5' SL. We selected actin m R N A because (i) it is an abundant transcript with a highly conserved sequence and (ii) 3 of the 4 actin m R N A species in C. elegans have a 5' SL while the fourth species does not [1]. To identify the number of different actin m R N A species in O. volvulus and to establish which of the actin transcripts contain a 5' SL, it was necessary to determine the organization of the actin genes in the O. volvulus genome. Thus, we constructed a genomic D N A library of O. volvulus and examined representative clones from the library that contain actin genes. We present here the results of this study and demonstrate that the mature transcripts of O. volvulus actin genes do contain the 5' SL. In contrast to C. elegans, the m R N A s from all 4 0 . volvulus actin genes appear to possess a 5' SL.

208

Materials and Methods

Genomic D N A library construction and library sereening. Genomic DNA (10 #g) isolated from adult O. volvulus collected in Mali, West Africa [5] was partially digested with Sau3A and the resulting fragments dephosphorylated with bacterial alkaline phosphatase [6]. The D N A fragments were ligated to BamHlcleaved 2EMBL3 D N A arms (Stratagene) and the ligation products were packaged with 2 packaging extracts (GigaPack Gold, Strategene) according to protocols provided by Stratagene. About 100 000 recombinant phage were screened under low stringency conditions with a human ~¢-actin c D N A probe kindly provided by Judy Wang and Peter Rubenstein of the Biochemistry Department at the University of Iowa.

England Biolabs). Restriction fragments were subcloned from recombinant ), phage DNA into plasmid IBl30 or IBI31 (International Biotechnologies, Inc.) using standard protocols [6]. DNA sequence determinations of subcloned fragments were conducted by both the Sanger dideoxynucleotide method [7] and the Maxam/Gilbert chemical method [8]. PCR amplifications were conducted as described [3]. Oligonucleotides used for sequence determinations and PCR amplifications were synthesized on a DNA Synthesizer (ABI, model 391). Southern blots, Northern blots and phage library screenings were conducted as described [3,9] and probed with fragments labeled with 32p using a D N A random priming kit (Boehringer Mannheim).

Results and Discussion

Primer extensions. Oligonucleotide 1 (Fig. 3) was labeled at its 5' end using polynucleotide kinase and [7-32p]ATP [6]. The primer extension reaction was conducted in a 20 ILl reaction mixture containing 1 pmol labeled oligonucleotide/0.5 #g O. volvulus poly(A) + RNA/50 mM Tris-HCl, pH 8.3/40 mM KCI/6 mM MgCI2/1 mM dithiothreitol/0.5 mM each of the 4 deoxynucleoside triphosphates/0.1 mg ml bovine serum albumin, 25 units of RNasin (Promega)/5 units AMV reverse transcriptase (Bethesda Research Laboratories). After incubation at 37°C for 1 h, the RNA was hydrolyzed by addition of N a O H to a final concentration of 0.2 M and incubation at 50°C for 30 min. The mixture was neutralized with HC1 and the extension products were extracted twice with phenol/chloroform (1:1), once with chloroform and precipitated with ethanol. After resuspension in 10/~1 H20, the products were subjected to electrophoresis on an 8% denaturing, polyacrylamide D N A sequencing gel which then was exposed to Kodak XAR-5 X-ray film for 24 h at - 8 0 ° C with an intensifier screen, General methods. Restriction enzyme incubations were conducted using the reaction conditions recommended by the vendor (New

Ident(J'l'cation o[" two classes o[" O. volvulus actin mRNAs. When a previously described c D N A library of poly(A) + R N A from adult O. volvulus collected near Touboro, Cameroon in West Africa [5], was screened with a human fl-actin cDNA, many c D N A clones were detected, as expected because actin m R N A s are abundant in most metabolically active cells. Twelve clones were selected at random and purified to homogeneity by consecutive platings and rescree.nings [9]. The sizes of the c D N A inserts in the 12 clones were determined by EcoRI digestions of small scale phage D N A preparations. The 3 largest c D N A inserts of the 12 clones were subcloned into a plasmid and analyzed by digestions with several restriction enzymes as summarized in Fig. I A. On the basis of differences in the cleavage patterns the actin cDNAs could be grouped into 2 classes, called Actin-1 and Actin-2. For example, members of the Actin-1 class contain a ClaI site immediately downstream of what was subsequently shown to be the termination codon, while the Actin-2 class has a HindIII site at a similar location. Another difference, indicated in Fig. IA, is the presence of a BgIII site within the coding region of the Actin-2 class that does not occur in the Actin-1 class.

209

All

BS

P ,

H I

BS

P I

H I

I I

Actin-1 cDNAs

x

I I

I

C ,I C iJ

II i

probe

i

probe 2

I

BS

Actin-2 cDNA

P I

I I

H I

(~ l

H J I

I 250 bp

probe 3

An :i

gll

1.9 kb i

2.5 kb

1.9 kb i

1.0 kb

i

i

I

J

or

HR X

J '11111~PH(~)

R

H

H

i

R R B "-"

i

~ % \ \ % . ~ . % \ "

%'%\" %"%"%\"~\\%.~ ."~I r

Actin- 1 A

Actin-2A

or

-

X BS PH(~) R II

II/_LI,

H

H

RBR

R x

I

BS PH

II

R C

II

/(

B HR

H

r

Actin-2B

Actin-1 B L\\\\.\\~ sequence determined

Fig. 1. (A) Restriction maps of 3 0 . volvulus actin cDNAs, representing the 2 classes of actin mRNAs, whose sequences were determined. The 2 Actin-1 cDNAs have identical sequences; the Actin-2 cDNA differs at positions shown in Fig. 3. Thick lines indicate coding sequences and open rectangles indicate probes 1,2 and 3 used to distinguish between the 2 classes of actin genes in Southern blots. The circled BgllI (B) site occurs in the coding region of the Actin-2 c D N A but is not in the Actin-I cDNAs. A , indicates the presence of a 3' poly(A) tail. (B) Organization of actin gene clusters A and B, each containing a member of the 2 actin gene classes. Thick and thin black regions indicate the 6 exons and 5 introns in each actin gene, respectively. Numbers at the top denote sizes in kb of EcoRI restriction fragments discussed in the text. Cross-hatched regions below the Actin-lA and Actin-2B genes indicate the 2 genomic sequences that are compared in Fig. 3. Arrows show the direction of transcription. The boxed BgllI (B) site occurs in intron 1 of the Actin-lA gene, but not in intron 1 of the Actin-lB gene, and was used to distinguish between these 2 genes. Restriction sites are shown for Xhol (X), Bglll (B), ScaI (S), PvulI (P), HindlII (H), Clal (C) and EcoRl (R). Uncertainty in the placement of an intergenic EcoRl site relative to a BgllI site is indicated by 'or'.

210

The complete sequences of the 3 cDNAs shown in Fig. 1A were determined, except for small segments within the coding region of the longer of the 2 Actin-1 cDNAs. These sequences are shown in Fig. 3 as components of their corresponding genomic D N A sequences. The shorter Actin-1 c D N A (1262 bp) contains the entire actin coding region and a 5' untranslated region of 25 nucleotides that does not contain any portion of the 22 nucleotide nematode SL. The longer Actin-1 c D N A (1770 bp) has a total 5' untranslated length of 514 nucleotides, but it also does not contain the SL. The coding sequences of these 2 Actin-1 cDNAs are identical within the sequences determined. The Actin-2 c D N A (1282 bp) lacks the first 49 codons of the coding region, but contains a complete 3' untranslated region with a poly(A) tail. A putative polyadenylation signal, A A T A A A , lies upstream of the poly(A) addition site. A third Actin-l cDNA, not shown in Fig. 1A, on which partial sequence information was obtained was found to have a 3' poly(A) tail, indicating a poly(A) addition site for the Actin-I class of m R N A s (see Fig.

3). A comparison of the coding sequences in the Actin-1 c D N A class and the Actin-2 c D N A class reveals that 95% of their nucleotides and 98% of their encoded amino acids are identical. However, their 5' and 3' untranslated regions have little or no similarity, indicating that the corresponding m R N A s are products of different genes. None of the 12 actin cDNAs contains the SL sequence, as deduced by either sequence determinations or by Southern blots probed with a SL oligonucleotide. The lack of a SL at the 5' ends of these cloned cDNAs appears to be because they are partial length cDNAs, or derived from partially processed pre-mRNAs, since the 5' SL

B

T

E

H

T GC H T G C

B

GH

S

Fig. 2. Southern blot of genomic DNAs (5 ~g/lane) probed with a mixture of Actin-2 c D N A and the smaller Actin-1 cDNA (Fig. 1A). The genomic DNAs were extracted from adult O. volvulus collected in Touboro, Cameroon (T), or in Guatemala (G), from C. elegans (C) or from cultured human cells (H), and digested with BgIII (B), EcoRI (E) or HindII1 (H). Lanes 2 and S contain 32p-labeled fragments of HindIII-digested 2 DNA and a 1-kb ladder, respectively.

was detected on the actin m R N A s (described below).

Characterization of the O. volvulus actin gene family. To examine the genomic organization of the actin genes in O. volvulus, extensive Southern blot analysis was conducted on genomic DNAs from adult O. volvulus collected at 4 different locations [5]. A represen-

Fig. 3. Nucleotide sequences of the genes for Actin-lA (top strand) and Actin-2B (bottom strand). Coding sequences are shown in capital letters. Sequences of the 5' and 3' untranslated regions and the 5 introns are shown in lower case letters. Dots indicate sequence identity. Translation initiation and termination codons are boxed. The 5' end of the largest Actin-1 c D N A (Fig. 1A) is indicated by the vertical arrow. SL indicates the sites of spliced leader addition. Black triangles and upstream lines show polyadenylation sites and the consensus polyadenylation sequence, AATAAA, respectively. Short overlines or underlines with an associated letter show restriction sites indicated in Fig. 1A and B. Oligonucleotides 1 and 2, used for primer extensions and PCR amplifications, respectively, are denoted by horizontal arrows and have sequences complementary to the D N A strand shown.

211

R •aattcctcactaaag•gagagaatcctc•a•aaa•ga•tc•aa•aaca•ctt•ttc•ctt•ttctcttcacattcctccccctcaa•c&actcaaaaa•ct•ct•aa•tt•aactatat

120

aagcctcagaatgcat~ttagagcagatgtgatccttcgatgcttcaacttgaca~tttcacgcttggaattagcttccttgc~gtccttactt~tgcagttttgagaa~ttctggtgct

240

~tttagaatttgaaagga~ttgctaagaacttaat~tgaaacCatttcttaaataaaCtacgtttagataacttttaCggattttactacaattttttcccattctgttaagaaaaatag

360

gtgaaaatagttagcaaggtaaaagaaatttatattgttttggaatattcttgaaattttatcatcttgtaacttgcaatttggaagatttctaattgatctatcaaaccgaaaatatca

480

aattCaagcagcggatgCtaattctagtcaaactgacttag~aaCataCtgtt~CtatttCaattctatttaattaagaacaaCtacaaagaaatattagCtgat~ttaattttaataa~

600

S L--1--1A

oligo 1

atataatttatttgattcatttccatcaagtgaatttgtaaattactagcagcaccagtttcaaaaatcatacat~AT(~TGTGACGAAGAAGTTGCGGCATTGGTAGTGGACAATGGTTC gaattctcattcaaaataataaactagaggaga~...~ ................. T .......................

x

SL- - I

~

720 78

[B]

•GGTATGTG•AAAG•CGGATTTG••GGTGAT••CG•AcCTCGAGCTGTATT•c•AT•AATAGTCG•A•GACCAAGG•ATCAAgtatgattttgaattttgtttacagatctaatacctaa ..........................................................................................................

a .............

tttaaaacaaatatccaatcggaaaactataatcatctttgaaaacatacttagaatttagaaggttttaattactttcatgtatcaataaaaggttttgtagGGTGTcATGGTTGGTAT ........................................................................................................................

oligo2

840 198

980 318

B

GGGTC'~AAGGA~CATACGTAGGCGATGAGGCTCAGTCTAAGAGAGGTATTCTGACA'~GAAATACCCAJ~F~GAACACGGTATTGTCACA~%J%CTGGGATGACATGGAGAAGATCTGGCA ........................................................................................................................

10B0 438

S T~A~A~ATTTTA~AAcGAAcT~GAGTTGcTc~TGAGGAAcAT~cAGTAcTAcTTA~GAAG~ccAcTGAATc~AAAGG~GAA~AGAGAAAAGATGAcGcAAATCATG~GAGA~ATT ........................................................................................................................

558

CAATACTCCGGCTATGTATGTTGCTATCCAAGCTGTCCTGTCTCTTTACGCTTCCGGTCGTACTACTGGCATTGTACTGGATTCTGGAGATGGCGTTACTCACACCGTACCAATTTACGA ........................................................................................................................

1320 678

AGgtatcattagtttccattagttgtcacaCcttctaataagttctaatttaaagttaatcagactgtCaacatttgaataaaagtcagatgaaacttcgcgaatatataagctcaaata .........................................................

1200

g ......................................................

a .......

acagtgtaaactgatttcaccataatgattcttgaagataaagtattgtcagcgttggggataagttaattctatttccggtatttcagGTTACGCATTGCCACACGCAATTTTGCGTTT ........................................................................................................................

1440 798

1560 918

P AGACTTGGCTGGACGAGACTTGACAGATTATTTGATGAAGATTCTCACTGAGCGTGGTTACTCATTCACAACCACAGCTGAACGAGAAATTGTTCGTGACATCAAAGAAAAGCTGTGCTA ........................................................................................................................

1680 1038

CGTTGCTTTGGACTTCGAACAGGAAATGGCAACTGCTGCATCGTCATCGTCTCTCGAAAAATCTTATGAATTGCCTGATGGCCAAGTGATTACCGTAGGCAACGAACGATTTCGATGCCC ........................................................................................................................

1800 1158

H

AGAAGCTTTATTCcAGgtcaattgaggattttcgatttctaccata9agaaaattaagattacaaagcaagcacctttattttgcatatttcatttctgtcaaaatttc~attcctttat ........................................................................................................................

aaaacagaatctttttagCCGTCCTTCTTGGGTATGGAATCTGCTGGTATTCATGAATCAACATACAACAGTATCATGAAATGTGATATTGATATTCGAAAGGATTTGTATGCCAATATT ......................................................... A .... T ..... T.+C..T ..... G ........... C..C ..... A,..C.C

..... A..C.A.

GTccTATccGGTGGcAcAA~TATGTAc~cGGcATTGcTGATAGAATGcAGgtaagacttttgttttagattcatattttatttgaaattgactgataaaaaagcaaatacgcataaaca ..TT.G ......... T.G..C ........ A ........ A ..... G ........ t.att..a.t...c.tt...tggaaaag...tttc..tta.ct.ttttttttttttg.tttcgg.at

.... t . . g t . . t . c t t t . . . a a c g a g . - . a g . t g . g . . a a t t t

2040 1398

2160 1518

aaatttct•tttatgaactt•attt•tgtggatttcccgcatctttttg•g•attttttttttcatttttttcatttaaaaaaaaaaaagaaagtgcgcaattacgttatattg--cagA .tcaa.tgaa.c.gt.tac.atg.gt.t.aaca..tg~atg.gaaaactttt.a..cca..aaa.a..gc..a

1920 1278

2280 ....

1638

AAGAAGTGAcTGCTTTGGCACCAAGTACGATGAAAAT•AAGATCATTGCACCACCTGAGCGCAAATACTCCGTATGGATTGGTGGTTCCATCcTGGCTTCCTTATCCACTTTCCAACAGg ..... A.C..A..CC.A .... ...................................................................................................

2400

tatatctggttgtcagcttttgatttctgtaatattttagaaatagattcaaaacagatcaatcagta•agaaatgaaagaaaagaaggatcagcgcttcagcaaataaagacattatca

2520

........................................................................................................................

1878

1758

R gaattcttagcaatgattgcattactcaatgaacttcaagtaaacttgcagATGTGGAT~rCGAAGCAGG~TATGATGAGTCTGGTCCATCAA'~GTACATcGT/~TGCTT~-~ct ...... c ........................................................................................................... [~.~at

2640 1998

C

tttgcattactt~atat~aat~gatgcatatacca~agtgctcaaa~aaccatctacaa~at~catgca~atttgcttccttatat~gc~t~ttacatttg~tttttatctttattatc actacaatcgaaaactattcacaa9agagaagcttgacaagataaagttgca~acatgttgcattcatcgtagtcggaatgtgttcgtatcttcaccttacacaaacaagtttttcaact

2760 2118

H

ttatctgatactgaaaatctctcggagtgattctttgatgtctagagatgtcaaataagaagttctactgttgcttctccaaaaattcggcaattaacactgccactttggctatcctta aagatatcagtttttcttctccttccattgtctttttaccaaatattcgcgaaatagccattttctcaaactattcgcatataattaagtatatgaattaagtgctgtttccggtcgtta

2880 2238

ttttgttggatccagtaactacagtaacgaatccaacacatgcgcat~gaaccactttgtattgttttgttagaaacaaaattataaaattataatttcatactggcaaatgtttaaagg aacatatttcttgCtgcaataaattttatgtgtttt~gttcattgataatgcaaaaaaaaaatcaataaagattt~tgg£ttttat~ttcagtccatgaataaagagtatttattataag

3000 2358

c a a c g a a t a a a t a a a a t t a t t a t c g • c a t t •t t g g t g c c c a c t c t g t t t g a a t t a c c t t t g t t g g a a a c c t t t t c a a t c t t t a c c c a a g g c c t t t t • g t a a t c c c t t t c t g a g a c c a t t t t

taaccataattcatactttacaataatc

3120 2386

cctaatgg

3128

212

tative blot with 2 of these genomic D N A s is shown in Fig. 2. In this example, a mixture of Actin-1 and Actin-2 c D N A s hybridizes strongly to EcoR| fragments of 2.5 kb and 1.9 kb, and weakly to an EcoRI fragment of 1.0 kb. No EcoRI site occurs within the cDNAs, suggesting either that O. volvulus contains multiple actin genes or that the gene(s) have introns with EcoRI sites. Both possibilities turned out to be correct. The c D N A s also hybridize to multiple HindIII genomic fragments, as expected because the actin coding region contains an internal HindII| site. They hybridize to an even greater number of BgllI fragments, which is also consistent with heterogeneity among several gene copies. These BglII sites were subsequently used to distinguish among the genes in recombinant clones (see below). No binding occurs to C. elegans or human D N A , showing that the hybridization stringency is high enough to prevent cross-hybridization to the actin genes of these genomes. Reprobing this blot with sequences unique to the 2 actin c D N A classes (indicated as probes 1, 2 and 3 in Fig. 1A) demonstrated that different restriction fragments contained unique genes (not shown). From the sum of several such blots, 2 observations were made. First, no evidence for restriction polymorphism among the genomic D N A s of the 4 0 . volvulus isolates was detected. Second, it was apparent that indeed 2 classes of actin genes exist in the O. volvulus genome and that members of each gene class are clustered adjacent to each other. To confirm and extend these observations a genomic D N A library was constructed using D N A isolated from O. volvulus collected in Mali, West Africa, [5]. Parasites from this source were used, rather than parasites from Touboro, Cameroon, from which the c D N A library was made because more of them were available. When this 'Mali' genomic library was screened with the 'Touboro' O. volvulus actin cDNAs, many clones were obtained. Preliminary restriction digestions of several of these cloned genomic DNAs, followed by Southern blots, demonstrated that most clones contained at least one member of each

actin c D N A class. Furthermore, differences in the patterns of Bg41I digestions indicated that 2 different genomic regions, or clusters, contain actin genes (Fig. 1B). Representative genomic clones of each of the 2 regions, called Cluster A and Cluster B, were chosen for more extensive analyses. Comparisons of Southern blots of EcoRI digests of total genomic D N A (Fig. 2) and of these 2 cloned genomic D N A s revealed that the Actin-1 gene class is located on adjacent EcoRl fragments of 2.5 kb and 1.0 kb and that the Actin-2 class is on 2 adjacent 1.9-kb EcoRI fragments. The internal EcoRI site, in each case, was subsequently shown to occur within the last intron. These EcoRI fragments were subcloned from the phage clones into plasmids and analyzed in detail. Restriction maps of these 2 different actin gene clusters, are presented in Fig. lB. A BglII site that occurs in an intron of the Actin-lA gene but not in the Actin-lB gene was used to distinguish between these 2 clusters. Likewise, another BglII site present in both Actin-2 genes but not the Actin-1 genes readily distinguished between the 2 gene classes. The complete sequences of a member of each gene class in the subcloned EcoRI fragments were determined and are shown in Fig. 3. The Actin-I gene that was sequenced comes from Cluster A and the sequenced Actin-2 gene is from Cluster B. Both genes were found to have 6 exons and 5 introns in the same relative locations. Consistent with previous observations about nematode introns [11, 12], the 5 introns are small, ranging in size from 122 bp to 207 bp. Interestingly, the intron sequences of the 2 gene classes are nearly identical except for the fourth intron. This intron is about the same length in the 2 gene classes, differing in size by only 2 bp, but its sequence in the 2 classes is completely different. Likewise, exons 4 and 5 that surround intron 4 have more differences between the 2 gene classes than do the other exons, although these flanking exon differences are not nearly as dramatic as the intron 4 differences. The significance of the intron 4 differences is not clear. In addition, the sequences of these 2 cloned genes confirmed

213

the observation from the cDNA sequences that the 5' and 3' untranslated regions of the 2 actin gene classes are completely different. When the coding regions of the 2 sequenced genes (from 'Mali' parasites) were compared with the sequences of the 2 cDNA classes (from 'Touboro' parasites), only 1 bp difference was found. This difference occurs in codon 205 of the Actin-2 class where the Actin-2B gene encodes alanine and the cDNA encodes valine. This discrepancy could be because (i) the cDNA and genomic libraries were constructed from different parasites [5] or (ii) the Actin-2 cDNA is derived from the Actin-2A gene whose sequence was not determined. The deduced actin amino acid sequences encoded by the Actin-1A and Actin2B genes differ at only 4 positions as shown in Fig. 4. The above results demonstrate unequivocally that the genome has 2 actin gene clusters containing 2 genes each. However, the data do not indicate with complete certainty how many gene copies are actually in the haploid genome. For example, there could be a total of 8 genes derived from 2 identical copies of cluster A and B. This alternative was examined in gene titration experiments based on the previously reported O. volvulus haploid genome size of 1.5 x 108 bp [5]. The results of these Southern blot experiments in which the signal intensities of an actin cDNA probe hybridizing to EcoRI fragments in total digested genomic DNA and to decreasing amounts of the cloned EcoRI fragments in adjacent lanes of the same gel is consistent with a total of 4 genes in the haploid genome (10). Another possibility was that clusters A and B actually represent the 2 actin

gene alleles in the diploid genome of O. volvulus. Again, this possibility cannot be completely eliminated but the above titration experiments and the fact that no restriction polymorphisms in the 2 gene clusters among 4 different O. volvulus genomic DNAs were detected on Southern blots argues against it. Attempts were made to identify genomic clones in the genomic DNA library that contained at least parts of both actin gene clusters but, unfortunately, none were found. Likewise, attempts to use DNA restricted with enzymes that have 8-bp recognition sites in Southern blots of pulsed field electrophoresis gels to identify restriction fragments containing both clusters were not successful. Thus, it is not known whether the 2 clusters are linked in the genome.

Detection of the nematode spliced leader at the 5' ends of the actin mRNAs. Since none of the actin cDNAs contain a 5' SL, it was necessary to look for trans-splicing of the actin transcripts using alternative approaches. The most direct way to demonstrate the presence of a 5' SL is to directly sequence the actin mRNAs [1]. Our limited amounts of O. volvulus RNA prevented us from using this approach. Instead, we conducted RNA polymerase chain reactions in which first strand cDNA, synthesized by reverse transcriptase and oligo dT, served as the template for PCR using a SL oligonucleotide and oligonucleotide 2 shown in Fig. 3. The major amplification product of these coupled reactions contained about 240 bp [3]. This product was blunt-end ligated into the SmaI site of plasmid pIBI30 and bacterial clones harboring the desired DNA inserts were

MCDEE

VAALVVDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVMVGMGQKDSYVGDEAQ

60

SKRGI

LTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHPVLLTEAPLNPKANREKM

120

TQIMF

ETFNTPAMYVAIQAVLSLYASGRTTGIVLDSGDGVTHTVPIYEGYALPHAILRLD

180

LAGRD

LTDYLMKILTERGYSFTTTAEREIVRDIKEKLCYVALDFEQEMATAASSSSLEKS

240

YELPD

GQVITVGNERFRCPEALFQPSFLGMESAGIHE~TYNSIMKCDIDIRKDLYAN~VL

300

SGG~T M Y P G I A D R M Q K E ~ T A L A P S T M K I K I I A P P E R K QEYDE S G P S I V H R K C F *

SVWIGGSILASLSTFQQMWISK

360 376

Fig. 4. Amino acid sequence of Actin-1 deduced from the Actin-lA nucleotide sequence (Fig. 3). The deduced sequence of Actin-2 from the Actin-2B gene is identical except at the 4 boxed locations where the lower amino acid occurs in Actin-2.

214

5' Actin-i cDNA sequence

GTTTAATTACCCAAGT TT~G~AA

5'

~G

~G~.

~

M

C

D

E

E

V

A

A

L

V

V

D

N

G ..

. . . . . t t t . . . . . . t ca t a c at a A T G T G T G A C G A A G A A G T T G C G G C A T T G G T A G T G G A C A A T G G T

.

~ c a a c a c c a a t t t ca a a a a t c a t ac at a A T G T G T G A C G A A G A A G T T G C G G C A T T G G T A G T G G A C A A T G G T .

Genomic

DNA

se~ence...atttccatcaagtgaatttgtaaattactagca~caccaatttcaaaaatcatacataATGTGTGACG~G~GTTGCGGCATTGGTAGTGGAC~TGGT..

AA

5' Actin-2 cDNA

IGGTTTAATTACCCAAGTT~ TG A

Genomic

DNA

sequence

M

C

D

E

E

V

A

A

L

V

V

D

N

G ..

~q.oaclaaATGTGTGACGAAG.~.GTTGCGGCATTGGTAGTGGAC~ TGGT"

sequence

...tcaaaataataaaactaga~qaaaaATGTGTGACGAAGAAGTTGCGGCATTGGTAGTGGACAATGGT..

Fig. 5. Sequence comparisons of the 2 actin gene classes and the 5' ends of actin cDNAs generated by R N A PCR using a SI primer and oligonucleotide 2 (Fig. 3). Boxed sequences are the SLs. N-terminal amino acids are shown above their codons. SI addition sites are indicated by arrowheads.

identified by hybridization to labeled O. volvulus actin cDNA. D N A inserts from 8 positive clones were completely sequenced and 3 different SL boundaries were identified. As shown in Fig. 5, 2 of the 3 sequences contain the 5' untranslated region of the Actin-1 gene class and the third contains the 5' untranslated region of the Actin-2 gene class. Several conclusions may be drawn from this information. Two different splice sites, located 24 and 27 nucleotides upstream of the start codon, are utilized in transferring the SL to the Actin-1 transcripts. Only one splice site, 7 nucleotides upstream of the start codon, was identified as a site for SL addition to the Actin2 transcript. The consensus splice donor dinucleotide, AG, is present at all 3 transsplice sites, as it is at the 3' cis-splice sites of the 5 introns. Five of the 8 c D N A inserts are derived from Actin-1 R N A and 3 from Actin-2 RNA, suggesting that the 2 classes of actin transcripts are about equally abundant in vivo. Of the 5 Actin-1 cDNAs, 4 have the SL attached 27 upstream of the start codon and one has it attached 24 nucleotides upstream. A similar utilization of alternative splice sites for the trans-splicing reaction has also been found in trypanosome m R N A s [13]. To prove that the SL is indeed at the very 5' end of the actin mRNAs, primer extension experiments were performed. Oligonucleotide 1, whose sequence is complementary to the 5'

untranslated region immediately upstream of the Actin-1 start codon (Fig. 3), was used as the extension primer as described in Materials and Methods. The extension products were analyzed on a denaturing acrylamide gel followed by autoradiography as shown in Fig. 6. The extension lengths of the 2 products indicated by the arrows are 22 and 25 nucleotides, and correspond exactly to the extension product sizes predicted by the sequences of the Actin-1 R N A PCR products shown in Fig. 5. The other 2 bands in lane 1 of Fig. 6 are of uncertain origin but are also faintly present in lane 3 (and are more apparent on the actual autoradiograms). Since lane 3 contains the products of an extension reaction in which no template R N A was present, these 2 bands may arise from self priming and extension of oligonucleotide 1 which was not purified following its synthesis. The absence of additional primer extension products suggests that these 2 splice sites (arrows, Fig. 5) are the major, and probably only, sites used in the trans-splicing of the Actin-1 transcripts. Primer extensions were not conducted with an oligonucleotide specific for the Actin-2 transcripts but by analogy with the above experiment, it is likely that the major splice site in these transcripts is the one that occurs 7 nucleotides upstream of the start codon. Since the 2 Actin-1 genes share the same 5'

215

upstream sequence, as do the 2 Actin-2 genes, it is very likely that transcripts of all 4 genes undergo trans-splicing to receive a 5' SL. This conclusion contrasts with the finding that the transcripts of only 3 of the 4 C. elegans transcripts are trans-spliced. In C. elegans, however, the 3 genes whose transcripts are trans-spliced form a single cluster while the fourth gene is separate. A final point concerns the Actin-1 c D N A shown in Fig. 1A that contains 514 nucleotides of 5' untranslated sequence which exactly matches the 5' upsteam sequence of the Actin-lA gene (vertical arrow in Fig. 3). The absence of introns in this cloned sequence indicates that it is not a genomic D N A fragment. Thus, it must be a c D N A derived from an actin pre-mRNA in which the introns have been removed and the SL has yet to be added. The existence of this pre-RNA indicates that transcription of the corresponding Actin-1 gene is initiated more than 500 nucleotides upsteam of the start codon, a conclusion supported by a Northern blot probed with probe 1 shown in Fig. 1A (ref. 10 and data not shown). Its presence also suggests that, at least some of the time, the 5 introns are removed from the actin pre-mRNA in 5 cis-splicing reactions before the SL is added in the one trans-splicing reaction, a temporal aspect of the R N A processing events that may be important in nematode transcripts that must undergo both cis- and trans-splicing [11, 14].

Acknowledgements This research was supported in part by a grant from The Edna McConnell Clark Foundation and by N.I.H. grant DK25295.

123AT Fig. 6. Primer extension analysis of O. volvulus actin mRNAs. The extension products of oligonucleotide 1 (Fig. 3) are shown when the reaction is conducted in the presence of O. volvulus poly(A) ÷ R N A and reverse transcriptase (lane 1), in the absence of of reverse transcriptase (lane 2) and in the absence of template R N A (lane 3). Lanes A and T contain the products from the A and T cleavage reactions of the Maxam and Gilbert sequencing procedure. Arrows indicate the products containing extensions of 22 and 25 nucleotides.

References 1 Krause, M. and Hirsh, D. (1987) A trans-spliced leader sequence on actin m R N A in C. elegans. Cell 49, 753761. 2 Bektesh, S.L., Van Doren, K. and Hirsh, D. (1988) Presence of the Caenorhabditis elegans spliced leader on different mRNAs and in different genera of nematodes. Genes Dev. 2, 1277-1283.

216 3 Zeng, W., Alarcon, C.M. and Donelson, J.E.(1990) Many transcribed regions of the Onchocerca voh,ulus genome contain the spliced leader sequence of Caenorhabditis elegans. Mol. Cell. Biol. 10, 2765 2773. 4 Bektesh, S.L. and Hirsh, D. (1988) C. elegans mRNAs acquire a spliced leader through a trans-splicing mechanism. Nucleic Acids Res. 16, 5692. 5 Donelson, J.E., Duke, B.O.L., Moser, D., Zeng, W., Erondu, N., Lucius, R., Renz, A., Karam, M. and Flores, G.Z. (1988) Construction of Onchocerca volvulus cDNA libraries and partial characterization of the cDNA for a major antigen. Mol. Biochem. Parasitol. 31, 241-250. 6 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory. Cold Spring Harbor, NY. 7 Hattori, M. and Sakaki (1986) Dideoxy sequencing method using denatured plasmid templates. Anal. Biochem. 152, 232 238. 8 Maxam, A.M. and Gilbert, W. (1977) Sequencing endlabeled DNA with base-specific chemical cleavages.

Methods Enzymol. 65, 499 560. 9 Huyuh, T.V., Young, R.A. and Davis, R.W. (1985) Construction and screening cDNA libraries in 2gtl0 and )~gtll. In: DNA Cloning. A Practical Approach (Glover, D.M., ed.), Vol. 1, pp.49 78. IRL Press, Oxford. 10 Zeng, W. (1990) RNA trans-splicing in the parasitic nematode Onchocerca volvulus. Ph.D. Thesis, University of Iowa, Iowa City, IA. 11 Blumenthal, T. and Thomas, J. (1988) Cis and trans mRNA splicing in C. elegans. Trends Genet. 4, 305 308. 12 Erondu, N.E. and Donelson, J.E. (1990) Characterization of a myosin-like antigen from Onchocerca volvulus. Mol. Biochem. Parasitol. 40, 213 224. 13 Layden, R.E. and Eisen, H. (1988) Alternate trans splicing in Trypanosoma equiperdum: implications for splice site selection. Mol. Cell. Biol. 8, 1352 60. 14 Zeng, W. and Donelson, J.E. (1990) A comparison of trans-splicing in trypanosomes and nematodes. Parasitol. Today 6, 327 334.

The actin genes of Onchocerca volvulus.

The genome of Onchocerca volvulus was found to contain 2 actin gene classes (called 1 and 2) of 2 genes each. The 4 genes are located in 2 clusters (c...
951KB Sizes 0 Downloads 0 Views