Molecular and Biochemical Parasitology, 42 (1990) 69-82 Elsevier

69

MOLBIO 01369

The gene family encoding eggshell proteins of Schistosoma japonicum K i m b e r l y J. H e n k l e 1, George A. C o o k 2, L e w i s A. Foster 3, D a v i d M. E n g m a n 1, Libuse A. B o b e k 5, George D. Cain 3 and John E. D o n e l s o n 1&4 IGenetics Ph. D. Program, Departments of 2Btochemistry and 3 Biology, and 4Howard Hughes Medical Institute, University of Iowa, Iowa Ctty, IA, U.S.A. and 5Department of Oral Biology, School of Dentistry, State University of New York, Buffalo, NY, U.S.A. (Received 15 December 1989; accepted 22 March 1990)

The four closely related genes encoding eggshell proteins in the human parasite Schistosoma japonicum are described. A cDNA and a genomic DNA library were constructed and members of the eggshell protein gene family isolated. The four genes in this family do not contain introns, and differ in organization and nucleotide sequence from the related set of genes in Schistosoma mansoni and Schistosoma haematobium. The coding sequences of two of the S. japonicum genes and their flanking regions were determined. Transcription start sites for these genes were shown by primer extension analysis to occur 47 and 50 nucleotides in front of the start codon. A female-specific component in nuclear extracts binds to a DNA fragment containing conserved sequences upstream of the transcription start sites. The deduced protein sequences of 207 and 212 amino acids are composed of 50% glycine with continuous glycine regions as long as 11 residues. In vitro translations of male and female RNAs revealed female-specific translation products, the sizes of which were consistent with the eggshell proteins, Key words: Eggshell protein; Gene family; Transcription; Nuclear extract; Schistosome

Introduction

The eggs produced by adult female schistosomes in the bloodstream play a predominant role in the pathology of schistosomiasis [1,2]. Thus, one strategy for control of the disease is to interfere with the capacity of the female worm to produce eggs. This approach requires a better understanding of the egg proteins and the molecular mechanisms that are responsible for their synthesis. We describe here the genes encoding a component of the eggshell in the Asian parasite Schistosoma japonicum. This schistosome species is of particular interest because it produces ten times Correspondence address: John E. Donelson, Genetics Ph.D. Program, University of Iowa, Iowa City, IA 52242, U.S.A. Abbreviations: DEPC, diethyl pyrocarbonate; DTT, dithiothreitol; ESG, eggshell gene; ESP, eggshell protein; SDS, sodium dodecyl sulfate. Note: Nucleotide sequence data reported in this paper have been submitted to the GenBank TM data base with accession numbers M32280 and M32281.

more eggs per female worm than the other two major human-infecting schistosomes, Schistosoma mansoni and Schistosoma haematobium [3]. Adult, male and female schistosomes live encopula in the mesenteric veins of the host for as long as ten years. The presence of the male worm is necessary for full growth and complete sexual maturation of the female [3,4]. Maturation of the female worm includes the ability to transcribe the genes encoding egg components, including the eggshell proteins (ESPs). The nature of the male stimulus responsible for female growth and maturation is unknown, but it has been shown to be independent of sperm transfer [5,6]. It could, for example, be a factor transferred from the male that acts directly on the genome of the female or one that activates another internal factor in the female. This report describes the structure and expression of a family of four eggshell genes (ESGs) encoding the glycine-rich ESPs in S. japonicum. This characterization represents a first step toward understanding the nature of the link between the transcriptional control of these genes and the

0166-6851/90/$03.50 © Elsevier Science Publishers B.V. (Biomedical Division)

70

male-female interaction required for female schistosome development. Materials and Methods

Schistosomes and nucleic acids. Swiss outbredmice, infected with S. japonicum cercariae (Philippine strain) were obtained from the Center for Tropical Diseases, University of Lowell (Lowell, MA). Adult worms were obtained by perfusion of the hepatic portal system [7]. Individual male and female worms were separated with fine forceps under a dissecting microscope, rinsed three times with saline and either processed immediately or stored at - 8 0 ° C until use. Adult worms in liquid nitrogen were crushed into a powder with a chilled mortar and pestle. Genomic DNA and total RNA were purified by the guanidinium isothiocyanate procedure [8,9]. Poly(A) + RNA was isolated by oligo(dT) chromatography [9], using 0.5 M KC1, rather than NaC1 for the binding. Southern blot analysis. Genomic DNA (4 Izg/ lane) was digested with various restriction enzymes (New England Biolabs) and the fragments transblotted from 0.8% agarose gels onto nitrocellulose paper [10]. Cloned restriction fragments or cDNAs used as hybridization probes were labeled by nick translation [11]. High stringency hybridizations were conducted at 65°C for 24 h in 4x SET (1 x SET=150 mM NaC1/30 mM TrisHC1, pH 8.0/2 mM EDTA)/10x Denhardt's solution/0.1% Na-pyrophosphate/0.1% SDS and 50 #g ml-1 boiled salmon sperm DNA [10] using 1 x 104 cpm of probe per cm 2 of filter. The filters were washed twice for 1 h each at 65°C in 0.1 x SET and 0.1% SDS. Low-stringency hybridizations were conducted at 50°C in the same hybridization solution as above. Two low-stringency washes were conducted for 1 h each at 50°C in 1 x SET and 0.1% SDS. The filters were blotted dry and exposed to Kodak XAR-5 film for 16-48 h at - 8 0 ° C with intensifier screens. Genomic DNA library. S. japonicum genomic DNA was partially digested with Sau3A to produce fragments of 15-20 kb. These fragments were treated with calf intestinal alkaline phosphatase and ligated onto BamHI-digested

AEMBL3 arms (Vector Cloning Systems) [121. The ligated products were packaged into infectious phage using packaging extracts (Gigapack, Vector Cloning Systems). Escherichia coli host strains LE392 and P2392 were used for plating and screening the library.

cDNA library. First and second strand cDNA were synthesized with AMV reverse transcriptase (Promega Biotec), followed by RNase H and E. coli DNA polymerase I (Amersham) [13]. The ends were repaired with T4 DNA polymerase, and the cDNAs were methylated using EcoRI methylase. EcoRI linkers then were ligated with the cDNA fragments and after EcoRI digestion, free linkers were separated from the larger cDNA fragments using a Biogel-A50m column. EcoRI cleaved, dephosphorylated Agtll DNA (Vector Cloning Systems) was ligated with the linkercontaining cDNA and the reaction products packaged into phage. Identification and subcloning of cDNA and genomic DNA inserts. The S.japonicum cDNA library was screened with a 5'-labeled oligonucleotide, 3'-GTGTGGAGAGTTAAGTTGAGTGACATT-5' that is complementary to a sequence near the 3' end of a S. mansoni eggshell cDNA [14,15]. Initially, a 166-bp cDNA, ESG-3' cDNA, was identified with this probe and its nucleotide sequence determined [16]. This cDNA was used as a probe to identify larger cDNAs in the Agtl 1 library and genomic DNA segments in the AEMBL3 genomic DNA library. The cDNA inserts were excised by EcoRI digestion and subcloned into the EcoRI site of the plasmid pUCI8 for subsequent isolation [9,17]. The AEMBL3 genomic clones were double-digested with SaII, which releases the entire insert, and EcoRI. The genomic EcoRI fragments were then subcloned into the EcoRI site of pUC 18 and transformed into E. coli HB 101 [9]. Transformants were screened for subclones which hybridized with the ESG-3' cDNA [18] and several were selected for further characterization. Two genomic EcoRI fragments of 2.5 kb and 1.9 kb containing eggshell genes, ESG-1 and ESG-2A, respectively, were derived from different phage clones. The 2.5-kb EcoRI fragment was cleaved with HpaII, resulting in a 1.4-kb fragment

71 that was subcloned for DNA sequence determination.

M13 cloning and DNA sequencing. The 1.4-kb EcoRI-HpalI genomic DNA restriction fragment from ESG-1 was ligated to an EcoRI-ClaI digested pBR322 plasmid. The plasmid was linearized by digestion with either EcoRI or HindlII, depending upon the orientation of the insert, and digested with BAL 31. The ends were repaired with T4 DNA polymerase and the plasmid digested with either EcoRI or HindlII, to release the BAL 31 digested insert. These products were separated on a preparative 6% polyacrylamide gel and the various size classes of BAL 31 digested insert were electroeluted [19] for subcloning into either MI3 or pUC19. The standard dideoxynucleotide reactions [20,21] were conducted using [35S]dATP (New England Biolabs). Maxam-Gilbert DNA sequencing. The DNA sequences of part of ESG- 1 and all of ESG-2A were determined by the Maxam-Gilbert base modification reactions [16]. Restriction fragments were end-labeled with [c~-32p]dNTPs using E. coli DNA polymerase I, The end-labeled fragments were digested with a second restriction enzyme or subjected to electrophoresis on denaturing acrylamide gels to obtain separated DNA strands. The DNA sequences were analyzed using the PCS programs for the IBM personal computer [22]. Primer extension. The oligonucleotides shown in Fig. 3 were labeled at their 5' ends using polynucleotide kinase and [7-32P]dATP. The labeled primers (1 pmol) and female total RNA (20 #g) were resuspended in the hybridization buffer (80% deionized formamide, 0.4 M NaC1, 1 mM EDTA, 1 mM DTT, 25 units RNasin (Promega Biotec), 10 mM Pipes, pH 6.4) in a total volume of 100 #1 and incubated for 16 h at 42°C. The DNA/RNA hybrids were ethanol-precipitated and redissolved in 200 izl of primer extension buffer containing 50 mM Tris-HCl, pH 8.3, 60 mM NaC1, 6 mM Mg acetate, 10 mM DTT, 5 mM each of the dNTPs, 25 units of RNasin and 20 units of AMV reverse transcriptase. After incubation at 41°C for 3 h, the RNA was degraded with 0.1 M NaOH for 30 m at 50°C and the reaction neutralized with HCI.

The extended products were extracted twice with phenol and precipitated with ethanol. The oligonucleotides alone, the extended products and DNA sequencing reactions used as size markers were separated by electrophoresis through an 8% polyacrylamide sequencing gel containing 6 M urea. The gel was exposed to Kodak XAR-5 film at - 8 0 ° C with an intensifier screen.

In vitro translation of RNA. Samples of female or male total RNA were translated in vitro in a rabbit reticulocyte lysate (Promega) with [35S]methionine [23]. The in vitro synthesized polypeptides were separated by electrophoresis on 12% polyacrylamide-SDS gels [24]. The gels were fixed in methanol, treated with a fluorographic reagent (Amplify TM, Amersham), dried and exposed to Kodak XAR-5 film at -80°C. Mobility shift assay. Six-week post-infection, live S. japonicum, separated by sex, were disrupted at 8-12000 p.s.i, in a French press. Nuclei isolated from the lysed cells were used to prepare nuclear extracts [25] that were stored at - 8 0 ° C until use. The template DNA in the binding assay was an end-labeled, 346-bp HinclI/DdeI fragment isolated from the conserved 5' region of ESG-1. The 25 #1 binding assay contained 10 #g of nuclear extract protein, approximately 1 ng of the labeled DNA template and 2.5 #g of poly(dldC) in binding buffer (50 mM NaCI/5% glycerol/1 mM DTT/I mM EDTA in 10 mM Tris-HC1, pH 7.5) [26]. The assay mixture was incubated at 25°C for 30 min and the potential protein-DNA complexes resolved in a 4% polyacrylamide gel containing 0.38 M glycerol and 2 mM EDTA in 50 mM TrisHC1, pH 8.5. Electrophoresis was performed at 10 V cm-~ for 1.5-2 h at room temperature. The gel was dried before autoradiographic exposure of Xray film at -80°C. Results

Identification of the eggshell protein cDNAs and genomic DNA clones. Approximately 4% of the phage in the S. japonicum cDNA library hybridized to the oligonucleotide described above (Materials and Methods). The 166-bp insert in one cDNA clone, ESG-3' cDNA, was purified and

72 t

Genomic clones

Hybridizes with ESG probe

EMBL 8, 18

2.5 kb EcoRI fragment

ESG- 1

EMBL 20

1.9 kb EcoRI fragment

ESG-2A & 2B

EMBL 14, 16

1.0 kb EcoRI fragment

ESG-3

ESG- 1

R

ESG-2A ESG-2B ESG-3

Hp

Gene

R

R

R

R

R R

R

Fig. 1. Autoradiogram (left) of a Southern blot of S. japonlcum genomlc DNA digested with Et'oRI (R) or HmdlIl (H) and probed with ESG-3r cDNA. Lane M contains labeled HindlII/EcoRl digested ADNA. Numbers indicate fragment sizes in kb. Summary (right) of AEMBL clones containing the four genomic EcoR1 fragments that have ESGs. The thickest lines indicate the protein coding regions within these EcoRI fragments. Restriction sites are EcoRl (R) and HpalI (Hp). used to probe the unamplified cDNA library for larger cDNA inserts. About 6% of the clones in the cDNA library were found to hybridize to the ESG-3' cDNA. Several of these positive phage were purified and five independent cDNA inserts were subcloned into pUC19. Two larger inserts (836 bp and 655 bp) were chosen for further characterization. The AEMBL3 library of S. japonicum genomic DNA was also screened with the ESG3' cDNA. Approximately 0.027% of the plaques hybridized to this probe and each of the twenty clones analysed contained putative ESGs. Isolation of the S. japonicum ESG family. Several recombinant AEMBL3 phage containing S. japonicum genomic DNA inserts that hybridized to ESG-3' cDNA were digesled with Sail, which releases the insert, or with a double digest of SalI and EcoRI. The ESGs were identified on the basis of which EcoRI fragments hybridized to the ESG cDNA as summarized in Fig. 1. Genomic clones EMBL8 and EMBL18 contained a 2.5-kb EcoRI fragment that was subsequently shown to have an ESG defined as ESG-1. The EMBL20 clone contained two EcoRI fragments of 1.9 kb, each of which possessed a gene defined as ESG2A or ESG-2B. Finally, EMBL14 and EMBL16

contained the gene ESG-3 on the 1.0 kb EcoRI fragment. Fig. 1 also shows a Southern blot of S. japonicum genomic DNA digested with EcoRI or HindlII and probed with ESG-3' cDNA. Three bands were observed in the EcoRI digest of genomic DNA, corresponding to EcoRI fragments of 2.5 kb, 1.9 kb and 1.0 kb when the blot was hybridized and washed under high stringency conditions. The 1.9-kb band was twice the intensity of the other two as determined by densitometric scanning (data not shown). Likewise, three HindlII bands were observed, the largest of which was twice the intensity of the other two. These results suggested that there are four related genes in this family, a result confirmed by nucleotide sequence determination. A Southern blot was also performed using HindlII and EcoRI digested genomic DNAs obtained from separated, male and female worms (data not shown). The ESG-3 ~ cDNA probe hybridized to the same restriction fragments in both the male and female DNAs indicating that the four members of the ESG family were present in both sexes. The various genomic DNA clones were examined by cross-hybridization and restriction map patterns for possible overlap among their

73

Hd Dd S H J MHJM

Fig. 2. Autoradiogramof a Southern blot of genomic DNA fromS. haematobtum(H), S. japonicum (J) and S. mansoni(M) digested with HindlII (Hd) and DdeI (Dd), and probed with a 700 bp DdeI/EcoRI fragmentcontainingthe coding region for S. japonicum ESG-1. Lane S contains labeled HindlIl fragments of )~DNA. 15-20 kb DNA inserts. Aside from the crosshybridization of the EcoRI fragments containing the genes themselves, no evidence for an overlap was observed. Therefore, except for the genes on the two 1.9-kb EcoRI fragments within the genomic region cloned in EMBL20, these genes are not closely linked in the schistosome genome.

Comparison of the ESG family in S. japonicum, S. mansoni and S. haematobium. Potential crosshybridization of this ESG family with similar gene families in S. mansoni and S. haematobium was examined by Southern blot analysis (Fig. 2). Genomic DNAs of S. haematobium, S. japonicum and S. mansoni were digested with HindlII and DdeI and the blotted DNA was probed with a nick-translated, 700 bp DdeI/EcoRI fragment from ESG-1 under low stringency conditions. In

spite of the reduced stringency conditions, the intensity of the hybridization to the DNA from S. haematobium and S. mansoni was much less than it was to the S. japonicum DNA. Based on the intensity of hybridization and the fragment sizes observed, the organization and the nucleotide sequences of the related set of genes in the three species appeared to be quite different (see below).

Nucleotide sequence analysis of two eggshell genes and several eggshell protein cDNAs. The S. japonicum ESGs on the cloned 2.5-kb EcoRI fragment and one of the cloned 1.9-kb fragments were chosen for further characterization. The 2.5kb fragment containing ESG-1 was cleaved with HpalI to generate a 1.4-kb HpalI-EcoRI genomic fragment whose complete sequence of 1446 bp was determined. This HpalI-EcoRI sequence contained the entire ESG-1 coding region, 135 bp of 3' non-translated region, and 675 nucleotides of upstream sequence (Fig. 3). The 1.9-kb fragment was found to contain the entire ESG-2A coding region, 135 bp of 3' non-translated region and 1165 nucleotides of upstream sequence. In addition the nucleotide sequence of the upstream flanking region of ESG-2B on the other cloned 1.9-kb fragment was also determined and is shown in Fig. 5A. The complete nucleotide sequences of ESG-1 and ESG-2A are presented in Fig. 3. The 146 bp immediately upstream of the coding regions are nearly identical in the two genes. Further upstream the similarity between the two sequences decreases over about 16 bp and then disappears entirely. Upstream of the ESG-2A coding region are several small repeated sequences indicated by broken overlines. At position -811, the sequence CAAT is repeated seven times, followed by eleven copies of AAT. At position -600, TAG occurs six times in a row, followed by five copies of TTG. The latter repeats are not found upstream of ESG-1. The significance and function of these small repeats, if any, are unknown. The sequences of the 655-bp and 836-bp ESG cDNAs were determined (data not shown). Both cDNAs contained the entire ESG translated region. A comparison of these cDNA sequences with those of the genomic clones revealed that the 655-bp cDNA was derived from ESG-1 mRNA

74 -I118

TTCAATCAACTAAAAcTAAATAACCAACAACCTAGAATTATCAAATATAAATGTAAAATGACACTATGATTTTCCAATTGAATGCAGGGTATGGTCTAATCACATGGGACATTAAGAGTC

-998

AAGCGATATTAAGAAcGTCGTGGATGGAGGAGGGTGTGTATATATGTTCAGTTTACACGTTGTAAAAGACGTTTATATGTCAGTTAGTTGATGcAATTTGACGACAGTAAGCAGTGCCAC

-878

TACCGCTCAGGATACAAATAAGATCATGACAAAATAGATATTTcTACAAAACGATGTGTGAATAAGGCAATcAATCAATcAATCAATCAATCAATAATAATAATAATAATAATAATAATA

-758

ATAATAATcCTAAAcATACTTATAAAACTAAcAcAAATAcTATGAAATAcACAAATACTGTCACTTATACACAAACTATTATACAATTAAcGTAN•iAAGAGGATACGGATGTGGTAGTG

-638

TAAGTGGAAGTAAT.GAGGAAG..ATGATAG.AGC.T.A. TAGTAGTA•TA•TAGTTGT•G•TG•TG•TGTTAGA•TAA••GT••TA•AGTTCG•A•TA•C•A•c•CTCAA••CA•T•A•

-518

AATTGGGTTATTTACTTTTCTCTAAAGAAGcGATTTACACTAAGCcACAAAACATCGGAAATTTTAATTTccACTCATTGAATGATAA~GTAATGAATcGTTAGTAGAcAATACATTAT C.AAT-AAAC..G.TCc.GTCGAC..AGCATAC-C.CTCACLCGT~..~TCT~AAATC.AAAA.T.AAT..A~GT-CA-~A~.ATG.-A.-TAA.~G.T-GTT.~..A-.~.A-~Acc

-398

GAAATTGATGAATGTAAAGTATGTAATG~AAAACATAATGATTT~AGGTAAATTATGC~TTGGAATAGAT~AAATMu~TCAAATAATCAAATTGAAAAAAGTTGATGAMu~CTTAAAG

-629~'~'i~TCGGcACcA~CTCTAGAAGAGCTGCGAGTGGAGAATA~TGG~GAG~GGCCTATGCTCCTCCGAGATGGGGTAACAGGCATAAGTAAGTATTATTTAATCTT

-278

ATTTCAc••T•.•T•C••CG•ATC•CAA•.C•.TCC-T.•-ACAA..AT...CA-CAC-T•ACAT-C-CGATT-TCTC.G..T-C.-CATGTGGTTG•.T.A..AGAATT•GTAGc-TTC TGcAAGTTGACAGTTTAGTAGGTAcAATGGATTAAAGTGATTCACGAAGAAGATATGAACTACTGAATcAGTTAAACAAk•AAAAAAcAATAATGCGAAAGATTTCATTCACCATTAACA

AA. . . . A'•TTGCC••.cACTAC.T••AcAT•C•CTAGC•A••GAC•GATTACATCA•T•ACA•AGCATTTCATTGTG•T•GTGG••TTGA•••CGTT.GTTA•.TG•GTGA••AGT•.T Oligonucleotlde

i __

-158 TTTMGTGT~TGATTMcI~AGATTTAG~cc~T~T~TAGAGTAcATAGA~GATATTTGTATTG~TA~TGTGcA~TTTA~A~cA~cTcTAc~ATccAT~Tc CACTTCCT.TTCG.ACAATCA..C. TCAAGTTC.TTG. CG.TTA.. G. T . . . . G . . . GA. . . . . . . . . . . . . . . . . . . . - 38

AAAAATCTTATATA~'TGACAATTACACACGCATT~CT •

1

met

2

A...........................................................

CTACcCAcCACcGTCCGATTATGATAGTGGGTATGGAGGTGGAGGTG~GGTGGAGGAGGAGGAGGTTACGGAGGTTGGTGTGGAGGATCAGAcTGCTAT ..........

200

Oli~onucleotide

ATCATAcTCATCAcCAATT•AAACATCAT•ACGATCACTC-AAGAATATGAAGTCATCATTGACATTGTTGTTCTTAGCAGCCATcGGCTACACAATTGC ........................................

100

T. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

T.A . . . . . . . .

C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

GGAGGAGGTAATGGTGGTGGTGGTGGTGGCGGCGG•GG•AA•GGTGGAGAATACGGTGGTGGTTATGGTGATGTCTATGGAGGTAGCTATGGTGGTGGTA .... G ..C ............................................................................................

300

400

500

600

700

800

GCTATGGTGGTGGTGGATATGGTGATGTcTATGGTGGTGGATGTGGTGGTcCAGATTGTTATGGTGGCGGTAATGGTGGTGGAAATGGTGGTGGCGGTGG

TTGCAATGGTGGTGGTTGTGGTGGTGGACCcGAcTTTTATGGCAAAGGATATGAGGACTcGTATGGTGGCGACAGCTACGGAAATGACTACTATGGCGAT ..........................

T. . . . . . . . . . . . . . .

T. . . . . . .

TATT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

G.

TCGAATGGAAGGAAGAATGGTCATGGTAAGGGAGGCAAGGGTGGCAATGG TGGTGGCGGCGGCAAAGGTGGTGGTAAGGGTGGTGGCAATGG TAAGGGTA • .A . . . . . C...CG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

C. . . . . . . . C. . . . . C..CG . . . . . .

term ACGGGAAAGGAGGTGGTGGTAAGAATGGTGGTGGCAAAGGTGGTAACGGAGGCAAAGGAGGCAG TTATGCACCCTCCTATTATTGAGTTCATCACTCA'[ C .........................

C..C . . . . . . . . . . .

C. . . . . . . . . . .

T . . . . . C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

~TTCATGCCCATCTATATCACTCC~CATTCATTTGTGTCATACTGCTCTCATTcCTCTCATcATACAACACACACCACTCAATTCATCTCACTGTAAT

ATCGTAATGTGTGAATTCcagataaataaatattatccattctaaaaaaa aaaaaaaaaaaa

862

Fig. 3. Nucleotide sequences of the ESG-l and ESG-2A regions. The sequence of the ESG-2A region begins at nucleotlde - l 118 and continues as the bottom sequence. The ESG-1 region begins at - 6 2 9 (arrow) and continues as the top sequence. The sequence in lower case letters was derived from the 3 t ends of c D N A s which extend beyond an EcoRl site in the genomic fragments. Short repeats are indicated by broken overlines or underlines. The putative TATA box is overlined with a thick line. The transcribed sequence is indented beginning at nucleotide l and continues to the poly(A) ÷ tail. The initiator metbionine and termination codons are indicated with met and term, respectively. Dots in the ESG-2A sequence show nucleotide idenlity to the ESG-I sequence; the asterisks indicate a deletion. The overlines labeled Oligonucleotide 1 and 2 indicate complementary oligonucleotlde sequences used in primer extension experiments. The diamonds indicate the 5' ends of the extended R N A ' s shown in Fig. 4,

75 and the 836 bp cDNA from ESG-2A mRNA. These ESGs do not contain introns. Additionally, several smaller cDNAs from the 3' ends of the ESGs were sequenced. The precise genomic origin of the 166-bp ESG-3 t cDNA, which was identified first, could not be determined. Its sequence terminates at an internal EcoRI site and is identical to that of both ESG-1 and ESG-2A, and the other small 3' cDNAs, in a region that is perfectly conserved among all the clones. The transcription start sites. To determine the 5' ends of ESG transcripts, primer extension experiments were performed using two different oligonucleotide primers on total female RNA. Fig. 3 shows the sequences to which these oligonucleotides hybridize as well as the putative TATA box and the 5' ends of the RNAs determined in this experiment. Oligonucleotide 1 is located upstream of the TATA box, while oligonucleotide 2 is within the coding region. Oligonucleotide 1 did not produce any extended products (Fig. 4, lane 3), while oligonucleotide 2 produced two extended fragments differing in length by three nucleotides. These extended fragments are 81 and 84 nucleotides long and corresponded to the locations in the sequence marked with diamonds (Fig. 3). This result indicated that transcription starts 30 nucleotides downstream from the first T of a putative TATA box, or 47 nucleotides from the start codon. It is likely that transcription actually begins from two places because there are two overlapping TATA box sequences. Another possibility is that these transcripts are from two different genes that contain a 3-bp difference in the length of the sequence from the oligonucleotide position to the transcription start point. Conserved sequences in the ESG promoter region. A comparison of the regions upstream of ESG-1, ESG-2A and ESG-2B is shown in Fig. 5A, along with the equivalent regions of a S. haematobium ESG [27] and a S. mansoni ESG [28,29]. Only a single representative member of the ESG family in these other Schistosoma is shown since within a species, the individual ESGs are very similar. In S. japonicum, the regions approximately 100 bp before the transcription start sites (146 bp before the start codon) are almost

Fig. 4. Primer extension analysisof S. japonicum ESG RNAs. Maxam-Gilbert sequencing reactions specific for A (lane 1) and G (lane 2) are the size standards. The elongationreactions c~)ntainedadult female RNA and oligonucleotide 1 (lane 3) or oligonucleotide 2 (lane 4). The unextendedoligonucleotides 1 and 2 are shown in lanes 5 and 6, respectively. identical in these three genes. An even larger region of similarity occurs upstream of the two S. haematobium and S. mansoni ESGs (not shown) [27-29]. When sequences from the three schistosome species are compared, however, only segments of the 100-bp region are conserved among the three. Within this region are the TATA box,

76

A ESG-] ESG-2A ESG-2B

S. j a p o n i c u m

GTTC.TTG.CG.TTA..G.T . . . . G . . . G A . . . . . . . . . . . . . . . . . . . GTACGCTA. . . . . . . TG.G . . . . . G..G.AGT.C.G . . . . . . . . . . . . . . .

T . . I. . . . . . G~ . . . . . .

I. . . . . . . .

I ..................

I. . . . .

I.

. . .

:1:i::11:::ii

....

;i:::::

CA [A * ~ C L . . . . GAG . . . . . . .TC . . .A. . .G .......G . . . .C. .A.G . . .TA . . . .A . . . . . . . . . . . . . .TG~I;;;III:AGAGAGG~ .... CAA C A GA G TCGA G G C AG TA A T TG . GTGAGG. . . . . * . . . . T. . . . . . . . . . . CAA

S. haematobium S. m a n s o n ~ - -

-

1

+I

met

.

ACAA.TTACACACGCATTCC CTCATCACCAATTCAAACATCATCACGATCACTC*AAGAATATG ...................................................... A. . . . . . . . .

I. . . . . . . .

~ ....

I. . . . . . .

~ .T..A.C . . . . TA.TC.A.A...~.~.. ACACC..GT.C~...CACCA.CA.TTTGA.~********...

* ........

A .......

I.c...I

....................

A ..............

A .........

I. . . . . . . . . ~. . . . A.C. . . . TA.TT.A.A.I. . . . . ~ACACCG.AT.CAA...CACAG.CA.TTGGA.AA********...

B 683 term

S. lapon~cLmm

783

ESG-I TGAGTTCATcACTCATCAATTcATGCCCATcTATATCACTCCAACATTCATTTGTGTcATACTGCTC•CATTcCTcTCATCATACAACA*CACACCACTCA ESG-2A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . * .......... S. haematob~um ...**************GGCA.CA.TTG.AT.G.GGATTATTCTT.C.TG ....... ******CACACTG.C.ACTA.CCAT.TTTC..*. ..T .... S. mansonl ...**************GGCA.CA.TTG.GT.G.GGAT.ATTCTA...TG ....... ******CACACTC.C.ACTGTCCT..TTTT.TA ...... T ....

862

EcoRI

ATTCATCTCACTGTAATATCGTAATGTG~.agat.~.attatccattctaaaaaaaaaaaaaaaaaaa ...

A.............

A..CG...T.~

I

.. . .. .. . G. ~. C. C. .. . .. T.

.

.

.

.

.

.

. .. . .. . .. ... .. . .. . .. . .. .. . .. . .. . .. .. . ..

.

.

.

.

.

.

C S. 2 a p o n l c u ~

ESP-I ESP-2A

S. haematoblum S. mansoni

V

V

MKSS LTL LF LAA I GY T I . . . . . . . . . . . . . . . . . . . .R . . . . . V..V .... A T ..Q .... V..V .... A T

V

V

V

V

I

AY PPP . S . . Q I S H . H T T S H

V

V

V



SDYDSGYGGGGGGGGGGG YGGWCGGS . . . . . . . . . . . . . . . . . . . . . . . . . E S P N D . . . D C Y . S D C . . G Y . D G . Y G G G . . S G . . . . . CY.SDCDSGY.DS.YGGG

V

V

V



• DC

OV

Y G G G N G G G G G G G G G N G G E YGGG Y G D V Y G G S YGGGS YGGGG YGDV YGGGCGG p DCY GGGNGGG

L~II~I CT..DC.

• •

II~III'IIL~II~IG~III6~II# -Y-..YG..CS.



DC.NYG.

V

V

V

E ..... E~ . Y . G I ~ ; ~

~I#~LL~II#CII

112

-YG.DCNG.DC.NYGGGY..GN.GG.S..NC...

V

YV

N G G G G G C N G G G C G G G P D F Y G KG y E D S y G G D S y G N O y y G D S N G R K N G H G K G G K G G N G G G G G K G . . . . . . . . . . . . V . . . . . . . LL . . . . . . . . . . . . . . G R GY...CNGDDCG Y'GGGG-G.GGGGG..YGG.CNGG.CGY .FDEAFPAPYG. NQ.y. NGH FDEAFPAPY. • DY.NGGNGFGKGGSKGNNYGK.YGGGSGKGKGGGKG . . . . TYKPSHY

V

~f

G G K G G G N G K G N G K G G G G K N G G G K G G N G G K G G S y A p S y y , E . ,

y.

50

174

212

KP N TYA S T Y *

Fig. 5. (A) Comparison of the 5 r nucleoUde sequences surrounding the transcription start snes of three S. japomcum ESGs, a S haematobtum ESG (27) and a S. mansom ESG (29). Dots indicate nucleotlde identmes. The Drosophda-like enhancer sequence, CAAT box, TATA box and cap site are boxed, respectively, from left to right. Broken arrows near - 7 6 indicate inverted complementary sequences. Asterisks indicate spaces introduced to maximize homology. The initiation codon is indicated by met. (B) Comparison of the nucleotide sequences in the 3 r non-translated regions of S japomcum ESG-I and ESG-2A, and the S. haematobium and S mansoni ESGs. The EcoRI site and polyadenylatlon signal sequence are boxed and the terminauon codon indicated by term. Small letters are the same as m Fig. 3. (C) Comparison of the amino acid sequences of S japomcurn ESP- 1 and ESP-2A, and the S haematobium and S mansoni ESPs Dashes indicate amino acid deletions. The putative S. japomcum signal peptide is overhned. Cysteine and tyrosine residues are indicated by diamonds and arrowheads, respectively.

77 represented by TATATAAAT, and the CAT box represented by CAAAT. The CAT box is located 16 bp in front of the TATA box within the largest area of sequence homology among the five genes. In addition, several other small sequences are an exact match in all of these genes. One of these regions is the sequence, TGCACT, which is known to act as an enhancer in both Drosophila and silkmoth chorion (eggshell) genes [30,31]. This sequence was initially identified in the opposite orientation, TCACGT, in Drosophila and shown to act bidirectionally. In addition, small complementary inverted repeats flank this sequence, which are indicated by broken arrows in Fig. 5A. Similar inverted repeats also flank the Drosophila enhancer. The transcription start site is also homologous to the cap site sequence ATCAT [29]. Aside from a single base substitution in ESG-2B and one in the S. haematobium ESG, these sequences in the five genes are identical. Within the 3' non-translated regions of the ESGs (Fig. 5B), the two S. japonicum sequences are identical, but quite different from the very similar S. haematobium and S. mansoni ESGs in the first 90 residues after the termination codon. The sequences of all three species are then very similar beyond these 90 nucleotides until the site of polyadenylation is reached. This conserved segment includes the consensus polyadenylation signal AATAAA and the EcoRI site. In summary, both the 5 ~ and 3 ~ flanking regions of the S. haematobium and S. mansoni sequences are much more similar to each other than to S. japonicum, suggesting that they are more closely related evolutionarily, consistent with their partial geographical overlap.

Codon usage and structure of the ESPs. The sequences of the 212-amino-acid ESP-1 and 207 amino acid ESP-2A are very similar (Fig. 5C). There are eight amino acid changes and one deletion of five amino acids in ESP-2A relative to ESP-1. The corresponding genes contain many third position codon differences that are not evident at the amino acid level. These differences are mainly in the glycine codons, GGX, where X represents any nucleotide. Both genes, however, contain a definite glycine codon preference. In ESG-1, GGT is present in 57% of the glycine

codons, while GGA, GGC and GGG are present at 23%, 18% and 2%, respectively. In ESG-2A, GGT is used in 54% of the glycine codons, while GGA, GGC and GGG are used at 20%, 24% and 2%, respectively. The codon usage for the other amino acids is similar in the two genes. The first 27 amino acids of both ESP-1 and ESP-2A (Fig. 5C) are hydrophobic and have the characteristics of a consensus signal peptide [32]. Each has a lysine at position 2 and has 10 out of 13 hydrophobic residues (8 in a row) followed by two or three prolines. The last proline is five residues before serine at position 27, a potential signal peptide cleavage site. This serine is the last residue before an extremely glycine-rich region that is likely to be part of the mature ESP. The deduced amino acid compositions of ESP- 1 and ESP-2A are compared in Table I, along with the deduced amino acid compositions of the products of two S. mansoni ESGs [14,28,33] and one S. haematobium ESG [27]. These deduced amino acid compositions are compared with the directly determined composition of native S. japonicum and S. mansoni ESPs [34]. The D. melanogaster chorion (eggshell) protein is also included for comparison [30].

In vitro translation products of ESP RNAs. The calculated molecular weights of ESP-1 and ESP2A are 18917 and 18505, respectively. To detect these proteins directly, total RNA from female and male worms was translated separately in a rabbit reticulocyte lysate system containing [35S]methionine. In Fig. 6, the lane containing the female translation products has an intense band in the 19-kDa range that is not present in the lane containing male translation products. This band probably contains the ESP precursors produced from the abundant female eggshell mRNAs since it is female-specific and corresponds to the sizes expected for ESP-1 and ESP-2A.

Mobility shift of the 5' untranslated region. The presence of the Drosophila and silkworm enhancer sequence and other conserved sequences upstream of the S. japonicum ESGs suggested that these regions contain cis-acting elements that participate in the control of ESG transcription. A 346-bp HinclI/DdeI fragment (positions - 2 7 0 to

78 76) c o n t a i n i n g this u p s t r e a m c o n s e r v e d r e g i o n was isolated for use as a D N A template in a m o b i l i t y shift assay o f schistosome n u c l e a r extracts (Fig. 7). A retardation in the electrophoretic m o b i l i t y of the 346-bp f r a g m e n t w o u l d provide e v i d e n c e for the presence of sequence-specific b i n d i n g c o m ponent(s) in the n u c l e a r extract of female and/or male w o r m s [25,26]. The extracts were adjusted to c o n t a i n the same a m o u n t of protein per unit v o l u m e , and their proteins a n a l y z e d on a SDSp o l y a c r y l a m i d e gel. Several sex-specific proteins were detected in the extracts of both females and males (data not shown). The presence of the female n u c l e a r extract retarded the m i g r a t i o n of nearly all of the labeled 346 bp D N A f r a g m e n t (Fig. 7, lanes F), while the e q u i v a l e n t male nuclear extract did not (lanes M). C o m p e t i t i o n ex-

periments (data not shown) d e m o n s t r a t e d that the presence of a 100-fold excess of the same, unlabeled template f r a g m e n t reduced the a m o u n t of the labeled f r a g m e n t retarded by the female extract while several adjacent g e n o m i c restriction fragments of about the same size did not. In addition, w h e n these other fragments were labeled and used as a template in the m o b i l i t y shift assay, their m i g r a t i o n was unaffected by either the female or male extracts. These results are consistent with the presence of a c o n s e r v e d sequence in the 346bp f r a g m e n t that is r e c o g n i z e d by a c o m p o n e n t ( s ) in the female extract but not in the male extract. Due to our limited supply of S. j a p o n i c u m parasites we were u n a b l e to prepare large a m o u n t s of the female extracts for c o n d u c t i n g more sensitive b i n d i n g assays such as D N A footprinting [35].

TABLE I Comparison of eggshell protein amino acid compositions Amino acid composinon deduced from nucleotide sequence S. japomcum

Ala Arg Asn Asp Cys Gin Glu Gly His lle Leu Lys Met Phe Pro Set Thr Trp Tyr Val

S mansoni

Ammo acid composition of nanve eggshell and chorion proteins S haemat.

ESG-1

ESG-2A

pSMfd

H-R Fragc

19 0.5 6.1 5.2 2.8 0.0 0.9 50.5 05 0.9 1.9 5.7 0.5 0.9 2.8 5.7 0.9 0.5 10.8 0.9

19 1.0 6.3 4.8 2.9 0.0 1.4 49.8 0.5 1.0 2.9 4.8 0.5 1.0 2.4 5.8 1.0 0.5 10.1 1.4

2.8 0.0 5.1 5.6 5.6 0.6 0.6 44.1 1.7 0.6 1.7 5.6 0.6 2.3 1.7 6.2 3.4 0.0 10.7 I. 1

2.9 0.0 5.2 5.8 5.8 0.6 0.6 43.7 1.7 0.6 1.7 5.8 0.6 2.3 1.7 6.4 3.5 0.0 10.4 1.2

SH.E2-1 b 3.2 0.9 4.5 3.6 2.7 0.9 3.2 50.5 1.8 0.9 1.4 4.1 0.5 1.4 2.7 3.2 2.3 0.0 11.4 0.9

S lapomcum d

S. mansont d

2 64 2.00

4.02 2.02

16.581 2.16

15.40f 2.16

4.271 44.84 2.29 0.81 1.62 6.39 0.57 1 13 3.82 6.87 2.29 0.05 ().58~ 1.09

4.70~ 36.67 5.20 1.22 1.77 9.44 0.95 1.92 3.37 6.63 3.05 0.01 0 78g 0.59

D. melanogaster ~

15.5 7.2 4.1 1.0 0.0 3.1 52 21.6 1.0 2.1 2.1 1.0 0.0 0.0 7.2 8.2 1.0 0.0 14.4 5.2

All values expressed as percent of total amino acid content. dRef. 14, 29. bRef. 27, S haematobium. CRef. 28, HmdlII-EcoRI fragment. dRef. 34. eRef. 30, Drosophda melanogaster. fCombined numbers for Asp and Asn, Glu and Gin. gTyr residues involved m protein cross-links are not detectable by the acid hydrolysis composition analysis used by Byram and Senti [34].

79

M

?

Discussion

Z

97 68 43

25.7

18.4 14.3

Fig. 6. Autoradiogram of the in vitro translation products from l #g S. japonicum female or male RNA. Lane M contains protein molecular weight markers in kilodaltons.

Hc ESG1 t

f ~EN

rATA

+1

Fig. 7. Autoradiogram of a DNA binding assay using a labeled 346 bp Hincll(Hc)/DdeI(D) fragment from ESG-I. This fragment contains the Drosophila-like enhancer (EN), the TATA box (TATA) and the transcription start site (+1). The thickest line indicates part of the coding region. Lanes indicated by dashes contain the labeled fragment without nuclear extract. The F lanes contain a mixture of the female nuclear extract and the DNA fragment. The left F lane was loaded with onethird of the amount of binding assay mixture compared to the right F lane. The M lanes contain the male nuclear extract and the fragment.

The first schistosome ESG cDNA to be isolated was from S. mansoni [14,28,36]. Using an oligonucleotide based on a sequence in the 3' untranslated region of this ESG [14], we identified ESG cDNAs of S. japonicum. The corresponding genes were subsequently subcloned from a genomic DNA library and characterized. This analysis revealed that both similarities and differences occur among the ESPs of S. japonicum, S. mansoni and S. haematobium. Among the similarities, all of the ESPs are about 10% tyrosine and 44-50% glycine. The S. japonicum and S. haematobium ESPs contain runs of 10-15 glycines while the S. mansoni ESP has shorter glycine stretches, the longest being three residues. As a result, the region of greatest amino acid conservation among the three species is actually the first fifteen residues of the signal peptide which contains only a single glycine (Fig. 5C). In addition, the positions of cysteines, tyrosines and charged amino acids are not particularly conserved among the ESPs of the three species, suggesting that the structures of these proteins may be different despite their high tyrosine and glycine content. Within a species, however, the sequences of the ESP family members are highly conserved. The S. mansoni and S. haematobium ESPs are more similar to each other than to the S. japonicum ESPs. For example, many positions throughout the first 150 residues of the S. mansoni and S. haematobium ESPs are the same, but different from the S. japonicum ESPs. In addition, the two longest stretches of glycine in the S. japonicum ESPs (11 and 9 residues) occur near the beginning of the protein while the longest glycine run in the S. haematobium ESP (15 residues) appeared about two-thirds of the way through the protein. In the ESPs of all three species, a common sequence motif is two or three glycines followed by a tyrosine, serine, asparagine or cysteine. The amino acid compositions of the ESPs deduced from the cDNA sequences corresponds well with the percentage of each amino acid determined on the native ESPs, except for tyrosine (Table I). This discrepancy may be explained by the fact that tyrosine residues involved in protein cross-links are not readily detected by amino acid analysis [34].

80 The composition of the D. melanogaster chorion protein is similar to that of the ESPs except that alanine is frequently used instead of glycine. An interesting feature of the S. japonicum ESPs is that their tyrosines, cysteines and lysines are non-randomly positioned. Many of the tyrosines (Fig. 5C) are spaced four or five residues apart, which may align certain portions of the protein for cross-linking. For example, tyrosines occurred at positions 69, 73, 77, 81, 86, 91 and 95, and at positions 131, 135, 139, 144 and 148. This tyrosine periodicity is much less prominent in the ESPs of the other two species. The S. japonicum ESPs have three pairs of cysteines (Fig. 5C) in which the cysteines of each pair are separated by four residues. The S. mansoni ESP contains four cysteine pairs separated by four amino acids, but at different locations while the S. haematobium ESP has only two similar cysteine pairs. Finally, the distribution of the lysines in the ESPs is unequal. The N-terminal halves of the S. japonicum ESPs have no lysines, while the C-terminal halves have 11 for ESP-1 and 9 for ESP-2A. Arginine, tryptophan and glutamine are the rarest amino acids in the ESPs. The sequences flanking the ESGs suggest that schistosome gene expression occurs by mechanisms that are similar to those of higher eukaryotes. The S. japonicum ESG-1 and ESG-2A are both flanked by consensus TATA boxes and polyadenylation signal sequences. The 5 / region may contain regions, such as enhancers and other cisacting elements, that are involved in the regulation of gene expression. The mobility shift experiments (Fig. 7) provide preliminary evidence that potential regulatory proteins bind to a specific sequence upstream of the ESGs. The fragment from ESG-1 used in these assays contains the transcription start site, the TATA box and the Drosophilalike enhancer sequence, all of which are likely candidates for elements of transcriptional control. The distinct absence of any mobility shift with the male nuclear extract strengthens this finding since all of the evidence indicates that the expression of the ESGs is female-specific (Figs. 4 and 6). The fecundity of S. japonicum is ten times greater than that of the other schistosome species [3]. Thus, ten times more ESP must be produced from the S. japonicum ESGs than from the other

species' ESGs. Two possible explanations of the increased ESP production are that ESG transcription or translation initiation rate is higher in S. japonicum, or that the ESG mRNA is more stable. No obvious feature of the 5' sequence comparisons of Fig. 5A suggests a basis for potential increased initiation rates, although, of course, unknown enhancer sequences could be present either within or upstream of the S. japonicum sequences shown. Alternatively, the 3' non-translated sequence could contribute to ESG mRNA stability. For example, the first 90 nucleotides after the ESG termination codons in S. japonicum are quite different from the equivalent regions in the other species (Fig. 5B). Following this segment the 3' non-translated sequences become very similar and are in fact more highly conserved than most of the coding regions (not shown). Although it was beyond the scope of this work to examine the relative ESG mRNA stabilities of the different species, these 3' non-translated sequences are intriguing since this region has been proposed to influence the stability of mRNAs in other systems [37]. The ESGs are not the only example of developmentally regulated, female specific genes in schistosomes. The expression of another female specific gene has also been described but its protein product has not been identified [38]. The DNA sequence of this gene is not similar to the ESGs. If it is involved in some aspect of egg production, however, the combination of the ESGs and this newly discovered gene provides two components important in egg development. Further characterization of these genes should lead to an understanding of the kinds of signals that regulate developmentally specific transcription in schistosomes in general. Finally, the ESGs have a potential link to the unusual male-female interaction necessary for female maturation and normal egg production. In contrast to most tremodes, schistosomes are dioecious and the female does not mature to an egg producing adult without a male. The components that the male schistosome provides for the maturation of the female are unknown [3,4,39-41]. Nevertheless, the ESGs may serve as DNA substrates for identifying the factors that are involved in this maturation process.

81

Acknowledgements This work was supported in part by an award from the Burroughs Wellcome Foundation.

References 1 Schmidt, G.D. and Roberts, L.S. (1981) Foundations of Parasitology, 2nd ed. pp. 281-296. Mosby, 2 Warren, K.S. (1982) Schistosomiasis: host-pathogen biology. Rev. Infect. Dis. 4, 771-775. 3 Moore, D.V. and Sandground, J.H. (1956) The relative egg producing capacity of Schistosoma mansoni and Schistosomajaponicurn. Am. J. Trop. Med. Hyg. 5, 831-840. 4 Vogel, H. (1947) Hermaphrodites of Schistosoma mansom. Ann. Trop. Med. Parasitol. 41, 266-277. 5 Shaw, J.R. (1977) Schtstosoma mansom: pairing in vitro and development of females from single sex infections. Exp. Parasitol. 41, 54-65. 6 Basch, P.F. and Basch, N. (1984) Intergenic reproductive stimulauon and parthenogenesis in Schistosoma mansoni. Parasitology 89, 369-371. 7 Duvall, R.H. and DeWitt, W.B. (1967) An improved perfusion technique for recovering adult schistosomes from laboratory animals. Am. J. Trop. Med. Hyg. 16, 483-486. 8 Chirgwin, J.M., Przybyla A.E., MacDonald, R.J. and Rutter, W.J. (1979) Isolation of biologically active ribonuclelc acid from sources enriched in ribonuclease. Biochemistry 18, 5294-5299. 9 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning. A Laboratory Manual. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. 10 Southern, E.M. (1975) Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. B,ol. 98, 503-517. 11 Rigby, P.W.J., Dieckmann, M., Rhodes, C. and Berg, P. (1977) Labeling deoxyribonucleic acid to high specific activity in vitro by nick translation. J. Mol. Biol. 113, 237-251. 12 Frischauf, A.-M., Lehrach, H., Poustka, A. and Murray, N. (1983) Lambda replacement vectors carrying polyhnker sequences. J. Mol. Biol. 170, 827-842. 13 Gubler, U. and Hoffman, B.J. (1983) A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269. 14 Bobek, L., Rekosh, D.M., van Keulen, H. and LoVerde P.T. (1986) Characterization of a female-specific cDNA derived from a developmentally regulated mRNA in the human blood fluke Schistosoma rnansoni. Proc. Natl. Acad. Sci. USA 83, 5544-5548. 15 Benton, D.W. and Davis, R.W. (1977) Screening Agt recombinant clones by hybridization to single plaques in situ. Science 196, 180--182. 16 Maxam, A.M. and Gilbert, W. (1980) Sequencing endlabeled DNA with base-specific chemical cleavages. Methods Enzymol. 65, 499-560. 17 Summerton, J., Atkins, T. and Bestwick, R. (1983) A rapid method for preparation of bacterial plasmids. Anal. Biochem. 133, 79-84. 18 Grunstein, M. and Hogness, D.S. (1975) Colony hybridization: a method for the isolation of cloned DNAs that contain a specific gene. Proc. Natl. Acad. Sci. USA 72, 3961-3965. 19 Poncz, M., Solowiejczyk, D., Ballantine, M., Schwartz, E.

and Surrey, S. (1982) 'Nonrandom' DNA sequence analysis in bacteriophage M13 by the dideoxy chain-termination method. Proc. Natl. Acad. Sci. USA 79, 4298~.302. 20 Sanger, F., Nicklen, S. and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467. 21 Zagursky, R.J., Baumeister, K., Lomax, N. and Berman, M.L. (1985) Rapid and easy sequencing of large linear double-stranded DNA and supercoiled plasmid DNA. Gene Anal. Techn. 2, 89-94. 22 Lagrimlni, L.M., Brentano, S.T. and Donelson, J. (1984) A DNA sequence analysis package for the IBM personal computer. Nucleic Acids Res. 12, 605-614. 23 Pelham, H.R.B. and Jackson, R.S. (1976) An efficient mRNA-dependent translation system from reticulocyte lysates. Eur. J. Biochem. 67, 247-256. 24 Laemmli, U.K. (1970) Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature 227, 680--684, 25 Dignam, J.D., Lebovitz, R.M. and Roeder, R.G. (1983) Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 11, 1475-1489. 26 Hromas, R. and Van Ness, B. (1986) Nuclear factors bind to regulatory regions of the mouse kappa immunoglobulin gene. Nucleic Acids Res. 14, 4837-4848. 27 Bobek, L.A., LoVerde, P.T. and Rekosh, D.M. (1989) Schistosoma haematobium: Analysis of eggshell protein genes and their expression. Exp. Parasitol. 68, 17-30. 28 Kunz, W., Opatz, K., Finken, M. and Symmons, P. (1987) Sequences of two genomlc fragments containing an identical coding region for a putative eggshell precursor protein of Schistosoma mansoni. Nucleic Acids Res. 15, 5894. 29 Bobek, L.A., Rekosh, D.M. and LoVerde, P.T. (1988) Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansom. Mol. Cell. Biol. 8, 3008-3016. 30 Wong, Y.C., Pustell, J., Spoerel, N. and Kafatos, F.C. (1985) Coding and potential regulatory sequences of a cluster of chorion genes in Drosophila melanoeaster. Chromosoma 92, 124-135. 31 Mitsialis, S.A. and Kafatos, F.C. (1985) Regulatory elements controlling chorion gene expression are conserved between fl~es and moths. Nature 317, 453-456. 32 Watson, M.E.E. (1984) Compilation of published signal sequences. Nucleic Acids Res. 12, 5145-5164. 33 Cordingley, J.S. (1987) Trematode eggshells: novel protein biopolymers. Parasitol. Today 3, 341-344. 34 Byram, J.E. and Senft, A.W. (1979) Structure of the schistosome eggshell: Amino acid analysis and incorporation of labelled amino acids. Am. J. Trop. Med. Hyg. 28, 539-547. 35 Galas, D.J. and Schmitz, A. (1978) DNAase footprinting: a simple method for the detection of protein-DNA binding specificity. Nucleic Acids Res. 5, 3157-3170. 36 Johnson, K.S., Taylor, D.W. and Cordingley, J.S. (1987) Possible eggshell protein gene from Schistosoma mansom. Mol. Biochem. Parasitol. 22, 89-100. 37 Shaw, G. and Kamen, R. (1986) A conserved AU sequence from the 3~ untranslated region of GM-CSF mRNA medtates selective mRNA degradation. Cell 46, 659-667. 38 Reis, M.G., Kuhns, J., Blanton, R. and Davis, A.H. (1989) Localization and pattern of expression of a female specific mRNA in Schistosoma mansoni. Mol. Biochem. Parasitol. 32, 113-120.

82 39 Armstrong, J.C. (1965) Mating behavior and development of schistosomes in the mouse. J. Parasitol. 51,605-616. 40 Atkinson, K.H. and Atkinson, B.G. (1980) Biochemical basis for the continuous copulation of female Schistosoma

mansom. Nature 283, 478-479. 41 Poplel, I. and Basch, P.F. (1984) Putative polypeptide transfer from male to female Schistosoma mansoni. Mol. Biochem. Parasitol. l l, 179-188.

The gene family encoding eggshell proteins of Schistosoma japonicum.

The four closely related genes encoding eggshell proteins in the human parasite Schistosoma japonicum are described. A cDNA and a genomic DNA library ...
2MB Sizes 0 Downloads 0 Views