MOLECULAR AND CELLULAR BIOLOGY, Sept. 1991, p. 4651-4659 0270-7306/91/094651-09$02.00/0 Copyright © 1991, American Society for Microbiology

Vol. 11, No. 9

elt-1, an Embryonically Expressed Caenorhabditis elegans Gene Homologous to the GATA Transcription Factor Family JOHN SPIETH, YHONG HEE SHIM, KRISTI LEA, RICHARD CONRAD, AND THOMAS BLUMENTHAL* Program in Molecular, Cellular and Developmental Biology and Department of Biology, Indiana University, Bloomington, Indiana 47405 Received 22 April 1991/Accepted 24 June 1991

The short, asymmetrical DNA sequence to which the vertebrate GATA family of transcription factors binds is present in some Caenorhabditis elegans gene regulatory regions: it is required for activation of the vitellogenin genes and is also found just 5' of the TATA boxes of tra-2 and the msp genes. In vertebrates GATA-1 is specific to erythroid lineages, whereas GATA-2 and GATA-3 are present in multiple tissues. In an effort to identify the trans-acting factors that may recognize this sequence element in C. elegans, we used a degenerate oligonucleotide to clone a C. elegans homolog to this gene. We call this gene elt-i (erythrocytelike transcription factor). It is single copy and specifies a 1.75-kb mRNA that is present predominantly, if not exclusively, in embryos. The region of elt-l encoding two zinc fingers is remarkably similar to the DNA-binding domain of the vertebrate GATA-binding proteins. However, outside of the DNA-binding domains the amino acid sequences are quite divergent. Nevertheless, introns are located at identical or nearly identical positions in elt-l and the mouse GATA-1 gene. In addition, elt-i mRNA is trans-spliced to the 22-base untranslated leader, SL1. The DNA upstream of the elt-i TATA box contains eight copies of the GATA recognition sequence within the first 300 bp, suggesting that eli-i may be autogenously regulated. Our results suggest that the specialized role of GATA-1 in erythroid gene expression was derived after separation of the nematodes and the line that led to the vertebrates, since C. elegans lacks an erythroid lineage.

Metazoan development is characterized by a progressive temporal and spatial patterning of gene expression that leads to the diversity of cell types and complexity found in higher eukaryotes. It is becoming more apparent that this patterning is dependent on the establishment of networks of regulatory factors, which bind in a sequence-specific manner to enhancers and promoters (reviewed in references 42 and 49). A model system to study developmentally specific and tissue-specific regulatory factors has been the globin gene family, in which switches in expression of different members occur throughout development and cell differentiation. Functional cis-acting DNA elements have been shown to be present 5' and 3' to many of the globin genes as well as in some of their introns (2, 3, 8, 13, 20, 59). Footprinting indicates the presence of a number of distinct, multiply repeated binding sites in the globin enhancers (47). One of these, (ArI)GATA(A/G), has also been identified in the promoters of a number of globin, as well as nonglobin, erythroid-specific genes (16, 38, 41, 48, 58-60). A DNAbinding activity that specifically recognizes the consensus sequence (A/T)GATA(A/G) has been identified in nuclear extracts of erythroid cells (16, 38, 58). This factor, GATA-1 (44) (previously called GF-1, Eryfl, or NF-E1), has been detected in human, mouse and chicken erythroid cells and is present during all stages of erythroid development. GATA-1 is a major regulator of erythroid-specific gene expression (15). Disruption of GATA-1 in mice has been shown to prevent development of the erythroid lineage (46). The recent discovery of other related GATA-binding proteins that may coexist in the same cells with GATA-1 complicates a simple model for GATA-1 function (63). The primary structure of GATA-1, GATA-2, and GATA-3 (15, 23, 32, 55, 56, 63, 64) has been determined through *

sequencing of cDNA clones. Each GATA protein contains two highly conserved zinc finger motifs of the form Cys-X2Cys-X17-Cys-X2-Cys. A finger motif has been associated with numerous DNA-binding regulatory proteins (31), and this domain of GATA-1 has been shown to comprise the DNA-binding region of the protein (37). Interestingly, related, single-finger-motif regulatory proteins involved in nitrogen metabolism have been found in Aspergillus nidulans and Neurospora crassa (18). The single zinc finger of these proteins is very similar to the carboxyl finger of the vertebrate GATA proteins and binds a similar core (GATA) consensus sequence. Thus, these vertebrate and fungal proteins appear to be members of a family of DNA-binding regulatory proteins with a highly conserved zinc finger motif that all recognize a very similar DNA sequence. In Caenorhabditis elegans, the vitellogenin genes are expressed at very high levels in the intestinal cells of the adult hermaphrodite (4). In the promoters of the vitellogenin genes there are two highly conserved, repeated, heptameric sequence elements (51), which are important in their highlevel, stage-, sex-, and tissue-specific expression (35a, 52, 65). One of these elements, VPE2, has a consensus sequence (CTGATAA) that matches the recognition sequence (A/ T)GATA(A/G) for the GATA proteins described above. A very similar sequence is also present just upstream of the TATA box in the promoters of the C. elegans msp genes (27), which, in contrast to the vitellogenin genes, are expressed at high levels in the spermatheca of males and hermaphrodites (28, 29), as well as upstream of the tra-2 gene (34a), which is involved in sex determination (24, 30). Because of the similarity between the recognition sequence of the GATA proteins and VPE2 and the observation of high conservation from fungi to mammals in the DNA-binding domain of the GATA proteins, we decided to seek a homolog of the GATA-binding proteins in C. elegans as a way of isolating a trans-acting factor involved in the regulation of

Corresponding author. 4651

4652

SPIETH ET AL. .

73 (91) SL1 RNA

71fB~.

108

160

22

169

O

I1

I

I

II

2

369

4

129

53

(819)

1091

685O

OR

R

II

I

0

I I I I.II

I

3

155

-f-5*-.*& .......

403

T

1

MOL. CELL. BIOL.

I

-

B

B

~~~~~~~~~~~~~~

I

5

6

7

8

9

10

11

12

FIG. 1. Restriction map of genomic clone YH9 and the gene structure of elt-J. Exons are indicated by rectangles, and introns are indicated by lines, with the size in base pairs above each exon and below each intron. The boundaries of the introns and exons were confirmed by sequencing a cDNA clone. The lengths of the 819-bp last exon and the 91-base outron (the RNA between the cap site and exon 1 [9]) are enclosed in parentheses because the exact sizes are unknown. The size given for the 3' exon is defined by the 3' end of the cDNA clone, which does not end in poly(A). The size of the 91-base outron is from a site 31 bp downstream of a good match to a TATA box to the SL1 splice acceptor. The 1,248-bp open reading frame is indicated by shading. YH9 was restriction mapped with SstI (T), XhoI (0), EcoRI (R), and BamHI (B). The two exons that contain the zinc fingers are underlined. The scale at the bottom is in kilobases.

vitellogenin gene expression. We have identified such a gene and have named it elt-1. We show here that it is single copy and surprisingly closely related to the vertebrate GATA genes. However, the presence of its mRNA predominantly or entirely in embryos suggests that the protein product of elt-J is probably not the trans-acting factor involved in the regulation of the C. elegans vit, tra-2, or msp genes. However, the sequence of the promoter region suggests that the elt-J protein is likely to regulate its own expression. MATERIALS AND METHODS Worms. Maintenance of and methods for handling C. elegans have been described previously (7, 53). Large-scale synchronous cultures were obtained by alkaline hypochlorite digestion of adults and hatching of the resulting embryos without food as described by Wood (62). Selection of clones. Genomic clones were selected from an amplified lambda 1059 library (SG25) (a gift of John Karn) containing partial Sau3AI digests of C. elegans N2 genomic DNA. The library was screened by hybridization with a mixture of synthetic oligonucleotides as described by Duby et al. (11). The hybridizations were done in 6x SSC (I x SSC is 0.15 M NaCl plus 0.015 M sodium citrate)-5 x Denhardt's solution-0.05% Na4P20O70.2% sodium dodecyl sulfate (SDS)-100 ,ug of yeast tRNA per ml at 37°C for 60 h, and the mixtures were washed in 6x SSC-0.05% Na4P207 at 60°C. The oligonucleotides were radioactively labeled by using a kinase reaction as described by Maniatis et al. (36). The synthetic oligonucleotide mixture contained the following sequences: 5'-GGAGA(C/T)CCAGT(C/T)TG(C/T)AA(C/T) GC(C/T)TG(C/T)GGACT(C/T)TAC-3'. At the degenerate positions, equal amounts of the two nucleotides were included in the synthesis reaction. Ten hybridizing clones were selected and analyzed. Seven of those hybridized intensely to the probe, and all of these contained inserts carrying portions of the region shown in Fig. 1. cDNA clones were isolated from a Lambda Zap library (gift of Robert Barstead) containing cDNA inserts made from RNA from a mixed population of C. elegans N2. The library was screened by hybridization to a fragment isolated from the genomic clone YH9 extending from the XhoI site at coordinate 8.0 (Fig. 1) to an SphI site 785 bp downstream

(not shown). Hybridizations were done in 50% formamide-lx SSC-2x Denhardt's solution-0.05 M NaH2PO4 (pH 6.5)-0.1% SDS-100 jig of sheared salmon sperm DNA per ml at 42°C for 15 h, and the mixtures were washed in 0.1% SSC-0.1% SDS at 55°C. The restriction fragment was radioactively labeled by using a Boehringer Mannheim Random Primed DNA Labeling Kit. Only a single hybridizing clone (RB1A) was obtained. PCR amplification of the 5' end of the transcript. In three separate reactions, 2 jig of the three oligonucleotides ELT-2, ELT-3, and ELT-5 were used to prime 40 ,ug of total RNA from a mixed population of C. elegans N2. Annealing was performed in 20 ,ul of 50 mM Tris-HCl (pH 8.5)-60 mM NaCl-10 mM dithiothreitol-4 mM each deoxynucleoside triphosphate, heated to 95°C for 10 min, and then allowed to incubate at 42°C for 50 min. After annealing, each reaction mixture was split into a 5-,1 and a 15-pI portion. For the 15-,ul portion, primer extension was performed by addition of 3 ,ul of avian myeloblastosis virus reverse transcriptase (Boehringer Mannheim), at a concentration of 5 U/pl, to 50 mM Tris-HCI (pH 8.5)-60 mM NaCl-10 mM DTT-30 mM Mg(CH3COO)2 followed by a 45-min incubation at 42°C. Control incubations were performed in parallel on the 5-pA portion by addition of 1 pul of the same solution minus reverse transcriptase. After the second 42°C incubation, samples were frozen at -20°C. For each mixture, two 3-pd aliquots were removed for analysis by the polymerase chain reaction (PCR). To one aliquot, 0.25 ,ug of an oligonucleotide equivalent to the 22-nucleotide C. elegans SL1 sequence (33) was added, and to the other 0.25 jig of an oligonucleotide equivalent to the 22-nucleotide C. elegans SL2 sequence (25) was added. Total volumes were brought up to 100 pI in a PCR solution by using the Cetus Perkins-Elmer recipe (50). No additional downstream primer was added. Each PCR cycle was as follows: 90 s at 92°C, 90 s at 50°C, and 90 s at 72°C, with ramps between each step taking about 40 s. Forty total cycles were performed. Products of these reactions were chloroform extracted to remove mineral oil overlay and then ethanol precipitated. After an ethanol rinse and drying, each pellet was redissolved in 10 j.l of formamide-bromphenol blue-xylene cyanol. After heating at 95°C for 2 min, 2 pA of the solutions, along with radioactive markers, was loaded

VOL . 1 l, 1991

onto a 6% polyacrylamide-7 M urea gel with a Tris-borateEDTA (TBE) buffer system and electrophoresed at 40 V/cm until the xylene cyanol band was at the bottom of the gel. The gel was subsequently electroblotted onto Hybond-N nylon membrane (Amersham) and probed with radioactively labeled ELT-4. The sequences of the four elt-i oligonucleotides used above and positions to which they hybridize in the GenBank sequence are as follows: ELT-2, 5'-CGTTGGATAGAAGT AACTCGG-3', 1551 to 1571; ELT-3, 5'-GCTGTCGTGGCA GCGGCGAG-3', 1602 to 1621; ELT-4, 5'-CAACGGGTTTT CCTTCG-3', 717 to 733; ELT-5, 5'-GTTGAGGAACATlTG G-3', 1034 to 1048. Sequencing PCR products. A second PCR amplification was performed on 2 ,ul from the remainder of the previous product from the SL1/ELT-2 reaction above. The reaction was performed with 1 ,ug of the ELT-5 oligonucleotide and 1 p.g of the SLi oligonucleotide under the same conditions as above. After chloroform extraction and ethanol precipitation, pellets were dissolved in 20 p.1 of distilled water. A 4-pdl sample of this solution was then denatured by adding 4 pul of H20 and 2 ,u1 of 2 M NaOH and heating to 65°C for 10 min. DNA was then precipitated by addition of 7 pul of H20, 3 p.1 of 3 M NaCH3COOH (pH 5.5), and 60 RI1 of ethanol and incubating at -70°C for 15 min before centrifugation. The resulting pellet was washed in 70% ethanol, dried, and dissolved in 8 ,u1 of H20-2 ,ul of 5 x Sequenase buffer (United States Biochemical Co.)-1 pI 0.1 M dithiothreitol-2.5 pul (10 ng) of the radioactively labeled primer (ELT-7). The Sequenase kit protocol (United States Biochemical Co.) for dideoxynucleotide sequencing was followed from this point, except that the termination reactions were started immediately after the labeling reactions were mixed, and water was substituted for the radioactively labeled deoxynucleoside triphosphate. Samples were electrophoresed on an 8% poly-

acrylamide (total acrylamide-to-bisacrylamide ratio, 30:1)-8 M urea gel with a TBE buffer system. The gel was dried to filter paper and autoradiographed. The sequence of ELT-7 and the position in the GenBank sequence to which it hybridizes, respectively, are 5'-GCCT CCAGAAGAAGTGCCG-3' and 744 to 762. Southern blot analysis. Genomic DNA was isolated from a population of mixed stages of C. elegans N2 as described by Spieth et al. (52). For Southern analyses, agarose gels were blotted onto Hybond-N nylon membrane (Amersham) as recommended by the manufacturer and hybridized to restriction fragments isolated from YH9. Hybridizations were done in 50% formamide-1 x SSC-2 x Denhardt's solution-0.05 M NaH2PO4 (pH 6.5)-0.1% SDS-100 ,ug of sheared salmon sperm DNA per ml at 42°C for 15 h and washed in 0.1% SSC-0. 1% SDS at 55°C. Low-stringency hybridizations were done in 6x SSC-5x Denhardt's solution-0.05% Na4P207-0.2% SDS-100 ,ug of sheared salmon sperm DNA per ml at 37°C. Washes were done in 6x SSC-0.1% SDS at 370C. Northern blot analysis. RNA was isolated from liquid cultures of mixed stages of worms as described by Conrad et al. (9). Poly(A)+ RNA was isolated by using a Pharmacia mRNA Purification Kit, electrophoresed on 1% agarose-6% formaldehyde gels made in 20 mM NaH2PO4 (pH 7.0), and transferred to a Hybond-N nylon (Amersham) membrane as described by the manufacturer. Running buffer contained 10 mM NaH2PO4 (pH 7.0) and 3% formaldehyde. Before electrophoresis, the RNA was heated to 60°C for 15 min in 60% formamide-30 mM EDTA-60 mM NaH2PO4-3% formaldehyde. The 353-bp PCR product used as probe for the

C. ELEGANS GATA TRANSCRIPTION FACTOR

4653

Northern (RNA) analysis was made as described above, except that 2 jig of the cDNA plasmid RB1A and 1 ,ug of each of the primers were used. The two primers were ELT-8, which is complementary to 18 bp in exon 4 beginning 637 bp downstream of the initiator AUG, and ELT-9, which is complementary to 18 bp in exon 6 beginning 973 bp from the AUG. This product was purified from an agarose gel and radioactively labeled by using a Boehringer Mannheim Random Primed DNA Labeling Kit. The cDNA clone of the ribosomal protein gene rp2J pPD33.24 (16a) was digested with BamHI and EcoRI. The cDNA insert was purified from an agarose gel and radioactively labeled by using a Boehringer Mannheim Random Primed DNA Labeling Kit. Hybridizations and washes were as described for Southern blots. Primer extension reactions with total RNA were performed as described by Conrad et al. (9). The sequences of the two elt-i oligonucleotides used above and the positions in the GenBank sequence to which they hybridize are ELT-8, 5'-GGTACCGAAGATCGTGAG TGTGTC-3', 1970 to 1987; ELT-9, 5'-GGTACCTCTTCGC GATCCCTTTGC-3', 4082-4099. Sequencing. For sequencing, restriction fragments were cloned into pTZ18U or pTZ19U (39). Sequencing was done by using a Sequenase DNA Sequencing Kit (United States Biochemical Corp.) and primers complementary to sequences that flanked the multiple cloning site of the vectors as well as primers complementary to the elt-i gene sequences as they were determined. Nucleotide sequence accession number. The sequence of the genomic clone YH9 from 708 bp upstream of the elt-i initiator AUG to 690 bp downstream of the first in-frame stop codon is available from the GenBank and EMBL data bases (accession number, X57834). RESULTS Selection of genomic clones. Several independent clones were isolated by screening a C. elegans genomic library at low stringency with a synthetic oligonucleotide made to a highly conserved region of GATA-1 that spans the carboxy cysteine pair of the carboxy finger. The sequence of the oligonucleotide was adjusted for the codon preference of C. elegans (12) and was twofold degenerate at 7 of 33 positions. Seven selected clones were determined to be either identical or overlapping clones by restriction fragment patterns and hybridization to the selecting oligonucleotide. The restriction map of one of the genomic clones, YH9, is shown in Fig. 1. This clone was localized by contig mapping to a position on the right arm of chromosome IV just to the left of a cluster of msp genes (9a). The gene contained on YH9 was shown to be present in a single copy by hybridization of a restriction fragment from YH9 to a Southern blot of genomic DNA cut with several restriction enzymes (Fig. 2). With each enzyme only a single fragment was seen, indicating that there is only a single copy in the genome. For enzymes that were mapped, the size of the band in the genomic Southern corresponded to the size predicted by the YH9 restriction map. Hybridizations to similar blots at lower stringency with restriction fragments from YH9 failed to show additional bands hybridizing to elt-i probes, suggesting that elt-i may be the sole member of the GATA family in C. elegans. On the other hand, we cannot exclude the possibility that the C. elegans genome contains distantly related GATA genes, which were not detected under the stringency conditions used. Gene structure. A 5,050-bp portion of the sequence from

4654

MOL. CELL. BIOL.

SPIETH ET AL.

A ,:'-

LQ

-Z

zz

7z

:'I

;--i

Z--

-::.

B

C A C G T

"9

_1 C

c ...

'-

-1.

_40

FIG. 2. Southern blot analysis. Samples (10 Rg) of N2 DNA were digested with the indicated restriction enzymes, electrophoresed on a 1.0% agarose gel, and blotted onto a Hybond membrane. The blot was hybridized to a 1.3-kb EcoRI fragment from the genomic clone YH9. The arrows indicate the position of the ethidium-stained, HindIll-digested k phage DNA markers electrophoresed on the same gel. The size of the markers is in kilobases.

the genomic clone surrounding the region that hybridized to the oligonucleotide was determined and is available from the EMBL and GenBank data bases. A 1,248-bp open reading frame, contained in six exons (Fig. 1), with amino acid homology to the zinc finger region of the vertebrate GATA proteins was predicted from the genomic sequence. Introns were predicted on the basis of highly conserved boundary sequences (5) and high A and U content within the proposed introns. An additional exon was found to be trans-spliced onto the 5' end of the mRNA (see below). This trans-spliced exon is called the SL1 exon, and the exons found in the genomic clone are exons 1 through 6. The intron-exon structure was confirmed by sequencing a single, 1,727-bp cDNA clone (RB1A) that extends from the A of the initiator methionine codon to 479 bp 3' of the translation stop site. The size of the cDNA is very close to the size of the mRNA detected in a Northern blot (see below). The cDNA clone ends in the middle of an acceptable polyadenylation signal, AUUAAA (61), but does not contain a poly(A) tail. The presence of SL1 at the 5' end of the mRNA was initially suggested by primer extension of total RNA extracted from adult worms. A primer (ELT-2) complementary to 21 bases of exon 3 beginning 291 bases from the initiator AUG gave a single extension product of approximately 322 bases (Fig. 3A). This located the 5' end of the gene 31 bases upstream of the AUG. However, a good match to the C. elegans splice acceptor site consensus UUUCAG/C was present 9 bases upstream of the AUG, suggesting that this gene may be trans-spliced to one of the two 22-base spliced leaders known in C. elegans (25, 33). To determine whether trans-splicing occurred, we amplified the 5' end of the mRNA by PCR with 5' primers of the same sequence as either SL1 or SL2. Three different 3' primers were used to generate the initial cDNA and in the subsequent PCRs. One of the primers was ELT-2. The other two primers were ELT-5, which was complementary to 15 bases in exon 2 beginning 171 bases from the AUG, and ELT-3, which was complementary to 20 bases of exon 3 beginning 341 bases from the AUG. The PCR products were detected on a

FIG. 3. Location of the 5' end of the elt-i gene and trans-splicing of SL1. (A) A primer extension reaction containing 36 p.g of total RNA extracted from adult worms and 50 ,ug of the synthetic oligonucleotide ELT-2 labeled at the 3' end was electrophoresed on a 6% polyacrylamide-7 M urea gel. (B) PCR amplification of cDNA. cDNA was made by using three different primers: ELT-2, ELT-3, and ELT-5. This cDNA was PCR amplified by using an oligonucleotide equivalent to the 22-nucleotide C. elegans SL1 sequence and the above primers. The PCR products were electrophoresed on a 6% polyacrylamide-7 M urea gel, electroblotted onto a nylon membrane, and hybridized to the radioactively labeled oligonucleotide ELT-4. The arrows indicate the positions of stained markers run on the same gel. The sizes of the markers are given in base pairs. (C) Sequence of trans-splice junction. The product of the ELT-2 SL1 PCR was reamplified and sequenced. The sequence of the transsplice junction is shown at the right, with the complement of the sequence of the 5' end of elt-l exon 1 shown in capital letters and the 3' end of SL1 shown in lowercase letters.

Southern blot with an oligonucleotide (ELT-4) complementary to 20 bases of exon 1. No PCR products were detected with the SL2 primer (data not shown), but with SL1, PCR products of the sizes predicted for trans-splicing at the site 9 bases upstream of the AUG were found (Fig. 3B), indicating that the gene encoded by YH9 is trans-spliced to SL1. The precise location of the trans-splice site and the presence of SL1 on the 5' end of the mRNA were confirmed by sequencing of the PCR product with the ELT-2 primer (Fig. 3C). mRNA size and pattern of expression. To determine the size of the mRNA encoded by the gene on YH9, a Northern blot of poly(A)+ RNA isolated from staged populations of worms was hybridized to a 353-bp PCR product made from the cDNA clone that spans the conserved finger region. A single 1.75-kb band was detected (Fig. 4). The mRNA is present predominantly, if not exclusively, in embryos. The blot was also hybridized to a ribosomal protein (rp2J) cDNA clone as a loading control. A small but detectable amount of the elt-J mRNA can be seen in all lanes on long exposures of the autoradiogram (data not shown) that could possibly be accounted for by the presence of a few unhatched embryos in larval populations and the few embryos within the young adults. Homology with vertebrate GATA family of transcription factors. Translation of the 1.75-kb cDNA clone from the initial methionine codon to the first in-frame stop codon predicts a 416-amino-acid protein (Mr 43,000) (Fig. 5), with several features found in other DNA-binding regulatory proteins. The protein contains two repeats of a zinc finger motif of the form Cys-X2-Cys-X17-Cys-X2-Cys (amino acids 217 to 241 and 272 to 296) followed by a basic region. As

C. ELEGANS GATA TRANSCRIPTION FACTOR

VOL . 11l 1991 c

,7

, -.

-Z

-1

ll:

*- 3. el

-1-47 9

rp22

-*

-4--

kb rRNA

.7 kb rRNA

Q_

FIG. 4. Northern blot analysis. Samples (2 Rg) of poly(A)+ RNA isolated from synchronous populations of worms were electrophoresed on a 1% agarose-formaldehyde gel, blotted onto a nylon membrane, and hybridized to a radioactively labeled PCR product made from the region of the cDNA clone RB1A that spans the conserved zinc fingers of elt-i. The blot was also hybridized to a radioactively labeled restriction fragment from a cDNA clone (pPD33.24) of the ribosomal protein gene rp2i (16a). The positions of the ethidium-stained 3.5- and 1.7-kb rRNAs were determined from a lane of total RNA run on the same gel.

shown in Fig. 5, the peptide sequence encompassing the two zinc fingers is highly conserved between the C. elegans polypeptide and mouse, human, and chicken GATA-1. This conservation extends well beyond each of the zinc fingers in a region rich in basic residues: 78 of 107 amino acids (73%) from Glu-213 to Arg-320 are identical in C. elegans with the three known vertebrate sequences. This region has been shown to be highly conserved in GATA-2 and GATA-3 as well (23, 32, 63). This remarkably high level of conservation leads us to conclude that this gene is the C. elegans homolog of the GATA family of transcription factors. We have called this gene elt-i for erythroidlike transcription factor since this family of transcription factors in vertebrates was originally identified by its importance in the regulation of erythroidspecific genes. This highly conserved region has been shown to be the DNA-binding domain in the mouse GATA-1 gene (37). Outside of the conserved DNA-binding domain, the elt-i protein is rich in Ser, Thr, and Asn but does not closely resemble any of the GATA proteins or any other known transcription factor. The lengths of the C. elegans elt-lencoded polypeptide chains on both the N-terminal and C-terminal sides of the DNA-binding domain are very similar to those of human and mouse GATA-1 as well as GATA-2 and GATA-3 from chickens. Some residues fall in identical positions in the vertebrate and nematode proteins, suggesting the alignment shown in Fig. 5. elt-i shows somewhat less similarity to GATA-2 and GATA-3, such that we were unable to make a convincing alignment between elt-i and these two proteins outside the DNA-binding domains (data not shown). It is also worth noting that the N-terminal half of elt-i is very low in charged residues. Strikingly, four of the five introns in elt-i are in nearly or exactly the same locations as introns in the vertebrate genes, which supports the alignment in Fig. 5 (12a, 43a). GATA-1 contains an intron in the 5' untranslated region, whereas most of the 5' untranslated region of elt-i is composed of the trans-spliced SLL. elt-) may be autogenously regulated. Although the start site of transcription is unknown because the elt-i mRNA is trans-spliced to SL1, an examination of the genomic sequence upstream of the AUG reveals several characteristics of a promoter region (Fig. 6). There is a good match to the consensus TATA box at -121 upstream of the trans-splice site and multiple repeated sequence elements that may be involved in the control of elt-i expression. We have identi-

4655

fied six different sequence elements that are repeated between two and eight times in the 510 bp upstream of the trans-splice site (Fig. 6). Some are direct repeats, others are inverted repeats, and some exist as both. The most numerous, and perhaps the most interesting, is the hexameric GATA-binding site, (A/T)GATA(A/G). Perfect matches to this consensus are present eight times in 220 bp. If elt-i binds GATA sites as is expected on the basis of its highly conserved DNA-binding domain, this finding suggests the interesting possibility that elt-i is autogenously regulated, as has been proposed for the mouse and chicken GATA-1 genes (21, 57). There are other GATA-binding sites in the 5,050 bp of genomic sequence obtained, but none are clustered as they are in the presumptive promoter region (data not shown). The base composition of the entire presumptive promoter region is highly asymmetric. The DNA strand shown in Fig. 5 is extremely rich in pyrimidines. This type of asymmetry is not typical of other C. elegans promoters that have been analyzed. However, its significance is unknown. DISCUSSION In this study we have shown that in C. elegans there is an embryonically expressed gene homologous to the vertebrate genes encoding the GATA family of transcription factors. This relatedness is based on very high amino acid identity in the DNA-binding domain. GATA-1 is primarily an erythroidspecific transcription factor that binds the core consensus sequence, GATA, within regulatory regions of the globin gene family (10, 16, 19, 38, 45, 58, 59), as well as the erythroid-specific chicken H5 gene (55) and the human porphobilinogen deaminase gene (41, 48). Like GATA-1, GATA-2 and GATA-3 bind the core consensus sequence and are also expressed in erythroid cells. In addition, they are present in nonerythroid tissues (23, 32, 63). Since this C. elegans gene is related to the GATA family of proteins, which were originally identified because of the important role of GATA-1 in erythroid-specific transcription, we have named it elt-i, for erythroidlike transcription factor. We selected elt-i by its homology to the vertebrate GATA-1 genes over a small region that encodes part of a highly conserved DNA-binding domain (37). A zinc finger motif has been identified in numerous DNA-binding regulatory proteins (26, 31), ranging from mammalian hormone receptors (14) to yeast gene regulatory proteins (40), but the zinc finger regions of the GATA family constitute a unique subclass. The GATA-1 DNA-binding domain has a two-zincfinger motif of the form Cys-X2-Cys-X17-Cys-X2-Cys. The DNA-binding domain of GATA-1 consists of two highly similar repeated sequences, each containing, but not limited to, one of the zinc fingers (37). Single zinc finger DNAbinding regulatory proteins from N. crassa (18) and A. nidulans (34) are also quite similar to the second (carboxy) finger of the vertebrate proteins, but they are not as closely related as is elt-i. These fungal DNA-binding proteins recognize the same core consensus sequence as the vertebrate proteins (17). Thus, all of these proteins appear to be members of a family of DNA-binding proteins with a highly conserved zinc finger motif that recognizes and binds a very similar or identical DNA sequence. The elt-i protein also has two, highly similar, repeated sequences, each containing one zinc finger. The entire conserved DNA-binding domain has 73% (78 of 107) amino acid identity with all the known vertebrate GATA proteins. Given the demonstration that the DNA-binding specificity resides in the zinc finger domain (37), as well as the high conservation of the finger domains in

4656

MOL. CELL. BIOL.

SPIETH ET AL.

C. elegans human mouse chicken

9YEEEGT F

SGLGSLO

GAS

EPLPQFVD

ASIPAASIAPE

YPMFLNYQ

NTSATNYYN

I.ESGVFFE JPEGLDAAAS

-----------

SDSTGFF 5PEGLDAASS GAL EPLPQFVD FVALG--G-PDAC@TPFPDEAGA-----FLGLGGGE--------__________

119 107 107 45

C. elegans

human mouse chicken

C elegans

human mouse

chicken

A NNS DLLTL DLLTLGT

NNNQLNVNIVQ------GNGTIVEITQNII SNVQ TVCPTREDSSP------QAVEDLDGKGSTSFLE KTE TVCPSHEDAPS------QALEDQEGKSNNTFLE iKTE GRVSLVPWADTGTLGTPQWVPPAK@EPPHYLELLQPPR

-----

S

CTGCST 173 VPNS 161 VTGS 161 SGPLL--

LSSAAYSSPKFAtLPL-AP( JSSGP ------------- PP(

293 279 279 185

C. elegans

human mouse chicken

348 339 339 241

C. elegans human mouse

chicken

C. elegans human mouse

chicken

C. elegans human mouse

chicken

96

233 220 220 126

C. elegans

human mouse chicken

59 51 51 31

ME___ ENCGE

:E S

--OGVWGMKNTQP 3GTAGTAHLYQGLGP

NCGEV S:GTAGTAHLYQGLGP

SMPPPPPPPAAAPEM)ALY-ALGP

GGGP

FGGQMKCNLNLN

YP NFYFi EDQLEYBE2 399

SHL4

PGPLL

HLI 4PGPLL

E

TGSFPTGPM

399

TTSFPTGPA 399

FG -GNSGGFFGGGAGGYE

296

416

TSTTVVAPLSS---

413

TSSTSVIAPLSS LSPQI ---------

413

---

A

304

FIG. 5. Alignment of the elt-J protein with human, mouse, and chicken GATA-1. The sequences of elt-J and human, mouse, and chicken GATA-1 (15, 55, 56), using the single-letter amino acid code, are aligned. The numbers at the right indicate the residue numbers. The downward-pointing arrowheads indicate the positions of introns in elt-J, whereas the upward-pointing arrowheads indicate the positions of introns in mouse GATA-1. The dark line underneath the sequences shows the highly conserved DNA-binding domain. The lines above the sequences show the positions of the zinc fingers with each Cys-X-X-Cys pair indicated by brackets. Identities between nematode and vertebrate proteins are marked by shaded boxes.

proteins from fungi to mammals all recognizing the same core sequence, this strongly suggests that the elt-J protein is also a DNA-binding regulatory protein that will recognize the same core sequence. The presumptive promoter region of elt-J contains several short, repeated sequences that may be involved in its regu-

lation. The most conspicuous of these are the GATA elements, which are present eight times in 294 bp upstream of the TATA box. Three of the GATA elements are of the double or overlapping variety that are like the GATA motif in the promoter of the mouse GATA-1 gene that is required for full promoter activity (57). If elt-J binds GATA elements

C. ELEGANS GATA TRANSCRIPTION FACTOR

VOL. 11, 1991

4657

CATCATTGATGCTCTTCATATTTCACAACCAATTGAGGTACAATATTATGTTCTCTGCAA -607 CAAGTCTTATTCTAGACTTCTCTATCCCACTCTCAGCCAGCAACGTCCGCCTCCATGGCA -547

AGTCGATTGGACGATGGGTGACAATTAGCTGGATGGGAAGACGAGAAGGGAGGAGGCCGG -487 CCCCGCCTCTCGTCTCGCCTTGCCGTGTGTCTGATTCAGTCCACATTGAAAAGGACCGTA -427

TCTAGAAACCC

TA

CC

tGAT

CTTGAAAACTACCACAACTACTCAGGG -367

TGGCCCTGAAGGATGCAAAAGGAAAACCCCCGATCTGCAATTTCGATCTGTAACTTGATC -307

TAATTTTCTTACiGGTTGGTCCCCTCTCTCATTTAATCTCAACTTTCGATGACTTG -247

.~

CT TTATCTCAAGCCGCAGATG

ATTAC U:W4

,TATGGC

-187

GCCGTGCCCTTTTCGTTCCCACGTTCCCTTTCCTCTCATATCTCTGCACACTCCTCTATT -127 TCTTCATATAATTCTGCTATTCGGGTTCCTGTGCTTGTGCTCTTAAGCCTTTTTCTGTTT -67 CACTTCTCGTCCTTCCTTTACTCTACTCTAATCCATCAACTCTCTTCAATAACTTTTTCT -7 TTTTCA CAGGAAAACATG

FIG. 6. Sequence of the elt-J 5'-flanking DNA. The numbering at the right indicates the position upstream of the trans-splice site, which is shown by the single bracket 9 bases upstream of the AUG initiator methionine codon. The AUG and the TATA box are doubly underlined. The (A/T)GATA(A/G) elements are enclosed in boxes. Those that are also VPE2 elements are shaded, with the shading extending outside the boxed region to include the base that forms part of the VPE2. Other repeated elements are underlined with different styles of lines for different sequences.

as expected on the basis of its highly conserved DNAbinding domain, it is likely that elt-J also regulates its own expression, as has been proposed for the mouse and chicken GATA-1 genes (21, 57). Autoregulation may be a common feature of DNA-binding regulatory proteins. It has been suggested that positive autoregulation of the segmentation gene fushi tarazu amplifies the expression of its own gene product (22). Autoregulation of the homeotic gene deformed, which specifies the mandibular and maxillary segments, provides for its own continued expression (35). The c-jun gene (1) and the myogenic determination genes MyoDl and myogenin (6, 54) have also been shown to activate their own

promoters. We isolated elt-i with the expectation that it may be involved in the highly regulated expression of the vitellogenin genes. Comparative sequence analysis of the vit promoter regions revealed the presence of the VPE2 element in a limited region (-90 to -150) of all 11 genes analyzed (51, 65). Subsequent mutational analysis of the two VPE elements in the vit-2 promoter demonstrated that at least one of these is required for high-level expression of vit-2 in transgenic worms (35a). The VPE2 consensus, CTGATAA, has a

GATA core and is consistent at every position with the GATA-binding sequence (A/T)GATA(A/G). Thus, it seems reasonable that the elt-i protein is capable of binding to the VPE2 sequence. However, it now seems less likely that the elt-i protein binds VPE2 in vivo and participates in the regulation of vitellogenin expression, since elt-i mRNA is present predominantly or exclusively in embryos. Likewise, the temporal pattern of expression of elt-i suggests that its protein product is not involved in the regulation of either tra-2 (43) or the msp genes (28), two genes which also have VPE2 (GATA) elements upstream of their TATA boxes. Perhaps a different protein recognizes the VPE2 sequences in the vit promoters and upstream of tra-2 and the msp genes, while the elt-) product regulates gene expression during embryogenesis. Although it appears that elt-i is the only C. elegans gene related to the GATA family of transcription factors, additional GATA family members not revealed by our low-stringency genomic Southern blots may nevertheless exist. Furthermore, it is possible that the GATA element in these genes interacts with regulatory proteins unrelated to the GATA family. In either case, elt-i appears to perform quite a different function from the

4658

SPIETH ET AL.

primary function of GATA-1, since there is no lineage in C. elegans with obvious homology to the vertebrate erythroid lineage.

MOL. CELL. BIOL.

REFERENCES

USA 85:5976-5980. 16a.Fire, A. Personal communication. 17. Fu, Y.-H., and G. A. Marzluf. 1990. nit-2, the major positiveacting nitrogen regulatory gene of Neurospora crassa, encodes a sequence-specific DNA-binding protein. Proc. Natl. Acad. Sci. USA 87:5331-5335. 18. Fu, Y.-H., and G. A. Marzluf. 1990. nit-2, the major nitrogen regulatory gene of Neurospora crassa, encodes a protein with a putative zinc finger DNA-binding domain. Mol. Cell. Biol. 10:1056-1065. 19. Galson, D. L., and D. E. Housman. 1988. Detection of two tissue-specific DNA-binding proteins with affinity for sites in the mouse ,B-globin intervening sequence 2. Mol. Cell. Biol. 8:381392. 20. Grosveld, F., G. Blom van Assendelft, D. R. Greaves, and G. Kollias. 1987. Position-independent, high-level expression of the human ,-globin gene in transgenic mice. Cell 51:975-985. 21. Hannon, R., T. Evans, G. Felsenfeld, and H. Gould. 1991. Structure and promoter activity of the gene for the erythroid transcription factor GATA-1. Proc. Natl. Acad. Sci. USA

1. Angel, P., K. Hattori, T. Smeal, and M. Karin. 1988. The jun proto-oncogene is positively autoregulated by its product, Jun/ AP-1. Cell 55:875-885. 2. Behringer, R. R., R. E. Hammer, R. L. Brinster, R. D. Palmiter, and T. M. Townes. 1987. Two 3' sequences direct adult erythroid-specific expression of human 3-globin genes in transgenic mice. Proc. Natl. Acad. Sci. USA 84:7056-7060. 3. Behringer, R. R., T. M. Ryan, R. D. Palmiter, R. L. Brinster, and T. M. Townes. 1990. Human -y- to ,-globin gene switching in transgenic mice. Genes Dev. 4:380-389. 4. Blumenthal, T., M. Squire, S. Kirtland, J. Cane, M. Donegan, J. Spieth, and W. Sharrock. 1984. Cloning of a yolk protein gene family from Caenorhabditis elegans. J. Mol. Biol. 174:1-18. 5. Blumenthal, T., and J. Thomas. 1988. cis and trans mRNA splicing in C. elegans. Trends Genet. 4:305-308. 6. Braun, T., E. Bober, G. Buschhausen-Denkar, S. Kotz, K.-H. Grzeschik, and H. H. Arnold. 1989. Differential expression of myogenic determination genes in muscle cells: possible autoactivation by the Myf gene. EMBO J. 8:3617-3625. 7. Brenner, S. 1974. The genetics of Caenorhabditis elegans. Genetics 77:71-94. 8. Choi, O.-R., and J. D. Engel. 1986. A 3' enhancer is required for temporal and tissue-specific transcriptional activation of the chicken adult ,B-globin gene. Nature (London) 323:731-734. 9. Conrad, R., J. Thomas, J. Spieth, and T. Blumenthal. 1991. Insertion of part of an intron into the 5' untranslated region of a Caenorhabditis elegans gene converts it into a trans-spliced gene. Mol. Cell. Biol. 11:1921-1926. 9a.Coulson, A. R., and J. E. Sulston. Personal communication. 10. deBoer, E., M. Antoniou, V. Mignotte, L. Wail, and F. Grosveld. 1988. The human P-globin promoter: nuclear protein factors and erythroid specific induction of transcription. EMBO J. 7:42034212. 11. Duby, A., K. A. Jacobs, and A. Celeste. 1988. Using synthetic oligonucleotides as probes, p. 6.4.1-6.4.10. In F. M. Ausubel, R. Brent, R. E. Kingston, D. D. Moore, J. A. Smith, J. G. Seidman, and K. Struhl (ed.), Current protocols in molecular biology. John Wiley & Sons, Inc., New York. 12. Emmons, S. 1988. The genome, p. 47-79. In W. Wood (ed.), The nematode Caenorhabditis elegans. W. Wood (ed.), Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. 12a.Engel, J. D. Personal communication. 13. Enver, T., N. Raich, A. J. Ebens, T. Papayannopoulou, F. Costantini, and G. Stamatoyannopoulos. 1990. Developmental regulation of human fetal-to-adult globin gene switching in transgenic mice. Nature (London) 344:309-313. 14. Evans, R. M. 1988. The steroid and thyroid hormone receptor super family. Science 240:889-895. 15. Evans, T., and G. Felsenfeld. 1989. The erythroid-specific transcription factor Eryfl: a new finger protein. Cell 58:877-885. 16. Evans, T., M. Reitman, and G. Felsenfeld. 1988. An erythrocytespecific DNA-binding factor recognizes a regulatory sequence common to all chicken globin genes. Proc. Natl. Acad. Sci.

88:3004-3008. 22. Hiromi, Y., and W. J. Gehring. 1987. Regulation and function of the Drosophila segmentation gene fushi tarazu. Cell 50:963-974. 23. Ho, I.-C., P. Vorhees, N. Marin, B. K. Oakley, S.-F. Tsai, S. H. Orkin, and J. M. Leiden. 1991. Human GATA-3: a lineagerestricted transcription factor that regulates the expression of the T cell receptor a gene. EMBO J. 10:1187-1192. 24. Hodgkin, J. A., and S. Brenner. 1977. Mutations causing transformation of sexual phenotype in the nematode Caenorhabditis elegans. Genetics 86:275-287. 25. Huang, X.-Y., and D. Hirsh. 1989. A second trans-spliced RNA leader sequence in the nematode Caenorhabditis elegans. Proc. Natl. Acad. Sci. USA 86:8640-8644. 26. Johnson, P. F., and S. L. McKnight. 1989. Eukaryotic transcriptional regulatory proteins. Annu. Rev. Biochem. 58:799-839. 27. Klass, M., D. Ammons, and S. Ward. 1988. Conservation of the 5' flanking sequences of transcribed members of the Caenorhabditis elegans major sperm protein gene family. J. Mol. Biol. 199:15-22. 28. Klass, M., B. Dow, and M. Herndon. 1982. Cell-specific transcriptional regulation of the major sperm protein in Caenorhabditis elegans. Dev. Biol. 93:152-164. 29. Klass, M. R., and D. Hirsh. 1981. Sperm isolation and biochemical analysis of the major sperm protein from Caenorhabditis elegans. Dev. Biol. 84:299-312. 30. Klass, M., N. Wolf, and D. Hirsh. 1976. Development of the male reproductive system and sexual transformation in the nematode Caenorhabditis elegans. Dev. Biol. 52:1-18. 31. Klug, A., and D. Rhodes. 1987. "Zinc fingers": a novel protein motif for nucleic acid recognition. Trends Biochem. Sci. 12:464469. 32. Ko, L. J., M. Yamamoto, M. W. Leonard, K. M. George, P. Ting, and J. D. Engel. 1991. Murine and human T-lymphocyte GATA-3 factors mediate transcription through cis-regulatory elements within the human T-cell receptor & gene enhancer. Mol. Cell. Biol. 11:2778-2784. 33. Krause, M., and D. Hirsh. 1987. A trans-spliced leader sequence on actin mRNA in C. elegans. Cell 49:753-761. 34. Kudla, B., M. X. Caddick, T. Langdon, N. M. Martinez-Rossi, C. F. Bennett, S. Sibley, R. W. Davies, and H. N. Arst, Jr. 1990. The regulatory gene areA mediating nitrogen metabolite repression in Aspergillus nidulans: mutations affecting specificity of gene activation alter a loop residue of a putative zinc finger. EMBO J. 9:1355-1364. 34a.Kuwabara, P., and J. Kimble. Personal communication. 35. Kuziora, M. A., and W. McGinnis. 1988. Autoregulation of a Drosophila homeotic selector gene. Cell 55:477-485. 35a.MacMorris, M., S. Greenspoon, S. Broverman, T. Blumenthal, and J. Spieth. Unpublished data. 36. Maniatis, T., E. F. Fritsch, and J. Sambrook. 1982. Molecular cloning: a laboratory manual, p. 122-126. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. 37. Martin, D. I. K., and S. H. Orkin. 1990. Transcriptional

ACKNOWLEDGMENTS We thank Andy Fire for the clone of the ribosomal protein gene rp2l, Bob Barstead for the Lambda Zap cDNA library, John Karn for the genomic library, Patricia Kuwabara and Judith Kimble for sharing their observation of the presence of a VPE2 element upstream of tra-2, Doug Engel and Stuart Orkin for their data on the locations of introns in the vertebrate GATA genes, and Lawrence Washington of the Institute for Molecular and Cellular Biology at Indiana University for synthesizing oligonucleotides. Finally, we thank all the members of our laboratory for their helpful suggestions and stimulating discussions concerning this work. This research was supported by U.S. Public Health Service grant GM30870 from the NIGMS.

VOL . 1 l, 1991

C. ELEGANS GATA TRANSCRIPTION FACTOR

activation and DNA binding by the erythroid factor GF-1/NFEl/Eryf 1. Genes Dev. 4:1886-1898. 38. Martin, D. I. K., S.-F. Tsai, and S. H. Orkin. 1989. Increased y-globin expression in a nondeletion HPFH mediated by an erythroid-specific DNA-binding factor. Nature (London) 338: 435-438. 39. Mead, D. A., E. Szczesna-Skorupa, and B. Kemper. 1986. Single-stranded DNA 'blue' T7 promoter plasmids: a versatile tandem promoter system for cloning and protein engineering. Protein Eng. 1:67-74. 40. Messenguy, F., E. Dubois, and F. Descamps. 1986. Nucleotide sequence of the ARGRII regulatory gene and amino acid sequence homologies between ARGRII, PPRI and GAL4 regulatory proteins. Eur. J. Biochem. 157:77-81. 41. Mignotte, V., L. Wall, E. deBoer, F. Grosveld, and P.-H. Romeo. 1989. Two tissue-specific factors bind the erythroid promoter of the human porphobilinogen deaminase gene. Nucleic Acids Res. 17:37-54. 42. Mitchell, P., and R. Tijian. 1989. Transcriptional regulation in mammalian cells by sequence-specific DNA binding proteins. Science 245:371-378. 43. Okkema, P. G., and J. Kimble. 1991. Molecular analysis of tra-2, a sex determining gene in C. elegans. EMBO J. 10:171176. 43a.Orkin, S. H. Personal communication. 44. Orkin, S. H. 1990. Globin gene regulation and switching: circa 1990. Cell 63:665-672. 45. Perkins, N. D., R. H. Nicolas, M. A. Plumb, and G. H. Goodwin. 1989. The purification of an erythroid protein which binds to enhancer and promoter elements of haemoglobin genes. Nucleic Acids Res. 17:1299-1314. 46. Pevny, L., M. C. Simon, E. Robertson, W. H. Klein, S.-F. Tsai, V. D'Agati, S. H. Orkin, and F. Costantini. 1991. Erythroid differentiation in chimaeric mice blocked by a targeted mutation in the gene for transcription factor GATA-1. Nature (London) 349:257-260. 47. Philipsen, S., D. Talbot, P. Fraser, and F. Grosveld. 1990. The p-globin dominant control region: hypersensitive site 2. EMBO J. 9:2159-2167. 48. Plumb, M., J. Frampton, H. Wainwright, M. Walker, K. Macleod, G. Goodwin, and P. Harrison. 1989. GATAAG; a ciscontrol region binding an erythroid-specific nuclear factor with a role in globin and non-globin gene expression. Nucleic Acids Res. 17:73-91. 49. Ptashne, M. 1988. How eukaryotic transcriptional activators work. Nature (London) 335:683-689. 50. Saiki, R. K., D. H. Gelfand, S. Stoffel, S. J. Scharf, R. Higuchi, G. T. Horn, K. B. Muliiss, and H. A. Erlich. 1988. Primerdirected enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487-489. 51. Spieth, J., K. Denison, S. Kirtland, J. Cane, and T. Blumenthal.

1985. The C. elegans vitellogenin genes: short sequence repeats in the promoter regions and homology to the vertebrate genes. Nucleic Acids Res. 13:5283-5295. Spieth, J., M. MacMorris, S. Broverman, S. Greenspoon, and T. Blumenthal. 1988. Regulated expression of a vitellogenin fusion gene in transgenic nematodes. Dev. Biol. 130:285-293. Sulston, J. E., and S. Brenner. 1974. The DNA of Caenorhabditis elegans. Genetics 77:95-104. Thayer, M. J., S. J. Tapscott, R. L. Davis, W. E. Wright, A. B. Lassar, and H. Weintraub. 1989. Positive autoregulation of the myogenic determination gene MyoDl. Cell 58:241-248. Trainor, C. D., T. Evans, G. Felsenfeld, and M. S. Boguski. 1990. Structure and evolution of a human erythroid transcription factor. Nature (London) 343:92-96. Tsai, S.-F., D. I. K. Martin, L. I. Zorn, A. D. D'Andrea, G. G. Wong, and S. H. Orkin. 1989. Cloning of cDNA for the major DNA-binding protein of the erythroid lineage through expression in mammalian cells. Nature (London) 339:446-451. Tsai, S.-F., E. Strauss, and S. H. Orkin. 1991. Functional analysis and in vivo footprinting implicate the erthyroid transcription factor GATA-1 as a positive regulator of its own promoter. Genes Dev. 5:919-931. Wall, L., E. deBoer, and F. Grosveld. 1988. The human 3-globin gene 3' enhancer contains multiple binding sites for an erythroid-specific protein. Genes Dev. 2:1089-1100. Watt, P., P. Lamb, L. Squire, and N. Proudfoot. 1990. A factor binding GATAAG confers tissue specificity on the promoter of the human (-globin gene. Nucleic Acids Res. 18:1339-1350. Wilson, D. B., D. M. Dorfman, and S. H. Orkin. 1990. A nonerythroid GATA-binding protein is required for function of the human preproendothelin-1 promoter in endothelial cells. Mol. Cell. Biol. 10:4854-4862. Wilusz, J., S. M. Pettine, and T. Shenk. 1989. Functional analysis of point mutations in the AAUAAA motif of the SV40 late polyadenylation signal. Nucleic Acids Res. 17:3899-3908. Wood, W. B. 1988. The nematode Caenorhabditis elegans. p. 603-604. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. Yamamoto, M., L. J. Ko, M. W. Leonard, H. Beug, S. H. Orkin, and J. D. Engel. 1990. Activity and tissue-specific expression of the transcription factor NF-E1 multigene family. Genes Dev. 4:1650-1662. Zon, L. I., S.-F. Tsai, S. Burgess, P. Matsudaira, G. A. P. Bruns, and S. H. Orkin. 1990. The major human erythroid DNAbinding protein (GF-1): primary sequence and localization of the gene to the X chromosome. Proc. Natl. Acad. Sci. USA 87:668-672. Zucker-Aprison, E., and T. Blumenthal. 1989. Potential regulatory elements of nematode vitellogenin genes revealed by interspecies sequence comparison. J. Mol. Evol. 28:487-496.

52.

53. 54. 55. 56.

57.

58. 59. 60.

61.

62. 63.

64.

65.

4659

elt-1, an embryonically expressed Caenorhabditis elegans gene homologous to the GATA transcription factor family.

The short, asymmetrical DNA sequence to which the vertebrate GATA family of transcription factors binds is present in some Caenorhabditis elegans gene...
2MB Sizes 0 Downloads 0 Views