MOLECULAR AND CELLULAR BIOLOGY, Jan. 1992, p. 38-44

Vol. 12, No. 1

0270-7306/92/010038-07$02.00/0 Copyright © 1992, American Society for Microbiology

Cloning of a Negative Transcription Factor That Binds to the Upstream Conserved Region of Moloney Murine Leukemia Virus JAMES R. FLANAGAN,"2 KEVIN G. BECKER,2 DAVID L. ENNIST,2 SHANNON L. GLEASON,2 PAUL H. DRIGGERS,2 BEN-ZION LEVI,2 ETTORE APPELLA,3 AND KEIKO OZATO2* Department of Internal Medicine, University of Iowa, Iowa City, Iowa 52242,' and Laboratory of Developmental and Molecular Immunity, National Institute of Child Health and Human Development,2 and Laboratory of Cell Biology, National Cancer Institute,3 Bethesda, Maryland 20892 Received 2 July 1991/Accepted 26 September 1991

The long terminal repeat of Moloney murine leukemia virus (MuLV) contains the upstream conserved region (UCR). The UCR core sequence, CGCCATTTT, binds a ubiquitous nuclear factor and mediates negative regulation of MuLV promoter activity. We have isolated murine cDNA clones encoding a protein, referred to as UCRBP, that binds specifically to the UCR core sequence. Gel mobility shift assays demonstrate that the UCRBP fusion protein expressed in bacteria binds the UCR core with specificity identical to that of the UCR-binding factor in the nucleus of murine and human cells. Analysis of full-length UCRBP cDNA reveals that it has a putative zinc finger domain composed of four C2H2 zinc fingers of the GLI subgroup and an N-terminal region containing alternating charges, including a stretch of 12 histidine residues. The 2.4-kb UCRBP message is expressed in all cell lines examined (teratocarcinoma, B- and T-cell, macrophage, fibroblast, and myocyte), consistent with the ubiquitous expression of the UCR-binding factor. Transient transfection of an expressible UCRBP cDNA into fibroblasts results in down-regulation of MuLV promoter activity, in agreement with previous functional analysis of the UCR. Recently three groups have independently isolated human and mouse UCRBP. These studies show that UCRBP binds to various target motifs that are distinct from the UCR motif: the adeno-associated virus P5 promoter and elements in the immunoglobulin light- and heavy-chain genes, as well as elements in ribosomal protein genes. These results indicate that UCRBP has unusually diverse DNA-binding specificity and as such is likely to regulate expression of many different genes.

Expression of type C retroviruses, including the murine leukemia viruses (MuLV), is regulated by various sequence motifs in the long terminal repeat (LTR), which are conserved in both the replication-defective endogenous and the exogenous/infectious viruses (14, 46, 47, 53). The 3' end of the LTR contains the promoter elements, the CCAAT and the TATA motifs. The middle part of the LTR contains at least six known cis motifs, and these function as a strong enhancer (27, 46). This middle region may also be involved in negative regulation of viral transcription in undifferentiated embryonal carcinoma (EC) cells (17). The 5' part of the LTR of both the infectious and the MuLV-related endogenous defective retroviruses contains the upstream conserved region (UCR), whose core motif is CGCCATTTT (the location of which is shown in Fig. 1). The UCR and the other downstream motifs are conserved in over 90% of more than 35 mammalian type C retrovirus isolates (13, 15, 24). We have previously shown that the UCR binds a ubiquitous nuclear factor(s) and that in murine L cells, the UCR and its binding factor(s) are involved in negative regulation of MuLV promoter activity (13). In addition to our work, the UCR-binding activity also has been reported in a study of developmental regulation of MuLV in undifferentiated F9 EC cells (51). These authors found that the UCR-binding activity occurs irrespective of retinoic acid-induced differentiation of F9 cells and that developmental control of MuLV is mediated by an element independent of the UCR. With the aim of further studying the structure and function of the UCR-binding factor, we have cloned a gene encoding a UCR-binding protein (UCRBP) by screening several

expression libraries with multimerized UCR. Here we report the isolation and initial characterization of a full-length UCRBP cDNA that encodes a C2H2 zinc finger protein and is capable of specifically binding to the UCR core motif. We show that transient transfection of an expressible UCRBP cDNA leads to specific down-regulation of MuLV promoter activity. The UCRBP sequence is essentially the same as 8, cloned by Hariharan et al. (20), and two human proteins, NF-E1 and YY-1, simultaneously cloned by Park and Atchison (37) and Shi et al. (43), respectively. It is significant that the same protein is shown to bind multiple regulatory elements unrelated to the UCR and to each other. These findings are discussed in the context of the functional diversity expected of UCRBP. MATERIALS AND METHODS

Oligonucleotides. All oligonucleotides were synthesized on an Applied Biosystems 380B synthesizer. The upper strands of the double-stranded oligonucleotides used for assessing the binding specificity of UCRBP are shown in Fig. 2. These oligomers had linker sequences to facilitate concatenation. Region I, region II, and ICS are from the regulatory elements of the murine major histocompatibility complex (MHC) class I genes (9, 44). Complementary oligonucleotides were annealed at a concentration of 0.4 mglml in a buffer containing 500 mM KCl, 50 mM Tris-HCI (pH 8.0), and 1 mM EDTA and were labeled with 32P by using polynucleotide kinase or DNA polymerase (Klenow fragment). Labeled probes were concatenated by using T4 DNA ligase as described previ-

ously (9). *

Library screening. XZapII expression libraries (Stratagene) were constructed from poly(A)+ RNA prepared from

Corresponding author. 38

VOL. 12, 1992

MuLV DNA-BINDING FACTOR

FIG. 1. The UCR and its position in the U3 region of MuLV. The UCR is located in Rl of the LTRs and is conserved in more than 35 type C mammalian retrovirus isolates (13, 15).

F9 EC cells, mouse adult spleens, and several mouse neonatal tissues. In these XZapII libraries, insert proteins are expressed as a galactosidase fusion protein in which the N-terminal galactosidase moiety is about 4.3 kDa. Two million clones from unamplified libraries were screened by the method of Vinson et al. (52) with modifications (9). Briefly, 5,000 plaques in a 150-mm petri dish were induced

isopropyl-p-D-thiogalactopyranoside (IPTG) soaked in nitrocellulose filters. The filters were treated with guanidineHCl, which was then removed in stages, and then were incubated with a 32P-labeled and concatenated UCR oligonucleotide probe (5 x 105 cpm/ml) in binding buffer (40 mM KCl, 20 mM N-2-hydroxyethylpiperazine-N'-2-ethanesulfonic acid [HEPES; pH 7.9], 3 mM MgCI2) supplemented with 0.5% Tween 20, washed, and autoradiographed. Sequencing. Additional cDNA clones were isolated by screening the F9 cell library (see above) with the 1.2-kb insert of clone W78, isolated as described above. Seven overlapping cDNA clones were sequenced in both directions by the Sanger dideoxy method, using sequence-specific oligonucleotide primers. To avoid sequence compression caused by the high G+C content of the insert, some sequence reactions were carried out by using deoxyinosine in place of deoxyguanosine (2, 18). UCRBP fusion protein expressed in bacteria. The 1.2-kb insert from clone W78 which produced a UCRBP was subcloned into pBluescript. Two subclones were prepared, by

CTGCAGTAACGCCATTTTGCAAGGCAT

UCR

GCC

K328

... . . .

.ATA.

K330

.........

K332

............

.ACG. GGG

K334

..... ..... .....

K336

..................

K338

.....................

Region Region

ICs FIG. 2.

specificity

I

II

Mc

.....

...

AMGGCTGGGGATTCCCCAGGCTGGGGATTCCCATCT

GCCAGGCGGTGAGGTCAGGGGTGGGGAA GATCGATTCCCCATCTCCTCAGTTTCACTTCTGCACCGCATC

Oligonucleotides used for assessing the DNA-binding Upper strands of double-stranded oligomers

of UCRBP.

are shown. Nucleotides with asterisks indicate the UCR core. K328 K338 are mutated versions of the UCR oligonucleotide. Dotted positions had nucleotides identical to those of the UCR.

through

Region I, region II, and ICS (40) are cis elements of MHC class I genes and were used as controls. Gel mobility shift experiments were performed with monomeric double-stranded oligomers, while screening and filter binding assays were performed with oligomers concatenated through the 5' overhang sequences as underlined.

39

SC10 and SC7, containing the UCRBP insert in the correct and reverse orientations, respectively. Escherichia coli strain XL-1-blue (Stratagene) harboring SC10 or SC7 was grown and induced by IPTG, and bacterial pellets were incubated in buffer containing 40 mM HEPES (pH 8.1), 0.2 mM EDTA, 1 mM dithiothreitol, 1 mM phenylmethylsulfonyl fluoride, 1 mM sodium metabisulfite, and 1 mg of lysozyme per ml in 25% sucrose at 4°C for 60 min. Then guanidine-HCI was added to this mixture at a final concentration of 6 M, and the mixture was incubated further at room temperature for 90 min. Lysates were centrifuged at 63,000 x g for 1 h, and the supernatants were dialyzed against binding buffer (see above) with 20% glycerol, with concentrations of guanidine-HCl decreasing in steps to zero. Extracts were frozen in aliquots at -70°C until use. Gel mobility shift analysis. Detailed conditions for gel mobility shift experiments were described previously (13). Monomers of the double-stranded oligonucleotides corresponding to the UCR (Fig. 2) were labeled with 32p (1.5 x 104 cpm) and were incubated with 3 jig of extract proteins and 0.4 ,ug of poly(dI-dC) (Pharmacia) in 30 p,1 of buffer containing 50 mM KCl, 20 mM HEPES (pH 7.9), 1 mM MgCl2, 0.2 mM EDTA, 5% glycerol, and 1 mM dithiothreitol for 60 min at 4°C. Reaction mixtures were electrophoresed through a 4% nondenaturing polyacrylamide gel. Cells and nuclear extract preparation. F9 tk- EC cells, B-cell line P3X63, and T-cell line BW5147 were cultured in Dulbecco's modified Eagle's medium or RPMI 1640 supplemented with 10% fetal bovine serum and gentamicin. F9 cells were induced to differentiate by treatment with 5 x 10-7 M retinoic acid (all trans; Eastman Kodak) (48). Nuclear extracts from these cells were prepared according to Dignam et al. (8) with modifications. Northern (RNA) blot analyses. Poly(A)+ RNA was prepared by guanidinium lysis of cells (32), CsCl2 centrifugation, and oligo(dT)-cellulose separation (41). RNA was separated in formaldehyde agarose gels and transferred onto nylon membranes (41). Blots were hybridized with a 32p_ labeled UCRBP probe (5 x 10' cpm/ml) prepared by the random priming method (11) in buffer containing 50% formamide, 1 M NaCl, 1% sodium dodecyl sulfate (SDS), and 10% dextran sulfate at 42°C overnight. Blots were washed at room temperature in 2x SSC (lx SSC is 0.15 M NaCl plus 0.015 M sodium citrate), followed by two additional washes in 0.2x SSC containing 0.1% SDS at 68°C for 30 min each. UCRBP expression vector and transfection assay. A mammalian expression plasmid containing full-length UCRBP cDNA was made by fusing the EcoRI-PstI fragment from clone 51.2.1 (containing the 5' end of the gene) to the PstI-EcoRI fragment of clone W78 (containing the remainder of the coding sequence), followed by cloning into the multiple cloning site of vector pCDNAI (Invitrogen) in both orientations. The cytomegalovirus promoter upstream of the multiple cloning site drives expression of the insert in the sense orientation in pCMV-UCRBP and in the reverse orientation in pCMV-RUCRBP. As a target reporter gene, we used pMuLVCAT as described previously (31). We used pT81LUC, containing the herpes simplex virus thymidine kinase gene promoter driving the firefly luciferase gene (36), as a control irrelevant target. These constructs were transfected, alone or in combination, into a murine L fibroblasts, using the method of calcium phosphate precipitation (19). Cells were harvested 2 days after transfection, and chloramphenicol acetyltransferase (CAT) and luciferase activities were determined as previously described (4, 16). Nucleotide sequence accession number. The sequence data

40

FLANAGAN ET AL.

reported here have been assigned GenBank accession number M73963.

RESULTS Isolation of a cDNA clone encoding UCRBP. To isolate cDNA clones encoding a protein that binds to the UCR, we screened AZap II expression libraries prepared from RNA of mouse spleens, neonatal tissues, and F9 EC cells, using the method described above. Approximately 2 million plaques from each XZapII expression library were screened, using concatenated duplex oligonucleotide containing the UCR sequence. We found one cDNA clone, W78 from the library prepared from F9 EC cells, that expressed a protein capable of binding specifically to the UCR. This clone contained 1.2 kb of the UCRBP cDNA (see below) but contained almost all of the open reading frame (amino acids 18 through 414). The DNA-binding specificity of the fusion protein produced in this phage clone was assessed first by the filter binding assay that formed the basis for the screening (52). Purified phages were transferred onto nitrocellulose filters, treated with guanidine-HCl, and incubated with a 32P-labeled concatenated UCR probe under the conditions used for library screening (Materials and Methods). The W78 phage protein bound strongly to the concatenated UCR probes regardless of flanking sequences but failed to bind unrelated probes region I, region II, and ICS (Fig. 2), which corresponded to the regulatory sequences of mouse MHC class I genes (9, 44). Control phages containing no insert or an unrelated sequence, ICSBP (9), did not bind the UCR probe (data not shown). Specificity of UCRBP binding. DNA-binding specificity was studied in more detail by using the W78 fusion protein expressed by plasmid pBluescript in E. coli XL1-blue. The pBluescript clone SC10 preserved the orientation of the UCRBP insert, whereas clone SC7 had the insert in the reverse orientation and was used as a control. Bacterial extracts containing products of these clones were tested in gel mobility shift assays with a nonconcatenated doublestranded UCR oligonucleotide with blunt ends as a probe. Extracts from the correctly oriented clone SC10 produced two prominent retarded bands (Fig. 3A). These bands were competed for by a 100-fold molar excess of unlabeled UCR. Extracts from control clone SC7 produced no retarded band. Binding specificity was evaluated by six mutated UCR double-stranded oligonucleotides which contained three base substitutions in various positions (Fig. 2). Mutated double-stranded oligonucleotides 330, 332, and 334, which had substitutions within the UCR core, failed to compete for these retarded bands. On the other hand, mutants 328, 336, and 338, which had substitutions outside the UCR core, retained the ability to compete for the retarded bands. As shown in Fig. 3B, this competition pattern was identical to that seen with nuclear extracts prepared from F9 EC cells. Consistent with these data, we have previously shown that extracts from many other cells give very similar competition patterns (13). Thus, UCRBP binds the UCR core with specificity equal to that of the F9 EC nuclear UCR-binding factor. Primary sequence of full-length UCRBP cDNA. Sixteen cDNA clones were isolated on the basis of hybridization with the original clone W78. Seven representative cDNA clones with inserts overlapping with W78 and with each other were chosen on the basis of grouping by restriction analysis and were sequenced. Figure 4 shows the DNA and deduced amino acid sequences of the 2,330-bp full-length

MOL. CELL. BIOL. B A -

SC r_l 0_

SC7 F

'I CC D

C )c ')

iC

Comp i r Competitor

)c

C')

cc

c

o

N

D

C

o cD

C

C

)

C

Competitor

t

co

CO )

C C'

FIG. 3. DNA-binding specificity of UCRBP. (A) SC10 bacterial extracts containing the UCRBP fusion protein (3 ,ug per lane) were tested in gel mobility shift assays using a 32P-labeled UCR oligomer as a probe. The control SC7 extracts were prepared from bacteria harboring the UCRBP insert in the reverse orientation. Unlabeled competitor oligomers were added at a 100-fold molar excess. In addition, each reaction had 0.4 ,ug of nonspecific competitor poly(dIdC). (B) Nuclear extracts from F9 cells (8 ,ug of protein per lane) were tested in gel mobility shift assays with the UCR probe as described above.

UCRBP cDNA. A long open reading frame begins with the putative ATG codon at nucleotide position 103. This ATG is the likely translation start site, as the sequence flanking this conforms to the consensus sequence (underlined in Fig. 4) for translation initiation proposed by Kozak (26). The open reading frame ends with the stop codon at nucleotide 1345 (asterisk in Fig. 4), indicating that UCRBP is composed of 414 amino acids with an expected molecular mass of 44.7 kDa. There is a 956-bp 3' untranslated region followed by a putative polyadenylation site (double underlined in Fig. 4). Within the open reading frame, the N-terminal region contains two amino acid stretches with strong alternating charges (underlined in Fig. 4). The 11 consecutive Asp and Glu residues (amino acid positions 43 to 53) result in a high local negative charge. This is followed by 12 uninterrupted His residues (amino acid positions 71 to 82), expected to generate a strong local positive charge. A similar polyhistidine tract has been reported for a developmentally controlled homeobox gene ERA-i (28). The most obvious of the structural characteristics of the UCRBP are the four putative zinc fingers present in the C-terminal part of the insert (amino acid positions 298 to 407; bold and underlined in Fig. 4). These typical C2H2 zinc fingers are composed of two diagnostic amino acid sequence motifs, Cys-XXXX-Cys and His-XXX-His, as well as the conserved intrafinger Phe and Leu. In addition, linker sequences that reside between the second and third and between the third and fourth fingers are similar to the conserved linker motif TGEKPY present in many other C2H2 zinc fingers (5, 33). By analogy to other zinc finger proteins, these zinc fingers are presumably involved in coordinating zinc atoms (29, 35) and most likely represent the DNA-binding domain (24, 29). Ruppert et al. (40) classified C2H2 zinc fingers into three subgroups: C2H2-X5, C2H2-Kruppel, and C2H2-GLI. As shown in Fig. 5, a comparison with other zinc finger proteins on the basis of this classification shows UCRBP zinc fingers to be members of the C2H2-GLI subgroup (25), since there are four amino acids between the conserved Cys and three variable amino acids between the conserved His. Zinc

VOL. 12, 1992

MuLV DNA-BINDING FACTOR

41

CTTCCCCACGGCCGGCCGCCTCCTCGCCCGCCCGCCCTCCCTCCCGCAGCCCAGGAGCCGACGCCGCCTGCCGCGGCGGCCGTGGCGGCGGAGCCCTCA5k

C:aICCTCGGGCGACACCCTCTACATCGCCACGGACGGCTCGGAGATGCCGGCCGAGATCGTGGAGCTGCATGAGATCGAGGTGGAGACCATCCCGGT N A S G D T L Y I A T D G S E M P A E I V E L H E I E V E T I P V

GGAGACTATCGAGACQCACTT_ GGGGrGGGG _GGGG E T I E T T V V G EEEEED D D D E D G G G G D H G G G G G G H

GGGCACGCCGGCCACCQCCATQCACCACCQCCACQCACQCACQCACCCCGCCCATGATCGCGCTGCAGCCGCTGGTGACGGACGACCCGACCCAAGTGC G H A G- H H H H R H A H H H H H P P N I A L Q P L V T D D P T Q V H ACQCACCACGGAGGTGATCCTGGTGCAGACGCGCGAGGAGGTGGTCG=GCGGACGACTCGGACGGGCTGCGCGCCGAGGACGGCTTCGAGGACCAGAT HH Q E V I L V Q T R E E V V G G D D S D G L R A E D G F E D Q I CCTCATCCCGGTGCCCGCGCGGCGCGCGACGACGATCTGACGCCGGTCACCGTGGCGGCGGCCGGCAAGAGCGGCGGCGGGGCCTCG

L I P V P A P A G G D D D Y I E Q T L V T V A A A G K S G G G A S

TCGGGCGGCGGTCGCGTGAAGAAGGGCGGCGGCAAGAGAGCGAAGAAGAGTCGGGGCGGGCGCGGGCGGGCGGAGGGGG CGCCGACC

S G G G R V K K G G G K K S G K K S Y L G G G A G A A G G G G A D P

C

=GGGAAAGAGGGACAGAACGTCGATCAAGACCCTGGAGGGCGAGTTCTCGGTQACCATGTGGTCCTCGGA TGAAAAAGTTG G N K K W E Q K Q V Q I K T L E G E F S V T NW S S D E

K

K D I D

CCATGAAACAGTGGTTGAAGAGCAGATCATTGGAGAGAACTCACCTCCTGATTATTCTGAATATATGACAGGCAAGAAACTCCCTCCTGGAGGGATACCT H E T V V E E Q I I G E N S P P D Y S E Y N T G K K L P P G G I P GGCATTGACCTCTCAGACCCTAAGCAACTGGCAGAATTTGCCAGAATGAAGCCAAGAAAAATTAAAGAAGATGATGCTCCAAGAACAATAGCTTGCCCTC G I D L S D P K Q L A E F A R N K P R K I K E D D A P R T I A ATAAAGGCTGCACAAAGATGTTCAGGGATAACTCTGCTATGAGAAAGCATCTGCACACCCACGGTCCCAGAGTCCACGTCTGTGCAGAGTGTGGCAAAGC N G P R V H V CXA x G C T x Y RDn N £ AM K N L COC XA GTTCGTTGAGAGCTCAAAGCTAAAACGACACCAGCTGGTTCATACTGGAGAAAAGCCCTTTCAGTGCACATTCGAAGGCTGCGGGAAGCGCTTTTCACTG YNNa a L RN O L V N T G E K P F QC e R EC CmL o

GACTTCAATTTGCGCACACTGTGCGAATCCTACGAAAGCCTTGGGCCCCTTCGACGGTTGTAATAAGAAGTTTGCTCAGTCAACTAACC V TT N T G D R P Y VC P p D a c TP KKI AO B L

DI U LT

TGAAATCTCACATCTTAACACACGCTAAAGCCAAAACAACCAGTGAAAAGAAGAGAGAAGACCTTCTCGACCCGGGAAGCCTCTTCAGGAGTGTGATTG

X

* N I L I N A K A K N N Q

GGAATAATATG CCTCTCCTTTGTATATTATTTCTAGGAATTTAAATGAACCTACACCTTAAGGGACATGTTTTGATAAAGTAGTAAAAATT TAAAAAATACTTTAATAAGATGACATTGCTAAGATGCTATATCTTGCTCTGTAATCTCGTTTCAAACAAGGTATTTTTGTAAAGTGTGGTCCCAACAG GAGGACAATTCATGAACTTCGCATCAAAAGACAATTCTTTATACAACAGTGCTAAAAATGGGACTTCTTTTCACATTCTTATAAATATGAAGCTCACCTG

TTGCTTACATTTTTTTAATTTTGTATTTTCCAAGTGTGCATATTGTACACTTTTTGGGGATATGCTTAGTAATGCTGTGTGATTTTCTGGAGGTTGATA ACTTTGCTTGCGGTAGATTTTCTTTAAAAGAATGGGCAGTTACATGCATACTTCAAAAGTATTTTTCCTGTACAAAAGTTATATAGGTTTTGTT

TGCTATCTTAATTTTGGTTGTATTCTTTGATGTTACACATTTTGTATAATTGTATCGTATAGCTGTATTGAATCATGTAGAATCAAATATTAGATGTGA TTTAATAGTGTTAATCAATTTAAACCATTTTAGTCACTTTTTTTTCCCCAAATGTACTGCCGATGTGACGTTAGTGTAATCTTTGCCTGTTCA GTTACAGAAAGTGGTGCTCAGTTGTAGAATGTATTGTACCTTTTAAC

TCTGATGTGTACA TCCGTGTAA

A_CACAAATGGATCCT

AAAGAAAGATTACGGCAGAAAGAGCTCTGTAAGCACAGCCTTATTTTCTTCTGTTGTCCAGAATACTTAGAATTCTTGAGCCTCCCQGAAATTGGAAGCA

FIG. 4. Primary nucleotide sequence and predicted translation product of the UCRBP full-length cDNA. The putative translation initiation codon within a Kozak consensus sequence is underlined. The N-terminal region alternating charges are underlined. The putative zinc fingers are bold and underlined. The stop codon is marked with asterisks. The putative polyadenylation site is double underlined.

fingers of the C2H2-GLI subgroup are distinct from those of the C2H2-Kruppel group (5, 6, 36), which have two amino acids between the conserved Cys. As seen in Fig. 5, the fingers of CRYBP-1 (34), MBP-1 (1), PRDIIBF-1 (10), SP-1 (22), Egr-1 (49), Zif268 (7), Krox24 (30), and many others (50) are of the Kruppel subgroup. (CR YBP-J, MBP-1, PRDIIBP-J, Egr-1, Zij268, and Krox24 are the same genes identified by different groups.) The UCRBP zinc fingers show 80% amino acid identity with the zinc finger domain of REX-1 (21). Both UCRBP and REX-1 have four fingers, and each of the four fingers is composed of the same numbers of amino acids and thus are anticipated to have a very similar overall structure and may

have similar DNA-binding specificities. Linker sequences between each finger are also very similar in these proteins. A target DNA sequence for REX-1 has not been reported so far. RNA expression. Poly(A)+ RNAs from F9 EC cells and from B and T lymphocytes show a predominant UCRBP message of about 2.4 kb (Fig. 6). UCRBP message of the same size was also found in equal abundance in other cell lines (including murine fibroblast, myocyte, and macrophage; data not shown). This size is in accordance with the size of full-length UCRBP (Fig. 4). In the T-cell line, there are fainter bands of higher molecular weight whose identity is not clear. In view of the similarity with REX-1 noted in the

FLANAGAN ET AL.

42

MOL. CELL. BIOL.

UCRBP FINGER 1 FINGER 2 FINGER 3 FINGER 4

C FPHKG C C AE-- C C 1TFEG C C FPFDG C

TKM GKA GKR NKK

F F F F

RDNSA N RK H VESSK L KR H SLdFN L RT H AQSTN L KS H

REX-1 (21) FINGER 1 FINGER 2 FINGER 3 fINGER 4

C FPQAG C pAE-C 1TFEG C FPFDG

C C C C

KKK GKA GKR EKS

L F F F

RGKTA TESSK SLDFN IQSNN

L RK H NLV L KR H FLV L RT H IRI Q KI H ILT

FRWDG HWGG C 1TFEG C EEHEG C KLPG

C C C C C

SQE SRE RKS SKA TKR

F F Y F Y

DSQEQ KAQYM SRLEN SNASD TDPSS

L L L R L

Reverse UCRBP C UCRBP

C

I.

I

GLI (25) FINGER FINGER FINGER FINGER FINGER

A

LHT H GPRVHV QLV H TGEKPFZ VRT H TGDRPYV ILT H

1 2 3 4 5

C C

CRY-BP-1 (34) FINGER 1 FINGER 2

VH VV KT AK RK

H H H H H

H GPRRHV H TGEKPYQ H TGERRFV H

I

7.5 22.5

-

7.5 22.5

-

*9999

INSE H IHGERKEFV NRR- H TGEKPHKLRD LRS- H TGEKPYN QNRT H SNEKPYV VKTV H

'-''

'--'''111

I

3

7.5

|

---

--

t

l

1

l

1.1.1

.

C EEE-- C GIR C KKPSM L KK H IRT- H TDVRPYH C sSY-- C NFS F KTKGN L TK H MKSKAH

Egr-1 (49) FINGER 1 FINGER 2 FINGER 3

C PVES C DRR F SRSDE L TR H IRI H TGQKPFQ C Rl-- C NRN F SRSDH L TT H IRT H TGEKPFA C IDI-- C GRK F ARSDE R KR H TKI H

SP-1 (22) FINGER 1 FINGER 2 FINGER 3

HIQG C GKV Y GKTSH L RA H LRW H TGERPFM TWSY C GKR F TRSDE L QR H KRT H TGEKKFA C IPE-- C PKR F MRSDH L SK H IKT H Conserved amino acid C C F H L H Linker FIG. 5. Comparison of the UCRBP finger amino acid sequences with sequences of other C2H2 zinc fingers. The invariant Cys, His, and relatively conserved Phe and Leu residues are set apart by spaces. Positions missing a residue compared with the consensus are marked with a hyphen. On the basis of the spacing between the two Cys residues, UCRBP is classified into the GLI subgroup (37). References are in parentheses. CR YBP-J (34), MBP-1 (1), PRDIIBF-J (10), and Egr-J (49), ZDf268 (7), and Krox24 (28) are the same C

C 1

genes.

zinc finger region (Fig. 5), levels of UCRBP mRNA were examined in F9 cells before and after retinoic acid treatment. Expression of REX-1 is developmentally controlled: its mRNA levels were shown to be high in undifferentiated F9 EC cells and to fall precipitously following retinoic acid treatment (21). As seen in Fig. 6, UCRBP mRNA levels did not significantly change following retinoic acid treatment for

L-)

.O 0

i9

r

RA (day) - 0

(Kb)

1

3

51

co t

x

CL m

9:R4.4I

(L

~p

~

.-:-

1.4-

0-

(-5

FIG. 6. Expression of UCRBP mRNA. Poly(A)+ RNA (3 ,ug per lane) from murine T (BW5147) or B (P3X63) lymphocytes and from F9 cells before (-, 0) and after retinoic acid (RA) treatment (5 x 10-7 M) were probed with the 1.2-kb UCRBP insert.

10

22.5

7.5

22.5

C

UCRBP Reverse UCRBP

UCRBP

REVERSE UCRBP

FIG. 7. Down-regulation of MuLV promoter activity by transfected UCRBP. (A) Murine L fibroblasts were transiently transfected with the reporter pMuLVCAT (3 jig) and the expression plasmid pCMV-UCRBP or the negative control plasmid pCMVRUCRBP (7.5 or 22.5 ,ug). Lanes C, without expression plasmid. (B) Data are averages of five replicate assays standard error. The luciferase reporter gene driven by the herpes simplex virus thymidine kinase (TK) promoter was cotransfected with pCMV-UCRBP or pCMV-RUCRBP (7.5 or 10 ,ug). Activity was normalized to values obtained without expression plasmid (C). ±

3 days, with a minor reduction detected on day 5, indicating that the pattern of UCRBP mRNA expression during differentiation of F9 cells is different from that of REX-1. Function of UCRBP. To test the role of UCRBP in regulating transcription of MuLV, cotransfection experiments were performed with the expression plasmid pCMVUCRBP, in which a UCRBP cDNA was placed under control of the cytomegalovirus promoter. pCMV-RUCRBP, in which UCRBP was placed in the reverse orientation, was used as a control. As a reporter, we used pMuLVCAT, in which the CAT gene is placed under the control of the MuLV LTR containing the UCR (31). These constructs were transiently introduced into murine fibroblasts, and MuLV CAT activity was measured. Representative data of such cotransfection experiments (Fig. 7A) show that pCMVUCRBP strongly decreases CAT activity produced by the MuLV reporter, while the control pCMV-RUCRBP had no effect. More detailed tests involving five replicate transfection experiments are summarized in Fig. 7B. These results show that UCRBP decreases MuLV CAT activity in a

up to

Lymphocytes

F9

C

VOL. 12, 1992

dose-dependent fashion: a significant reduction in CAT activity was detected when >7.5 jig of pCMV-UCRBP was used. Linear regression analysis of the points involving 0 to 22.5 jig of pCMV-UCRBP demonstrate a negative slope (P value less than 0.0081), whereas analysis of the points involving 0 to 22.5 p,g of pCMV-RUCRBP demonstrate no significant trend (P value less than 0.35). The maximum effect of UCRBP of approximately 15-fold reduction is achieved by 10 ,ug of the expression plasmid. The data in Fig. 7B also show that UCRBP had no effect on a control HSV thymidine kinase promoter construct driving a luciferase reporter gene. These results indicate that UCRBP is involved in negative regulation of MuLV.

DISCUSSION Three lines of evidence (binding specificity, ubiquity of expression, and functional role) indicate that the cloned UCRBP gene is likely to be the structural gene for the previously reported UCR-binding factor (13). UCRBP specifically binds the core sequence, CGCCATTTT, to which the nuclear UCR-binding factors present in all tissue culture cells tested bind (13). The 2.4-kb UCRBP mRNA is expressed in all cell lines tested, consistent with the ubiquitous presence of UCR-binding activity (13, 51). Furthermore, as expected from the UCR-mediated negative regulation, we have shown that introduction of the UCRBP gene into fibroblasts leads to strong down-regulation of MuLV promoter activity (Fig. 7). UCRBP has a zinc finger domain composed of four C2H2 fingers (Fig. 4 and 5). There are a number of proteins that contain putative C2H2 zinc fingers, some with many repeats (33, 42, 50) and others with only two to four repeats, constituting a large family of regulatory proteins (6, 40). They are presumed to bind specific regulatory DNA elements and control gene expression. However, target sequences for many of these zinc fingers have yet to be identified; only a handful of finger proteins have been shown to bind specific DNA sequences (1, 22, 25, 38, 49). The UCRBP zinc finger domain is very similar to that of the murine REX-1. Neither DNA-binding activity nor target specificity has been demonstrated for REX-1, however. Given the similarity in the zinc finger domain, it may be postulated that REX-1 binds DNA sequences identical or similar to the UCR. Likewise, because of the similarity in the zinc finger domain, Krox24 and Krox2O, as well as ZFX and ZFY, are predicted to have the same or closely related target sequences (30, 42). Upon revision of this report, we became aware of three independent efforts, in addition to ours, which have resulted in the cloning of genes identical to UCRBP. Shi et al. (43) cloned human YY-1, which binds to the adeno-associated virus P5 promoter and represses transcription of a target gene. Human NF-E1, on the other hand, isolated by Park and Atchison (37), binds a regulatory element of the 3' region of the immunoglobulin K gene and the ,uE1 site of the immunoglobulin heavy-chain gene and is capable of downregulating transcription mediated by a ,uE1 site. Finally, Hariharan et al. (20) have isolated a clone called 8 which binds to downstream elements of ribosomal protein genes. It is clear from these findings that UCRBP is highly conserved between mice and humans, with essentially the identical amino acid composition in zinc finger region. The motif found in the adeno-associated virus P5 promoter contains the sequence CArTTTG, similar to the UCR core. These and our recent observations (2a) indicate that the UCR-

MuLV DNA-BINDING FACTOR

43

binding motif is not confined to MuLV but is widespread in a number of viral and cellular genes. More striking, however, is the dissimilarity of the binding motifs recognized by UCRBP, NF-E1, and 8: NF-E1 and 8 are shown to specifically bind CCACCTCCATCTT and GCNGCCATC, respectively, which differ from the UCR motif and from each other. These findings demonstrate that the binding specificity of UCRBP is unusually diverse, which in turn suggests that functional roles of UCRBP are also unusually diverse. ACKNOWLEDGMENTS We thank Wai-Han Mak, Apri Chalian, and Gordon L. Bontrager for technical assistance. We thank M. Atchison, R. Perry, and T. Shenk for providing unpublished information. This work was supported in part by the Department of Veterans Affairs Merit Review and Research Associate funds. In addition, partial support was provided by the Roy J. Carver Charitable Trust in the form of a Clinician Scientist award.

REFERENCES 1. Baldwin, A. S., Jr., K. P. LeClair, H. Singh, and P. A. Sharp. 1990. A large protein containing zinc finger domains binds to related sequence elements in the enhancers of the class I major histocompatibility complex and kappa immunoglobulin gene. Mol. Cell. Biol. 10:1406-1414. 2. Barnes, W. M., M. Bevan, and P. H. Son. 1983. Kilo-sequencing: creation of an ordered nest of asymmetric deletions across a large target sequence carried on phage M13. Methods Enzymol. 101:98-122. 2a.Becker, K. G., et al. Unpublished data. 3. Bellefroid, E. J., P. J. Lecocq, A. Benhida, D. A. Poncelet, A. Belayew, and J. A. Martial. 1989. The human genome contains hundreds of genes coding for finger proteins of the Kruppel type. DNA 8:377-387. 4. Brasier, A., J. E. Tate, and J. F. Habener. 1989. Optimized use of the firefly luciferase assay as a reporter gene in mammalian cell lines. BioTechniques 7:1116-1122. 5. Chavrier, P., P. Lemaire, 0. Revelant, R. Bravo, and P. Charnay. 1988. Characterization of a mouse multigene family that encodes zinc-finger structures. Mol. Cell. Biol. 8:1319-1326. 6. Chowdhury, K. H., U. Deutsch, and P. Gruss. 1987. A multigene family encoding several "finger" structures is present and differentially active in mammalian genomes. Cell 48:771-778. 7. Christy, B., and D. Nathans. 1989. DNA binding site of the growth factor inducible protein Zif 268. Proc. Natl. Acad. Sci. USA 86:8737-8741. 8. Dignam, J. D., R. M. Lebovitz, and R. G. Roeder. 1983. Accurate transcription initiation by RNA polymerase II in a soluble extract from isolated mammalian nuclei. Nucleic Acids Res. 11:1475-1489. 9. Driggers, P. H., D. L. Ennist, S. L. Gleason, W.-H. Mak, M. S. Marks, B.-Z. Levi, J. R. Flanagan, E. Appella, and K. Ozato. 1990. An interferon regulated protein that binds the interferoninducible enhancer element of major histocompatibility complex class I genes. Proc. NatI. Acad. Sci. USA 87:3743-3747. 10. Fan, C. M., and T. Maniatis. 1990. A DNA binding protein containing two widely separated zinc finger motifs that recognize the same DNA sequence. Genes Dev. 4:29-42. 11. Feinberg, A. P., and B. Vogelstein. 1983. A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13. 12. Flamant, F., C. C. Gurin, and J. A. Sorge. 1987. An embryonic DNA-binding protein specific for the promoter of the retrovirus long terminal repeat. Mol. Cell. Biol. 7:3548-3553. 13. Flanagan, J. R., A. M. Krieg, E. E. Max, and A. S. Khan. 1989. Negative control region at the 5' end of the murine leukemia virus long terminal repeats. Mol. Cell. Biol. 8:739-746. 14. Fulton, R., M. Plumb, L. Shield, and J. C. Neil. 1990. Structural diversity and nuclear protein binding sites in the long terminal repeats of feline leukemia virus. J. Virol. 64:1675-1682. 15. Golemis, E. A., N. A. Speck, and N. Hopkins. 1990. Alignment of

44

16. 17.

18. 19.

20.

21.

22.

23.

24. 25.

26. 27.

28.

29. 30.

31.

32. 33. 34.

FLANAGAN ET AL. U3 region sequences of mammalian type C viruses: identification of highly conserved motifs and implications for enhancer design. J. Virol. 64:534-542. Gorman, C. M., L. F. Moffat, and B. H. Howard. 1982. Recombinant genomes which express chloramphenicol acetyltransferase in mammalian cells. Mol. Cell. Biol. 2:1044-1051. Gorman, C. M., P. W. Rigby, and D. P. Lane. 1985. Negative regulation of viral enhancers in undifferentiated embryonic stem cells. Cell 42:519-526. Gough, J. A., and N. E. Murray. 1983. Sequence diversity among related genes for recognition of specific targets in DNA molecules. J. Mol. Biol. 166:1-19. Graham, F., and A. van der Eb. 1973. A new technique for the assay of infectivity of human adenovirus 5 DNA. Virology 52:456-457. Hariharan, N., D. E. Kelley, and R. P. Perry. 1991. Delta, a transcription factor which binds to downstream elements in several polymerase II promoters, is a functionally versatile zinc finger protein. Proc. Natl. Acad. Sci. USA, in press. Hosler, B. A., G. J. LaRosa, J. F. Grippo, and L. J. Gudas. 1989. Expression of REX-1, a gene containing zinc-finger motifs, is rapidly reduced by retinoic acid in F9 teratocarcinoma cells. Mol. Cell. Biol. 9:5623-5629. Kadonaga, J. T., K. R. Carner, F. R. Masiarz, and R. Tjian. 1987. Isolation of cDNA encoding transcription factor Sp 1 and functional analysis of the DNA-binding domain. Cell 51:10791090. Kadonaga, J. T., A. J. Courey, J. Ladika, and R. Tjian. 1988. Distinct regions of Spl modulate DNA binding and transcriptional activation. Science 242:1566-1570. Khan, A. S., and M. Martin. 1983. Endogenous murine leukemia proviral long terminal repeats contain a unique 190-base pair insert. Proc. Natl. Acad. Sci. USA 80:2699-2703. Kinzler, K. W., J. M. Ruppert, S. H. Bigner, and B. Vogelstein. 1988. The GLI gene is a member of the Kruppel family of zinc finger proteins. Nature (London) 332:371-374. Kozak, M. 1987. An analysis of 5'-non-coding sequences from 699 vertebrate messenger RNAs. Nucleic Acids Res. 15:81258148. Laimins, L. A., P. Gruss, R. Pozzatti, and G. Khoury. 1984. Characterization of enhancer elements in the long terminal repeat of Moloney murine sarcoma virus. J. Virol. 49:183-189. LaRosa, G. J., and L. J. Gudas. 1988. Early retinoic acidinduced F9 teratocarcinoma stem cell gene ERA-1: alternate splicing creates transcripts for a homeobox-containing protein and one lacking the homeobox. Mol. Cell. Biol. 8:3906-3917. Lee, M. S., G. R. Gippert, K. V. Soman, D. A. Case, and P. E. Wright. 1989. Three dimensional solution structure of a single zinc finger DNA binding domain. Science 245:635-637. Lemaire, P., 0. Revelant, R. Bravo, and P. Charney. 1988. Two mouse genes encoding potential transcription factors with identical DNA-binding domains are activated by growth factors in cultured cells. Proc. Natl. Acad. Sci. USA 85:4691-4695. Linney, E., B. Davis, J. Overhauser, E. Chao, and H. Fan. 1984. Non-function of a Moloney murine leukemia virus regulatory sequence in F9 embryonal carcinoma cells. Nature (London) 308:470-472. MacDonald, R. J., G. H. Swift, A. E. Przybyla, and J. M. Chirgwin. 1987. Isolation of RNA using guanidinium salts. Methods Enzymol. 152:219-227. Miller, J., A. D. McLachlan, and A. Klug. 1985. Repetitive zinc-binding domains in the protein transcription factor IIIA from Xenopus oocytes. EMBO J. 4:1609-1614. Nakamura, T., D. M. Donovan, K. Hamada, C. M. Sax, B. Norman, J. R. Flanagan, K. Ozato, H. Westphal, and J. Piatigorsky. 1990. Regulation of the mouse aA-crystallin gene: isolation of a cDNA encoding a protein that binds to a cis sequence motif shared with major histocompatibility complex class I gene and other genes. Mol. Cell. Biol. 10:3700-3708.

MOL. CELL. BIOL. 35. Neuhaus, D., Y. Nakaseko, K. Nagai, and A. Klug. 1990. Sequence-specific [1H] NMR resonance assignments and secondary structure identification for 1- and 2-zinc finger constructs from SW15. A hydrophobic core involving 4 invariant residues. FEBS Lett. 262:179-184. 36. Nordeen, S. K. 1988. Luciferase reporter gene vectors for analysis of promoters and enhancers. BioTechniques 6:454-457. 37. Park, K., and M. L. Atchison. 1991. Isolation of a candidate repressor/activator, NF-El (YY-1, 8) that binds to the IgK and the IgH,uE1 site. Proc. Natl. Acad. Sci. USA, in press. 38. Rauscher, F. J., J. F. Morris, 0. E. Tournay, D. M. Cook, and T. Curran. 1990. Binding of the Wilm's tumor locus zinc finger protein to the EGR-1 consensus sequence. Science 250:12591262. 39. Rosenberg, U. B., C. Schroder, A. Preiss, A. Kienlin, S. Cote, I. Riede, and H. Jackle. 1986. Structural homology of the product of the Drosophila Kruppel gene with Xenopus transcription factor IIIA. Nature (London) 319:336-339. 40. Ruppert, J. M., K. W. Kinzler, A. J. Wong, S. Bigner, F.-T. Kao, M. L. Law, H. N. Seuanez, S. J. O'Brien, and B. Vogelstein. 1988. The GLI-Kruppel family of human genes. Mol. Cell. Biol. 8:3104-3113. 41. Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: a laboratory manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. 42. Schneider-Gadicke, A., P. Beer-Romero, L. G. Brown, R. Nussbaum, and D. C. Page. 1989. ZFX and ZFY may bind to the same target. Cell 57:1247-1258. 43. Shi, Y., E. Seto, L.-S. Chang, and T. Shenk. 1991. Transcriptional repression by YY1, a human GLI-Kruppel related protein, and relief of repression by adenovirus ElA protein. Cell, in press. 44. Shirayoshi, Y., J. Miyazaki, P. A. Burke, K. Hamada, E. Appella, and K. Ozato. 1987. Binding of multiple nuclear factors to the 5' upstream regulatory element of the murine major histocompatibility class I gene. Mol. Cell. Biol. 7:4542-4548. 45. Singh, H., J. H. LeBowitz, A. S. Baldwin, Jr., and P. A. Sharp. 1988. Molecular cloning of an enhancer binding protein: isolation by screening of an expression library with a recognition site DNA. Cell 52:415-423. 46. Speck, N., and D. Baltimore. 1987. Six distinct factors interact with the 75-base-pair repeat of the Moloney murine leukemia virus enhancer. Mol. Cell. Biol. 7:1101-1110. 47. Speck, N. A., B. Renjifo, and N. Hopkins. 1990. Point mutations in the Moloney murine leukemia virus enhancer identify a lymphoid-specific viral core motif and 1,3-phorbol myristate acetate-inducible element. J. Virol. 64:543-550. 48. Strickland, S., and V. Mahdavi. 1978. The induction of differentiation in teratocarcinoma stem cells by retinoic acid. Cell 15:393-403. 49. Sukhatme, V. P., X. Cao, L. C. Chang, C.-H. Tsai-Morris, D. Stamenkovich, P. C. P. Ferreira, D. R. Cohen, S. Edwards, T. B. Shows, T. Curran, M. M. Le Beau, and E. D. Adams. 1988. A zinc finger-encoding gene coregulated with c-fos during growth and differentiation, and after cellular depolarization. Cell 53:3743. 50. Thiesen, H.-J. 1990. Multiple genes encoding zinc finger domains are expressed in human T cells. New Biol. 2:363-374. 51. Tsukiyama, T., 0. Niwa, and K. Yokoro. 1989. Mechanism of suppression of the long terminal repeat of Moloney leukemia virus in mouse embryonal carcinoma cells. Mol. Cell. Biol. 9:4670-4676. 52. Vinson, C. R., K. L. LaMarco, P. F. Johnson, W. H. Landschulz, and S. L. McKnight. 1988. In situ detection of sequence-specific DNA binding activity specified by a recombinant bacteriophage. Genes Dev. 2:801-806. 53. Weiss, R., N. Teich, H. Varmus, and J. Coffin (ed.). 1985. RNA tumor viruses, vol. 2. Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y.

Cloning of a negative transcription factor that binds to the upstream conserved region of Moloney murine leukemia virus.

The long terminal repeat of Moloney murine leukemia virus (MuLV) contains the upstream conserved region (UCR). The UCR core sequence, CGCCATTTT, binds...
2MB Sizes 0 Downloads 0 Views