Proc. Nadl. Acad. Sci. USA Vol. 89, pp. 9372-9376, October 1992 Biochemistry

Cloning and chromosomal mapping of a human immunodeficiency virus 1 "TATA" element modulatory factor JOSEPH A. GARCIA*, S.-H. IGNATIUS Out, FOON Wut, ALDONS J. Lusist, ROBERT S. SPARKESt, AND RICHARD B. GAYNORt§ Departments of *Microbiology and Immunology and *Medicine, University of California, Los Angeles School of Medicine, Los Angeles, CA 90024; and tDivision of Molecular Virology, Departments of Medicine and Microbiology, University of Texas, Southwestern Medical School, 5323 Harry Hines Boulevard, Dallas, TX 75235

Communicated by Jonathan W. Uhr, June 8, 1992

vitro transcription assays (7). TBP associates with a number of distinct proteins known as coactivators that are present in both Drosophila and human nuclear extracts (8, 9). This association is presumed to be critical for modulating the transcriptional activity of TBP since the cloned TBP alone is not sufficient to activate gene expression in response to upstream regulatory proteins (8, 9). Given the fact that a single transcriptional regulatory element is frequently capable of binding several different DNAbinding proteins (2), it is possible that cellular factors other than TBP can bind to the TATA element and participate in regulating gene expression. To address this issue, we used Agtll expression cloning to identify factors that bind to the human immunodeficiency virus 1 (HIV-1) TATA element (10). A cellular factor of 123 kDa that we designated the TATA element modulatory factor (TMF) was identified.1 TMF binds to the HIV-1 TATA element and inhibits the ability of TBP to activate basal in vitro transcription of the HIV-1 long terminal repeat (LTR). In addition, the TMF gene localizes to a region of human chromosome 3 that is associated with a variety of malignancies. The functional properties and chromosomal localization of TMF suggest that it may be important in the regulation of viral and cellular gene expression.

ABSTRACT A critical regulatory element in many promoters transcribed by RNA polymerase II is the "TATA" box, which is located 25-30 nucleotides upstream of the transcription initiation site. TFIID is a biochemically defined HeLa cell nuclear fraction containing a transcription factor activity that binds specifically to the TATA box and is critical in determining both basal and regulated promoter activity. Recently, the gene for a TATA-binding protein was cloned and found to bind to various TATA elements and to substitute for TMID in stimulating basal gene expression in in vitro transcription systems. However, it is possible that additional cellular factors can bind to the TATA element and influence the level of gene expression. By using Agtll expression cloning with oligonucleotides corresponding to the human immunodeficiency virus 1 TATA element, we report the identification ofa cellular protein with a calculated molecular mass of 123 kDa that we designate TATA element modulatory factor (TMF). TMF binds to the human immunodeficiency virus 1 TATA element in gelretardation assays and inhibits activation of the viral long terminal repeat by the TATA-binding protein in in vitro transcription assays. TMF contains leucine-zipper amino acid motifs and exhibits homology in its DNA binding domain with the phage-encoded DNA binding protein Ner. Chromosomal mapping localizes the TMF gene to human chromosome 3p12p21, which is a site of frequent rearrangements in lung and renal carcinomas. Thus, TMF is a transcription factor that likely regulates the expression of both viral and cellular genes.

MATERIALS AND METHODS Expression Cloning and Identification of a Composite cDNA Encoding TMF. A random-primed Agtll HeLa-cell cDNA library was screened with a ligated oligonucleotide (10) that contains nucleotides -46 to -1 of the HIV-1 LTR (11). Multiple overlapping clones were identified and analyzed to construct a composite cDNA. Composite cDNA clones encoding TMF were assembled in both pUC18 and pGEM-2. A cDNA fragment containing the original TMF Agtll fragment was cloned downstream of the glutathione S-transferase (GST) gene in the pGEX bacterial expression vector to generate a fusion protein. The expression and purification of the GST-TMF fusion protein were performed as described (12). Analysis of TMF. Northern blot analysis was performed using 10 pug of HeLa poly(A)+ mRNA electrophoresed through a 1% formaldehyde/agarose gel. For in vitro transcription ofTMF, RNA was first synthesized from a modified pGEM vector using T7 polymerase, and the RNA generated was in vitro-translated in reticulocyte lysate in the presence and absence of [35S]methionine as suggested by the manufacturer (Promega). Gel-Retardation Analysis. Gel-retardation analysis was performed with an HIV TATA element probe containing

The regulation of gene expression is a complex process involving the interaction of both sequence-specific DNA binding and general initiation factors (for review, see refs. 1 and 2). A cis-acting regulatory motif known as the TATA element is located 25-30 nucleotides upstream of the transcription initiation site in many RNA polymerase II transcription units (3-6). The initial step in the formation of a functional transcription complex requires the binding to the TATA element of cellular factors present in a biochemically defined fraction known as TFIID. The interaction of factors in the TFIID fraction with other general transcription factors and RNA polymerase II results in a multiprotein complex that is capable of transcription initiation and subsequent elongation (3-7). This initiation complex is the likely target for the regulation mediated by upstream regulatory proteins and viral transactivators. Cloning from yeast, Drosophila, and human sources of the gene encoding a component of the TFIID fraction referred to as the TATA-binding protein (TBP) has been reported (for review, see ref. 6). The cloned TBP can substitute for a heat-sensitive activity in HeLa nuclear extracts and is necessary for basal expression of polymerase II promoters in in

Abbreviations: HIV-1, human immunodeficiency virus 1; LTR, long terminal repeat; TMF, TATA element modulatory factor; TBP, TATA-binding protein; GST, glutathione S-transferase. §To whom reprint requests should be addressed. 1The sequence reported in this paper has been deposited in the GenBank data base (accession no. L01042).

The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. 9372

Biochemistry: Garcia et al.

Proc. Natl. Acad. Sci. USA 89 (1992)

9373

through the use of Agtll expression screening with ligated oligonucleotides corresponding to specific DNA binding sequences (10). By employing this technique with a probe containing the TATA element of the HIV-1 LTR, we isolated a cDNA whose protein bound specifically to the HIV-1 TATA element (Fig. 1A). Partial cDNA clones derived from this Agtll clone were used as probes to identify additional cDNA clones from HeLa-cell AgtlO and Agtll libraries. Overlapping sequences from multiple independent clones allowed a 3.5-kilobase composite TMF cDNA to be constructed that terminated in a poly(A) tract and contained multiple stop codons upstream of the initiating methionine and in the C terminus (Fig. 1A). The composite TMF cDNA encoded a protein of 1093 amino acids (Fig. 1A). A search of the protein databases [National Biomedical Research Foundation (release 20) and Swiss-Prot (release 8) (16)] revealed the presence of a TMF domain that exhibited strong homology to the prokaryotic DNA binding protein Ner, encoded by the bacteriophage D108 (17) (Fig. 1B). Analysis of the binding specificity and DNA contact points of Ner indicated that it bound to an element consisting of two direct repeats that flanked an A+T-rich region encompassing the transcription initiation site ofthe bacteriophage Pe and Pc2 promoters (17). A similar DNA sequence arrangement is found in the HIV-1 promoter, which has two imperfect direct repeats containing the sequence CANNTG (where N is any nucleotide) flanking the TATA sequence (11). In other promoters, sequences related to the HIV-1 LTR direct repeats have been demonstrated to bind proteins with a basic helix-loop-helix DNA binding motif (18). Thus several sequence motifs in the HIV-1 TATA

nucleotides -46 to -10 of the HIV TATA element and either 10 ng of purified GST protein alone or 10 ng of the GST-TMF DNA binding domain fusion protein. Unlabeled doublestranded oligonucleotides corresponding to nucleotides -46 to -10 of the HIV-1 LTR (5'-GCGTGCCTCAGATGCTGCATATAAGCAGCTGCTTTTT-3'), a mutated HIV-1 TATA element (5'-GCGTGCCTCAGATGCTGCAiCGAAGCAGCTGCTTTTT-3'; the mutagenized nucleotides are underlined), or a portion of the HIV-1 TAR element containing nucleotides +10 to +52 were used in competition analysis with gel-retardation assays. In Vitro Transcription of the HIV-1 LTR. In vitro transcription analysis of the HIV-1 LTR was performed essentially as described (13). HIV-1 LTR CAT constructs containing LTR nucleotides -179 to +80 were digested with Nco I to generate a 620-nucleotide runoff transcript. HeLa nuclear extract (100 ttg) was added to each reaction mixture. The HeLa nuclear extract was heated at 47C for 15 min. TBP and TMF proteins were in vitro-translated in a rabbit reticulocyte lysate and 3 1.l of unprogrammed lysate, TMF, TBP, or both TMF and TBP was added to in vitro transcription assays to give a final quantity of 6 1,l of lysate. Chromosomal Mapping of TMF. The chromosomal localization of TMF was determined by Southern blot analysis of human-mouse chromosomes (14). Normal human metaphase chromosomes were examined by in situ hybridization with the TMF cDNA followed by exposure to photographic emulsion for 1 week (15). RESULTS Isolation of TMF. The isolation of cDNA clones encoding sequence-specific DNA binding proteins has been facilitated

A MSWFNASQLSSFAKQALSQAQKSIDRVLDIQEEEPSIWAETIPYGEPGISSPVSGGWDTSTWGLKSNTEPQSPPIASPKA ITKPVRRTVVDESENFFSAFLSPTDVQTIQKSPVVSKPPAKSQRPEEEVKSSLHESLHIGQSRTPETTESQVKDSSLCVS

TPSPPVSTFSSGTSTTSDIEVLDHESVISESSASSRQETTDSKSSLHLMQTSFQLLSASACPEYNRLDDFQKLTESCCSS

DAFERIDSFSVQSLDSRSVSEINSDDELSGKGYALVPIIVNSSTPKSKTVESAEGKSEEVNETLVIPTEEAEMEESGRSA

TPVNCEQPDILVSSTPINEGQTVLDKVAEQCEPAESQPEALSEKEDVCKTVEF&EKLEKLAQLLSaKEKAI I--

i EAFD|I I

IODEMFRVKEESSSISSLKDEFTQRIAEAEKKVQLACKERDAAKKEIKNIKEELATRLNSSETADLLKE EQIRGD@EjI I

I

EGEK0 KQQLHgNIIKK AKDKE&NMVAKDIKKVKE EELQHDCQVLDGWEVEKQHRENIKKLNSMVERQEKDLGI RLQVDMDELEEKNRSIQAALDSAYKELTDLHKANAAKDSEAQEAALSREMKAKEELSAALEKA EEbARQETrLAIQVGD

ILRLALQRTEQAAARKEDYLRHEIGELQQRLQEAENRN i I

QELSQSVSS

T

QATLGSQTSSWEKLEKNLSDRL

IGESQTLLAAAVERERAATEELLANKIQMSSMESQNSkLRQENSRFQAQLESEKNRLCKLEDENNRYQVELENLKDEYVRT I _

_w

LEETRKEKTLLNSQLEMERMKVEQERKKAIFTQETIKEKERKPFSVSSTPTMSRSSSISGVDMAGLQTSFLSQDESHDHS FGPMPISAKWKHLYAACKDGSRIKHeNLQS ( LREGEDrHLQLE NLEKTiIMAEEDKLTNQNDELEE iKEIPK

&TQLRDIIQRYN sQMYG&E >EAE B

ner M

TMF V

CNEKARDWHRADVI iKRKKI TRPLLRQIENL

SQT

KTQID-ELLRQSLS

.... SAL

KLE

QE

A

......WPKG

SC

RAAMEL

..... AL

IQMSS

KPEVIWP.

SQNSLLRQE

m

PA

FIG. 1. Amino acid sequence of TMF. (A) The 1093-amino acid open reading frame of TMF is shown. The area boxed with a dotted line indicates the original Agtll fragment identified by oligonucleotide expression cloning. This fragment was also expressed as a fusion protein with GST for use in gel-shift analysis. The regions highlighted by white type on a black background indicate the sequences with homology to ubiquitin-mediated protein degradation consensus sequence elements. The individually boxed amino acids indicate the potential coiled coil regions referred to as leucine zippers in the text. The underlined sequences contain the region of homology between TMF and Ner. (B) The homology of TMF (amino acids 704-788) to Ner (amino acids 1-74) is indicated. Identical amino acids are indicated by boxes, and lines indicate conserved amino acids.

9374

Proc. Natl. Acad. Sci. USA 89 (1992)

Biochemistry: Garcia et al.

element serve as potential binding sites for cellular transcription factors. The amino acid sequence of TMF revealed the presence of several potential leucine zippers consisting of a minimum of three helical turns with every seventh amino acid encoding leucine, isoleucine, or other amino acids that have been found in functional leucine zippers of other transcription factors (19) (Fig. 1A). The N terminus of TMF, in addition, contains numerous serine-proline and threonine-proline repeats that have previously been noted in a variety of cellular regulatory proteins (20). Since TMF is a senine- and threonine-rich protein, it may serve as a target for multiple cellular kinases involved in signal transduction pathways. Finally, two consensus sequences for ubiquitin-mediated protein degradation similar to those found in members of the cyclin family were noted (21). Therefore, it is likely that TMF is regulated by multiple cellular control mechanisms. Northern Blot Analysis and in Vitro Translation of TMF. Northern blot analysis using a portion of the TMF cDNA demonstrated that HeLa cells contain two major TMF transcripts of -4.0 and 4.5 kilobases (Fig. 2A). A minor TMF transcript of =7.5 kilobases was also detected that may represent a related mRNA species or an unspliced or an alternatively spliced form of TMF RNA. Hence, the TMF cDNA that was isolated may represent one isoform of the RNA species detected by Northern blot analysis. Finally, in vitro translation of in vitro-transcribed TMF RNA in an [35S]methionine-containing reticulocyte lysate yielded a major species of between 140 and 150 kDa, which was slightly larger than the predicted size of 123 kDa based on the nucleotide sequence of TMF (Fig. 2B). Binding Specificity of TMF Is Determined by the HIV-1 TATA Element. A region of TMF corresponding to the original Agtll cDNA fragment that contained most of the ner homology region (Fig. 1A) was expressed in bacteria as a GST fusion protein. Gel-retardation analysis was performed with labeled oligonucleotides corresponding to HIV-1 TATA region nucleotides -46 to -10 and either GST protein alone (Fig. 3A, lane 1) or the GST-TMF fusion protein (Fig. 3A, lane 2). GST alone did not result in a gel-retarded complex but GST-TMF resulted in a major gel-retarded species (Fig. 3A, lane 2). Competition analysis with unlabeled oligonucleotides corresponding to either the HIV-1 TATA region (positions -46 to -10) or the HIV-1 TAR element (positions +10 to +52) indicated GST-TMF binding was competed efficiently by oligonucleotides corresponding to the TATA region (Fig. 3A, lanes 3-5) but not the TAR element (Fig. 3A, lanes 6-8).

A

kb

B

To further define the binding specificity of TMF, competition analysis was performed with both wild-type and mutated oligonucleotides from the HIV-1 TATA region. Oligonucleotides corresponding to the HIV-1 TATA element specifically bound GST-TMF (Fig. 3B, lane 3) and a 50-, 100-, or 200-fold molar excess of unlabeled wild-type competitor TATA element oligonucleotides competed for GST-TMF binding (Fig. 3B, lanes 4-6), whereas mutated oligonucleotides with a TATAA -) GCGAA substitution in the TATA element competed poorly for GST-TMF binding (Fig. 3B, lanes 7-9). Oligonucleotides with substitutions in the direct

A 1

2

3

4

5

6

8

7

;I z.

-dit-

I li hi .:

B 1 2

3 4 5 6 7 8

9

a"'

kDa -210

-9.5

-4.4

-110

-2.4

-68 -1.0

-44

FIG. 2. Analysis of TMF mRNA and protein. (A) Northern blot analysis of TMF RNA in 10 ,ug of HeLa poly(A) RNA. (B) In vitro translation of [35S]methionine-labeled TMF using a rabbit reticulocyte lysate. The position of size [kb, kilobase(s)] (A) and molecular mass (B) markers are indicated.

FIG. 3. Binding properties of bacterial-produced TMF. (A) Gelretardation analysis using labeled oligonucleotides corresponding to the HIV-1 TATA element extending from position -46 to -10 in the LTR. Either 10 ng of GST alone (lane 1) or 10 ng of GST-TMF fusion protein (lanes 2-8) was used in gel-retardation analysis in the presence of 5, 10, or 20 ng of oligonucleotide extending from position -46 to -10 in the HIV-1 LTR (a specific competitor, lanes 3-5, respectively) or similar quantities of oligonucleotides extending from positions +10 to +52 in the HIV-1 LTR (a nonspecific competitor, lanes 6-8, respectively). (B) Gel-retardation analysis using the same HIV-1 TATA element oligonucleotides (lane 1) and either GST (lane 2) or GST-TMF fusion protein (lanes 3-9) and 5, 10, or 20 ng of unlabeled competitor oligonucleotides. The oligonucleotide competitor corresponds to a wild-type HIV-1 TATA region (lanes 4-6, respectively) or a mutated TATA sequence (lanes 7-9, respectively).

Biochemistry: Garcia et al. A

B

;,|lu; t

Cloning and chromosomal mapping of a human immunodeficiency virus 1 "TATA" element modulatory factor.

A critical regulatory element in many promoters transcribed by RNA polymerase II is the "TATA" box, which is located 25-30 nucleotides upstream of the...
1MB Sizes 0 Downloads 0 Views