Cell, Vol. 67, 1231-1240, December 20,
1991 by Cell Press
TFIID Binds in the Minor Groove of the TATA Box
D. Barry Starr and Diane K. Hawley Institute of Molecular Biology and Department of Chemistry University of Oregon Eugene, Oregon 97403
Summary We have analyzed the interaction of the general RNA polymerase II transcription factor TFIID with its DNAbinding site, the TATA box (consensus sequence TATAAAA). We have demonstrated that TFIID, unlike most sequence-specific DNA-binding proteins, interacts primarily within the minor groove of the DNA helix. This was established by a novel approach involving complete replacement of the thymines and adenlnes in the TATA box with cytosines and inosines, respectively. This substitution exchanged the major groove of TATAAAA for that of the sequence CGCGGGG, without altering the surface of the minor groove. The unusual DNA-binding properties of TFIID revealed by this study have important implications for TFIID specificity and function and, more generally, for sequence-specific recognition by DNA-binding proteins. Introduction TFIID is a DNA-binding protein required for RNA polymerase II-mediated transcription of many, if not all, proteinencoding genes in eukaryotic ceils. Recent evidence supports an important role for TFIID in the utilization of some promoters transcribed by RNA polymerase Ill, as well (Simmen et al., 1991; Margottin et al., 1991). TFIID binds to a site known as the TATA box, which has a consensus sequence TATAAA(A) (Breathnach and Chambon, 1981). This binding can occur in the absence of other proteins in vitro and appears to be an early step in the assembly of a multi-protein preinitiation complex at the promoter (Nakajima et al., 1988; Davison et al., 1983; Fire et al., 1984; Buratowski et al., 1989). The TATA box is located about 25 to 30 bp upstream of the start site of transcription in higher eukaryotes and somewhat more distal to the start site in lower eukaryotes (Corden et al., 1980; Breathnach and Chambon, 1981; Struhl, 1987). Mutational analyses have shown that the TATA box sequence is an important determinant of transcription initiation frequency both in vivo and in vitro. These studies have generally supported the view that the consensus sequence is optimal for promoter function (Myers et al., 1986; Wobbe and Struhl, 1990). The gene for TFIID has been cloned from a number of sources, including yeasts, vertebrates, insects, and plants (Horikoshi et al., 1989a; Schmidt et al., 1989; Cavallini et al., 1989; Gasch et al., 1990; Hahn et al., 1989a; Hoey et al., 1990; Hoffmannetal., 1990a, 1990b; Kaoet al., 1990; Peterson et al., 1990; Muhich et al., 1990). All these pro-
teins share a common 180-amino-acid carboxy-terminal “core” domain, which is essential for DNA binding and sufficient for basal levels of transcription in vitro (Peterson et al., 1990; Horikoshi et al., 1990; Pugh and Tjian, 1990). The amino-terminal regions vary greatly in both length and sequence and have been proposed to be responsible, either directly or indirectly, for interaction between TFIID and transcriptional activators (Pugh and Tjian, 1990; Hoffmann et al., 1990b; Peterson et al., 1990). The highly conserved core domain of TFIID contains two stretches of amino acids that show significant homology with one another (Cavallini et al., 1989; Hoeijmakers, 1990; Nagai, 1990; Stucka and Feldmann, 1990). Various models for how these two regions might interact with DNA have been postulated (Stucka and Feldman& 1990; Nagai, 1990; Horikoshi et al., 1989a; Gasch et al., 1990; Reddy and Hahn, 1991). Evidence that both repeats contain common structural features involved in DNA recognition was obtained recently from genetic studies in yeast. Reddy and Hahn isolated mutations in the first of the two repeated regions of yeast TFIID that resulted in an apparent loss of specific DNA binding (Reddy and Hahn, 1991). Several of these mutations changed amino acids that were conserved in both copies of the repeated region; introduction of each of the mutations at the analogous positions in the second repeat also resulted in reduced DNA binding. Identification of the TATA box as the recognition sequence for TFIID binding has been established both by mutational studies and by localization of the region of DNA that TFIID protects from cleavage by enzymatic and chemical reagents. In studies using DNAase I, MPE.Fe(ll), and copper phenanthroline, the area protected by TFIID extended beyond the base pairs normally considered part of the TATA box consensus sequence (Sawadogo and Roeder, 1985; Nakajima et al., 1988; Hahn et al., 1989b; Maldonado et al., 1990). Protection of the phosphodiester backbone from enzymatic or chemical cleavage is usually interpreted as evidence that the bound protein closely approaches the protected position. However, these methods generally cannot reveal the relative importance of individual bases to the interaction with the protein, nor can they distinguish whether the protein is involved in nonspecific or base-specific contacts at any position. In addition, altered DNA structure can contribute to changes in the cleavage pattern upon protein binding (Travers, 1989). The critical role of TFIID binding in formation of the RNA polymerase II preinitiation complex makes systematic characterization of this interaction important. We have used chemical interference assays to obtain a detailed picture of the interactions of recombinant yeast and human TFIID with a consensus TATA box. In these experiments, the DNA was first chemically modified in the absence of TFIID, and the ability of the protein to form stable complexes with the modified DNA was determined. Using these methods, we measured the relative importance of individual bases within the TATA box to TFIID binding. We also showed that methylation of adenine residues on both
8 5 s x ,“,
. GCTATAAAAGGG -31 5
_ _ _
,001 CCCG-JTATTTTCCC 3’
Figure 1. Hydroxyl Radical Interference Pattern of TFllD on the ML Promoter A 3’end-labeled DNAfragment containing the MLTATA box was subjected to hydroxyl radical cleavage asdescribed in the Experimental Procedures. Modified DNA was incubated with yeast TFIID, and complexed and free DNA fragments were separated by electrophoresis on a native polyacrylamide gel. The bound and free DNA fragments were isolated from this gel and electrophoresed on a sequencing gel. (a) and (b) show the data obtained for the top and bottom strand, respectively. In each case, the lanes marked B, F, and G indicate the bound and free DNA and the products of a G-specific sequencing reaction, respectively. The data in (a) and (b) were analyzed by densitometry, and the results are summarized in the histograms shown in (c). The ratio of the intensities of the bands observed in the bound and free DNA fractions were compared for each position in the TATA
strands of the TATA box interfered with this interaction. As adenine methylation changes the surface of the minor groove of the DNA helix (Gilbert et al., 1976) this finding suggested that at least part of the recognition of the TATA box by TFIID occurs through contacts in the minor groove. To test this possibility and at the same time assess the importance of major groove contacts, we synthesized an artificial TATA box sequence in which all of the adenine and thymine residues were replaced by inosines and cytosines. These substitutions completely alter the surface of the major groove, but do not change the minor groove. We found that the interaction of TFIID with this sequence (CICIIII) was essentially indistinguishable from that observed with the normal TATA box. Taken together, our results provide compelling support for the view that TFIID primarily interacts with the minor
groove. This highly unusual mode of recognition has previously been postulated for only one other sequencespecific DNA-binding protein, the bacterial integration host factor (IHF) (Yang and Nash, 1969). These two proteins may define a new class of DNA-binding proteins. The approach we used in completely substituting the DNA recognition site should provide a powerful new tool for directly testing the importance of minor groove contacts in other systems, as well. Results Hydroxyl Radical Interference Patterns In the experiment of Figure 1, we used the hydroxyl radical cleavage method of Tullius and Dombroski (1966) to remove individual bases from a DNA fragment containing a
Is a Minor Groove
consensus TATA box sequence derived from the adenovirus major late promoter. DNA labeled at the 3’end of either the top or bottom strand was subjected to a light cleavage treatment, so that each DNA fragment was modified at no more than one position. TFIID was then incubated with the modified DNA, and the DNA fragments bound to TFIID were separated from unbound fragments by native gel electrophoresis (Horikoshi et al., 1989b; B. C. Hoopes et al., submitted). DNA eluted from the regions of the gel containing the bound and free fragments was electrophoresed on a sequencing gel. The results of this experiment are shown in Figure 1. Densitometry of the autoradiograms of Figures 1 a (top strand) and 1b (bottom strand) revealed the relative importance of individual bases to the formation or maintenance of a TFIID-TATA box complex. These data are summarized in the histograms of Figure lc. On the top DNA strand, removal of the T at position -29 and any of the A’s from -28 to -26 strongly interfered with stable binding of TFIID, while the bases at -30 and -25 contributed to the binding but were apparently less important. On the bottom strand, the T residues at positions -30, -28, -27, and -26 were extremely important, and the A residues at -29 and -31 were moderately important. Removal of the flanking G and T residues at -32 and -25, respectively, resulted in a small, but still significant, reduction in binding. Thus, the primary determinants of TFIID binding appear to be contained in the TATA box consensus sequence, although the data in Figure 1 suggest that bases immediately flanking this sequence might play a small role. In addition, these data haveshown that the bases at different positions within the TATA box sequence have different relative importance. The method used to isolate a population of bound DNA fragments for this and subsequent analyses required that the complex be relatively stable during gel electrophoresis. We have previously found that significant dissociation of TFIID-TATA box complexes does occur during electrophoresis (B. C. Hoopes et al., submitted). Therefore, we cannot distinguish whether particular bases are primarily required for formation or maintenance of the complex. TFIID Binds in Close Proximity to the Minor Groove To learn more about the base contacts that contribute to TFIID binding, we treated the DNA with dimethyl sulfate (DMS) and determined whether methylation of A’s and G’s within and surrounding the TATA box interfered with stable complex formation. The primary advantage of this method is that information is obtained about whether the protein contacts the major or minor groove (Siebenlist and Gilbert, 1980). This distinction is possible because A’s become methylated at the N-3 position within the minor groove and G’s at N-7 in the major groove. Using DMS interference assays, we found that methylation of several A’s within the TATA box interfered with TFIID binding. In particular, methylation of any of the A’s from -28 to -26 on the top strand and the A at -29 on the bottom strand strongly interfered with TFIID binding (data not shown; but see Figure 4). Methylation of the A at -25 and G at -23 on the top strand and the A at -31 on the
bottom strand wasonly mildly deleterious, and methylation of the G at -32 actually resulted in a small enhancement. The results were equivalent when cloned human TFIID was used instead of yeast TFIID (data not shown). The observed interference by adenine methylation suggested that TFIID interacts with the minor groove of the TATA box. We were unable to test the importance of major groove contacts within the TATA box by DMS methylation, however, because the site contains no G residues. In fact, G-C base pairs at almost every position within the TATA box have been shown to reduce TFIID binding and/or promoter strength (Myers et al., 1966; Chen and Struhl, 1988; Wobbe and Struhl, 1990). Although diethylpyrocarbonate can be used to alkylate A residues in the major groove, we decided instead to synthesize a double-stranded oligonucleotide containing an artificial TATA box sequence. This oligonucleotide, IC, consists of the adenovirus major late promoter sequence from -45 to -8, except that in the TATA box region the thymines and adenines were replaced by cytosines and inosines, respectively (Figure 2a). In terms of hydrogen bonding substituents on the purine and pyrimidine rings, CICIIII looks like TATAAAA in the minor groove and like CGCGGGG in the major groove. If major groove contacts are important for TFIID binding, then TFIID would not be expected to bind the IC oligonucleotide. TFllD Specifically Bound the IC Oligonuclaotide To determine whether TFIID could recognize and bind the I&substituted TATA box, we compared the ability of TFIID to retard the electrophoretic mobility of the ML and IC oligonucleotides in a bandshift assay. In the experiment of Figure 2b, either yeast TFIID (lanes 1, 4, and 7) human TFIID (lanes 2, 5, and 8) or buffer (lanes 3, 6, and 9) was incubated with the IC (lanes l-3) or ML (lanes 4-6) oligonucleotide. As a control, we also tested the ability of TFIID to bind a DNA fragment of similar size, derived from the pTZ18U polylinker (lanes 7-9). Both yeast and human TFIID shifted the IC oligonucleotide to the same position on the gel as was observed with the ML oligonucleotide. In contrast, neither protein produced a shifted complex with the nonspecific polylinker DNA. In addition, a single C-G substitution in the ML TATA box (at -31) prevented detection of a bandshift complex with either yeast or human TFIID under the same experimental conditions (Figure 2~). These results provided a strong indication that TFIID was able to bind specifically to the IC-substituted TATA box. As a further test that TFIID wasspecifically binding to the substituted TATA box, we repeated the hydroxyl radical interference assay on the ML and IC oligonucleotides. If TFIID was bound specifically to the TATA box region of the oligonucleotide, then only missing bases within that sequence should interfere with formation and recovery of a bandshift complex. We found that the patterns observed for the ML and IC oligonucleotides were similar to each other and to the pattern observed for the 110 bp DNA fragment in the experiment of Figure 1 (Figure 2d; data not shown for ML). In particular, modified IC oligonucleotides missing any one of the bases from -29 to -27 (CII) were
a GGGCCICIIIIGGGG CCCGICICCCCCCCC -iI
Figure 3. Sarkosyl Resistance of TFIID Complexes on the ML- and IC-Substituted TATA Box Yeast TFIID was bound to the ML (open squares) or IC (closed diamonds) DNAfragment, asdescribed in Experimental Procedures. Sarkosyl was added to the final concentrations shown; the samples were incubated an additional 2 min at 30°C and electrophoresed on a native polyacrylamide gel. The fraction of DNA fragment bound by TFIID at each Sarkosyl concentration was normalized to the fraction observed in the absence of Sarkosyl. That value was 0.35 for ML and 0.26 for IC. In this experiment, no correction was made for the dissociation of TFIID during electrophoresis (8. C. Hoopes et al., submitted).
Figure 2. Replacement of TATAAAA with ClCllll Does Not Significantly Affect TFIID Binding (a) The sequence of the double-stranded “Ic” oligonucleotide from -35 to -21 is shown. The substituted TATA box sequence is flanked by a DNA sequence corresponding to positions -45 to -6 with respect to the transcription start site of the ML promoter. (b) Yeast TFIID (lanes 1, 4, and 7) human TFIID (lanes 2, 5, and 8) or buffer (lanes 3, 6, and 9) was incubated with either IC (lanes l-3) ML (lanes 4-6). or a nonspecific (NS) DNA fragment (lanes 7-Q) for 60 min at 30°C, using the reaction conditions described in Experimental Procedures. Sarkosyl was added to 0.02% and the reactions were immediately loaded onto a native polyacrylamide gel. Electrophoresis and subsequent steps were as described in the Experimental Procedures. (c) A 10 nM concentration of either yeast TFIID (lanes 3 and 6) human TFIID (lanes 2 and 5) or buffer (lanes 1 and 4) was incubated with approximately 0.5 ng of a DNA fragment containing either CATAAAA (lanes l-3) or TATAAAA (lanes 4-6). as described in the Experimental Procedures. The reaction conditions and all subsequent steps were as in the experiment of Figure 2b. (d) The IC oligonucleotide labeled at the 5’ end of the top strand was treated with hydroxyl radicals, complexed with yeast TFIID, isolated, and displayed as described in the legend to Figure 1. The positions corresponding to -31 and -25 are marked with a dot and arrowhead, respectively. In this experiment, double bands wereobserved at some positions, probably reflecting the expected heterogeneity in the 3’termini left by hydroxyl radical cleavage (Hertzberg and Dervan, 1964).
almost completely excluded from the fraction of DNA bound by TFIID, whereas bases outside the CICI box region were represented equally in the bound and free fractions (Figure 2d). Together, these results demonstrated
that the TATA box was the only site on either oligonucleotide that was significantly contacted by TFIID. To determine how similar the interaction of TFIID was with the TATA and CICI box sequences, we compared several properties of the TFIID complexes with the ML and IC oligonucleotides. First, we measured the rate of formation of TFIID complexes with the IC oligonucleotide and observed the same kinetics, within experimental error, as measured for the ML TATA box at the same protein concentration (data not shown). Second, we carried out competitions and found that similar concentrations of an ML TATA box-containing DNA fragment were required to reduce TFIID binding to either the MLor IC oligonucleotide (data not shown). Third, we formed complexes on both the ML and IC oligonucleotide and measured the resistance of these complexes to different concentrations of Sarkosyl (Figure 3). We haveshown that Sarkosyl at aconcentration of 0.015% dissociates most nonspecific complexes within 2 min, while dissociation of the TATA box complex is negligible during that same time (B. C. Hoopes et al., submitted). Higher concentrations of Sarkosyl promote a more rapid dissociation of the specific complex, as well. Thus, titration of the amount of Sarkosyl required to dissociate TFIID-DNA complexes in a 2 min incubation is a sensitive test of the relative stability of these complexes. The result of such a titration is shown in Figure 3. The curves defining the amount of complex remaining on either oligonucleotide as a function of Sarkosyl concentration were equivalent, providing further evidence that the interaction of TFIID with the IC oligonucleotide was not only specific but was not detectably different from that observed with the unsubstituted TATA box. Methylation In the Major Groove Did Not Interfere with TFllD Binding The experiments of Figures 2 and 3 demonstrated that the
Is a Minor
tq strand IC ML EFBFBFBF
bottom strand IC ML
a P 3 t! -
60 E z n 3 !?! -
to the IC and ML Oligonucleotides
The ML and IC oligonucleotides were treated with DMS, complexed with yeast TFIID, tal Procedures. (a) The methylation pattern for bound (B) and free (F) DNA isolated top and bottom strands, as indicated. The dot and the arrowhead indicate the bases and -32 and -29 on the bottom strand of DNA. The results of densitometric analysis in (b).
interaction of TFllD with the K-substituted TATA box was specific. Furthermore, those experiments did not reveal any significant differences in the properties of the complexes observed with CICIIII and with TATAAIU. Thus, the ring substituents within the major groove are unlikely to contribute substantially to the properties we measured in those experiments. Although these results have clearly established the minor groove as the primary determinant of TFIID binding, we could not rule out participation of the major groove, for example, through recognition of the purine N-7. The fact that TFIID bound specifically to the IC-substituted TATA box allowed us to probe directly the importance of major groove contacts. Inosine, like guanine, is methylated by DMS at the N-7 position (Scheit and Holy, 1967). Figure 4 shows the results of a methylation interference analysis of both strands of the ML and IC oligonucleotides. Methylation of the three l’s at positions -26, -27, and -26 on the top strand resulted in a small reduction in TFIID binding, providing further confirmation that TFIID
isolated, cleaved, and displayed as described following native gel electrophoresis is shown corresponding to -30 and -26, respectively, of the data for the top strands are summarized
in the Experimenfor the IC and ML on the top strand, in the histograms
specifically interacted with the IO-substituted TATA box. In contrast to the relatively mild interference caused by methylation of these l’s in the major groove of the CICI box, methylation of any of the three A’s at the same positions in the ML TATA box was severely deleterious. This comparison is shown in the histogram of Figure 4b. Similarly, on the bottom strand, methylation of the I at -29 had no significant effect, while methylation of the A at that position strongly interfered with binding to the ML oligonucleotide (Figure 4a). Thus, the methylation interference pattern observed for the IC oligonucleotide is most consistent with a model in which functional groups within the ma,ior groove do not contribute substantially to the recognition and binding of TFIID to the TATA box. Discussion This study has demonstrated that TFIID recognizes and interacts with the base pairs of the TATA box primarily within the minor groove of the DNA helix. This property
Figure 5. Summary of Contacts Important for TFIID Binding The DNA helix is represented as a cylinder that has been unrolled and flattened onto a surface. The positions at which removal of the base by hydroxyl radical cleavage interfered with TFIID binding are represented with boldface letters. The larger letters indicate those positions at which the interference was the strongest. Positions at which methylation interfered with binding are marked with circles in the minor groove (major groove for the G at -23). The closed and stippled circles indicate strong and weak interference, respectively. The asterisk marks the position at which methylation resulted in enhanced binding.
establishes TFllD as a member of a highly unusual class of proteins, as almost all of the well-characterized sequence-specific DNA-binding proteins recognize the hydrogen bonding surface within the major groove. Indeed, the minor groove has been thought to contain insufficient information to define a specific recognition site for a protein (Seeman et al., 1976). Thus, details of the binding of TFIID to its site should reveal important new information about the determinants of the affinity and specificity of protein-DNA interactions. In addition, our findings have important implications concerning the function of TFIID in transcription. Two main lines of evidence support our conclusion that TFIID binds in the minor groove. First, we found that methylation of A residues on both DNA strands within the TATA box sequence interfered with TFIID binding. Methylation of A alters the minor groove surface both by eliminating a potential hydrogen bond acceptor and by introducing a group that might interfere sterically, A problem in the interpretation of chemical interference studies is that modified bases can also interfere with protein binding by altering the proclivity of the DNA to adopt a necessary structure (Lu et al., 1981; McClarin et al., 1986). For this reason, we decided to employ an alternative approach. We substituted I-C and C-l base pairs for all of the A-T and T-A bases of the TATA box in order to alter the recognition potential of the major groove while leaving the minor groove unchanged. The complexes formed between TFIID and the TATA and CICI boxes were essentially indistinguishable from one another in experiments measuring the kinetics and stability of binding by bandshift assays. I-C base pairs contribute a pattern of hydrogen bond
donors and acceptors that is identical to G-C pairs in the major groove and to A-T pairs in the minor groove. Thus, the finding that I-C substitutions were not deleterious to TFIID binding showed that major groove functional groups were relatively unimportant to the interaction with TFIID. This conclusion was strengthened further by our finding that methylation of the l’s in the major groove caused only a small decrease in binding. We have also used hydroxyl radical cleavage to identify the bases in the TATA box that contribute directly to the formation and maintenance of the TFIID-DNA complex. The contacts revealed by both types of chemical interference experiments are summarized on a flattened cylindrical projection of the DNA (Figure 5). Displaying the methylation interference data in this fashion reveals that TFIID interacts in the minor groove primarily on one side of the DNA helix, centered on the base pairs from -29 to -26 (TAAAon the top strand). The hydroxyl radical interference experiments showed that several bases located on the back side of the helix with respect to these minor groove contacts were also important for TFIID binding. To contact these bases, TFIID could either lie across the minor groove or wrap around to the back of the helix within the minor groove. It is worth noting that we cannot be certain that hydroxyl radical cleavage interference is restricted to positions at which the protein directly contacts the base, although that is the usual interpretation. Hydroxyl radical cleavage removes the base and the deoxyribose moiety, while both the 3’ and 5’ phosphate groups are retained (Hertzberg and Dervan, 1984). It is unclear whether “free” phosphates generated by this treatment would be recognized in the same way as within the context of the intact backbone. In addition, removal of nucleosides could alter protein-DNA interactions by influencing DNA structure. All of the critical bases identified by hydroxyl radical interference were contained within the consensus TATA box sequence TATAAA. Several flanking bases appeared to make minor contributions. While our data showed that no single flanking nucleoside contributed substantially to TFIID binding, we cannot rule out that the flanking sequences as a whole may be important. Indeed, recent compilations of class II promoters have revealed a conservation of guanine and cytosine residues both upstream and downstream of the TATA box (Sucher, 1990; Penotti, 1990). These sequences may be required to promote a particular DNA structure or to confine TFIID to the TATA box region by placing inhibitory amino groups in the minor groove. As the compilations were based on analyzing functional promoters rather than TFIID-binding sites, it is also possible that the flanking sequences are primarily important for the interaction of other transcription factors with the promoter. Can the Minor Groove of A-T Base Pairs Constltute a Speciflc Recognition Site for a Protein? Seeman et al. (1976) have argued that A-T and T-A base pairs cannot easily be discriminated in the minor groove, because the hydrogen bond accepting C-2 carbonyl of thymine and N-3 of adenine are predicted simply to switch
:;l3$ Is a Minor Groove Binding Protein
positions when the base pair is flipped over. DNA structure might provide some specificity, by influencing the precise orientation of the bases within the helix, the relative widths of the grooves, or the alignment of protein-backbone contacts. Indeed, the TATA box sequence has been suggested to have unusual structure, based on reactivity to chemical and enzymatic cleavage, NMR spectroscopy, and drug-binding studies (Kilpatricket al., 1988; Nussinov, 1990). With respect to this point, comparison of the structure of the natural and IC-substituted TATA box would be very informative in assessing the importance of DNA conformation, as we did not detect a significant difference in the interaction of TFIID with these two sequences. A thorough characterization of the sequence specificity of TFIID binding will also be required to understand the mechanism of transcription initiation by RNA polymerase II and the essential function of TFIID in this process. Despite recent intense interest in TFIID and its binding site, the sequence specificity of this interaction has not been analyzed in detail, particularly in regard to the discrimination between A-T and T-A base pairs at each position. Results of mutational analyses have suggested that the consensus TATA box is optimal for transcription activity in vivo and in vitro. T-A to A-T transversions at several positions within the TATA box were severely deleterious to promoter function, while such changes at other positions had milder effects (Wobbe and Struhl, 1990). However, the interaction of TFIID with most of these sequences has not been directly measured. Our laboratory has begun a comparison of the interaction of TFIID with several variant TATA boxes. We have found that T-A to A-T changes at the positions examined thus far do result in small but significant differences in the properties of the TFIID-DNA complexes (B. C. Hoopes and D. K. H., unpublished data). This observation, coupled with our finding that TFIID appears primarily to contact the minor groove, suggests that this protein will provide an excellent system for studying the molecular basis for recognition of the minor groove surface. If TFIID is lying in close contact with the minor groove surface, then the major groove may be available for direct recognition by one or more of the other transcription factors. The deleterious effects of some promoter mutations could reflect reduction of this interaction, rather than (or in addition to) the ability of TFIID to bind to the altered sequence. Although the other general transcription factors are not known to bind DNA in a sequence-specific manner, TFIIA and TFIIB both have been shown to bind to a TFIIDTATA box complex (Buratowski et al., 1989; Peterson et al., 1990). Quantitative measurement of these various interactions with different TATA box sequences will be needed to address this interesting possibility. Comparison with Other DNA-Binding Proteins TFIID joins a small but growing list of DNA-binding proteins that interact with AT-rich sequences within the minor groove. Examples of such proteins include Escherichia coli IHF protein and members of the eukaryotic HMG family (reviewed in Churchill and Travers, 1991). A number of
other proteins have been shown to contact bases in the minor groove, but also to make important contacts in the major groove (Sluka et al., 1990; Anderson et al., 1987; Otting et al., 1990). To date, IHF is the only protein other than TFIID thought to both recognize a specific DNA sequence and bind primarily to the minor groove (Yang and Nash, 1989). Recently, Nash and Granston have pointed out a similarity between a highly conserved region of TFIID and a region of IHF shown by mutational analysis to be important for DNA binding (Nash and Granston, 1991). Mutations within this region of TFIID have also been shown to reduce or eliminate binding to the TATA box (Reddy and Hahn, 1991). IHF is a member of a family of small, basic proteins with significant sequence and functional homology (Drlica and Rouviere-Yaniv, 1987). The threedimensional structure of another member of this family, the HU protein of Bacillus stearothermophilus, has been determined by X-ray crystallography (Tanaka et al., 1989). On the basis of this structure, White et al. (1989) have proposed that an HU dimer contacts the DNA by means of a pair of two-stranded, antiparallel B-sheets. The stretch of amino acids in IHF that appear similar to TFIID occurs in a region predicted to be P-sheet by analogy to HU. Therefore, Nash and Granston have suggested that TFIID may also use a p-sheet motif to contact the DNA (Nash and Granston, 1991). A structural complementarii between an antiparallel P-sheet and the minor groove of the DNA helix was first recognized by Church et al. (1977), who proposed that such a motif could accommodate both specific and nonspecific protein-DNA contacts. Many of the proteinsthat contact the minorgrooveeither specifically or nonspecifically also bend or otherwise distort the DNA helix. (White et al., 1989; Suck et al., 1988; Nash, 1990). In at least some cases, this distortion of DNA structure is critical to the protein’s function. For example, IHF appears to promote condensation of the phage lambda attachment site into a compact structure necessary for integration into the host chromosome (Nash, 1990). HU is believed to be a histone-like protein that wraps and compacts bacterial DNA (Drlica and RouviereYaniv, 1987). It will be interesting to discover whether TFIID also bends the DNA upon binding and, if so, whether this bending isessential for assembly of a functional preinitiation complex. Experimental Procedures DNA Fragments
110 bp DNA fragments used in the experiments of Figures 1 and 2c were purified by gel electrophoresis following digestion of the plasmid pWRM18 or pWRMl&BlC wilh BamHl and Hindlll. pWRM18, which has been described elsewhere (6. C. Hoopes et al., submitted), contains adenovirus major late promoter sequence from -52 to +lO followed by 35 bp of G-less cassette. pWRMl8-31C is identical except for a T - C change at position -31 in the TATA box. The fragment was end-labeled by the Klenow fill-in reaction (Maniatis et al., 1982). The oligonucleotides used in the experiments of Figures 2-4 were synthesized by the Biotechnology Laboratory at the University of Oregon. Deoxynucleoside phosphoramidates were purchased from Ap
plied Biosystems, follows.
Inc. The sequences
of the oligonucleotides
ML top strand: 5’-(TCCTGAAGGGGGGCTATAAAAGGGGGTGGGGGCGCGT C)-3’. ML bottom strand: 5” (GACGAACGCGCCCCCACCCCCTmATAGCCCCCCTTCAGCCCCCCTTCAGGA) 3’. IC top strand: 5’-(CCTTGAAGGGGGGCCICllllGGGGGTGGGGGCGCG~-3’. IC bottom strand: 5’+AGAACGCGCCCCCACCCCCCCCClClGCCCCCCTTC. The single-stranded DNA was end-labeled using T4 polynucleotide kinase (Maniatis et al., 1982) and then annealed. Annealing was done by heating a mixture of complementary oligonucleotides to 90°C in 20 ul of annealing buffer (50 mM NaCI, 3.5 mM MgCI?) for 30 min, cooling to 50°C for 2 hr. and then cooling to room temperature for an additional 2 hr. Double-stranded DNA was separated from unannealed strands by electrophoresis on a 20% polyacrylamide gel in TBE (0.089 M Tris base, 0.089 M boric acid, 2 mM EDTA). DNA was isolated from the gel by standard procedures (Maniatis et al., 1982). Cloned TFIID from Saccharomyces cerevisiae was overexpressed in and purified from E. colistrain BL21IDElasdescribed (B. C. Hoopes et al., submitted). In brief, cell lysates were extracted with 500 mM KCI, dialyzed into BCIOO (20 mM Tris-Cl [pH 7.9 at 4OC], 20% glycerol, 100 mM KCI, 0.2 mM EDTA, 1 mM DTT), loaded onto a heparinagarose column, and eluted with 500 mM KCI. Following dialysis into BCIOO, the pooled fractions from the heparin column were chromatographed on a DE-52 column. TFIID was recovered in the flow-through fractions. The human TFIID gene was obtained from the laboratory of Robert Tjian (University of California, Berkeley) and inserted into the expression vector pBH500 (B. C. Hoopes et al., submitted). Details of the plasmid construction and protein purification will be described elsewhere. Chemical Interference and Bandahlftlng Expsrlments The DNAwas modified with DMS as follows: Either 20 ng of the 110 bp DNA fragment or 8 ng of double-stranded oligonucleotide was added to a’ final volume of 20 ul in bandshift reaction buffer (see below). The reaction was cooled to O°C on ice for 10 min, after which 0.5 pl of DMS was added. After 10 min, 100 pl of DMS stop mix (20% 8-mercaptoethanol, 25 mM EDTA, 1.5 M NH& and 75 f&g/ml RNA) and 240 ul of ethanol were added, and the DNA was precipitated overnight at -2OOC. The DNAwas pelleted by centrifugation, dried under vacuum, and resuspended in 10 ul of TE (IO mM Tris-HCI (pH 81.1 mM EDTA). Hydroxyl radical cleavage reactions were carried out as described by Tullius and Dombroski (1988) with minor modifications. Either 20 ng of the 1 IO bp DNA fragment or 8 ng of either oligonucleotide was added to a final volume of 35 ul of TE. Eighteen microliters of 3x cleavage solution (20 pl of 200 uM Fe(NH&oO&, 20 ul of 400 9M EDTA, 40 ul of 0.3% H202, and 40 pl of 10 mM ascorbic acid) was then added. After 5 min at room temperature, 500 pl of HR stop mix (140 mM EDTA, 1.5 M NH&, and 75 pglml RNA) was added. The DNA was ethanol-precipitated overnight at -2OOC. The DNA was pelleted, dried under vacuum, and resuspended in 10 pl of TE. The gel retardation assays were performed as described (B. C. Hoopes et al., submitted). For a typical chemical interference experiment, 50 nM TFIID was incubated for 80 min at 30°C with premodified D’NA fragment (5 ng; 1 x 10cpm) in a total volume of 20 91. For other gel retardation experiments, 20 nM TFIID was incubated as above with unmodified DNA (0.5 ng; 1 x 105 cpm). The buffer conditions used were 12 mM Tris-Cl (pH 7.9 at 4OC), 20 mM HEPES (pH 8.4). 10 mM MgC&, 12% glycerol, 60 mM KCI, and 0.5 mM DlT. Sarkosyl was added to a final concentration of 0.02%. after which the reaction was incubated an additional 2 min at 30°C and was then loaded directly onto a native 4% polyacrylamide gel (40:1 mono:bis) containing 10% glycerol. The gel was electrophoresed in TGEM buffer (25 mM Tris base, 190 mM glycine, 1 mM EDTA, 5 mM MgCIS. For the experiments of Figures 2 and 3, the gel was transferred to Whatman 3 MM paper,
dried under heat and vacuum, and exposed to Kodak 885 film overnight at -70°C with an intensifying screen. For the experiments of Figures 1 and 4, the bandshift gel was exposed to Kodak 885 film for l-2 hr at room temperature, and the bound and free DNA fragments were electroeluted onto NA45 membrane (Schleicher and Schuell) at 100 mA in TBE buffer. DNA was eluted from the membrane by incubation in 300 91 of TENtmo (10 mM Tris [pH 7.51, 1 mM EDTA, 1 M NaCI) at 4tj°C-55‘C for 2 hr. The DNA was precipitated with ethanol overnight at -20°C. In the case of DMS interference, DNA was cleaved essentially as described (Maxam and Gilbert, 1980). The DNA was resuspended in 45 ul of TE, to which was added 5 pl of piperidine. The reaction was incubated for 30 min at 90°C and the DNA was ethanol precipitated overnight at -2OOC. The DNA from both types of premodification interference experiments was then washed with 70% ethanol, dried under vacuum extensively, and resuspended in 10 ul of 95% formamide and dyes. Equivalent amounts of radioactivity corresponding to bound and free DNA were loaded onto either a 10% (bottom strand, 1 IO bp fragment), 8% (top strand, 110 bp fragment), or 20% (either strand, both oligonucleotides) polyacrylamide-7 M urea-TBE sequencing gel. The gel was electrophoresed at 80 W constant power until the bromphenol blue reached the bottom of the gel (approximately 38 cm). The 8% and 10% polyacrylamide gels were then transferred to Whatman 3 MM paper, dried under heat and vacuum, and exposed to either Kodak XAR-5 or Hyperfilm MP (Amersham) film overnight at -7OOC with an intensifying screen. Gels (20%) were exposed directly to film at -70°C with an intensifying screen. Densitometryof autoradiograms was done using aZeineh Scanning Densitometer, Model SL-504-XL, and the areas under the peaks corresponding to the bands on the gel were calculated using the Videophoresis II Program (Biomed Instruments, Inc.). The ratios of intensities of bands observed in the bound and free DNA fractions were normalized using positions outside the TATA box region. Acknowledgments We are grateful to Barbara Hoopes and James LeBlanc for providing purified yeast and human TFIID and Margaret Lindorfer and the Biotechnology Laboratory for synthesisof oligonucleotides. We thank Barbara Hoopes, Tom Stevens, Vicki Chandler, Laurakay Bruhn, Cynthia Phillips, Howard Nash, Drew Granston, Eric Selker, Pete von Hippel, and Will McClure for discussions and a critical reading of the manuscript. This work was supported by grants DMB-8703950, DMB9018875, and a Presidential Young Investigator Award from the National Science Foundation and by a Searle Scholar Award from the Chicago Community Trust (to D. K. H.). D. B. S. was supported, in part, by a graduate training grant from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “adve~~sernenf” in accordance with 18 USC Section 1734 solely to indicate this fact. Received
8, 1991; revised
Anderson, J. E., Ptashne. M., and Harrison, S. C. (1987). Structure of the repressor-operator complex of bacteriophage 434. Nature 326, 846-852. Breathnach, R., and Chambon, P. (1981). Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50, 349-383. Bucher. P. (1990). Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J. Mol. Biol. 272, 583-578. Buratowski, S., Hahn, S., Guarente, L., and Sharp, P. A. (1989). Five intermediate complexes in transcription initiation by RNA polymerase II. Cell 56, 549681. Cavallini, B., Faus, I., Matthes, H., Chipoulet, J. M., Winsor, B., Egly, J.-M.,andChambon, P.(1989). Cloningofthegeneencoding theyeast
Is a Minor Groove
BTFIY which can substitute for the human Proc. Natl. Acad. Sci. USA 66. 9603-9607.
Unusual DNA structures 261,11350-11354.
in the adenovirus
Chen, W., and Struhl, K. (1966). Saturation mutagenesis of a yeast his3 “TATA element”: genetic evidence for a specific TATA binding protein. Proc. Natl. Acad. Sci. USA 65, 2691-2695.
Lu, A.-L., Jack, W. E., and Modrich, P. (1961). DNA determinants important in sequence recognition by EcoRl endonuclease. J. Biol. Chem. 266, 13200-13206.
Church, G. ht., Sussman, J. L., and Kim, S.-H. (1977). structural complementarity between DNA and proteins. Acad. Sci. USA 74, 1456-1462.
Maldonado, E., Ha, I., Cartes, P., Weis, L., and Reinberg, D. (1990). Factors involved in specific transcription by mammalian RNA polymerase II: role of transcription factors IIA, IID, and IIB during formation of a transcription-competent complex. Mol. Cell. Biol. 70, 6335-6347.
Secondary Proc. Natl.
Churchill, M. E. A., and Travers, A. A. (1991). Protein motifs that recognize structural features of DNA. Trends Biochem. Sci. 76, 92-97. Corden, J., Wasylyk, B., Buchwalder, A., Sassone-Corsi, C., and Chambon, P. (1960). Promoter sequences protein-coding genes. Science 209, 1406-1414.
P., Kedinger, of eukaryotic
Davison, B. L., Egly, J., Mulvihill, E. R., and Chambon, P. (1963). Formation of stable preinitiation complexes between eukaryotic class B transcription factors and promoter sequences. Nature307,660-666. Drlica, K., and Rouviere-Yaniv, J. (1967). ria. Microbial. Rev. 51, 301-319.
Fire, A., Samuel% M., and Sharp, P. A. (1964). Interactions RNA polymerase II, factors, and template leading toaccurate tion. J. Biol. Chem. 259, 2509-2516.
Maniatis, T.. Fritsch, E. F., and Sambrook, J. (1962). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press). Margottin, F., Dujardin, G., Gerard, M., Egly, J.-M., Huet, J., and Sentenac, A. (1991). Participation of the TATA factor in transcription of the yeast U6 gene by RNA polymerase C. Science 257, 424-426. Maxam, A., and Gilbert, base-specific chemical
W. (1960). cleavages.
Sequencing end-labeled DNA with Meth. Enzymol. 65, 497-559.
McClarin, J. A., Frederick, C. A., Wang, B.C., Greene, P., Boyer, H. B., Grable. J., and Rosenberg, J. M. (1966). Structure of the DNA-EcoRI endonuclease recognition complex at 3 ii resolution. Science 234, 1526-1541.
Gasch, A., Hoffmann, A., Horikoshi, M., Roeder, R. G., and Chua, N.-H. (1990). Arabidopsis thaliana contains two genes for TFIID. Nature 346, 390-394.
Muhich, M. L., lida, C. T., Horikoshi, M., Roeder, C. S. (1990). cDNA clone encoding Drosophile TFIID. Proc. Natl. Acad. Sci. USA 67, 9146-9152.
Gilbert, W., Maxam, A., and Mirzabekov, A. (1976). Contacts between the LAC repressor and DNA revealed by methylation. In Control of Ribosome Synthesis, N. C. Kjeldgaard and D. Maalee, eds. (New York: Academic Press), pp. 139-149.
Myers, R. M., Tilly, K., and Maniatis, T. (1966). Fine structure analysis of a @lobin promoter. Science 232, 613-616. Nagai, 416.
Hahn, S., Buratowski, S., Sharp, P. A., and Guarente, L. (1969a). Isolation of the gene encoding the yeast TATA binding protein TFIID: a gene identical to the S/V15 suppressor of Ty element insertions. Cell 56, 1173-1161.
Nakajima, N., Horikoshi, M., and Roeder, R. G. (1966). Factors involved in specific transcription by mammalian RNA polymerase II: purification, genetic specificity, and TATA box-promoter interactions of TFIID. Mol. Cell. Biol. 6, 4026-4040.
Hahn, S., Buratowski, S., Sharp, P. A., and Guarente, L. (1969b). Yeast TATA-binding protein TFIID binds to TATA elements with both consensus and nonconsensus DNA sequences. Proc. Natl. Acad. Sci. USA 66,5716-5722.
Nash, H.A.(1990). Bendingandsupercoilingof site of bacteriophage X. Trends B&hem.
Hertzberg, R. P., and Dervan, P. B. (1964). Cleavage of DNA with methidiumpropyl-EDTA-iron( reaction conditions and product analysis. Biochemistry 23, 3934-3945.
Nussinov, R. (1990). Sequence gions. CRC Crit. Rev. Biochem.
Hoeijmakers, J. H. J. (1990). ture 343, 417-416.
Hoey, T., Dynlacht, B. D., Peterson, M. G., Pugh, B. F., and Tjian, R. (1990). Isolation and characterization of the Drosophila gene encoding the TATA box binding protein, TFIID. Cell 61, 1179-1166. Hoffmann, A., Horikoshi, M., Wang, C. K., Schroeder, S., Weil, P. A., and Roeder, R. G. (199Oa). Cloning of the Schizosaccharomyces pornbe TFIID gene reveals a strong conservation of functional domains present in Saccharomyces cerevisiae TFIID. Genes Dev. 4, 1141-l 146. Hoffmann, A., Sinn, M., and Roeder, R. unique N terminus TATA factor (TFIID).
E., Yamamoto, T., Wang, J., Roy, A., Horikoshi, G. (1990b). Highly conserved core domain and with presumptive regulatory motifs in a human Nature 346, 367-390.
Horikoshi, M., Wang, C. K., Fujii, H., Cromlish, J. A., Weil, P. A., and Roeder, R. G. (1969a). Cloning and structure of a yeast gene encoding a general transcription initiation factor TFIID that binds to the TATA box. Nature 341, 299-303. Horikoshi, M., Wang, C. K., Fujii, H., Cromlish, J. A., Weil, P. A., and Roeder, R. G. (1969b). Purification of a yeastTATA box-binding protein that exhibits human transcription factor IID activity. Proc. Natl. Acad. Sci. USA 66,4643-4647.
R. G., and Parker, transcription factor
DNAattheattachment Sci. 75, 222-227.
Nash, H. A., and Granston, A. E. (1991). Similarity binding domains of IHF protein and TFIID protein.
between the DNACell 67, this issue.
signals in eukaryotic 25, 165-224.
Otting, G., Qian, Y. Cl., Billeter, M., Muller, M., Affolter, M., Gehring, W. J., and Wuthrich, K. (1990). Protein-DNA contacts in the structure of a homeodomain-DNA complex determined by nuclear magnetic resonance spectroscopy in solution. EMBC J. 10, 3065-3092. Penotti, F. E. (1990). Human DNA TATA boxes and transcription tion sites: a statistical study. J. Mol. Biol. 213, 37-52.
Peterson, M. G., Tanese, N., Pugh, B. F., and Tjian, R. (1990). Functional domains and upstream activation properties of cloned human TATA binding protein. Science 246, 1625-1630. Pugh, B. F., and Tjian, R. (1990). Mechanism of transcriptional tion by Spl : evidence for coactivators. Cell 61, 1167-l 197.
Reddy, P., and Hahn, S. (1991). Dominant negative mutations in yeast TFIID define a bipartite DNA-binding region. Cell 65, 349-357. Sawadogo, M., and Roeder, R. G. (1965). Interaction of agene-specific transcription factor with the adenovirus major late promoter upstream of the TATA box region. Cell 43, 165-I 75. Scheit, K.-H., and Holy, A. (1967). Die Methylierung von lnosin und Uridylyi-(3’-B?lnosin durch Dimethylsulfat. Biochim. Biophys. Acta 749, 344-354. Schmidt, TATA-box
M. C., Kao, C. C., Pei, R., and Berk, A. J. (1969). Yeast transcription factor gene. Proc. Natl. Acad. Sci. USA 66,
Horikoshi, M., Yamamoto, T., Ohkuma, Y., Weil, P. A., and Roeder, R. G. (1990). Analysis of structure-function relationships of yeast TATA box binding factor TFIID. Cell 61, 1171-I 176.
Seeman, N. C., Rosenberg, J. M., and Rich, A. (1976). Sequencespecific recognition of double helical nucleic acids by proteins, Proc. Natl. Acad. Sci. USA 73, 604-606.
Kao, C. C., Lieberman, P. M., Schmidt, M. C., Zhou, Q., Pei, R., and Berk, A. J. (1990). Cloning of a transcriptionally active human TATA binding factor. Science 246, 1646-1650.
Siebenlist, U., and Gilbert, W. (1960). Contacts between Esche~ichis co/i RNA polymerase and an early promoter of phage T7. Proc. Natl. Acad. Sci. USA 77, 122-126.
M. W., Torri,
A., Kang, D. S.. Engler,
J. A., and Wells,
K. A., Bernves,
H. D.. Stunnenberg,
H. G., Berken-
stam, A., Cavallini, B., Egly, J.-M., and Mattaj, I. W. (1991). TFIID is required for in vitro transcription of human U6 gene by RNA polymerase Ill. EMBO J. 10, 1653-1662. Sluka, J. P.. Horvath, S. J., Glasgow, A. C., Simon, M. I., and Dervan, P. B. (1990). Importance of minor-groove contacts for recognition of DNA by the binding domain of Hin recombinase. Biochemistry 29, 65516561. Struhl, K. (1967). Promoters, activator proteins and the mechanism of transcriptional initiation in yeast. Cell 49, 295-297. Stucka. R., and Feldmann, H. (1990). An element of symmetry in yeast TATA box-binding protein transcription factor IID. FEBS Lett. 261, 223-225.
Suck, D., Lahm, A., and Cefner, C. (1988). Structure refined to 2A of a nicked DNA octanucleotide complex with DNass I. Nature 332,464466.
Tanaka, I., Appelt, K., Dijk, J., White, S., and Wilson, K. (1969). 3A resolution structure of a protein with histonelike properties in prokaryotes. Nature 370, 376-381. Travers. A. A. (1969). DNA conformation and protein binding. Annu. Rev. Biochem. 56.427-462. Tullius, T. D.. and Dombroski, B. A. (1986). Hydroxyl radical “footprinting”: high-resolution information about DNA-protein contacts and application to I repressor. Proc. Natl. Acad. Sci. USA 83,546~5473. White, S. W., Appelt, K., Wilson, K. S., and Tanaka, I. (1989). A protein structural motif that bends DNA. Protein Struct. Func. Genet. 5,261266.
Wobbe, C. R., and Struhl, K. (1990). Yeast and human TATA-binding proteins have nearly identical DNA sequence requirements for transcription in vitro. Mol. Cell. Biol. 10, 3659-3667. Yang, C.-C., and Nash, H. A. (1989). The interaction of E. coli IHF protein with its specific binding sites. Cell 57, 669-660.