Cell, Vol. 68, 819-821,
March 6, 1992, Copyright
0 1992 by Cell Press
TATA-Binding Protein Is a Classless Factor Phillip A. Sharp Center for Cancer Research and Department of Biology Massachusetts Institute of Technology Cambridge, Massachusetts 02139
The TATA sequence has long been recognized as a critical element for initiation by RNA poiymerase II. This element is recognized by a sequence-specific TATA-binding protein, which is a component of a large complex of proteins forming the basal transcription factor TFIID. Until recently, the function of the TATA-binding protein was thought to be restricted to those promoters that are transcribed by RNA polymerase II. However, new data have indicated that the TATA-binding protein is also acomponent of activities that are central for the transcription of RNA polymerase Ill- and RNA polymerase l-dependent genes. Assembly of the TATA-binding protein into protein complexes generates a TFIID transcription factor that is specific for promoters transcribed by RNA polymerase Ii. The ‘other proteins contained in these complexes are known as TATA-binding protein-associated factors (TAFs) and include six prominent poiypeptides of 150,110,80,80,40, #and31 kd in Drosophila ceils (Dynlacht et al., 1991) and ,approximately 10 polypeptides (1 O-200 kd) in mammalian loells (Pugh and Tjian, 1991). Although the heterogeneity of these TFIID complexes has not been analyzed, some of the TAF polypeptides appear to be present in submolar ratios. Furthermore, a distinct TATA-binding protein-TAF complex has been resolved that mediates initiation by RNA polymerase II and is significantly smaller than the previously described TFIID complexes (Timmers and !Sharp, 1991). Thus, there could be a diverse set of TATAbinding protein-TAF complexes that constitute distinct specificities. For example, in the case of promoters that do not contain 1:heTATA element, sequences that flank the site of initiation are important for mediating the stable association of l:he TATA-binding protein with the promoter (Pugh and Tjian, 1991). Thus, it is highly likely that specific TATAbinding protein-TAF complexes recognize these TATAless promoters, and that the TAFs in these TFIID complexes are critical for this core-sequence recognition. Recently, a protein has been characterized that recognizes sequences encompassing the site of initiation and promotes transcription of TATA-less promoters (Roy et al., l991). It has not been determined whether this protein is a member of the set of TAFs that are stably associated with TATA-binding protein in the absence of DNA. A central concept that emerges from these studies is that proteins associated with the TATA-binding protein, i.e., TAFs, can have promoter specificity. Whether each of the 20,000 or so promoters in mammalian cells is bound by its own uniqueTFllDcomplex, or TATA-bindingprotein-TAFcomplex, utilizing different combinations of TAFs, is a matter of interesting speculation!
As reported by Comai et al. (1992) the TATA-binding protein is also part of a factor that is critical for transcription of the ribosomal RNA genes by RNA polymerase I. In this factor, the combination of polypeptides bound to the TATA-binding protein is responsible for sequence-specific recognition of the promoter. Assembly of TATA-binding protein into a complex with three polypeptides, TAFs of molecular weights 110,63, and 48 kd, generates the polymerase I transcription factor. This complex, SLl (selectivity factor l), cooperatively binds with a second factor, upstream binding factor (UBF), to the promoter of genes encoding the precursor to ribosomal RNA. Thus, recognition of the promoter DNA is conferred both by UBF, a member of the high mobility group of proteins, and by the SLl complex. Surprisingly, SLl is not interchangeable between human and mouse transcription systems, while all the other components, UBF and polymerase I, have equivalent activity in either system. Since the DNA-binding domain of TATA-binding protein is essentially identical in the two organisms, it is likely that the polypeptides associated with the TATA-binding protein in SLl specifically recognize DNA sequence. Thus, TAFs may be members of families of sequence-specific DNA-binding proteins that are stably associated with TATA-binding protein in the nucleus of a cell. Although it has not yet been identified, another complex of TATA-binding protein with a different set of TAFs is likely to be specific for promoters that are transcribed by RNA polymerase Ill. Evidence for this supposition is that the addition of TATA-binding protein to in vitro transcription reactions stimulates transcription of the U6 snRNA gene by RNA poiymerase Ill (Margottin et al., 1991). Furthermore, addition of competitive DNA fragments containing a TATA element specifically inhibits transcription of tRNA genes by polymerase Ill (White et al., 1992). The prototypical genes transcribed by polymerase Ill encode tRNAs, small viral RNAs, and the ribosomal 5S RNA. Two factors, TFIIIB and TFIIIC, are thought to be adequate for RNA polymerase Ill-dependent transcription of the first two groups of these genes. The 5S RNA gene, however, requires a third factor, TFIIIA, for activity. TFIIIA has been extensively studied and has been shown to bind specifically to sequences within the 5S RNA gene. TFIIIC binds to internal sequences of the 5S RNA gene only in conjunction with TFIIIA; however, it binds directly to sequences within tRNA genes and small viral RNA genes. In contrast to TFIIIA and TFIIIC, TFIIIB has not been well analyzed and may, in fact, contain the TATA-binding protein or may require this polypeptide for its activity. Yeast TFIIIB contains at least two large polypeptides, 90 kd and 70 kd, that directly contact DNA in the transcriptionally active complex (Bartholomew et al., 1991). This factor will associate with sequences upstream of tRNA genes only when TFIIIC is bound to the internal sequences. Subsequently, RNA polymerase Ill is assembled on the promoter by recognition of the TFIIIB factor (Lassar et al., 1983). Surprisingly, once assembled on the promoter by
TFIIIC, TFIIIB will remain associated with the DNA template and promote initiation of polymerase Ill after dissociation of the TFIIIC factor (Kassavetis et al., 1990). Thus, TFIIIB, perhaps a TATA-binding protein complex, is assembled into a highly stable complex on the promoter. The largest TATA-binding protein-TAF complex, which is equivalent to the TFIID activity, is approximately lo6 daltons in size. This complex must simultaneously bind the promoter site with another large complex, RNA polymerase II. How could these two large complexes be distributed along 50 bp of DNA? Two recent studies (Starr and Hawley, 1991; Lee et al., 1991) have suggested that the TATA-binding protein binds in the minor groove of DNA. This conclusion fits nicely with the observation that the two repetitive subdomains of TATA-binding protein are homologous to the prokaryotic proteins, integration host factor (IHF) and HU (Nash and Granston, 1991). The atomic structure of HU shows that each subunit has an extended arm of an antiparallel two-stranded 8 sheet. In the case of the heterodimeric IHF protein, the two arms of the dimer subunits are thought to grasp the minor grooves of alternative faces of the DNA. The sequence homology between TATA-binding protein and IHF/HU suggests that the two repetitive subdomains of TATA-binding protein may also have extended arms that grasp minor grooves of the DNA helix. This arrangement could place the bulk of TATA-binding protein at a distance from the DNA helix and present ribbons of proteins along the minor grooves for contact with other basal transcription factors (Yang and Nash, 1989). However, as mentioned above, the TATA-binding protein is found as part of large complexes containing TAFs. Although such complexes have not been well characterized to date, it is likely that the TAFs also extensively contact DNA. The recognition of DNA by the TAFs is suggested by the comparison of footprint patterns of the isolated TATA-binding protein with those of TFIID or TATA-binding protein-TAFcomplexes. TATA-binding protein protects sequences from approximately -31 to -25 from cleavage either by small chemical compounds such as hydroxyl radicals or by DNAase I (Buratowski et al., 1988). Footprints of complexes corresponding to the TFIID activity, and thus representing TATA-binding protein-TAF complexes, are much larger, spanning from approximately -45 to +35 (Sawadogo and Roeder, 1985). Sequences from -45 to -10 are fairly uniformly protected from DNAase I cleavage, while the region from -10 to +35 exhibits a pattern of protected and enhanced cleavage with a periodicityof 10 bp. This pattern is consistent with protection of one side of the helix. Thus, it is possible that the TAFs are arrayed along one side of the DNA in this region, generating a platform for the binding of polymerase and other transcription factors along the opposite side of the helix. Not only do TAFs participate in sequence recognition for promoter specificity, they are also probably involved in selecting the appropriate RNA polymerase. Promoters are uniquely recognized by either polymerase I, II, or Ill. If TATA-binding protein is a common component of transcription complexes for all three types of polymerases,
then it is likely that the proteins associated with TATAbinding protein specify which polymerase will initiate at a given promoter. For RNA polymerase II, the 33 kd protein TFIIB binds to the TATA-binding protein and stabilizes the binding of the polymerase. TFIIB probably interacts directly with the TATA-binding protein, since these two proteins, when purified from bacteria, will form a complex on DNA that contains a TATA element and will promote the association of polymerase II (Ha et al., 1991). In the case of RNA polymerase I, proteins associated with the TATA-binding protein in SLl probably interact with the polymerase. Although the identity of this bridging component of SLl has not been determined, its existence is suggested by the reasoning that the only other factor necessary for initiation by this polymerase is UBF, a sequence-specific DNA-binding protein. Interestingly, since RNA polymerase II does not initiate at polymerase I promoters, it is likely that the binding site for TFIIB is covered in these SLl-TAF complexes. Otherwise, the binding of TFIIB to the TATA-binding protein that is bound to the polymerase I promoter could result in occupancy of the promoter by RNA polymerase II. For polymerase Ill, polypeptides in the TFIIIB complex may specify the association of the polymerase, although, again, a specific polypeptide in this complex has not yet been identified as binding to the polymerase. Thus, the TATA-binding protein is assembled into TAF complexes that are likely to be specific for the selection of the RNA polymerase as well as for the promoter DNA sequences. How could the common polypeptide TATA-binding protein be assembled into complexes that are both polymerase and promoter specific? Furthermore, when after translation of the TATA-binding protein is it assembled into these stable complexes? As mentioned previously, the RNA polymerase II specificity factor TFIIB associates with the TATA-binding protein in the promoter-bound TFIID complex. Generation of the TAF complexes for polymerase I and Ill may block the binding of TFIIB on the assembled TATA-binding protein. It is likely that the specificity component of the TAF complexes for polymerase I and Ill assembles shortly after the synthesis of the TATA-binding protein. Alternatively, this component may associate with the TATA-binding protein-TAF complex on the promoter. In either case, the promoter sequence must dictate the structure of the TATA-binding protein complex for the association of the appropriate RNA polymerase. Given the diversity of the TATA-binding protein-TAF complexes and their possible specificity both for DNA sequences as well as for a particular polymerase, it is interesting to propose that many of the TAFs are assembled after the TATA-binding protein associates with the promoter. The other possibility is that TAFs associate with TATA-binding protein before DNA binding, and this would imply a pool of unbound TATA-binding protein-TAF complexes sufficient to satisfy the sequence and polymerase diversities of cellular promoters. The more frugal mechanism would be to assemble the TATA-binding protein-TAF complex on the promoter using the sequence content to specify the components that become stably associated. In this manner, a low affinity
(and relatively nonspecific initial TATA-binding protein complex could be modified into a highly stable and [uniquely configured complex. The presence of this complex would account for the frequently described DNAasezsensitive cleavage sites consistently found flanking the initiation site of active promoters. The composition of the Iultimate complex could then reflect the interactions between the complex at the promoter and sequence-specific transcription factors bound to enhancers nearby. Enhancer elements would then stimulate or repress transcription by molding the composition of the TATA-binding protein-TAF complex. In some cases, multiple factors bound to the enhancers could become stably associated with the TATA-binding protein-TAF complex and thus become a TAF protein. This would easily account for the puzzling synergistic interactions of two or more transcription factors (Ma and Ptashne, 1989). Two transcription factors that weakly enhance transcription when either binding site is inserted into a promoter can synergistically interact and strongly activate the promoter when both binding sites are inserted. In fact, evidence suggests that the two factors may more tightly bind DNA when synergistically interacting in the activation of transcription. Such synergistic interactions are not dependent upon a unique spacing between the binding sites of the two factors, and this is consistent with the possibility that both are stably interacting with a third !body. The body could be the TATA-binding protein-TAF complex assembled at the promoter. Inherent in this assembly model of enhancement is the concept that the final loonfiguration of the TATA-binding protein-TAF complex imust determine the rate of initiation of transcription by the ipolymerase. Thus, TATA-binding protein-TAF assembly iprovides a structural model for the mechanism by which ,a group of regulatory elements, possibly separated by thousands of base pairs, could specify initiation from some Imammalian genes as infrequently as once per hour or (once per day and from others as frequently as once per itwo seconds. Finally, given the centrality of TATA-binding protein in transcription, it is not surprising that a large number of proteins have been shown to specifically interact with it. These include the RNA polymerase II basal transcription ffactors TFIIA, TFIIB, and TFIIG, the repetitive amino acid sequences in the largest subunit of polymerase II, and the specific regulatory factors adenovirus El A, Epstein-Barr virus Zta, and herpes simplex virus VP18. Some of these proteins may be members of the sets of TAFs, while others might stimulate or suppress transcription by controlling the association of specific TAFs with TATA-binding protein. Analyzing this network of interactions will require determining the three-dimensional structures of TATAbinding protein alone, complexed with DNA, and complexed with other proteins, as well as several years of good biochemistry and genetics. References I3artholomew, B., Kassavetis, G. A., and Geiduschek, Mol. Cell. Biol. 17, 5181-5189.
E. P. (1991).
Buratowski,S., 334, 37-42.
Hahn,.%, Sharp, P. A.,andGuarente.
L. (1988). Nature
Comai, L., Tanese, N., and Tjian, R. (1992). Cell 68, this issue. Dynlacht,
B. D., Hoey, T., and Tjian, R. (1991). Cell 66, 563-576.
Ha, I., Lane, W. S., and Reinberg,
D. (1991). Nature 352, 689-695.
Kassavetis, G. A.. Braun, B. Ft., Nguyen, E. P. (1990). Cell 60, 235-245.
L. U., and Geiduschek,
Lassar, A. B., Martin, P. L., and Roeder, R. G. (1983). Science 222, 740-748. Lee, D. K., Horikoshi, 1250.
M., and Roeder, R. G. (1991). Cell 67, 1241-
Ma, J.. and Ptashne, M. (1989). Cell 48, 847-853. Margottin, F., Dujardin, G.. Gerard, M., Egly, J.-M., Huet, J., and Sentenac, A. (1991). Science 257,424-426. Nash, H. A., and Granston,
A. E. (1991). Cell 67, 1037-1038.
Pugh, B. F., and Tjian, R. (1991). Genes Dev. 5, 1935-1945. Roy, A. L., Meisterernst, Nature 354, 245-248. Sawadogo,
P., and Roeder, R. G. (1991).
M., and Roeder, R. G. (1985). Cell 43, 165-175.
Starr, D. B., and Hawley, D. K. (1991). Cell 67, 1231-1240. Timmers, H. T. M., and Sharp, P. A. (1991). Genes Dev. 5,1946-1956. White, R. J., Jackson, S. P., and Rigby, P. W. J. (1992). Proc. Natl. Acad. Sci. USA 89. 1949-1953. Yang, C.-C.. and Nash, H. A. (1989). Cell 57, 869-880.