Cell, Vol. 67, 1037-1039,

December

20, 1991, Copyright

0 1991 by Cell Press

Letter to the Editor

Similarity Between the DNA-Binding Domains of IHF Protein and TFIID Protein IHF Protein integration Host Factor(lHF) isasmall basic heterodimeric E. coli protein that participates in many processes involving DNA (Friedman, 1988). IHF is homologous to HU, a DNA-binding protein that is ubiquitous in the bacterial kingdom (Drlica and Rouviere-Yaniv, 1987). The molecular structure of one member of the HUllHF family, the HU protein of Bacillus stearothermophilus, has been determined byX-raycrystallography(Tanakaet al., 1984). Each subunit of the HU dimer has a two-stranded P-sheet that emerges from the body of the protein. These P-strands are arrayed like a pair of arms that could wrap around the DNA helix. Recent proposals for the nature of the interaction between the arms and DNA use information from IHFDNA complexes. Unlike HU, IHF recognizes specific DNA sites (Goodrich et al., 1990). Protection/interference studies of this interaction imply that IHF primarily contacts the minor groove of DNA (Yang and Nash, 1989). Existing models (reviewed by Phillips, 1991) do not specify how residues in the arms of IHF recognize specific DNA sequences via the minor groove, or how the corresponding residues in the arms of HU lead to nonspecific DNA binding. Nevertheless, genetic studies (Goshima et al., 1990; Sayre and Geiduschek, 1990; Lee et al., 1992) on several members of the HUllHF family support the notion that DNA binding involves the arm region (residues 53-84: Figure 1). TFIID Protein Transcription factor IID (TFIID or TATA-binding factor) is an important element of the transcriptional apparatus of eukaryotes. This protein binds to the TATA box, a conserved sequence found upstream of typical RNA polymerase II-dependent transcription starts; this binding appears to be an early and important step in activating transcription. Genes encoding TFIID have been isolated from several species (reviewed by Greenblatt, 1991). A C-terminal segment of 180 amino acids, which contains the DNAbinding region of TFIID, is highly conserved between species. Within this conserved region are two segments of 60-66 amino acids that are quite similar to each other (Figure 1); Reddy and Hahn (1991) have identified mutations in these repeated regions of yeast TFIID that depress DNA binding. Alignment of IHFa and TFIID We have noted a similarity between a portion of the arm region of the a subunit of IHF (IHFa) and a portion of the repeated segment of yeast TFIID. The aligned segments (Figure 2) include and abut residues that are, in each protein, implicated in binding to DNA by mutagenesis studies.

When assessed by the ALIGN program (Dayhoff et al., 1983), the quality of the alignment for these 18 residues is more than six standard deviations above that found for randomized sequences of identical length and composition. Many of the residues in the aligned segments are conserved in other members of each family (Figure 2). It should be noted, however, that the N-terminal repeat of TFIID matches IHFa much better than does the C-terminal repeat of TFIID. Moreover, IHFa matches the TFIID repeats somewhat better than does IHFf3. Thus, although both IHF and TFIID have dual DNA-binding motifs, the two elements in each protein might play different roles in binding to DNA. Moreover, none of the HU subunits match TFIID as well as IHFa does, suggesting that some of the features shown in the alignment may be involved in determining sequence-specific DNA binding. To further assess the significance of this imperfect alignment, we used a search motif consisting of residues that are identical or similar in both subunits of IHF and the N-terminal repeat of TFIID (RXP[R/K][T/S]xxX(l~]xXX [G/A][R/K]XV) to search a protein database with over 80,000 entries. The only matching sequences were subunits of IHF from diverse bacteria and TFIID from diverse eukaryotes. A less stringent search motif, which contained residues common to both N- and C-terminal repeats of TFIID as well as both subunits of IHF, only detects four additional proteins; the possibilitythat oneor more of these has a functional relationship to IHF or TFIID is open. In any case, the limited number of matches to the search motif demonstrates the high information content of our alignment. Structural Implications The segment of IHF that aligns with TFIID is believed to be part of a b-sheet. We therefore propose that the aligned portion of TFIID also adopts this conformation. Based on secondary structure predictions, Stucka and Feldmann (1990) have also suggested that residues in this region of TFIID adopt a P-strand conformation. IHF appears to place its b-strands in the minor groove; if our alignment indicates

IHFa

l c**

*

E.CC:I

Figure

1. Features

of TFIID

and IHF Proteins

Numbers denote amino acid positions; asterisks indicate residues at which mutations alter DNA-binding ability (Reddy and Hahn, 1991: Lee et al., 1992; A. G. and H. N., unpublished). Structural featuresof IHFa, as predicted from the structure of HU from 8. stearothermophilus (Tanaka et al., 1994) are indicated: a, alpha helix; 9, f-3structure. Doubleheaded arrows represent similarities between TFIID and the helixloop-helix protein c-Myc (Gasch et al., 1990). Striped arrows represent the tandem repeats in TFIID.

Cdl

1036

Rolcin

Species

Residues

Amino

Acid

similar mechanisms for DNA binding, one would expect TFllD to contact DNAvia the minor groove, and this prediction is supported by recent footprinting and chemical substitution studies (Starr and Hawley, 1991; Lee et al., 1991). The quality of the alignment and the minor groove interaction lead us to favor the resemblance between IHF and TFIID over an alternate suggestion (Gasch et al., 1990) of a resemblance between a segment of helix-loop-helix proteins and the repeated region of TFIID. The consensus sequence (5’-TATAAA) (Wobbe and Struhl, 1990) for binding of TFIID is similar to part (5’WATCAA, W = A/T) of the typical IHF site. This portion of the IHF consensus sequence is proposed to be contacted by one of the arms of the protein (the remainder of the consensus is proposed to be in contact with the body of the protein) (Yang and Nash, 1989). Genetic experiments indicate that residues in the arm of IHF contact this region and also provide evidence that this contact is with the a subunit (Lee et al., 1992). These observations suggest that IHF and TFIID might even use similar protein-DNA contacts to recognize their specific targets. Interestingly, although both proteins have a welldefined target consensus sequence, both are known to also bind tightly to DNA sequences that match this consensus poorly (Hahn et al., 1989; Kur et al., 1989; Wobbe and Struhl, 1990). Perhaps the limited opportunity for hydrogen bond formation via the minor groove accounts for this feature. It must be emphasized, however, that WATCAA is only part of the IHF consensus sequence (Goodrich et al., 1990) and that the protein makes tight contacts with a large stretch of DNA (Yang and Nash, 1989). In contrast, TFIID has a short consensus sequence and a compact footprint (Hahn et al., 1989; Starr and Hawley, 1991; Lee et al., 1991). This supports the idea that regions outside the aligned segment are responsible for the additional contacts made by IHF and further suggests that the principal DNA-binding surface of TFIID is contained within the aligned region.

Figure 2. Alignment of TFIID and IHF Sequences from E. coli IHFa are aligned with sequences from the N-terminal repeat of S. cerevisiae TFIID. Vertical lines and colons. respeo tively, indicate identical residues and consewative substitutions. Asterisks denote positions at which amino acid substitutions alter DNA binding @eddy and Hahn, 1991; Lee et al., 1992; A. G. and H. N., unpublished). (TheTFllD mutations indicated correspond to the rightmost three in each repeat; Figure 1.) Sequences from these regions of additional members of theTFllDfamilyand theHU/lHFfamilyarealso listed. C-terminal repeats of the TFIID proteins are shown at the top with the positions of additional mutations that affect DNA binding of the S. cerevisiae protein. TFIID sequences are from references in Greenblatt (1991) except TFIID from D. discoideum (GenBank accession number M64861). IHF and HU sequences are from Drfica and Rouviere-Yaniv (1997) and references therein, except HU from B. subtilis (GenBank accession number X52418).

Sequence

Howard A. Nash and Andrew E. Granston Laboratory of Molecular Biology National Institute of Mental Health Bethesda, Maryland 20892 References Dayhoff,

M. O., Barker,

W. C., and Hunt, L. T. (1983). Meth. Enzymol.

91, 524-545.

Drlica, K., and RouviereYaniv, J. (1987). Microbial. Rev. 57,301-319. Friedman, D. I. (1988). Cell 55, 545-554. Gasch. A., Hoffmann, A., Horikoshi, M., Roeder, R. G., and Chua, N. H. (1990). Nature 346, 390-394. Goodrich, J. A., Schwartz, M. L., and McClure, W. R. (1990). Nucl. Acids Res. 18, 4993-5000. Goshima, N., Kohno, K., Imamoto, F., and Kano, Y. (1990). Gene 96, 141-145. Greenblatt, J. (1991). Cell 66, 1067-1070. Hahn, S., Buratowski, S.. Sharp, P. A., and Guarente, L. (1989). Proc. Natl. Acad. Sci. USA 86, 5718-5722. Kur, J., Hasan, N., and Szybalski, W. (1989). Gene 87, 1-15. Lee, D. K., Horikoshi, M., and Roeder, R. G. (1991). Cell, this issue. Lee, E. C., Hales, L. M., Gumport, R. I., and Gardner, J. F. (1992). EMBO J., in press. Phillips, S. E. V. (1991). Curr. Op. Struct. Biol. 7, 89-98. Reddy, P., and Hahn, S. (1991). Cell 65, 349-357. Sayre, M. H., and Geiduschek, E. P. (1990). J. Mol. Biol. 216, 819833. Starr, D. B., and Hawley, D. K. (1991). Cell, this issue. Stucka, R., and Feldmann, H. (1990). FEBB Lett. 267, 223-225. Tanaka, I., Appelt, K., Dijk, J., White, S. W., and Wilson, K. S. (1984). Nature 370, 376-381. Wobbe, C. R., and Struhl, K. (1990). Mol. Cel. Biol. 70, 38593867. Yang, C. C., and Nash, H. A. (1989). Cell 57, 889-880.

Similarity between the DNA-binding domains of IHF protein and TFIID protein.

Cell, Vol. 67, 1037-1039, December 20, 1991, Copyright 0 1991 by Cell Press Letter to the Editor Similarity Between the DNA-Binding Domains of IH...
288KB Sizes 0 Downloads 0 Views