Cell, Vol. 66, 11-12,
July 12, 1991, Copyright
0 1991 by Cell Press
Letter to the Editor
The TEA Domain: A Novel, Highly Conserved DNA-Binding Motif
can be inferred to be a DNA-binding domain for which I propose the name TEA domain (TE for TEF-1 and TECl , A for ABAA). An additional small block of weak sequence similarity about 100 to 130 residues downstream from the main motif was also detected (lower panel, Figure 1). About 25 residues further downstream from this second block, all three proteins could potentially form an amphipathic helix (TEF1, position ~284, 15 amino acids; TECI, position ~345, -25 amino acids; ABAA, position ~348, ml 6 amino acids; see also Mirabito et al., 1989). In TEF-1 this region is located in a part of the protein that is able to produce a selective squelching phenomenon (Xiao et al., 1991) but it can be removed from TECl apparently without interfering with its transcriptional activity (Laloux et al., 1990). Structural Predictions The availability of sequences from three highly divergent organisms allows a tentative identification of the residues necessary for the structural integrity of the TEA domain (Figure 1). Since many DNA-binding domains use an a helix to contact DNA (Pabo and Sauer, 1984; Otting et al., 1990; Pavletich and Pabo, 1991) the sequences were examined for the presence of helix-breaking residues and absolutely conserved hydrophobic residues that might contribute to an amphipathic helix (Figure 1). This pattern of conservation, together with Chou and Fasman (1978) secondary-structure prediction and amino acid frequencies in a helices (Richardson and Richardson, 1988) allows prediction of three helices (Figure 1). The boundary of the N-terminus of helix 3 is unclear. All three predicted helices have hydrophobic patches that may contribute to the folding of the domain. Helix 1 is separated from helix 2 by about 16-l 8 residues and may form a random coil since that stretch contains multiple helix-breaking residues and the gap in TEC1 and ABAA. Helix 2 and Helix 3 are close together, but do not display the characteristic amino acids of a helix-turn-helix motif
Many of the nuclear regulatory proteins cloned to date can be grouped into distinct classes based on conservation of their primary sequence. A number of the motifs thus identified have been shown to be DNA-binding domains. Yet there are still many nuclear proteins that are not obviously homologous to any other known proteins. Using the Blast program (Altschul et al., 1990) to search the Genbank and the PIR databanks with the sequence of the SV40 enhancer factor TEF-1 (Xiao et al., 1991), I detected sequence similarity to the yeast transacting factor TECl (Laloux et al., 1990) required for Tyl enhancer activity, and to the Aspergillus abaA regulatory gene product (Mirabito et al., 1989) which is necessary for spore differentiation. These genes define a previously unidentified DNAbinding motif that I term the TEA domain. The Homology The most striking sequence conservation between TEF-1, TECl , and ABAA is a block of 6666 amino acids located toward the amino terminus in all three proteins (Figure 1). TEF-1 is 44% @O/68) identical to TECl, TEF-1 is 65% (44/ 68) identical to ABAA, and TECI is 44% (29166) identical to ABAA in this region. The best identity (71%) exists in the last 53155 amino acids between ABAA and TEF-1. This similarity is better than the sequence conservation of the homeodomain between Drosophila and yeast (-37% identity; Biirglin, 1988). TEF-1 and TECl also share some weaker similarity just downstream of this main conserved block. Xiao et al. (1991) provide evidence that the DNA-binding domain of TEF-1 is within amino acids 26 to 96. Thus the conserved sequence motif (amino acids 30-97 in TEF-1)
r TEF-1 TECl abaA
28: 124: 133:
b D w G
/.........l..1....../.........l.._......I........_/...._....,.,..__...l 1 11 21 31 L?zzzzzmmn
The TEA domain is marked with a bar. Identical and similar amino acids between the sequences are boxed, and identical amino acids between all three sequences are in bold face. A gap of two amino acids was introduced in TECl and ABAA, at the same position relative to TEF-1, to maintain optimal alignment. TEF-1 and TECl share some less conserved regions immediately downstream of the TEA domain. Helix-breaking residues (G, P) are indicated as “b.” Residues present in all three sequences are indicated: +, basic; -, acidic; o, hydrophobic. Predicted helices are shown on top. Amino acid positions are indicated at the beginning of the line. The lower panel shows a small patch of sequence similarity approximately lOO130 amino acids downstream of the TEA domain. Similar amino acids are: Y/F, I/L/V, WR, D/E.
Thomas R. Biirglin Department of Genetics Harvard Medical School and Department of Molecular Biology Wellman 8 Massachusetts General Hospital Boston, Massachusetts 02114 Figure
Sites of TEF-I
to Tyl Sequences
The GT-IIC, Sph-I, and Sph-II binding sites of TEF-1 are boxed. Sequences of the Tyl enhancer similar to GT-IIC are aligned underneath. The bold face nucleotide in site 2, when mutated, reduces Tyl enhancer activity (Errede et al., 1987). Similarities to the al/a2 control site and the MATa diploid control sites (binding sites of the yeast homeodomain proteins MATal and MATa2) are underlined and overlap the regions of similarity to the GT-IIC site (Errede et al., 1987, Company and Errede, 1988). The Sph-I and Sph-II sites of TEF-1 in the SV40 enhancer overlap the octamer binding site of the POUhomeodomain protein Ott-1 (underlined).
(Pabo and Sauer, 1984). Since the region of protein-DNA interaction is very well conserved in homeodomains and zinc-finger motifs, one might speculate that helix 2 and/or helix 3 interacts with DNA. Binding Sites TEF-1 can bind to two distinct sequences in the SV40 enhancer (Davidson et al., 1988; Figure 2). Sequences similar to the GT-IIC site exist in many retroviral enhancers (called “core”) and can contribute strongly to viral enhancer activity (Speck et al., 1990 and references therein). Possibly TEA domain proteins bind to these sites. Yeast Ty elements are retrotransposons that share functional similarities with retroviruses. Sequences similar to the GT-IIC site are present in the Tyl enhancer in regions that have been shown tc be important for enhancer function (Errede et al., 1987; Company and Errede, 1988 and references therein). It remains to be shown whetherTEC1 binds to any of these regions, and whether binding sites have been conserved in evolution. Function Both TEF-1 and TECl were identified in studies of enhancer control of transcription. In contrast, abaA was isolated as a gene that controls expression of several sets of genes, including another regulatory gene, in the pathway of asexual spore (conidium) differentiation (Mirabito et al., 1989; Marshall and Timberlake, 1991). TECl is not required for sporulation or mating nor is it necessary for cellular viability (Laloux et al., 1990). TEF-1 shows tissuespecific expression, being present in HeLa and F9 cells but absent from lymphoid tissue (Xiao et al., 1991). This apparent diversity of function suggests that TEA domain proteins have numerous regulatory roles in development. TEA domain proteins will certainly be found throughout the animal kingdom, although they may not be as numerous as homeodomain and zinc-finger proteins.
Altschul, S. F., Gish, W.. Miller, W., Myers, E. W., and Lipman, D. J. (1990). A basic local alignment search tool. J. Mol. Biol. 215.403-410. Biirglin, T. R. (1988). The yeast homeo box. Cell 53, 339-340. Chou, P. Y ., and Fasman, conformation. Ann. Rev.
G. D. (1978). Empirical predictionsof Biochem. 47, 251-276.
Company, M., and Errede, B. (1988). A Tyl cell-type-specific regulatory sequence is a recognition element for a constitutive binding factor. Mol. Cell. Biol. 8, 5299-5309. Davidson, I., Xiao, J. H., Resales, R., Staub, A., and Chambon. P. (1988). The HeLa cell protein TEF-1 binds specifically and cooperatively to two SV40 enhancer motifs of unrelated sequence. Cell 54, 931-942. Errede, B., Company, M., and Hutchison, C. A., Ill (1987). Tyl sequence with enhancer and mating-type-dependent regulatory activities. Mol. Cell. Biol. 7, 258-265. Laloux, I., Dubois, E., Dewerchin, M., and Jacobs, E. (1990). TECl, a gene involved in the activation of Tyl and Tyl-mediated gene expression in Saccharomyces cerevisiae: cloning and molecular analysis, Mol. Cell. Biol. IO, 3541-3550. Marshall, M. A., and Timberlake, W. E. (1991). Aspergillus nidulans wetA activates spore-specific gene expression. Mol. Cell. Biol. 11,5562. Mirabito, P. M., Adams, T. H., and Timberlake, W. E. (1989). Interactions of three sequentially expressed genes control temporal and spatial specificity in Aspergillus development. Cell 57, 859-868. Otting, G., Qian. Y. Q., Billeter, M., Miiller, M., Affolter, M., Gehring, W. J.. and Wiithrich, K. (1990). Protein-DNA contacts in the structure of a homeodomain-DNA complex determined by nuclear magnetic resonance spectroscopy in solution. EMBO J. 9, 3085-3092. Pabo, C. O., and Sauer, R. T. (1964). Rev. Biochem. 53, 293-321.
Pavletich, N. P., and Pabo, C. 0. (1991). Zinc Finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 A. Science 252,809817. Richardson, J. S.. and Richardson, D. C. (1988). Amino acid preferences for specific locations at the ends of a helices. Science 240, 1648-1652. Speck, N. A., Renjifo, B., Golemis, E.. Fredrickson, T. N., Hartley, J. W., and Hopkins, N. (1990). Mutation of the core or adjacent LVb elements of the Moloney murine leukemia virus enhancer alters disease specificity. Genes Dev. 4, 233-242. Xiao, J. H.. Davidson, I., Matthes, H., Garnier, J.-M., and Chambon, P. (1991). Cloning, expression, and transcriptional properties of the human enhancer factor TEF-1. Cell 65, 551-568.