Gene, 117 (1992) 91-97 © 1992 Elsevier Science Pubfishers B.V. All rights reserved. 0378-1119/92/$05.00

91

GENE 06538

Cloning and expression of the Acanthamoeba castellanii gene encoding transcription factor TFIID (Recombinant DNA; gene cloning; DNA-binding protein; RNA polymerase If; keratitis)

Jie-Min W o n g , F e n g Liu a n d Erik B a t e m a n Department of Microbiology attd Molecular Genetics. Unirersityof Vermont, Burlington. VT 05405.0068 USA Received by J. Marmur: 6 January 1992; Revised/Accepted: 19 February/20 February 1992~Received at publishers: ! April 1992

SUMMARY

We have cloned and characterized the cDNA encoding transcription factor TFIID from the eukaryote, Acanthamoeba castellanii. The gene occurs as a single species, encodes one mRNA and, presumably, a single protein. A. castellanii TFIID contains two recognizable domains, a nonconserved N-terminal domain and a highly conserved C-terminal domain. Similarities between the amino acid (aa) sequences of TFIID from several organisms are also found within the N-terminal 78 aa, suggesting a potential role in TFIID function. Full-length or truncated A. castellanii TFIID produced in Escherichia coli binds to a TATA box and is able to activate transcription in a TFIID-depleted HeLa cell extract, but the C-terminal 180-aa domain was found to be less efficient in these reactions.

INTRODUCTION

Transcription factor TFIID is essential in the formation of the RNA polymerase II initiation complex at most, or all promoters. It is thought to act by recognizing and binding to the asymmetric sequence 5'-TATAAA, but it can also bind other somewhat related sequences (Van Dyke et al., 1988; 1989; Buratowski et al., 1989; Hahn et al., 1989; Sawadogo and Sentenac, 1990; Singer et ai., 1990; Nakajima et al., 1988). Once bound, TFIID acts as a nu-

Correspondence to: Dr. E. Bateman, Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, VT 05405-0068 USA Tel. (802)656-8608; Fax (802)656-8749. Abbreviations: A., Acanthamoeba; aa, amino acid(s); bp, base pair(s); cDNA, DNA complementary to RNA; D., Dictyostelium; ds, double strand(ed); IPTG, isopropyl-~-D-thiogalactopyranoside; kb, kilobase(s) or I000 bp; nt, nucleotide(s); oligo, oligodeoxyribonucleotide;ORF, open reading frame; PAGE, polyacrylamide-gelelectrophoresis; PCR, polymerase chain reaction; Pollk, Klenow (large) fragment of E. coil DNA polymerase I; SDS, sodium dodecyl sulfate; TFIID, transcription factor liD; TFIID, gene (DNA) encoding TFIID.

cleation site for other general factors such as TFIIA, B, E and F which together direct binding and permit transcription initiation by RNA polymerase II (Sawadogo and Sentenac, 1990; Saltzman and Weinmann, 1989). TFIID may be considered a multi-functional protein; it must bind its recognition sequence and it must interact with one or more proteins in a specific manner. In addition, either TFIID or another component of the complex must be responsive to the panoply of upstream regulators (Mitchell and Tjian, 1989; Maniatis et al., 1987) that collectively regulate gene expression and downstream events in the cell's life cycle. A. castellanii is a small free-living amoeba found in soil and fresh water (Byers, 1986). It is the causative agent of Acanthamoeba keratitis, but has been generally useful as a model organism for transcription studies and cellular differentiation (Paule, 1990). In order to examine directly the interactions between TFIID, DNA and other factors, we have cloned and characterized TFIID from A. castellanii. A. castellanii TFIID or a C-terminal derivative expressed in E. coil, is active for TATAAA recognition and binding and for transcription activation. Features of the predicted aa sequence for A. castellanii TFIID are discussed.

92 a PCR product of between 400 and 460 bp. One of the products was subcloned and sequenced, which allowed us to identify a TFIID ORF interrupted by an intron of about 100 bp.

RESULTS AND DISCUSSION

(a) Amplification of Acanthamoeba castellanii TFIID genomic DNA The predicted aa sequences of TFIID cloned from a variety of organisms, including A. casteilanii show that the C-terminal 180 aa are remarkably conserved, whereas the N-terminal portion is more variable with respect to length and aa composition. This feature allowed us to design four degenerate oligo primers (corresponding to portions of the conserved regions of yeast and Drosophila TFIID aa sequences) which were used in PCR ofA. castellanii genomic DNA (Fig. IA). All four pair-wise combinations produced

(b) The Acanthamoeba casteUanH TFllD gene An A. castellanii cDNA library in ,;LZAP (Short et al., 1988) was screened using a labeled probe derived from the subcloned PCR product described above. Of several positive clones, one was rescued as a ds clone in the vector p S K ( - ) (Short et al., 1988) and sequenced completely (Fig. 1B). The A. castellanii TFIID gene is contained on a 1050-bp eDNA clone which contains a 49-bp adenine tract

A 1

50

100

200

150

250

aa

Repeats **



M

.**

,



, m • l



,

**





,

•a

-

'/

Basic aa

SQP

I II

III

IV

PCR primers

B 1 CATCGAATTCAAGGGAGAAGGAGTCGATTCA~ACATACAACAAGATG AGC GGGATT ACT 1 H $ G I T 60 CTA CCG AGC TTG ACG AAT GTC CTT CAG AGC GCGGGCATG GCG GIG CAC GGC 6 L P $ L T N V L 0 $ A G N A V H G .]J] CAC CCG TCG GCA CCC GGGAGC ACe CAG CTA CCC CCA CTT CAT CAA TTG AAC 2'3 H P $ A P G $ T O L P P L H 0 L 8 ]62 ATC TCG TCA CAG CCC TCC TCG CAG CCT CCT CAG CCT TCA CTG CAG TAC TCT 40 I $ $ 0 P $ $ 0 P P O O $ L 0 Y $ 213 GAG CCC GC(~CAA TCG ACT GCT GCC AGC GAC GAT ATG GAC AGC GAC GTG GAT 57 E P A 0 $ T A A $ O O N O $ O V 0 264 CGC ACC AAG CA¢ CCG TCG GGCATT GTC CCT ACT CTG CAA AAC ATC GI'C TCC 74 R r K H P $ G I V P T L 0 N l V S 315 ACG GTG AAT TTG GGGTGC AAG CTC GAC CTC AAG AAC ATC GCA CTG CAT GCA 91 T V N L G C I~ L O L K If I A L H A ,t66 CGT AAC GCC GAG TAC AAC ¢CG AAG CGT TTT GCT GCC GTC ATC ATG AGA ATT 108 R N A ff Y N O ff R F A A V ! 11 R l 4J? CGCGAG CCG AAG ACG ACC GCGCTC ATC TTT GCA TCG GGC AAG ATG GTG TGT 125 R E P g T T A L l F A $ G K H V C 468 ACG GGCGCC AAG AGC GAA GAG GCA TCT CGT CTG GCT GCC AGG P,AG TAC GCT 142 T G A ff $ E E A $ R L A A R K Y A 5]9 ¢GC ATC ATC CAG AAG CTC GGATTC GCC GCC AAG TTC CTC GAT TTC AAG ATT 159 R I l 0 K L G F A A K F L D F K l 570 CAG AAC ATC GTC GGC TCG TGC GAT GTG CGA TTC CCC ATT CGT CT¢ GAG GGT 176 0 N [ V G $ C O II R F P I R L E G 621 ¢TC GCC TTT GCC CAC AAC CAC TAC TGC AGC TAC GAG CCA GAG CTG TTC CCG 193 L A F A H N H Y C $ Y E P E L F P 672 GGT CTC ATC TA(; CGC ATG GTG CAG CCC AAG ATC GTG CTG CTC ATC TTC GTG 210 G L ! Y R N V O P I¢ l V t L l F V 723 TCG GGAAAG ATC GTG CTC ACG GGCGCC AAG GTG CGA GAG GAG ATC TAC GAG 2E7 $ G K ! V L T G A K V R E [" l Y E 774 G¢C TTC GAG P,AC ATC TAC ¢CG GTG TTG ACC GAG TAC AAG AAG ACC TAA ,244 A F E N l g P V L T E Y K K T * SEE GCCATCCACCTCACCTGTACAAACTACCAACCCCCCCAACCATCGCCGCCGICACTTGG 881ATCTACTCCC TCCCTCCCGCCAACCAACTACTCAGCCCACCGCGGTCTGCCACCGGCGCG 94] TGCACGAATGTACAATCACCAGGGGGCGTTGAGAACAAAAACATACACCGGTTTTTTTAA

Fig. 1. 1 he A. castellanii TFliD gene and deduced protein sequence. (A) Diagram of the ,4. castellmlii TFllD gene showing the positions of oligo primers used to amplify genomic DNA. The large open box represents the conserved C-terminal 180 aa, The small blackened box indicates the position of aa conserved between wheat, D. discoideum, A. thalliana and A. castella,ff. SQP is a region containing only Scr, Gin and Pro. Basic aa are shown as asterisks (dots), and the aa direct repeats noted with long arrows, PCR primers are depicted beneath the diagram by short arrows. (B) The nt sequence of A. castelkmii TFllD eDNA and the deduced aa sequence. The sequence underlined is a putative polyadenylation signal. Asterisk indicates a stop codon. Oligo primers corresponding to conserved portions of the yeast (Horikoshi et al., 1989; Schmidt et al., 1989) and Drosophila (Muhich ct al,, 1990: Hoey et al., 1990) TFIID genes were synthesized and used for PCR employing A. castellanii genomic DNA. The oligos corresponded to the following aa and are shown with their sequences: oligo !, ALHARNA: 5'-GC~ CT~ CAC GC~ CG~AAC GC~; G oligo !1, EYNPKRF: 5'-GAG TAC AAC CC~I AAG CG.~ITT.~"G oligo !11, YEPELFP: 3°-ATG CTC GGG CTC GAG AAT GGG C oligo IV, IVLLIFV: 3'-TAG CAG GAG GAG TAG AAG CAG A After PCR, tile products were gel-purified and the ends repaired with Pollk, and ligatcd into the Smal site of p S K ( - ) (Short et al., 1988). Transformation of E. coli JM 1()9 and plasmid preparation were by standard procedures. An A. castelkmii eDNA library in phage i,ZAP (Short et al., 1988) was prepared from poly(A)~RNA after eDNA synthesis using standard procedures and those recommended by the supplier (Stratagene, La Jolla, CA). The library was amplified once. 107 plaques were screened after transfer to nitrocellulose using a 32P-labelcd probe prepared by random-primed synthesis from the subcloned PCR TFIID fragment described above and 14 positive clones were identified. Scveral of these were rescued as ds DNA in pSK( - ) (Short et al., 1988), and one was sequenced completely. The plasmid containing the A. castellanff TFIID eDNA was dubbed pAcTFIID. The GenBank accession No. for the ,4. castellanii TFilD eDNA sequence is M87493.

93 at the 3' end and a putative polyadenylation signal (underlined in Fig. l). The deduced aa sequence ofA. castellanii TFIID beginning at the first Met codon and a schematic diagram of the gene are also shown in Fig. 1. The cDNA contains the entire TFIID coding region as determined by comparison to a genomic TFIID clone (E.B. and J.-M.W., unpublished). (c) Acanthamoeba casteilanii TFllD protein Taking the first Met encoded by the cDNA sequence as the start codon, A. casteganii TFIID contains 258 aa and has a predicted Mr of 28 351. A. casteilanii TFIID contains two easily recognizable domains based on comparisons to TFIID from other organisms; a 78-aa N-terminal domain and a 180-aa C-terminal domain. The A. castelim)ii TFIID N-terminal domain is rich in Pro(l 1/78), Ser (14/78) and Gin (8/78), with one stretch of 12 aa being composed of only these residues (aa 41-52). The N-terminal domain is also characterized by a lack of basic aa (Fig. I), and by a 10-aa stretch between aa 1 and 10 that closely resemble aa 79-89 in the conserved domain. At the C-terminal end of the A. castellanii N-terminal domain, several aa are found to have identity with D. discoideum, wheat and Arabidopsis TFIID (Fig. 2A). The functional significance of these features remains to be elucidated, but it is tempting to suggest that the unusual sequences of the A. casteila#ii TFIID N-

79 t

A. cestellanif

ASDDHDSDVDRTKHPSG

! VPTLO

Wheat

A V H E G A Q PVl) L A I t H P S G

! V PVLQ

A.thaltana 1

QGTEGSQPVDLTKHPSG

11VPTLQ

A.thaliana 2

QGLEGSNPVDLSKHPSG|VPTLQ

D. dfscofdeum

TTSTPAQNVDLSKHPSG

Yeast

I KRATPESEKDTSAT$6IVPTLQ

| I PTLQ

86 QNIVSTVNLGCKLDLKNIALHARNAEYNPKRFAAVIHRIREPKTTALIFASGKHVCTGAKSEEASRLAAR ~ttt

t

t

-t

Q

t

t

t

tw

ttt

tit

t

tt~



t

QNIVG$CDVRFPIRLEGLAFAHNHYCSYEPELFPGLIYflMVQPKIVLLIFVSGKIVLTGAKVREEIY£AFE ]76

1

17

H$GITLPSLTNVLQSAG SGIVPT LQN]V STV 79

92

Fig. 2. Aminoacid sequence features of A. castellaniiTFIID. (A) Comparison of aa sequences at the border of the N-and C-terminal domains region from various organisms. Those aa in the N-terminal domain that are commonto plants A.castellaniiand D. discoideum are underlined.The C-terminal domain is in bold. Asterisks marks SetTM (see Fig. IB). (B) Alignmentofthe directrepeats withinthe C-terminal 180aa ofA. castellanii TFIID. (C) Alignment of A. castellaniiTFIID aa 1-17 with aa 79-92. Identical aa in B and C are marked with asterisks; numbers refer to the first and last aa in each line.

terminal domain such as the run of Ser, Gin and Pro or the aa conserved between plants, A. caswllanii and D. discoideum will be involved in interactions with regulatory proteins or will be required for TFIID function. Interestingly, in this regard, the Pro-rich N-terminal region of A. castellanii TFIID is reminiscent of the transcriptionactivating domain from the human factor CTF (Santoro et al., 1988; Mermod et al., 1989). The C-terminal portion of A. castellanii TFIID is well conserved with regard to size and aa sequence when compared to the predicted aa sequences from other organisms (Gasch et al., 1990; Hoffmann et al., 1990b; Fikes et al., 1990; Horikoshi et al., 1989; Schmidt et al., 1989; Hoey et al., 1990; Peterson et al., 1990; Kao et al., 1990; Muhich et al., 1990). For example, the A. caswllanii C-terminal domain has 855/0 identity with human TFIID and a correspondingly higher level of similarity to TFIID from more closely related organisms such as yeast or D. discoideum (not shown). Where changes in aa sequence do occur, many, but not all, are conservative. Several sequence features such as the direct repeats separated by a basic region and a weak homology to E. coli sigma factor have previously been described tbr the C-terminal region of TFIID from yeast or other organisms (Hoffmann et ai., 1990a) and these are also found in A. castellanii TFIID. The direct repeats of the A. castellanii TFIID C-terminal region are diagrammed in Fig. I and an alignment is shown in Fig. 2B, which illustrates primarily that the match between the repeats is rather poor. There is an additional repeat in A. castellanii TFIID; the 10 aa at the N terminus have similarity to the beginning of the conserved C-terminal domain (Fig. 2C). The major direct repeats are separated by a region which contains several basic aa repeated at more or less regular intervals, however, basic aa are also well represented throughout the conserved C-terminal region (Fig. 1A). Based on a helical wheel analysis, it was suggested that the basic residues between the direct repeats could be presented as one face of a helix (Horikoshi et al., 1989) and in the case of human TFIID, this region may be necessary for interactions witlz the adegoviru~ EIA protein (Lee et al., 1991). More recently, on the basis of homology to the prokaryotic integration host factor protein, it has been proposed that portions of the conserved C-terminal region may form regions of r-sheet that participate in DNA binding (Nash and Granston, 1991). The conserved C-terminal portion contains the TFIID functions necessary to assist transcription initiation in vitro and in vivo (Horikoshi et al., 1990; Reddy and Hahn, 1991). Deletions to the C terminus of yeast TFIID abolished its ability to bind DNA and to activate transcription (Horikoshi et al., 1990). In those studies, the essential aa mapped between aa 63-240 (equivalent to A. caswllanii TFIID aa 81-258). Examination of point mutations to TFIID from

94 yeast has shown that the direct repeats both contain aa that are important in DNA binding (Reddy and Hahn, 1991). The role of the TFIID N-terminal domain is not known. It has been suggested that in some species the TFIID Nterminal domain may be necessary for interactions with regulatory transcription factors (Peterson et al., 1990; Pugh and Tjian, i990). In contrast, it is possible to completely remove the N-terminal domain of yeast TFIID and retain full basal TFIID activity in vitro (Horikoshi et al., 1990). Similarly, the N-terminal domain is not necessary for viability of yeast as shown in gene disruption experiments (Short et al., 1988), but in some stra/ns a cluster ofaa near the C.terminal domain may be essential for efficient TFIID function in rive (Zhou et al., 1991). The positions of these aa correspond approximately to those conserved between A. casteilanii, D. discoideum and plant TFIID noted above, but there is no similarity to the yeast TFIID sequence.

(d) Codon bias within the Acanthamoeba castellanii TFllD gene It had been previously reported that A. caste!/anii genes use codons which end in C or G where this is a choice (Nellen and Gallwitz, 1982; Hammer et al., 1987). Because of the rather limited number of genes sequenced it was possible that bias resulted from use within one type ofgene rather than all genes from A. castellanii. Codon usage for the TFIID gene is shown in Table I. As for the A. castellanii actin and myosin II heavy-chain genes (NeUen and Gallwitz, 1982; Hammer et al., 1987), there is a marked preference for C or G in the third position of most codons. The bias is not quite as extreme as that reported for the myosin II or actin genes, with some exceptions. For example, in codons found in actin, myosin and TFIID-encoding genes, ACA is never used for Thr, ATA is never used for lie, CTA is never used for Leu, GTA is n.e':er used for Val, and other examples can be seen in Table I. While these unfavored codons may eventually be found to be used in particular genes, the codon bias appears to be general in A. castellanii and could be exploited when using PCR methodology based on aa sequences. (e) Analysis of TFIID mRNA and the genomic copy of the TFliD gene Southern and Northern blots of A. castellanii genomic DNA and mRNA, respectively, were hybridized with a labeled TFIID gene-derived probe. These assays were used to determine the number of TFllD-related genes in genomic DNA, whether the TFIID gene is expressed in A. castellanii and t.he number of TFIID messenger RNA species. All but one of the genomic DNA digests produced a single band in the Southern analysis (Fig. 3A), indicating that there is only one form of the TFIID gene. Genomic DNA cut with PstI gives rise to two hybridizing species

TABLE I C o d o n u s ' g e in the Acamhamoeba castellanii TFIID gene aa

First and second nt

Third nt A

Ala Asn

GC AA

5 0

Asp

GA

--

Cys Gin

TG CA

-3

Glu

GA

1

Gly His

GG CA

2 --

lie

AT

Lys

AA

C

G

1! 8

4 --

4 2

4

T

--

4

3 --

-9

1

--

12

7 5

3 --

2 2

0

14

--

5

0

m

16

Met

AT

--

Phe

TT

~

Pro

CC

2

6

7

4

Thr Trp

AC TG

0 --

5 --

5 0

3 --

Tyr

TA

--

9

--

0

Val

GT

0

5

II

0

Arg

CG AG

2 1

4 --

0 !

4

Set

TC

Leu

AG TT

Stop

CT TA TG

--

6

7

m

3

2

2

8

2

-4

0

0

8 .--

2 !

10 .--

6 0

2

0

.

~

.

.

.

.

(Fig. 3A, lane 4), due to a Pstl site roughly in the middle of the PCR-derived probe, and in the genomic copy of the gene. As shown in Fig. 3B, only one band is detectable by Northern analysis, suggesting that only one form of mature mRNA encoding TFIID is expressed. We infer that the A. castellanii TFIID gene encodes one form of mRNA and is present within ~he genome as a unique gene. TFIID mRNA, and presumably TFIID itself, is somewhat: rare, in that it can only be detected using poly(A) + mRNA, and not total RNA, in Northern analysis. TFIID is similarly found as a single copy or single type of gene in all organisms examined with the exception of Arabidopsis thaJiana, which contains two TFIID genes, both of which are ey,pressed (Gasch et al., 1990). (f) Expression of Acanthamoeba casteUanii TFliD in F.scherichia coli In order to demonstrate that the A. castellanii TFIID gene encodes a protein having the activities expected of TFI1D, we produced intact TFIID or ita C-terminal 180 aa (C-180) in E. coll. The C-180 fragment was ar, alyzed in order to determine which aa of TFIID form an active core. Induction of appropriately transformed E. coli BL21(DE3) with IPTG resulted in modest expression of

95

A

B

,2s,sG,

3

7.1-

43--

4.o-

~

Cloning and expression of the Acanthamoeba castellanii gene encoding transcription factor TFIID.

We have cloned and characterized the cDNA encoding transcription factor TFIID from the eukaryote, Acanthamoeba castellanii. The gene occurs as a singl...
847KB Sizes 0 Downloads 0 Views