Gene, 111 (1992) 165-173 © 1992ElsevierSciencePubfishersB.V. All dghts reserved.0378-1119/92/$05.00

165

GENE062~

Structure of a gene encoding heat-shock protein HSP70 from the unicellular alga Chlamydomonas reinhardtii

....

(cDNA; codon usage; exon intron boundaries; introns; G-box motif; promoter; recombinant DNA; transcription start point)

Frank W. Miiller, Gabor L. Igloi and Christoph F. Beck

lnstitut far Biologie III, Albert-Lud~igs-Universitat, D-7800 Freiburg(F.R.G.) Receivedby J.K.C. Knowles: 16 August 1991 Revised/Accepted:30 Octoher/l November 1991 Receivedat publishers: 5 December 1991

SUMMARY The structure of a gene encoding a 70-kDa heat-shock protein (HSP70) from the unicellular alga, Chlamydomonas reinhardtii, is described. This gene shows a remarkable expression pattern, because it is inducible by light as well as by elevated temperature [von Gromoff et al., Mol. Cell. Biol. 9 (1989) 3911-3918]. As a first step in the investigation of trans-acting factors involved in environmentally controlled expression of this hsp70 gene, the nucleotide sequence of the entire gene, including its 5'- and 3'-flanking regions was determined. Although the deduced amino acid sequence exhibits a high degree of conservation to the HSP70 from higher plants, the C. reinhardtii gene has a unique structure among the members of the hsp70 gene family. While most hsp70 genes have only one or no intron, the coding region of the C. reinhardtii gene is interrupted by six introns. Besides putative TATA and CCAAT boxes, two heat-shock elements (HSE) were found in the promoter region, and a third HSE motif was located within the fourth intron. A computer search for regulatory cis-acting elements revealed a noted similarity of a 5'-upstream sequence motif to the G-box motif conserved in higher plants. A polyadenylation recognition sequence canonical for nuclear genes of C, reinhardtii is located downstream from the coding sequence.

INTRODUCTION When cells or organisms are exposed to elevated temperatures or other forms of environmental stress they respond with a transient alteration of their gene expression

Correspondenceto: Dr. C.F. Beck,lnstitutmr BiologieIII, Schanzlestrasse 1, D-7800 Freiburg(F.R.G.)Tel.49-761-2032713;Fax 49-761-2032745. Abbreviations:aa, amino acid(s); C., Chlamydomonas;CLAP,calf intestinal alkalinephosphatase; cabll, geneencodingchlorophyll-a/b-binding protein;eDNA, complementaryDNA; ds, doublestrand(ed);HSE, heatshock element; HSP, heat-shock protein(s); hsp, gene encoding HSP; HSTF, heat-shock transcription factor; kb, kilobase(s)or 1000bp; nt, nucleotide(s);Pipes, 1,4-piperazin~diethanesulfonicacid; PolIk,Klenow (large)fragmentofE. coliDNA polymeraseI; r-, restriction-negative;ss, single strand(ed); tsp, transcriptionstart point(s); u, unit(s).

patterns. This results in the induction of a small set of proteins called HSP. In contrast to most mRNAs, the translation of hsp transcripts increases upon temperature shift. Extended studies have revealed that the heat-shock response is a universal phenomenon in all organisms investigated so far (for a review see Lindquist and Craig, 1988). Heat-shock proteins can be grouped into five faroflies based upon their molecular masses: 17-30kDa, 60 kDa (chaperonins), 70 kDa, 83 kDa, and greater than 100 kDa. Among the different classes of HSP, the HSP70 group is the most conserved in evolution, with homologous genes present in bacteria, fungi, plants, and animals (Lindquist and Craig, 1988; Neumann et al., 1989). The hsp70 gene family consists &genes which are strictly stressinducible and heat-shock cognate genes (hsc) that are constitutively expressed.

166 and pCB410) containing small inserts of 126 and 127 bp, respectively, were isolated. Interestingly, the two inserts display a heterogeneity at the 3' end of the hsp70 mRNA, as the sequence adjacent to the poly(A) tail in pCB409 is 3 bp longer than that in pCB410. To find out whether the eDNA was co-linear to genomic sequences at the 3' end of the gone, an additional genomie clone (pCB411) was generated and sequenced.

Recent work from several groups suggests that certain members of the HSPT0 family interact with nascent polypeptides (Beckmann et al., 1990), maintaining them in a conformation suitable for transport through organelle membranes (Chirico et al., 1988; Deshaies et al., 1988). Another proposed function of HSP70 is the capability of retrieving incorrectly folded proteins in an ATP-dependent manner (Lewis and Pelham, 1985). Here we report the nt sequence of an hsp70 gene from the unicellular green alga C. reinhardtfi. The cloning of this gene is part of our investigation of signal chains involved in environmentally controlled gene expression in this organism.

(b) Structure of the Chlamydomonas reinhardtii hsp70 gene The complete primary sequence of the hsp70 gene from C. reinhardtii, including 759 bp of 5'- and 207 bp of 3'flanking regions, was determined by dideoxy sequencing (Sanger et ai., 1977; Chert and Secburg, 1985) (Fig. 2). The presence of introns in the hsp70 gone was suggested by the absence of an intact reading frame. Comparison of the deduced aa sequence with that of other HSP70 led to the assumption that the CMamydomonas hsp70 gone is interrupted by six introns. The location of the introns in the coding region was based upon the conserved nt at the exon-intron boundaries in other C. reinhardtiigenes. Intron 6 was also confirmed by aligning the nt sequences of the genomic clone pCB353 and the eDNA clone pCB369. The existence of six introns in a hsp70 gone is a novel finding, and also interesting from an evolutionary perspective. Although a sea urchin hsp70 gone with five introns has been reported (La Rosa et at., 1989), hsp70 genes in plants contain either one intron at a conserved position (maize: Rochester etal., 1986; petunia: Winter et at., 1988) or no intron (soybean: Roberts and Key, 1991). None of the reported introns match the six introns in the CMamydomonas hsp70

RESULTS AND DISCUSSION (a) Isolation of genomie and cDNA clones covering the entire Chlamydomonas veinhardtii hsp70 gene The construction of genomic clone pCB353 has been described previously (von Gromoff etal., 1989). The 3.8kb Sail fragment in pCB353 contains the entire proteincoding sequence and 0.76 kb of 5'-flanking sequences. The downstream Sail site (Fig. 1) comprises the last two codons of the hsp70 gene. The 3.8-kb SalI fragment was used to screen a eDNA library prepared from poly(A) + RNA isolated from heat-treated Chlamydomonas cells. One clone (pCB369) carrying a 1.7-kb insert was obtained. Although first-strand eDNA synthesis was primed with oligo(dT) this clone lacked a poly(A) tail. To obtain eDNA clones with polyadenylated Y ends the eDNA library was rescreened with the insert of pCB369. Two clones (pCB409

L

Sa

Pvl~

Psl~

I~Ps

I~

----

p(:m11 (~mo.~o) pcm,~e (d:)NA) l~mOe (cONA)

.,,.

p(m410(cO~)

Ba

I I Fig. 1. Structureof the C. reinhardtiihsp70gone.The top of the diagram showsthe insertsin plasmidclonesused for sequencedetermination.In the lower panelthe completegonestructurecombinedfromthe nt sequencesof the fiveclonesis presented.Stippledboxesrepresentintrons,blackenedboxesdenote protein-codingregions, and open boxes mark untranslated sequences, Restrictionsites for Pstl (Ps), Pvull (Pc), Sail (Sa), SnaBI (Sn), and Sstl (Ss) are also noted. For the constructionof a eDNA library5/~gof poly(A)+RNA isolatedfromheat-treated(7. reinhardtiicells (strain 137c,matingtype + ) accordingto von Gromoffet al. (1989)werereversetranscribedwith a eDNA synthesiskit (Pharraacia).Afteradditionof EcoRI-adaptol'sds eDNA was ligated into EcoRl-digested/CIAP-treatedpUC18 and used to transformcompetent£. coilJM83r-. Recombinantbacterialcolonieswerereplica-plated at a density of 2000colonies/era2 (Hanahan and Meselson, 1983)onto nitrocellulosefilters(HybondC,Amersham)and screenedwith the 32P-labelled (Feinbcrgand Vogelstein,1983)3.8-kbSalI fragmentfrompCB353.The eDNAclonepCB369carries an insertof 1654bp. AdditionalsmalleDNAclones pCB409 and pCB410wereidentifiedby rescreeningthe librarywith the insert of pCB369as a probe. To obtain a genomicclonespanningthe 3' end of the gone,a 1.0-kbSalI-Sstl fragmentfrom the ).-clonehsp70-2 (yon Gromoffet al., 1989)was clonedinto pUC18 to generatepCB41l.

167

GTCGACAGCCATATCGCCGCCGCTTTGGCCACCTCCAAACAGCCCCCTCCCCGCAAAGC701

: cAc T cT cc c c TcAcA ` c cA ` c . ` cA ccc;` cATTc` c` cAcAc` T dcTcAT c dcT AT cThccT cc

601

/~cCAGGc~G~T~CAGTGcAGT~A~GTGGGCGTGAcAGP~cGGGTGcTc~AGC~G~GTGC~AATT~CAACC~cA~CTACG~GAAG~CAT TACGCG 0000000000

501

cTcAcc T cATT cTccT cTAcA ccc TT c cAc cc cc.rc G T dGTTcTc` c TG cc .cTT cccc .c cA cT

401 301

GTAuATTAATGcAcTTGAGC.rATTCATTGGAGC6ATCTGccGGG~AcAGcGGGTcTGGcG1.GcGc~cGAT.~GGAGATCGc/~U~TTAcATATGTcTGcGTn • 201 XXXXXX

ACGGCGGGGAGCTCGCTGAGGCTTGA~ATATGATTG~TGCGTATGTTTGTATGAAGCTACAG~ACTGATTTGGCGGGCTATGAGGGCGCG~T>>>

.......

TCCACTTTCAP~CGACAAACdGCACTTATACATACGCGACTATTCTGCCGr.TATACATAAC~ACTCAA~TCGGTTAAGAG;CAGTAAACA;GGGCAAGGA

+

100 4

+

200 17

+

300 47

ACTGAGCGTP-TGATTGGTGATGCCGCCAAGAACCAGGTACGT TGCGAATTGGGCGGGCCGACTTCAGCGCGCGCAGCCACTTACCCGCCTTCGCCGACCT T E e L ! G O A A K N 0 1

+

400 59

GCCTTCCACAGGTCGCTATGAACCCGCGCCACACGG~GTTCGACGCCAAP'CGCCTG~TTP`GCCGCAAGTTCTCGGACCCCATTGTCCAGAGCGACA~TAA iV A H N P R H T V F D A K R L I G R g F S D P I V Q S D ] K

+

500 89

GCTGTGGCCTTCCCAGGTCGCGCCGGCGCA~GATGTGCcGGAGATCGTTGGTAAGTTCAGCCG~AAAGp`P~GTCCGTGCTGTGTGcG~TTAGTTGCTCA L ~ P S Q V A P A H D V P E I V

+

600 105

CAACCTACTCTTTGCGCTCGcAGTCTCCT/~CAGAACCGAGGAGAAGGTCTTCAAGGcTGAGGAGATCTCCTCGATGGTGCT TATCAAGATGAAGGAGACC iV S Y R T E E K V F K A E E I S S H V L I g H K E T

+

700 131

GCTCAGGCTTCCCTGGGCGr.TGACCGCGAGGTCAAGAAGGCCGTGGTGACCGTGCCCGCCTACTTCAACGArt-TCCCAGCGcCAGGTACGCACGGCACGCG

+

800 159

H

G

K

E

G~CCCCCGC1~ATCGGTATTGACCTGGGCACCACGTACAGGTGAGCTCCC1.CTGCACCTTCAACGTCTCT1~GGACACCAGCTGACCCTTGGCGTGCT TCA/~ A

P

A

]

G

]

D

L

G

T

T

Y

SJ

T cTc .cA cT c T ` T TcT cAGA; T Acc c T; `AGATT TT ccAA Tc cA cc Acc cTcccTc TAc T ccTTcAc ` A A C V

A

Q

A

S

G L

L

W Q N D

G A

D

R

E

V

R V

K

E

K A

I

V

l

V

A

T

N D Q G N R

V

P

A

Y

F

N

T

T

D

S

P

O

S Y V

R

A

F

T

O

O j

`c cc uGuu cTTGGG `TGccAccGc ` `ccTTT c GTTGccAcTc `cc $$ FcTccc cTcTTuAc TcTcT `TccAccAcTc c+ 900 + 1000 TGTTTTTcTc`GcTTA;`ccTTAcc`cGTcTdcccATcr`TcccTGTc¢¢TGTT¢c¢TAGTTcTcTccTdGcTTTcAGGccA¢¢ GAT 163 AA T K O

ccG TATd.TTGccGGcdG A T T cGc TcATc c cccA; c cc cc dmTccTAc ccT 0 c AA "cA T . cTTA A

G H

]

A

G L

E

V

L

R

[

!

N E P

T

A

A

A

]

S

Y

G L

D

K

+ 1100 193

K D SA

+ 1200 + 1300

c T cTc T c¢¢TTcA c c c: TAT ccccTA dA c c ` cc c Tc :AAA:A:A TTccT dcTc Ac cT ccTTcTcc

+ 1400

+ 1500 TccccAccc~`Tccc~T~¢TcccTTcccAccAcA~c~ccTnn~c~A~c~`cAAc~T~cTc~`Tc~Tc~AccT~`c~c~cAccTTc~`ATGT~Tc~cT~cT 215 , E v C . FO, T FO V S C'

T

l

E

E

G !

F

E

V

K A

T

A

G D

T

H L

G G E D

F D

E

R L

V

N H F

A

N

TTccA cAA TAcA AA AccT `AAGAccTc ccc `T cTd c cc ccT c cAcc cd c A c c cTA c cAc cT . cTA d

EF A

A

y

Q T

T

!

E

OL

L

D

S

L

TS

F

E

G V D

A,

F A

gene in position and size. Nevertheless, the occurrence of multiple introns appears to be a common feature of C. reinhardtii genes. The cabll-1 gene, for example, is interrupted by three introns while its counterparts in higher

,

T

S

[

T

TACE

R A

R

A

F

E

"T'SS

E

L

C H D

L

F

+ 1600 248 + 1700 281 + 1800 315

plants have none or only one intron (Imbault et al., 1988). Another structural feature of the hspTO gene is its 804-nt long Y-untranslated sequence, comprising almost 30% of the mRNA (2839 nt). Long 3'-untranslated sequences have

168 CCGC~GTGCATG~CCCC6TG~G~GTGCCTGCACGACGCC~TGGACAAGAT~CTGTGCACGACGTGGTGCTGGTGGGCGGCTC~CCCGTATC R K C R O P V E K C L H D A K N D K H T V H D V V L V G G S T R |

+ 1900 368

CCC~GGTG~GCAGCT~CTGCAGGACTTCTTC~CGGC~GGAGCTG~CAAGTCGA~ACCCCGACGAGGCCG~GGCCTACG~GCCGCCG~GCAG~ p K V Q Q L L Q D F F N G K S L N K S [ N P N E A V A Y G A A V Q

+ 2000 381

CCGC~TTCTGACCGGCGAGGGCGGCGAG~GGTGCAG~CCTGCTGCTGC~G~CGTGACGCCCC~GTCGCTGGGTCTGGAGACCGCCGGCGGCGTCAT A A I T G E G G E K V Q D L L L L D V T P L S L G L E T A G G V R

+ 2100 415

~CGGTGCTCATCCCCCGC~CACCACCATCCCCACC~G~GGAGCAGGTGTTCTCG~CTACTCCGAC~CCAGCCCGGCGTGCTGATCCAGGTCTAC T V L I P R N T T I P T K K E Q V F S T Y S D N Q P G V L ] Q V Y

+ 2200 448

GAGGGCGAGCGCGCGCGCACC~GGACAAC~CCTGCTGGGCAAG~CGAGCTGACCGGCATCCCGCCGGCGCCTCGCGGTGTGCCCCAGATC~CGTGA E G E R A R T K D N N L L G K F E L T G [ P P A P R G V P Q I N V

÷ 2300 681

TCTTCGACATTGACGCCAACGGTA~CCTGAACGTGTCTGCCGAGGAC~GACCACCGGC~CAAGAACAAGATCAC~TCACCAACGAC~GGGCCGCCT 1F D Z D A N G I L N V S A E D K T T G N K N K I T I T N D K G R L

+ 2600 515

GTCCAAGGACGAGATCGAGCGCATGGTGCAGGAGGCGGAGAAGTACAAGGCTGACGACGAGCAGCTGAAGAAGGTGGAGGCCAAG~CTCGCTGGAGAAC S K D e 1 e R H V Q E A E K Y K A D D E Q L K K V e A K N S L E N

+ 2500 548

TAcG~T~ca~T~T~A~T~GTG~A~Gcac~cc~TAGc~GGAT~A~GTTGGG~TcGc~G~T~AT;~cT~TA~cTTTcTT~

÷ ~00

Y

~

549

c~GG~cG~GcT~GAA~cTGATA~cT~G~xTTGcATGATcGG~cGT~TGAc~Ac~G~cAATTA~cAGc~eAccTGAcGcGTG~cTxGcAcGAc~

+ z~o

A~TecT~;cAT~TcTc~ccT~ccTc~TcccAc~c~TAcA~cA~cAAc~ccA~cc~c~A~AAG~T~cc;eccA~cT~c~cc~cG~A~

+ ~o

iA

Y

N "

R

N

T

]

R E

D

K

V

A

S

Q

L

S

A

S

D

5~

AAGGAGTCGATGGAGAAGGCGCT~ACCGcCGCCATGGACTGGCTGGAGGCCAAC~AGATGGCCGAGGTGGAGGAGTTCGAGCACCACCTC~GG~CTG6 K E S H E K A L T A A . D W L E A N Q . A E V E E F e H H L K E L

÷ 2900 603

AGGGGCTG~GC~CCCCA~CACCCGCCTCTACCAGGGCGGCGCCGGCGCGGGCGGCATGCCCGGCGGCGGCGCCGGCGCCGGCGCTGCCCCCTCGGG E G L C N P I 1T R L Y Q G G A G A G G N P G G G A G A ~ A A p $ G

+ 3000 ~7

CGGCTCGGGTGCCGGCCCCAAGATCGAGGAGGTCGACT~TCGGCTTCGCCCC~GACTGAGGAG~GCGGGAGGC~CCGGCGG~GACGGC~GGCGCCG G S G A G P K [ E E V O

+ 3100 6~

TGGACTGGGTGTGTGGGTGCGCCGGTTGGGCGGCTGTGGC~CGGCCTAGGGCCCGGACGTGGGCCGGGCGCTGTATTGATGTGTGGGAAcGGCAGACGC~ + 3200 GCTG~GCGTTGTGTGTGAATACGTACCTATATG~CGGCGGCGCTGTAGCACTGATG~TGTGTTTCGCG~GTGT~TTGTGcTCCTTGTG;TTAAC~CC;

+ 3300

GGTTGGATTGGGACACCGCACGGTCTACATACTCAGAGCAGGACTGAGCTGATTGGTCGGTCGGCCGGCGATTGATTTTGCCAACATGCGCTGAGTGCAG ÷ 3600 CGTTGAGTTCG~ACTACGTGGGATTTGTG~TT~TTAGCTAGATAGGCCCCGGGCTGCTACTGTTGCGTAGGCCCGTGGGAC~ACGC~ACTGAGGCTGT~

+ 3500

GCTGCACGCATCCGGGCTGTTGAATGGCTAGG~TCGTGCGCGGAAGGCCGTTGCTGAGCCATGCCAAGTAGAGTAGGATGACGATGATGTTATAAG~A~

+ 3~0

CGAGATGG~A~CTAGCTCCAACTGAGTTGGGTGCGGCCTTAGGGGAGAGGCGTGCAGGCAGGTGCAAGTTCCAAAGCATGAGAGT~GT~GTGAGAGGC~ + 3 ~ 0 GGA~T~GTGGATGGAAGTGCAT~GAGGGGGT~T~GA~AGAA~TCAA~GGC~GT~AcGGA~AATT~6AAGGGcG~GCcG~GAT~AGAGT~C~CAC~GA~

+ 3800

G~TG~C~CA~ATTG~GTGGTGTGT~AAGCAG~AGATTCCGTAT~CcTTGC~TAATTGAGCC~GGGT~GGAATGGTA~TTTGG~T~GAGTGCGGG~

+ 3900

~AAG~CTCA~GACATGCGGAA~CAG~A~ACA~CAG~T~ATACTGATAGCACGGT~TG~CACTCCT~CCCACCC~C~T~CT~CGGG~

+ 4000

CCGCCAGACAGGAGCTAGGGCCTAGGGGTGC~ACAGGGACATGAGCTC

+ 6069

}

Fig, 2. Nucleotide sequence of the h ~

gene includi~ its 5'- and 3'~anking r~ions. The aa sequence ded~ed ~ m the nt s~uence is given bgow

the coding strand ( ~ e d with the second nt of e ~ h codon). ~ s h w c nt numbering b ~ n s at the ~ (F~. 3); n~afive numbers ~ e u s ~ ~ ~stream sciences. Noted in the figure are a TATA box ( + + + ), a G + C cluster ~ a c e n t to ~ e TATA-box (double underlined), a CCAAT-fikc s~uence ( × × × three HSE (the inverted orientation of the 5-bp units is marked ~ >>>, < < < ~ a 12&p divot ~peat sequence p ~ t ~ overlapping wi~ HSE (unde~ lined), a G-box like motif ( O O O ) , a TGTAA p~yadenylation m~if canonical ~ C. wmhardtii ~verlined), and the 3' end of a eDNA ~CB409; T). ~rst and last nt o f e ~ h intron a ~ marked below the science ~ upward ~rowhe~s. Met~ds. For sequencing both straMs p C B ~ 9 ~ d pCB410 we~ u s ~ dir~t~, while ~om pCB353, pCB369, and pCB411 nested s~s of deletion subciones we~ generated ~ exonuclease ]II d ~ s t ~ n a c c o ~ ~ Henikoff (1987), e ~ e ~ that mun~bean nuelease was used instead of SI nuclease. The nt science was ~ t e ~ m e d on ds plasm~ ~mplates by ~ e dideo~ ch~n-te~ination m e t e d ( S ~ r eta]., 1977; C ~ n and Seebu~, 1~5) using a modified T7 DNA p~ym~ase (Tabor and Rich~dson, 198~. Sequence data we~ generated usi~ the semi-automated nonradioactive sequencing ~ s ~ m d~eloped ~ d m ~ u ~ d ~ EMBL (Heidelbe~, F.R.G.) ( A n s o ~ et ft., 1986; Voss et al., 1989). These sequ~ce d~a will appe~ in the E M B ~ G e n B ~ k / D D B J nt sequence d ~ a bases und~ access~n No. M76725.

169

HSEI

TATA

v

v

I

Exonl I

l

cx~o

Pdmerelonostionwith Pollk o[-"8]dATP, dGTP,dCTP,dl"rP

I

~eeoooeeeeee

eeeeeeeeeees eeeeoeeeeeeeeoeoet

leeeeqmm

t

D~ed0n w,h S d

t

Gel purlfica~ 0f 31(3nt fragment

1

2

$

4

5

6

7

G

A

T

0

m

TATA Ebox

__R

l O m

qmm

i O g

"t

"

"8

Fig. 3. Mapping the tsp. (Top) Scheme for generating an internally radiolabeled probe for nuclease protection assay. A 1.9-::b 5all.PstI fragment (nt -759 to + 1125, see Fig. 1) from pCB353 was cloned into Ml3mpl8 and ssDNA prepared. The probe was synthesized by extending a synthetic oligodeox~a'ibonucleotide(5'-CCGATAGCGGGGGCCTCCTTGCC, complementary to codons 2 to 9, Fig. 2), annealed to the template, using Pollk in the presence of 50/zCi [a-3sS]dATP (Amersham), and then cleaved with SstI. The radiolabeled probe was separated from the template by boiling and electrophoresis through a 1.8% agarose gel. The 316-nt fragment was purified with glass milk/Nal according to Vogelstein and Gillespie 0979). Total RNA from exponentially growing C. reinhardtii cells subjected to a 40°C heat-shock for 30rain was isolated as described by yon Gromoff et al. (1989). RNA (25 pg) was mixed with the probe (200000cpm) and co-precipitated with ethanol. The resulting pellet was resuspended in 20 #1 of 80% formamide/ 400 mM NaCI/40 mM Pipes pH 6.4/1 mM EDTA, heated to 85°C for 5min and then hybridized for 18h at 52°C. The annealing reaction was diluted with 180/d of mung-bean nuclease mix (16.5 mM Na.acetare pH 5.0/55mM NaCI/0.11mM ZnCI2/1.1mM cysteine/5.5% glycerol/200 or 400u mung-bean nuclease) and incubated at 30°C

also been reported for g- and fl-tubulin genes in Chlamydomonas (Silfiow et al., 1985; Youngbiom et al., 1984). The putative polyadenylation recognition motif of the hsp70 gene (at + 3823) is identical to the canonical 'TGTAA' sequence (Silflow et al., 1985) found at the 3' ends of C. reinhardtii genes 10 to 15 nt upstream from the polyadenylation site. The reading frame starts with ATG within the sequence 'AAACATGGG', a context which is optimal for translational initiation in eukaryotes (Kozak, 1984). In contrast to the long 3'-untranslated sequence the 5'-untranslated sequence is much shorter. The tsp as determined by a nuclease protection assay maps 89 nt upstream from the start codon (Fig. 3). A TATA box was found another 23 bp upstream from the putative tsp a distance typical for eukaryotic genes (Breathnach and Chambon, 1981). The TATA box is flanked by a G + C-rich cluster noted also in other C. reinhardtiigenes (tubulin: Brunke et al., 1984; rbcS: Goldschmidt-Clermont and Rahire, 1986; cabll- I: Imbault et aL, 1988). In addition to the TATA box a sequence resembfing a CCAAT motif(at -216) was located. The characteristic feature of heat-shock promoters is the presence of several HSE. In the C. reinhardtii hsp70 promoter two HSE (HSEI at -96, HSE2 at -358) are present (Fig. 2). The first HSE shows a perfect match with the palindromic canonical motif 5 ' - C N N G A A N N T r C N N G (Bienz and Pelham, 1986), while the second HSE contains one mism~Ich. Both HSE also comprise a 15-bp structure of three 5-bp units (NGAAN) inverted relative to each other. This arrangement is consistent with the finding that HSTF from Drosophila and yeast bind to DNA as a trimer (Perisic et al., 1989; Sorger and Nelson, 1989). Both HSE are flanked by 12-bp direct repeats which partly extend into the actual HSE sequences. Interestingly, a sequence resembling a third HSE was found in the fourth intron (Fig. 2). By computer analysis a sequence displaying a remarkable similarity to a G-box motif was located between nt -569 and -578. G-box motifs are binding sites for nuclear factors, and have been found in various plant genes, but also in yeast and mammalian genes (for a review see Weising and Kahl, 1991; Oeda et al., 1991).

for I h. The protected fragment was extracted with phenol andprecipitated with ethanol. The size of the protected fragment was determined on a 6% polyacrylamide/8 M urea gel with a DNA sequencing ladder derived from the same template/primer combination as size marker. (Bottom) Autoradiogram of a nuclease protection assay. Radiolabeled probe was mixed with 25 #g of yeast tRNA and digested with 200 u mung-bean nuclease (lane 1). C. reinhardtii total RNA (25 pg) hybridized to the probe was incubated with 200 u (lane 2) or 400 u (lane 3) of mung-bean nuclease, in lanes 4 to 7 the didenxy sequencing ladder (GATC) of the strand complementary to the hspTO mRNA is given. The position of the protected fragment is indicated by an arrow. Note that only the area of the autoradiogram showing the protected fragment but not the unprotected probe is displayed.

170 TABLE I Codon usage in the Chlamydomonasreinhardtiihsp70gene Codon

Total

Codon

i'otal

Ala G---CT GCC GCA GCG

62 10 44 0 8

16 71 0 13

Gly GGT GGC GGA GGG

54 10 43 0 1

18 79 0 2

Arg CGT CGC CGA CGG AGA AGG

32 3 28 0 0 1 0

9 88 0 0 3 0

Hi__.~s CAT CAC

8 1 7

12 88

lie ATT ATC ATA

38 12 26 0

32 68 0

As._.~n AAT AAC

29 1 28

3 97

Le._._u.u TTA TTG CT T CTC CTA CTG

54 0 0 1 5 0 48

0 0 2 9 0 89

Asp GAT GAC

44 5 39

ll 89

Cys TGT TGC

6 0 6

0 100

Lys AAA AAG

47 0 47

0 100

Gl_..nn CAA CAG

24 0 24

0 100

Me._.~t ATG

16

Gl..__uu GAA GAG

54 0 54

0 100

Ph...~e TTT TT C

21 1 20

Stop TAA

1

TAG

0

(c) The

%

hspTOgene product

The nt-sequence-deduced C. reinhardtii HSP70 is a 70.8k D a protein of 649 aa. Alignment with the HSP70 from maize (monocotyledonous plant), petunia (dicotyledonous plant), human (vertebrate), Drosophila (invertebrate), and yeast (fungi) revealed a close relationship between H S P 7 0 from algae and higher plants (Fig. 4), Homology is 77 % in either case as calculated with the C L U S T A L program, while the overall homology to the other three HSP70 is lower than 65%. Although H S P show a high degree of conservation in the N-terminal portion, their (::-terminal portions diverge significantly. The final tetrapeptide 'EEVD', however, is found in all H SP70, whereas an extended m o t i f ' G A G P K I E E V D ' is present in all plant HSP70 (Rochester et al., 1985; Winter et al., 1986; Roberts and Key, 1991) including the Chlamydomonas gene product.

%

5 95

Codon

Total

~/o

Pr_._~o CCT CCC CCA CCG

25 2 18 0 5

8 72 0 20

Se__£ TCT TCC TCA T CG AGT AGC

32 2 12 0 13 0 5

6 38 0 40 0 1

Th_..gr ACT ACC ACA A CG

42 4 31 0 7

9 74 0 17

Trp TGG

3

Tyr TAT TAC

13 0 13

0 100

Va_..]l GTT GT C GT A GTG

45 1 11 0 33

2 25 0 73

TGA

0

One of the characteristics of H SP70 is the ability to bind and hydrolyze ATP. Indeed, a stretch of aa (C. reinhardtii HSP70: L G G G T F D F V S - - S a a - - F E V K A T A ) similar to the ATP-binding site of protein kinases is present in all HSP70 (Neumann et al., 1990). Another interesting feature of HSP70 is the existence of a calmodulin-binding domain (Stevenson and Calderwood, 1990) (Fig. 4).

(d) Codon usage in the

hsp70gene

As reported for other C. reinhardtii genes (~-tubulin: Silflow et al., 1985;/~-tubulin: Youngblom et al., 1984; rbcS: Goldschmidt-Clermont and Rahire, 1986) the codon usage in the hsp70 gene is strongly biased (Table I). In general, C or G residues at the third position of codons are preferred (% N N C / G = 91.7), and, where possible, also in the first position. Exceptions are the codons G C T for alanine and G G T for glycine which occur more often than the

171 109 C.r.

MGK-EAPAIGIDLGTTYSCVGLMQMDRVEI •AND••NRTTPsYVAFTDTERLXGDAAKN•VAHNPRHTVFDAKRL•GRKFSDP1VQ•D•KLi•J•SQVAPAH•DVPE•WSYR

P.h. ~

S.c.

C.r. Z.m, P.h. H.S. D.m.

H

~

.......

;; i;; iiiiiii

M S ~ H F A I ~ A ~ ~

ii

++.II~I..II_.._._[N~~F~I~P~FM~

I--+-....... iii~i;~iiiii i '

iii ; ~ I ~ ~ F | F K L I D V D G - ~ F K

219 TEEKVFKAEE! SSMVLI KMICETA-QASLGADREVKK-AWTVPAYFNDSQRQATKDAGMIAGLEVLR! ! MEPTAAAISYGLDKKI)SGLGERNVL! FDLGGGTFDVSLLTi EE GID:ll~Gllllnllll~lI-EIYlsT-IlN. ...... _, _;, .... f f S S I ~ - ....................... ............

~TIII-E~ITT--I~-

~...............~

i~

IIIIII,~ i i':+ "

. ...................! ~

.............................iiii i

.......

~ ~ m x l - e + I P - - I p . - m :~ . : - : ........... iii iiii..~i,i._.ii...L_I//....._: T O t - " ~ t t O GlSlRIIAPIlllI~IEIY1- - ES! T D I I | - ~ n ~ l l ~ H l l l N l l m l l n ~ L AIIIINLK - .l~l~l~n~n li | S.c. ~ I T P ~ I I I ' E S Y i - - ~ D - _ - I ~ - - ~ _ - ~ Z ............................. ;' i.~J ............................................... i K - - E ~ ? ~ F ~

330 C,r. G- I FEVKATAGDTHLGGEDFOERLVNHFANE FQRKYKKDLKTSPRALRRLRTACERAKRTLSSAAQTT ! ELDSLFEGVDFATSITRARFEELCMDLFRKCMDPVEKCLHDAK P.h. l - ~ N ] + I x s o + ~ ~ : ~ / / / : : . _ ~ : : : / / : : : / ~ : : _ ~ : / / r l x l l x l m ~ :

. . . . . . . . .

:+ __ : /

+ I.L.+....L ......... I

p.m. IStlslBIIIIIII~IILA+~S~L_~IIIIII.II__IIIII_ ~.~iiii.i._~i.i._il.;teAInllqnl~l~lLVS~tC~ll S.c. I I - ~ l l l l l ~ l l l l l l ~ l l U l l o l l < ~ l n l l s l M ~ s l ~ v l l i l l l i I ~ ~ S T cilll~U i 442 C.r. MDKMTVHDVVLVGGSTR IPKVQQLLQDFFNGKELMKS i NPDEAVAYGAAVQAAI LTGEGGEYVQDLLLLDVTPLSLGLETAGGVMTVL [PRMTT%PTKKEQVFSTYSDMQPP-

Z.m. lllllss~lnlilllllllllln---~____: ~_~._-:::Z:.-~:;~:;;::.~:~ |illNII RSD" e.h. Bill" .....................I l l .................................................................................................................. ;11~________.____.___________

__________________________

o.m. B l ~ m l B s ~ s l l ~ " l ~ l ~ / / l l l l l B ~ l t D 0 S ~ mD~qlllBB~llc.lcl~T~lmlHll s.~. LBIs~ItDEs ~ T l ~ l i t t l l ~ l l ~ ~ t ~ s s ~ z l l B I B l ~ l l l l ~ l l B I I B l l t q B l ~ B s t l l t i l e E

~

~

552 C,r. VLIQVYEGGERARTKDNMLLGKFELTG l PPAPRGVPQIMVIFDXDANG! L +~3AEDICTTNKM-%T1TNDKGRLSKE-! ERMVQEAEKYKAEDDEQLKKVEAKNSLEMYAYNM

D.m. ~ n ~ I T p ~ S ~ | T B L ~ E H S ~ K A K H ~ A E D ) ~ I N E D I E K R R Q V T S R S.C. m ~ F l l B ~ I I ~ n ~ S ~ | T I ~ S ~ E K G I K S ~ I I ~ I D ~ F I E I E K E S O R

"~SHVLW 1AslQLES I AYSL

649 C.r. RNTI REDKVASQLSASDKESMEKALTAAMDI/LEANQMAEVEEFEHHLKEL~iGLCMPI TTRLYQG-GAGAGAGG-MPGGAGAGAAPS" GfiS. . . . . . GAGPKIEEVD z.m. I I l ~ D I I t I F I P ¥ 1 F K I I P l v O G l l s I I O S I I L I I B I ~ K " B ~ I I I l i A K " I I : I E I I " I I A A G I D E o ..... l ' l ...... IIIIil P.h. n ~ l ) l l N~KR I ~D|I DE|I K n D M B L I A D ~ p K . ~ S 1~ I AK~'~'BT" "D)EDG. . . . . U ~ I A G " S Q T ~ H.S. KSAVEDEGLKGK1~EA~K~LDKCQEVI S ~ I ) ~ T L ~ K D l i i K R ~ I ~ I S G ~ " "~RGPGP. . . . ~ F n l Q G ' I K ' ~ ' G ' S . . . . ~ T ~ 0,111, KQAVEQAP-|GkllDEA~SDLDKCNDT1RIlDSITTIII~](1EllT~.:H~II41KMHIO - I m l n "~"CGQQA"" "BFGGYS. . . . I R T l n S.c. KIIISlAGDICLEQAO~TVTKKAEE|- - [ S l I D S l T T I S l C ~ D K I I ~ ~I A I ~ I t C I ~ G l l - " P - I " A~GIFPGGAPPAPEAE . . . . l w l l l Fig. 4. Comparison of aa sequencesof C. reinhardtii (C.r.), maize (Z.m.; Rochester et al., 1986), petunia (P.h.; Winter et al., 1988), human (H.s.; Hunt and Morimoto, 1985),Drosophila (D.m.; Karch et al., ]981), and )'east (S.~ ; Lindqnist, 1986)ksp?Ogenes. Indicated are homologies(shaded boxes),gaps (dashes), and ambiguous aa (question marks). Also noted are the putative ,' TP-binding site (single oveflined) and the calmodulin-binding domain (double overlined). Proteins were aligned with the CLUSTAL program (window e,Lce10; filtering level 2.5; gap penalty 4) (Higgins and Sharp, 1988).

codons GCG and GGG, respectively. The biased codon usage reflects the high G + C content (62.5%) of the DNA sequenced. In monocotyledonous plants genes with a high % of NNC/G codons are often strongly inducible by internal

and external stimuli (Campbell and Gowri, 1990). De Hostos et al. (1989) suggested that codon usage in C. reinhardtii is biased for genes which encode abundant proteins like tubulins and the small subunit of ribulose bisphosphate carboxylase. In less frequent transcripts like those for cy-

172 tochrome c-552 (Merchant and Bogorad, 1987) and arylsulfatase (de Hostos et al., 1989) codon usage is considerably more balanced. (e) Conclusions (1) This paper describes the structttral analysis o f a C. reinhardtii hsp70 gene, from genomic and e D N A clones, comprising the entire transcription unit (3842 nt), plus 759 nt of 5 ' - and 207 nt of 3'-flanking sequences. The gene structure, whose coding region is interrupted by six introns, none o f which is similar to any intron in other known hsp70 genes, exhibits an unusual organization. (2) In the 5'-flanking region a T A T A box, a putative C C A A T element, and two H S E s were localized. In the fourth intron a third H S E motif was found. A G-box-like motif found in the regulatory upstream regions o f other plant genes is also present in the promoter region o f the hsp TO gone, (3) The deduced polypeptide is 649 aa long and displays a marked similarity to H S P 7 0 from monocotyledonous and dicotyledonous plants. (4) Like other genes in C. reinhardtii induced by environmental stimuli, the hsp70 gene displays a remarkable bias for codons with C or G residues at the first and third position.

ACKNOWLEDGEMENTS This work was supported by a grant from the Deutsche Forschungsgemeinschaft (Be 903/4-1) and by a fellowship from the Konrad-Adenauer-Stiftung to F,W.M. We thank H a n s K t s s e l and Tom Quayle for helpful comments on the manuscript.

REFERENCES Ansorge, W., Sproat, B.S., Stegemann, J. and Schwager, C.: A nonradioactive automated method for DNA sequence determination. J. Biochem. Biophys. Methods 13 (1986) 315-323. Beckmann, R.P., Mizzen, LA. and Welch, WJ.: Interaction of Hsp70 with newly synthesizedproteins: implicationsfor protein folding and assembly. Science 248 (1990) 850-854. Bienz, M. and Pelhanl, H.R.B.: Heat-shock regulatoryelements function as an inducible enhancer in the Xenopus hspTOgene and when linked to a heterologouspromoter. Cell 45 (1986) 753-760. Breathnaeh, R. and Chambon, P.: Organization and expression of enkaryotie split genes coding for proteins. Annu. Rev. Bivchem. 50 (1981) 349-383. Brunke, K.J., Anthony, J.G., Sternberg, E.J. and Weeks, D." Repeated consensus sequence and pseudopromoters in the four coordinately regulated tuhulin genes of Chlamydomonasreinhardtii.Moi. Cell. Biol. 4 (1984) 1115-1124. Campbell, W.H. and Gowri, G.: Codon usage in higher plants, green algae, and cyanobacteria. Plant Physiol. 92 (1990) 1-11.

Chen, E.J. and Seeburg, P.H.: Supercoil sequencing: a fast and simple method for sequencing plasmid DNA. DNA 4 (1985) 165-170. Chirico, W.J., Waters, M.G. and Blobel,G.: 70k heat-shock related proteins stimulate protein transloeation into microsomes. Nature 332 (1988) 805-810. de Hostos, E.L., Schilling,J. and Grossman, A.R.: Structure and expression of the gone encoding the periplasmatie arylstflfatase of Chlam.vdomonasreinhardtii.Mol. Gen. Goner. 218 (1989) 229-239. Deshaies, R.J., Koch, B,D., Wereer-Washburne, M., Craig, E.A. and Schekman, R.: A subfamilyof stress proteins facilitates translocation of secretory and mitochondriai precursor polypeptides. Nature 332 (1988) 800-805. Feinberg, A.P. and Vogelstein, B: A technique for radiolabelling DNA restriction endonuclease fragments to high activity. Anal. Biochem. 132 (1983) 2-13. Goldschmidt-Clermont, M. and Rahire, M.: Sequence, evolution and differentialexpression of the two genes encoding variant small subunits of ribulose bisphosphate earboxylase/oxygenase in Chlamydomonas reinhardtii.J. Mol. Biol. 191 (1986) 421-432. Hanahan, D, and Meselson, M.: Plasmid screening at high colony density. Methods Enzymol. 100 0983) 333-342, Henikolf, S.: Unidirectional digestion with exonuclease III in DNA sequence analysis. Methods Enzymol. 155 (1987) 156-165. Higgins, D.G. and Sharp, P.M.: CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gone 73 (1988) 237-244. Hunt, C. and Morimoto, R.: Conservedfeatures of eukaryotichspTOgenes revealedby comparison with the nucleotide sequence of human hspTO. Pron. Natl. Acad. Sci. USA 82 (1985) 6455-6459. lmbault, P., Wittemer, C., Johanningmeier,U., Jacobs, J.D. and Howell, S.H.: Structure &the Chlamydomonasreinhardtffcabll-I gene encoding a chlorophyll.a/b-binding protein. Gene "/3 (1988) 397-407. Karch, F., TOrtk, I. and Tissieres, A.: Extensive regions of homology in front of the two hsp70 heat-shock variant genes in Drosophila melanogaster. J. Mol. Biol. 148 (1981) 219-230. Kozak, M.: Compilation and analysis of sequences upstream from the translational start site in eukaryotic mRNAs. Nucleic Acids Res. 12 (1984) 857-872. La Rosa, M., Sconzo, G., Giudice, G., Roccheri, M.C, and Di Carlo, M.: Sequence of a sea urchin hspTOgene and its S'-flanking region. Gene 96 (1990) 295-300. Lewis, M.J. and Pelham, H.R.B.: Involvementof ATP in the nuclear and nuclenlar functions of the 70 kD heat-shock proteins. EMBO J. 4 (1985) 3137-3143. Lindquist, S.: The heat-shock response. Annu. Rev. Biochem. 55 (1986) 1151-1191. Merchant, S. and Bogorad, L.: The Cu(II)-repressible plastidic cy. tochrome c, J. Biol. Chem. 262 (1987) 9062-9067. Neumann, D., Never, L., Parthier, B., Rieger, R., Scharf, K.-D., Woilgiehn, R. and zur Nieden, U.: Heat-shock and other stress response systems of plants. Biol. Zentralbl. 108 (1989) 1-156. Oeda, K., Salinas, J. and Chua, N.-H.: A tobacco bZip transcription activator (TAF-I) binds to a G-box-like motif conserved in plant genes. EMBO J. 10 (1991) 1793-1802. Pelham, H.: Speculation on the functions of the major heat-shock and glucose-regulatedproteins. Cell 46 (1986) 959-961. Perisic,O., Xiao, H. and Lis, J.T.: Stablebinding of Drosophilaheat-shock factor to head-to-head and tail-to-tail repeats of a conserved 5-bp recognition unit. Cell 59 (1989) 797-806. Roberts, J.K. and Key, J.L.: Isolation and characterization era soybean hsp70 gene. Plant Mol. Biol. 16 (1991) 671-683. Rochester, D.E., Winter, J.A. and Shah, D.M.: The structure and expression of maize genes encodingthe major heat-shock protein, hsp70. EMBO J. 5 (1986) 451-458.

173 Sanger, F., Nictden, S. and Coulson A.R.: DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1977) 54635467. Silflow, C.D., Chisholm, R.L., Conner, T.W. and Ranum, L.P.: The two alpha-tubulin genes of Chlamydomonas reinhardtii code for slightly different proteins. Moi. Cell. Biol. 5 (1985) 2389-2398. Sorgcx, P.K. and Ndson, H.C.M.: Trimerization of a yeast transcriptional activator via a coiled-coil motif. Call 59 (1989) 807-813. Tabor, S. and Richardson, C.C.: DNA sequence analysis with a modified bacteriophage T7 DNA polymerase. Proc. Natl. Aead. Sci. USA 84 (I 987) 4767-4771. Vogelstein, B. and Gillespie, D.: Preparative and analytical purification of DNA from agarose. Proc. Natl. Acad. Sei. USA 78 0979) 615-619. von Gromoff, E.D., Treier, U. and Beck, C.F.: Three light-inducible heat-shock genes of Chlmnydomonas reinhardtii. Mol. Cell. Biol. 9 (1989) 3911-3918.

Voss, H., Schwaiger, C., Kristensen, T., Duthie, S., Olsson, A., Erfle, H., Stegemann, .I., Zimmerman, J. and Ansorge, W.: One step reaction protocol for automated DNA sequencing with T7 DNA polymerase results in uniform labelling. Methods Mol. Cell. Biol. 1 (1989) IS5159. Weisiag, K. and Kahl, G.: Towards an understanding of plant gone regulation: the action of nuclear factors. Z. Naturforsch. 46c (1991) 111. Winter, J., Wright, R., Duck, N., Gasser, C., Fraley, R. and Shah, D.: The inhibition of petunia hsp70mRNA processing during CdCl2 stress. Mol. Gen. Genet. 211 (1988) 315-319. Younghlom, J., Schloss, J.A. and Silflow, C.D.: The two fl-tubulin genes of Chlamydomonas reinhardtiicode for identical proteins. Mol. Cell. Biol. 4 (1984) 2686-2696.

Structure of a gene encoding heat-shock protein HSP70 from the unicellular alga Chlamydomonas reinhardtii.

The structure of a gene encoding a 70-kDa heat-shock protein (HSP70) from the unicellular alga, Chlamydomonas reinhardtii, is described. This gene sho...
1MB Sizes 0 Downloads 0 Views