Plant Molecular Biology 7." 385-392 (1986) © Martinus Nijhoff Publishers. Dordrecht - Printed in the Netherlands

385

Structures, of tobacco chloroplast genes for t R N A ne (CAU), t R N A ~ u (CAA), t R N A Cys (GCA), t R N A set (UGA) and t R N A Thr (GGU): a compilation of t R N A genes from tobacco chloroplasts* Tatsuya Wakasugi, Masaru Ohme, Kazuo Shinozaki & Masahiro Sugiura Center f o r Gene Research, Nagoya University, Chikusa, Nagoya 464, Japan

Keywords: chloroplast, codon usage, DNA sequence, tRNA gene, tobacco

Summary The location and nucleotide sequences of tobacco chloroplast genes for tRNA ne (CAU), tRNA Leu (CAA), tRNA cys (GCA), tRNA s~r (UGA) and tRNA TM (GGU) (trnI-CAU, trnL-CAA, trnC-GCA, trnS-UGA and trnT-GGU, respectively) have been determined. The trnI and trnL are located in the inverted repeat region. The trnC, trnS and trnT are present in the large single copy region. These five tRNA genes together with the 25 different tRNA genes previously published have been compiled and compared. These 30 tRNA genes corresponding to 20 amino acids are most likely to be all of the tRNA genes encoded in tobacco chloroplast genome.

Introduction

Materials and methods

Chloroplasts contain their own proteinsynthesizing apparatus. All tRNA species involved in chloroplast protein synthesis are believed to be coded in the chloroplast genome. The structures of some chloroplast tRNAs and their genes have been determined in various plants and algae (22, 23). To analyze the organization and fine structures of tRNA genes from tobacco chloroplasts, we have sequenced the chloroplast DNA fragments which hybridized with the total chloroplast tRNAs. Twenty-five tRNA genes from tobacco chloroplasts have so far been reported (9, 13, 25, 26, 30). Here we show the locations and structures of trnI-CAU, t r n L - C A A , trnC-GCA, trnS-UGA and trnT-GGU. We also present a compilation of tRNA genes of tobacco chloroplast DNA and a summary o f their structural features.

Cloned BamHI fragments of tobacco (Nicotiana tabacum var. Bright Yellow 4) chloroplast DNA were as described (27). DNA sequences were determined by the chemical method (12) and the dideoxy chain termination method (19) using the M13mpl0/ll and M13mp18/19 phages and E. coil JM109 (31). Nucleotide sequences were analyzed using the G E N E T Y X program (Software Development Co. Ltd., Tokyo).

Results and discussion Nucleotide sequences o f the regions hybridizing with total tobacco chloroplast tRNAs in the BamHI fragments Ba7, Ba8, Ba6a, Ba6b, Ba9a, Ba9b and Ba10b were determined. The DNA sequencing has revealed the presence of five new tRNA genes (see Figs. 1 and 2). The fact that the tRNA gene se-

Address for offprints: Masahiro Sugiura, Center for Gene Research, Nagoya University, Chikusa, Nagoya 464, Japan. *This paper is dedicated to Professor Morio Ikehara on the occasion of his retirement from Osaka University in March 1986.

386

)'I"

,nS -GCU rnQ- uUG

,,.o,-°'° V

% g Fig. 1. Location of the 37 t R N A genes on the physical m a p of tobacco chloroplast DNA. S and Ba indicate SalI and B a m H I fragments, respectively, and bold lines show the inverted repeat (26). Asterisks show t R N A genes containing introns. Triangles indicate t R N A genes described in this paper.

quences hybridize to the total chloroplast tRNAs indicates that these genes are functional. Cloverleaf structures of these tRNAs predicted from the D N A sequences are shown in Fig. 3.

trnI-CAU The trnI-CAU gene (in Ba7 and Ba8) is located 166 bp upstream from rp123 on the same strand in each segment of the inverted repeat (28). Both copies of trnI-CAU have been sequenced and found to be identical. Although this trn has a methionine anticodon (CAU), we concluded this to be trnI based on the homology with the spinach trnI-CAU (8). The tobacco trnI-CAU is 74 bp long and shows 100070, 100070, 53070, 57070 and 43o70 sequence homol-

ogy with spinach and N. debneyi trnls-CAU, tobacco trnI-GAU, tobacco trnM-CAU and tobacco trnfM-CAU, respectively (8, 26, 32). Tobacco chloroplast t R N A lIe (CAU) predicted from the gene probably recognizes the AUA codon and contains an extra non-base paired nucleotide (C42,) in the anticodon stem (or a mismatch base-pair, C27-C43 , see Fig. 3) as has been reported in the spinach trnICAU (8). There is an A30-U40 base-pair in the anticodon stem while all the other t R N A genes found in tobacco chloroplast genome contain a G-C or CG base-pair in the corresponding positions (see Table 1). The flanking regions of tobacco and spinach trnls-CAU are nearly identical (Fig. 2a). Sequences similar to the prokaryotic promoter are found in the 5"flanking regions and short palindromic se-

Fig. 2. Nucleotide sequences of the regions containing trnI-CAU (a), t r n L - C A A (b), t r n C - G C A (c), trnS-UGA (d) and trnT-GGU (e). Tobacco sequences (T) are compared with those of spinach (S), maize (M), broad bean (B), wheat (W) and pea (P). RNA-like strands are presented. Deletions are indicated by dashes. H o m o l o g o u s nucleotides are shown by asterisks. Coding regions are boxed. Arrows indicate palindromic sequences. - 3 5 and - 1 0 shows promoter-like sequences.

387 -35

- 10

T

T ATCCTAACTAAATTGCATTGATTTATCCTAAAGATITCATTTCAATTGGAATTTGGTTATTCACCATGTACGAGGATCCCCGCTAA

S

AATCCAC•ATAAATTGCATTGATTTATCCTAAAGATTTCATTTCAATTGGAATTTGGTTATTCACCATGTACGAGGATCCCCGCT AA

trnl T

GCATCCATGGCTGAATGGTTAAAGCGCCCAACTCATAATTGGCGAATTCGTAGGTTCA

S

GCATCCATGGCTGAATGGTT AAAGCGCCCAACTCATAATTGGCGAATTCGT AGGTTCAATTCCT ACTGGATGC

ATTCCTACTGGATGCA[CGCCAATGGGACCCTCCA ATAAGTCT

T

ATTGGAATTGGCTCTGTATCAATGGAATCTCATCATCCAT ACA'fAACGAATTGGTGTGGTATATTCATATCATAATATATGAACAGTAAGAACTAGCATT

-35

GCCAATGGGACCCTCCAATAAGTCT

-I0

M

AAGACTCCACCTTTGTCATATATTCCATATATCACATTCGATAGATATCATATTCATGGAATACGATTCACTTT~AAGAT

B

ATTAGAATTGGAG A A TAATA TGAAC

G

TA

AGA A TAT

C

TTAAG AT

trn L G~TTGATGGTGAAATGGT~--AGA~ACGCGAGACT~AAA~TCTCGTG~AAAGAGCGTGGAGGTTCGAGTC~TC~TCAAGGC~TAATA~GG~GAATGCT

G~TTG~TGGTGAAATGGT~AGACACGCGAGACTCAAAATCTCGTGCTAAATAGCGTGGAGGTTCGAGT~CTCTTCAAGGC TAATATTGAGAATGCT GCCTTGGTGGTGAAATGGT••-AGACACGCGAGACTCAAAATCTCGTGCTAAACAGCGTGGAGGTTCGAGTCCT•TTCAAGGC

TAATATTGAGATCT

GCCTTGGTGGTGAAATGGTGGTAGACA~GCGAGACTCAAAAT~TCGTGCTAAACAGCGAGGAGGTTCGAGTCCTCTTCAAGGC TAATATTGAGATC M

C--~GAATGAGCATTCCCC-~GAAGTATTCCGGAAATCTGCGCCTGGCGCTCTCCTCTATCT

T

CA_~TTGAATGAGCATTCTCAATAAGAGAGCTCGGATCGAATCGGTATTGATATACCGATTCGAT

S

TGCAAGTACAATAAATAAAA . . . . . . . . .

AAATAAGTTCAGTAATCCATCTCTTTTTGGTAATCTTTTCGTCTAGTC-TATTTTTGATCG

......

TTTT

AGTAAGTAGAATTATATTCAAAATAAAAG~AAGGGG~TTT~T~GT~AGAAAACG--TGGATGAATA~TTG~TGTTTTCT~GATTTT~GAT~GGATTTTTT

LrnC T

GGCGACATGGCCGAGTGGTAAGGCAGAGGACTGCAAATCCTTTTTTCCCCAGTTCAAATCCGGGTGTCGCC ~ATCAACAAAAAACTCGAAATCTCTTCT

W

GGGGGCATGGCCAAGCGGTAAGGCAGGGGACTGCAAATCCTTTA-TCCCCAGTTCAAATCTGGGTGCCGCC

S

ACTTTCTTATCGACTGCTATAACCGATAACTCCCCGAATTCTTAGGCAGCCAAAAGGAAGCGTAAGAGGTCTCTTGGTACGTACTTG

ATCAATAAAATACTTAGGTTTTTTTTA

.

.

.

.

ATTCTAAGT

TAGTGCTGATCGAATTC

C--TTACACCTAATATTTTTCTACAACTTGTGATGGTATCAGCTTATATAACCATGTTCTGTTCGGGGGAATAAGAATAAAATAGAAATTGACTTATGGT

trn$ GGAGAGATGG~TG~GTGGTTGATAG~TC~GGTCTTGAAAACCGG~ATAGTTCTTTATTCAGAACTATCGAGGGTTCGAAT~CT~TCT~T~C~--TT(~

GGAGAGATGGCTGAGCGGTTGATAGCCCCGGTCTTGAAAACCGGTATAGTT•TTAACAAAGAACTAT•GAGGG•TCGAAT•C•TCTCT•TCC TTTTGC GGAGAGATGGCTGAGTGGTTGATAGCTCCGGTCTTGAAAACCGGTATAGTT .....

CTAGGAACTATCGAGGGTTCGAATCCCTCTCTCTCC --TTGT

S

TCGTTCAATAGAACTGTTTCCTTATTTGAATTTATTAGTGT

T

TAATTGAATAGATTTTTTTCTTTAGTGGTTTTGCCCAACCT

M

ATATTGAATACATTTGCTTCTTTCTATCTATGTTTTTCTT~GTGCAGCCCC~CT~ATATC;TTCTTTCATAAAAAAGC;T~.~GA,G~

S

CCTAG••GAGCCAGAAAAAAAAGAAATTAAATTATATAGAAATGAGTTGAAAAAAAAAGGAAGA•TTTTTA•GTATGA•CCCGATCCCAACATGTAATAT

.

.

.

.

.

.

.

.

.........

.

TGCTTCGTCTGGTAAAATATAAAATTAAAGAGAATGGCTCGGCTATCTCA GCTATCCGAAAG-AA . . . . . . . . . . . .

AAGGGAATGGCTCGGCTATCCCA

T M

e

s

ATGTATAGTATATATAACTTAGTGTATGTAATCTATG•-TACATAGATATTTTTTCTGTGTATATACATAATGACTCAGACAAGAATTGAcT••AAAAGG

T

---TAT-GCATATATA-CTTA-TGTTTATAATATATGTACCTATAGATATTTTATCCACATAGT

B

GAATAA-GTATATGCAGTAACAAATAGATGTACT

.....

........

AAACTCATCTTCATATTGGCTGAT . . . . . . . .

GAATAATTCCGGAATTAAATCAAAAAGTCCGTA-TTGGGGAATTTACTCAAACGCC

trnT

S T

GCCCTTTTAACTCAGTGGTAGAGTAACGCCATGGTAAGGCGTAAGTCATCGGTTCAAATCCGATAAGGGGC

B

GCCCTTTTAACTCAGTGGTAGAGTAACGCCATGGTAAGGCGTAAGTCATCGGTTCAAATCCGATAAGGGGC

--TTG ..........

S

TA . . . . . . . . .

T

TAGTATTCATATTTGAGGGGAGAATTGTATTT

B

AGGTATTCCTATTTGAAGGAAGATATAGAAATATTCTTGATATTTGTAAGAAGTTTCTGTTTGTA~ATAAAA~AAAACTAA

TTTGAACGGGGAATAGAGATA . . . . . . . . . . . . . . . . . . . . . . . . ........................

TAAAACTCCAATC

TTTGTATCAAAAAATGAT---AACAGT

TTGTTTGTTGATATTTTTAAAGTA-CAAATTAAGCAACTTTCTA TTATTTGTA-ATAAA--AAAAGTAACTAACIGGATAATACATTA ...................

388

IRNAcA l i eU

pG-C

~OH

tO"

~OH

C

C

5U

A

A -U

C U CA

G_A -Pc-o

Leu

C-G A-U U-A C -O C -O A-U uUA uAA GU CAUCC it tt i A G OUCG GUAGG ,~I Cu UU C O AGC o UU A A C C GAU C-G C-G A-U

t RNA CAA

U-A

G-C G G.U U GU UCUCC

uA A G

G U

A

A G

A

66~

GUG C, lAl C

A A U

UG

G U'~.C C-G G .G G-C CLFAu A-U A" A G-C A A-U C A U A C A A

to.

tRNA~'A

£

U

U

O-C Po -c A-U G°C A-U G-C G"A-UcuccCu U A A C GA O ~ll~l G O GUCG GAGGG C ''' C U UAGC . U U UU GA Cc_GU~A u C O -

C U U

u-A

U~C A, A

G

G' A U'G

A A A

U U

A

O C

U O G

O A

G

GU

CCG I

UA

t

I

uAA G GGCC I I I

C

CCCA(~

A

UuC

A

C U G

A-U G-C G-C A-U

C

A A A

to.

c

G-C G-C

pG-C G-C C-G G-C A-U C-G AmU

C-G U-A

u6,~ A UA A AC

Thr ~HNAGGU

pG-C C-G C -G C -G U "G U-A U-A U A A UGAc U Au UAGCC I I I I I A CA AUCGG G ~t~, AGAGUA CU U UC G U AAG A-U C-O G-C A U

cC--o °A

G G

A U

Fig. 3. Cloverleaf structures of tRNAs deduced from the DNA sequences.

quences in the 3"flanking regions, suggesting that trnI-CAU is transcribed monocistronically. There are two trnIs in tobacco chloroplast D N A . In addition to trnI-CAU, trnI-GAU is also located in the inverted repeat. This is the only case that two isoaccepting species are within the inverted repeat. trnL-CAA

The t r n L - C A A gene (in Ba6a and Ba6b) is located in the inverted repeat on the opposite strand to the r R N A genes. Both copies o f trnL have been sequenced and found to be identical. The tobacco t r n L - C A A gene is 81 bp long and shows 99%, 98%, 98%, 65% and 59% sequence h o m o l o g y with broad bean, pea and maize trnLs-CAA, tobacco trnL-UAG and tobacco trnL-UAA, respectively (2, 9, 20, 24, 30). The t R N A Leu (CAA) deduced from the D N A sequence shows 95% and 96°70 sequence h o m o l o g y with c o m m o n bean and soybean

t R N A Leu (CAA)s, respectively (15, 16). The 5"flanking regions are highly conserved between tobacco and maize (Fig. 2b). Prokaryotic promoter-like sequences are observed in the h o m o l o g o u s regions. In the 3"homologous regions between tobacco and maize, palindromic sequences are found. Interestingly, the 5" and 3"flanking regions o f the tobacco trnL are similar to those o f the trnLs from broad bean and pea chloroplasts whose gene organization (lacking the inverted repeat) is quite different from the former (2, 20). There are three trnLs and each is located in different regions o f the genome; t r n L - C A A is found in the inverted repeat, trnL-UAA in the large single-copy region and trnL-UAG in the small single-copy region. trnC-GCA

The

trnC-GCA

gene

(in Bal0b)

is located

389 Table l. Nucleotide sequences of 30 tRNAgenes from tobacco chloroplasts. Anti Codon

Amino stem

DDstem loop

Dstem

Anti. Ante stem loop

Ante stem

Extra arm

TFstem

"Ala AFg AFg Asn Asp Cys Gin Glu

(UGC) GGGGATATA GCTCAGTTGGT A GAGC T CCGCTCTTGCAA GGCGGATGT C (ACG) GGGCCTGTA GCTCAGA GGATTAGAGC A CGTGGCTACGAA CCACGGTGT C (UCU) GCGTCCATT GTCTAAT GGATA GGAC A GAGGT CTTCTAAACCTTTGG T (GUU) TCCTCAGTA GCTCAGT GGT A GAGC G GTCGGCTGTTAACCGATTGGT C (GUC) GGGATTGTA GTTCAATT GGTC A GAGC A CCGCCCTGTCAA GGCGGAAGC T (GCA) GGCGACATG GCCGAGT GGT A AGGC A GAGGA CTGCAAA TCCTTTTTT C (UUG) TGGGGCGTG GCCA AGT GGT A AGGC A ACGGGTTTTGGTCCCGCTATT C (UUC) GCCCCCATC GTCTAGT GGTTTAGGAC A TCTCTCTTTCAAGGAGG CAG C GIy (GCC) GCGGATATG GTCGAAT GGT A AAAT T TCTCTTTGCCAA GGAGA AGA T "GIy (UCC) GCGGGTATA GTTTAGT GGT A AAAC C CTAGCCTTCCAA GCTAA CGA T His (GUG) GGCGGATG TA GCCAAGT GGATCAAGGC A GTGGA TTGTGAA TCCAC CATG C " I l e (GAU) GGGCTATTA GCTCAGT GGT A GAGC G CGCCCCTGATAA GGGCGAGGT C Ile (CAU) GCATCCATG GCTGAAT GGTTA AAGC G CCCAACTCATAA TTGGCGAATT C *Leu (UAA) GGGGATATG GCGA AATC GGTA G ACGC T ACGGA CTTAAAA TCCGTCGACTTTAAAAATCG T Leu (CAA) GCCTTGGTG GTGA AAT GGTAG ACAC G CGAGACTCAAAA TCTCGTGCTAAATAGCG T Leu (UAG) GCCGCTATG GTGA AATT GGTA G ACAC G CTGCTCTTAGGA AGCAGTGCTAATGCA T °Lys (UUU) GGGTTGCTA ACTCAAC GGT A GAGT A CTCGGCTTTTAACCGACTAGT T fMet (CAU) CGCGGGGTA GAGCAGTTTGGT A GCTC G CAAGGCTCATAA CCTTGAGGT C mMet (CAU) ACCTACTTA ACTCAGT GGTTA GAGT A CTGCTTTCATACGGCGGGAGT C PNe (GAA) GCCGGGATA GCTCAGTT GGT A GAGC A GAGGA CTGAAAA TCCTCGTGT C Pro (UGG) AGGGATGTG GCGCAGCTTGGT A GCGC G TTTGTTTTGGGTACAAA ATGT C Ser (GGA) GGAGAGATG GCCGAGT GGTTGAAGGC G TAGCA TTGGAACTGCTATGTAGGCTTTTGTTTAC C SeF (UGA) GGAGAGATG GCTGAGC GGTTGATAGC C CCGGT CTTGAAA ACCGGTATAGTTTTAACAAAGAACTATC Set (GCU) GGAGAGATG GCTGAGT GGACTAAAGC G GCGGATTGCTAATCCGTTGTACGAGTTAATCGTAC C ThF (GGU) GCCCTTTTA ACTC AGT GGT A GAGT A ACGCCATGGTAA GGCGTAAGT C ThF (UGU) GCCCGCTTA GCTCAGA GGTTA GAGC A TCGCATTTGTAATGCGATGGT C TFO (CCA) GCGCTCTTA GTTCAGTTCGGT A GAAC G TGGGTCTCCAAA ACCCGATGT C TyF (GUA) GGGTCGATG CCCGAGC GGTTAATGGG G ACGGA CTGTAAA TTCGTTGGCAATATGTCTA C C Val (GAC) AGGGATATA ACTCAGC GGT A GAGT G TCACC TTGACGTGGTGGAAGT *Val (UAC) AGGGCTATA GCTCAGTTGGT A GAGC A ACTCGTTTACACCGAGAAGGI C invaFiant nucleotides

M

T RY

AR

• gene containing an i n t r o n ,

GG

R

RY

YT

R

Y

TFloop

TFstem

Amino. stem

AGCGG TICGAGT CCGCT TATCTCC A GGGGG TICGAAT CCCIC CTCGCCC A ATAGG TTCAAAT CCTAT TGGACGC A GTAGG TTCGAAT CCTAC TTGGGGA G GCGGG TTCGAGC CCCGT CAGTCCC CCCAG ITCAAAT CCGGG TGTCGCC T GGAGG TTCGAAT CCTTC CGTCCCA G GGGGATTCGAAT TCCCC TGGGGGT A GCGGG TTCGATT CCCGC TATCCGC C GCGGG TTCGATT CCCGC TACCCGC T GCGGG TTCAATT CCCGT CGTTCGC C TCTGG TTCAAGT CCAGG ATGGCCC A GTAGG TTCAATT CCTAC TGGATGC A GAGGG TTCAAGT CCCTC TATCCCC A GGAGG TTCGAGT CCTCT TCAAGGC A CTCGG TTCGAGT CCGAG TGGCGGC A CCGGG TTCGAAT CCCGG GCAACCC A ACGGG TTCAAAT CCTGT CTCCGCA A ATTGG TTCAAAT CCAAT AGTAGGT A ACCAG TTCAAAT CTGGT TCCTGGC A ACAGG TTCAAAT CCTGT CATCCCT A GAGGG TTCGAAT CCCTC TCTTTCC G GAGGGTTCGAATCCCTCTCTCTCC T GAGGG TTCGAAT CCCTC TCTTTCC G ATCGG TTCAAAT CCGAT AAGGGGC T ATCGG TTCGATT CCGAT AGCCGGC T GTAGG TTCAAAT CCTAC AGAGCGT G GCTGG TTCAAAT CCAGC TCGGCCC A ATCAG TTCGAGC CTGAT TATCCCT A TACGG TTCGAGT CCGTA TAGCCCT A RG TTCRA Y CY

M

R = A or G , Y = f or C , M = G or C

1.3 kbp upstream from the putative gene for the/3 subunit of RNA polymerase (14) on the opposite strand (strand A). The tobacco trnC-GCA gene is 72 bp long and shows 94070, 8807o and 85% sequence homology with spinach, wheat and Euglena trnCs-GCA, respectively (3, 4, 18). An A-C mismatch base-pair was found in the T~,C stem as has been reported in spinach trnC-GCA (4) (see Fig. 3). There is partial homology in the 5" and 3"flanking regions of trnCs-GCA from tobacco, spinach and wheat (Fig. 2c). No promoter- and terminator-like sequences can be found in the homologous regions.

trnS-UGA The trnS-UGA gene (in Ba9b) is located 240 bp downstream from psbC and 363 bp upstream from ORF62 (a putative gene for a membrane protein)

on the opposite strand (strand B). The tobacco trnS-UGA is 92 bp long and shows 92%, 91%, 85%, 70% and 67% sequence homology with spinach, maize and liverwort trnSs-UGA, tobacco trnSGCU and tobacco trnS-GGA, respectively (6, 10, 26, 29, 30). The trnS-UGA is the longest tRNA gene in tobacco chloroplast genome. There is moderate homology in the 5" and 3"flanking regions of trnSs-UGA from tobacco, spinach and maize (Fig. 2d). No promoter-like sequences can be found in the 5"homologous regions while short palindromic sequences are observed in the 3"homologous regions. These might be involved in termination of transcription from trnS-UGA or psbC. There are three trnSs in the genome and these are located away from each other in the large singlecopy region.

390 Table 2. Anticodons of tRNA genes and codon usage in 39 protein genes in tobacco chloroplasts.

U

U

C

A

G

F F L L

C

~

321 161 359 212

S S S S

202 52 123 67

P P P P

2

148

T

I M

[-C'~2 ~ 2

222 231

T T

V V V V

~-~

230 73 269 80

A A A A

* ~ ~

L L L L

i

I

2

~-~

*

2 * [~

A

~ ~G~

~

72 ~-~

* ~-~

2

191 112 95 48

Y Y

196 79 118 65

H H Q Q

130

N

152 57

K K

* ~-~

342 128 232 73

D D E E

~-~

G

~-~ stop stop ~ ~

2

[~

259 61 25 9

W

208 60 285 95

R R R R

~

305 110 373 104

S S R R

291 85 405 120

G G G G

C C

~ [GCA] stop ~

64 18 5 184

U C A G

190 44 155 42

U C A G

~ IGCU~ ~ I I

143 45 183 65

U C A G

~ [GCC[ * ~ [1

327 100 311 142

U C A G

2

Boxes, codons recognized by tRNAs (shown as anticodons); *, gene containing an intron; 2, two copies.

trnT-GGU The trnT-GGU gene (in Ba9a) is located 0.9 kbp upstream from trnEYD (13) on the opposite strand (strand A). The tobacco trnT-GGU is 72 bp long and shows 99070, 100070, 96°70 and 72°7o sequence homology with spinach, broad bean and wheat trnTsGGU, and tobacco trnT-UGU, respectively (5, 7, 11, 17, 30). The 5" and 3"flanking regions of trnTsG G U from tobacco, spinach and broad bean show partial homology (Fig. 2e). No promoter- and terminator-like sequences can be found in the flanking regions.

Compilation o f t R N A genes f r o m tobacco chloroplasts We have so far reported the location and structures of 25 different t R N A genes of tobacco chloroplast genome (9, 13, 25, 26, 30). Here we present five additional t R N A genes. As the complete nucleotide sequence of the tobacco chloroplast gen o m e has been determined, we have searched for t R N A genes using a computer (21). No additional t R N A genes could be found and hence these 30

t R N A genes are most likely to be all of the t R N A genes encoded in tobacco chloroplast genome. Seven of them are located in the inverted repeat and therefore the total number of t R N A genes is 37. Their m a p positions are shown in Fig. 1. The location of most trns is consistent with that based on t R N A / D N A fragments hybridization studies as reported by Bergmann et al. (1). The sequences of the 30 t R N A genes are shown in Table 1. General features of the chloroplast t R N A genes and tRNAs deduced from the D N A sequences are as follows. No t R N A gene code for the 3'-CCA end. Six t R N A genes harbor long single introns (503-2526 bp). Many of the t R N A genes contain sequences similar to the prokaryotic promoter in their upstream regions. A unique feature is the presence of a G2-C72 or C2-G72 base-pair in the aminoacyl stem of all the tRNAs. All the tRNAs can form the cloverleaf structure and none has an abnormal structure as has been reported for some m a m m a l i a n tRNAs. All the t R N A sequences show higher homology with the corresponding bacterial tRNAs than with the corresponding eukaryotic cytoplasmic and mitochondrial tRNAs. Table 2 summarizes the anticodons of tobacco

391 chloroplast t R N A genes and the codon usage in the sum of the 39 protein genes. All possible codons are used in the sequences coding for proteins in tobacco chloroplasts. The minimum number of t R N A species required for translation of all codons is 32 if normal wobble base pairing is involved in codon-anticodon recognition. Either t R N A L~u (UAA) or t R N A Leu (CAA) is likely to be dispensable. As shown in Table 2, no t R N A which recognizes codons C U U / C (L), C C U / C (P), G C U / C (A) and C G C / A / G (R) has been found. A unique mechanism must operate in the chloroplast if the 30 tRNAs predicted from the gene sequences are all the t R N A species present in the organelles (21). Relevant to this, structural studies of the chloroplast tRNAs are necessary. Seven t R N A genes are present in two copies each because they are localized on the inverted repeat. However, there is no apparent relationship between gene copies and codon usage frequencies. From the codon usage, all six t R N A species whose genes contain long introns are frequently used tRNAs. Therefore, the presence of long introns seems not to interfere with effective expression of these t R N A genes.

Acknowledgements We thank Dr M. Sugita for his help to sequence part of trnI-CAU. This work was supported in part by a Grant-in-Aid from the Ministry of Education, Science and Culture, and a grant from the Toray Science foundation.

References 1. Bergmann P, Seyer P, Burkard G, Weil JH: Mapping of transfer RNA genes on tobacco chloroplast DNA. Plant Mol Biol 3 : 2 9 - 3 6 , 1984. 2. Bonnard G, Weil JH, Steinmetz A: The intergenic region between the Vicia faba chloroplast tRNA T M (CAA) and tRNA Leu (UAA) genes contains a partial copy of the split tRNA T M (UAA) gene. Curr Genet 9:417-422, 1985. 3. HaUick RB, Hollingsworth M J, Nickoloff JA: Transfer RNA genes of Euglena gracilis chloroplast DNA. Plant Mol Biol 3:169- 175, 1984. 4. Holschuh K, Bottomley W, Whitfeld PR: Sequence of the genes for tRNA Cys and tRNAAsp from spinach chloroplasts. Nucl Acids Res 11:8547 - 8554, 1983. 5. Holschuh K, Bottomley W, Whitfeld PR: Organization and nucleotide sequence of the genes for spinach chloroplast

tRNA Glu and tRNATyr. Plant Mol Biol 3:313- 317, 1984. 6. Holschuh K. Bottomley W, Whitfeld PR: Structure of the spinach chloroplast genes for the D2 and 44 kd reactioncenter proteins of photosystem II and for tRNA ser (UGA). Nucl Acids Res 12:8819- 8834, 1984. 7. Kashdan MA, Dudock BS: Structure of a spinach chloroplast threonine tRNA gene. J Biol Chem 257:1114- 1116, 1982. 8. Kashdan MA, Dudock BS: The gene for a spinach chloroplast isoleucine tRNA has a methionine anticodon. J Biol Chem 257:11191 - 11194, 1982. 9. Kato A, Takaiwa F, Shinozaki K, Sugiura M: Location and nucleotide sequence of the genes for tobacco chloroplast tRNA Arg (ACG) and tRNA T M (UAG). Curr Genet 9:405 - 409, 1985. 10. Krebbers E, Steinmetz A, Bogorad L: DNA sequences for the Zea mays tRNA genes tV-UAC and tS-UGA: tV-UAC contains a large intron. Plant Mol Biol 3 : 1 3 - 2 0 , 1984. 11. Kuntz M, Weil JH, Steinmetz A: Nucleotide sequence of a 2 kbp BamHI fragment of Vicia faba chloroplast DNA containing the genes for threonine, glutamic acid and tyrosine transfer RNAs. Nucl Acids Res 12:5037-5047, 1984. 12. Maxam AM, Gilbert W: A new method for sequencing DNA. Proc Natl Acad Sci USA 74:560-564, 1977. 13. Ohme M, Kamogashira T, Shinozaki K, Sugiura M: Structure and cotranscription of tobacco chloroplast genes for tRNA Glu (UUC), tRNA Tyr (GUA) and tRNAAso (GUC). Nucl Acids Res 13:1045- 1056, 1985. 14. Ohme M, Tanaka M, Chunwongse J, Shinozaki K, Sugiura M: A tobacco chloroplast DNA sequence possibly coding for a polypeptide similar to E. coli RNA polymerase ~subunit. FEBS Lett 200:87-90, 1986. 15. Osorio-Almeida ML, Guillemaut P, Keith G, Canaday J, Weil JH: Primary structure of three leucine transfer RNAs from bean chloroplast. Biochem Biophys Res Commun 92:102- 108, 1980. 16. Pillay DTN, Guillemaut P, Weil JH: Nucleotide sequences of three soybean chloroplast tRNAs T M and re-examination of bean chloroplast tRNALeu2 sequence. Nucl Acids Res 12:2997- 3001, 1984. 17. Quigley F, Weil JH: Organization and sequence of five tRNA genes and of an unidentified reading frame in the wheat chloroplast genome: evidence for gene rearrangements during the evolution of chloroplast genomes. Curr Genet 9:495- 503, 1985. 18. Quigley F, Grienenberger JM, Weil JH: Localization and nucleotide sequences of the tRNA GIy (GCC), tRNAAsp (GUC) and tRNACys (GCA) genes from wheat chloroplasts. Plant Mol Biol 4:305-310, 1985. 19. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chainterminating inhibitors. Proc Natl Acad Sci USA 74:5463 - 5467, 1977. 20. Shapiro DR, Tewari KK: Nucleotide sequences of transfer RNA genes in the Pisum sativum chloroplast DNA. Plant Mol Biol 6:1 - 12, 1986. 21. Shinozaki K, Ohme M, Tanaka M, Wakasugi T, Hayashida N, Matsubayashi T, Zaita N, Chunwongse J, Obokata J, Yamaguchi-Shinozaki K, Ohto C, Torazawa K, Meng BY, Sugita M, Deno H, Kamogashira T, Yamada K, Kusuda J, Takaiwa F, Kato A, Tohdoh N, Shimada H, Sugiura M: The complete nucleotide sequence of tobacco chloroplast

392

22. 23.

24.

25.

26.

27.

genome: its gene organization and expression. E M B O J 5: in press, 1986. Sprinzl M, Moll J, Meissner F, H a r t m a n n T: Compilation of t R N A sequences. Nucl Acids Res 1 3 : r l - r 4 9 , 1985. Sprinzl M, Vorderwulbecke T, H a r t m a n n T: Compilation of sequences of t R N A genes. Nucl Acids Res 13:r51 - r104, 1985. Steinmetz A A , Krebbers ET, Schwartz Z, Gubbins E J, Bogorad L: Nucleotide sequences of five maize chloroplast transfer R N A genes and their flanking regions. J Biol Chem 258:5503 - 5511, 1983. Sugita M, Shinozaki K, Sugiura M: Tobacco chloroplast t R N A TM (UUU) gene contains a 2.5 kilobase pair intron: An open reading frame and a conserved boundary sequence in the intron. Proc Natl Acad Sci U S A 82:3557-3561, 1985. Sugiura M, Shinozaki K, O h m e M: Tobacco chloroplast genes for transfer RNAs. In: Vloten-Doting L, Groot GSP, Hall TC (eds) Molecular F o r m and Function of the Plant Genome. Plenum Publishing Corp, New York, 1985, pp 325 - 334. Sugiura M, Shinozaki K, Zaita N, Kusuda M, K u m a n o M: Clone bank of the tobacco (Nicotiana tabacum) chloroplast genome as a set of overlapping restriction endonuclease fragments: Mapping of eleven ribosomal protein genes. Plant Science 4 4 : 2 1 1 - 2 1 6 , 1986.

28. T a n a k a M, Wakasugi T, Sugita M, Shinozaki K, Sugiura M: Genes for the eight ribosomal proteins are clustered on the chloroplast genome of tobacco (Nicotiana tabacum): Similarity to the S10 and spc operons of Escherichia coil Proc Natl Acad Sci USA 83, in press, 1986. 29. U m e s o n o K, Inokuchi H, O h y a m a K, Ozeki H: Nucleotide sequence of Marchantia polymorpha chloroplast DNA: a region possibly encoding three t R N A s and three proteins including a homologue of E. coli ribosomal protein S14. Nucl Acids Res 12:9551 - 9565, 1984. 30. Y a m a d a K, Shinozaki K, Sugiura M: D N A sequences of tobacco chloroplast genes for t R N A set (GGA), t R N A T M (UGU), t R N A T M (UAA), t R N A Phe (GAA): the t R N A T M gene contains a 503 bp intron. Plant Mol Biol 6 : 1 9 3 - 199, 1986. 31. Yanisch-Perron C, Vieira J, Messing J: Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M 1 3 m p l 8 and pUC19 vectors. Gene 3 3 : 1 0 3 - 119, 1985. 32. Zurawski G, Bottomley W, Whitfeld PR: Junctions of the large single copy region and the inverted repeats in Spinacia oleracea and Nicotiana debneyi chloroplast DNA: sequence of the genes for t R N A His and the ribosomal proteins S19 and L2. Nucl Acids Res 12:6547-6558, 1984. Received 17 June 1986; accepted 24 July 1986.

Structures of tobacco chloroplast genes for tRNA(Ile) (CAU), tRNA (Leu) (CAA), tRNA (Cys) (GCA), tRNA (Ser) (UGA) and tRNA (Thr) (GGU): a compilation of tRNA genes from tobacco chloroplasts.

The location and nucleotide sequences of tobacco chloroplast genes for tRNA(Ile) (CAU), tRNA(Leu) (CAA), tRNA(Cys) (GCA), tRNA(Ser) (UGA) and tRNA(Thr...
535KB Sizes 0 Downloads 0 Views