Nucleic Acids Research, Vol. 18, No. 20 6133

Complete nucleotide sequence of the aroA gene from Salmonella typhi encoding 5-enolpyruvyishikimate 3-phosphate synthase S.Chatfield, G.Dougan and I.Charles* Department of Molecular Biology, Wellcome Biotech, Langley Court, Beckenham, Kent BR3 3BS, UK Submitted September 17, 1990

EMBL accession no. X54545

REFERENCES

As part of a programme of research to generate fully characterized aro- mutants of Salmonella typhi for use as live attenuated vaccines we have cloned and sequenced the aroC (1) and aroD genes (submitted to J. Gen. Microbiol.) of S. typhi. We report here the nucleotide sequence of the S. typhi aroA gene encoding 5-enolpyruvylshikimate 3-phosphate synthase (EPSP synthase). An aroA containing cosmid was isolated by complementation of E. coli BRD048 aroA (2). The full length sequence encodes a protein of 427 amino acids that shows a 97.7 % similarity with the S. typhimurium EPSP synthase (3). N

E

S

L

T

L

0

P

I

1. Charles,I.G., Lamb,H.K., Pickard,D., Dougan,G. and Hawkins,A.R. (1990) J. Gen. Microbiol. 136, 353 -358. 2. Dougan,G., Chatfield,S., Pickard,D., Bester,J., O'Callaghan,D. and Maskell,D. (1988) J. Infect. Dis. 158, 1329-1335. 3. Stalker,D.M., Hiatt,W.F. and Comai,L. (1985) J. Biol. Chem. 260, 4724-4728.

A

R

V

GA I

D

N

L

P G

S

K

S

V S

N

R

A L

L

TrTTATTTCTGTTTTTBTTGAGATOTTTCATOGAATCCCTACBTTACAACCCTCCGCGBGTCGATGGCGCCATTAATTTACCTGGCTCCAAAAGTGTT CAACCGTGCTTTGCTC L

A A

30

20

10 L

A

C

G K

T

V

40 L

T

N

50 L

L

D

S

60 D

D

V

9o

70 R

H

M L

N A

L

90

S A L

100

G I

N

110 Y

T

L S

120

A

D

R

CTOOCBBCTTTAGCTTBTGGTAAACCTTCTGACGAATCTBCTGBATAGCGATBACGTCCGCCATATGCTCAATCCCTB4BCGCGTTOGGZATCAATTACACCCTTTCTGCCGATCGC 140

130

T

R

D

C

I

T

O NG

O P

L

R A

16O

170

160

150

S G

T

L

E

L

190

F

L

G

200

N

A G

210

T A

F R

220

P

L

A A

A

L

230

240

C

L O 0 GTCTGGGA 360

ACCCGCTGTGATATCACGGTATOSTCGGCCCATTACGCGCGTCAaCACTCTGGAACTGTTTCTCBGTAATGCCGGAACCGCGATBCGTCCGTTAGCGGCAGCCCTAT 260

250

290

270

290

300

310

330

320

350

340

N E I V L T G E P RM G G A N I D Y L E 0 E N Y P P H L V D S L RM K E R P I G AATGTAGATATTTAACCCGCGACB TCCCBTT TAAGAGCCCATGCATCTGBT TCAATTCCTGCGTCABGOTBGGBCGATATTATTATTCCTGGAGAGAACTATCCBCCC 400 410 390 380 450 440 460 370 420 430 470 460 L

R

L

R

O O

F

I1

0

D

I

E

V

D

O S V S

S

0

F

L

T

A LL

M T

A

P

L

A

P

E

D

T

I

I

R

CTBCGTCTGCGCGGCGGTTTTATCGBCGCB3CACATTGAGBTTGATGGTAGCGTTTCCAGCCABTTCCTBACCGCTCTGCTGATGACGGCGCCGCTGGCGCCTAGACACAATTATTCGC 600 530 490 530 500 510 520 540 550 540 570 590 V

K

G E

L

V

S

K

P

Y

I

D

I

T

L

N

L

M

K

T

F

0

V

E

I

A

N

VY

H

0

0

F

V

V

K

0 B

0

0

OTTAAUGCGAACTGBTATCAAACCTTACATCBATATCACBCTAAATTTAATGAAAACCTTTGGCBTGBAGATABCBAACCATCACTACCAACAATTTBTCBTGAGGG(TCAACAB 650 640 660 680 620 630 610 670 690 700 710 720 R VY L V E G D A S S A S Y F L A A G Y H S P T V K V T G I G G K S M I K BG TATCACTCTCCAGBTCGCTATCTG@TCBAGCATBCCTCBTCAGCGTCCTATTTTCTCGCCGCTGBGGGCATAAAGrGCGGCACBGTAAAAGTGACCGGBATTGBCBGCAAAfTATO 740 770 780 640 930 760 900 730 750 790 920 91o

0G

D

I

R

F

A

D

V

L

H

K

M G

A

T

I

T

W

O D

D

F

I

A

C

T

R

G E

L

H

A

I

D M

D M N

H

CABBBCGATATTCGTTTTBCCGATBTGCTCCACAAAATGGGCGCBACCATTACTTGGGGCGATGATTTTATTGCCTGCACBCGCGGCGAATTGCACGCCATAGATATGBATATGAACCAT 990 660 900 960 950 950 870 9so 930 940 910 920 I

P

D

A

A M

T

I

A

T

T

A

L

F

A

K

O T

T

T

L

R

N

I

Y

N

W

R

V

K

E

T

D

R

L

F

A M

A

T

AG ATCGccTGTTCGCWWTliCGAC ATTCCG9ATGCG7G0T9C6TTCCA9 11CCACGG10CGA0G1G0CCACG1C0TT0GC10A0TATTTATA1060AOTA 990 970 1000 1010 1070 990 1020 1050 10eo 1030 1040 1060 E L R K V GA E V E E 6 H D Y I R I T P P A K L 0 H A D I G T Y N D H RM A M C TATCAC GAOCTACSTAMOTGTB CBCTGATCOAGASCCACTATATTCG 13CCGCCGGCGAAGCTCCAACACGCGG ATATTB GCACG TACAACGACCACCGTATOOCGATGTBT 1200 1190 1160 1100 1120 1130 1110 1140 1170 1090 1150 1160 F

S

L

V

A

L

S

D

T

P

V

T

I

L

D

P

K

C

T A

K

T

F

P

D

Y

F

E

0

L A

R

M S

T

P A

TTCTCACT(3TCBCACTGTCCGATACBCCAOTCACGATCCTGGACCCTA TGTACCGCAACGTTCCCTGATTATTTCGAACACTGGCBCOMGATTACBCCTBCCTAATTCTTC 1210

1220

1230

TOTTGCGCCA 1330

*

To whom

correspondence

should be addressed

1240

1250

1260

1270

1290

1290

1300

1310

1320

Complete nucleotide sequence of the aroA gene from Salmonella typhi encoding 5-enolpyruvylshikimate 3-phosphate synthase.

Nucleic Acids Research, Vol. 18, No. 20 6133 Complete nucleotide sequence of the aroA gene from Salmonella typhi encoding 5-enolpyruvyishikimate 3-ph...
120KB Sizes 0 Downloads 0 Views