YEAST

VOL. 8: 577-586 (1 992)

0 0 0 0 . 0

0

111

o0 0

0 0

Yeast Sequencing Reports

0 0 0 0

The Complete Sequence of a 9,543 bp Segment on the Left Arm of Chromosome I11 Reveals Five Open Reading Frames Including Glucokinase and the Protein Disulfide Isomerase B. SCHERENS,* F. MESSENGUY, D. GIGOT. AND E. DUBOIS Institut de Recherches du CERIA-COOVI, Laboratoire de Microbiologie, Universite'Libre de Bruxelles and Laboratorium voor Erfelijkheidsleer en Microbiologie, Vrije Universiteit Brussel, Avenue Emile Gryson I , B-1070 Brussels, Belgium

Received 2 March 1992; accepted 30 March 1992

We report here the DNA sequence of a 9.5 kb segment of chromosome 111. The sequence was determined by subcloning the segment into subfragments generated by appropriate restriction enzymes followed by oligonucleotide-directedsequencing. The segment contains at least five open reading frames, YCL311, YCL312, YCL313, YCL314, YCL315. YCL311 and YCL315 extend in the adjacent fragments, A4H and A6C respectively. YCL312 encodes glucokinase, and YCL313 the protein disulfide isomerase. Disruption of YCL311, 314 and 315 by insertion of a URA3 cassette does not lead to a detectable phenotype, whereas disruption of YCL3 13 provokes cell lethality. KEY WORDS-yeast;

chromosome 111; gene disruption

INTRODUCTION As a part of the European project to sequence chromosome I11 of Saccharomyces cerevisiae, we have sequenced and analysed a 9,543 b p D N A fragment located between HIS4 and HML on the left arm of c h r o m o s o m e I11 ( t h e Y I p 5 A I G c l o n e prepared by C. Newlon). No genetic locus was reported in this fragment. In addition to the determination of the nucleotide sequence, we have also established the transcriptional map of the fragment and disrupted by insertion of a URA3 cassette the putative open reading frames. MATERIALS AND METHODS

strain 01884bxlOR34d (ura3Iura3, leu2lLEU2) were used in the gene disruption experiments. The Escherichia coli strain XL1-B from Stratagene was used for amplification of plasmids. The plasmid-harboring yeast strains were grown in a minimal medium with 0.02 M - ( N H ~ ) ~ SasOa~ nitrogen source. The composition of the minimal medium was described previously (Messenguy, 1976). The phagmid pBS (+KS) from Stratagene was used as a vector to subclone and sequence. The plasmid pUC19 from Pharmacia was used in the constructions for gene disruption experiments. Plasmid p F L 4 4 (pUC19,URA3,2p, a g i f t from F. Lacroute, described in Bonneaud et al., 1991) was used as a vector for expression in yeast.

Plasmids, strains and media The yeast strain 10R34d (ura3,Mata) (Dubois et al., 1987) was used as a recipient strain in transformations; 01884b (ura3, leu2, M a t a ) and the diploid *Corresponding author.

0749-503x/92/07057749$09.50 01992 by John Wiley & Sons Ltd

Procedures for gene disruptions

To disrupt YCL311 O R E the 1.1 kb SalI-BamHI fragment was inserted into a pUC19 vector. The 0.6 kb XhoI-XhoI fragment of this plasmid was replaced by the 1.1 kb BgZII-BgZII fragment containing the URA3

B. SCHERENS ET AL.

578 gene, after treatment with T4 DNA polymerase. Haploid and diploid ura3 strains were transformed with the SalI-BamHI fragment disrupted by insertion of URA3. To disrupt YCL314 and YCL315 ORFs, the 4.1 kb BamHI-SalI fragment was inserted into a pUC19 vector. The 0.1 kb PstI-EcoRV fragment of this plasmid was replaced by the BglII-BglII URA3 fragment after treatment with T4 DNA polymerase to disrupt YCL314, and the 1.75kb HindIII-XhoI fragment was replaced by the same URA3 fragment after treatment with T4 DNA polymerase to disrupt YCL315. These two BamHI-SalI fragments with URA3, were used to transform haploid and diploid ura3 strains.

These subfragments were analysed with different restriction enzymes to establish a restriction map. The appropriate restriction fragments were then isolated from low melting agarose gels and subcloned in the pBS (+KS) phagmid to be sequenced. Double-stranded templates were prepared using an alkali lysis small-scale preparation, followed by RNAse treatment and PEG precipitation. These were then denatured prior to annealing using an alkaline denaturation method (Chen and Seeburg,l985; Hattori and Sakaki, 1986). The templates were sequenced by the dideoxy chain-termination method (Sanger et al., 1977), using synthetic oligonucleotides as primers, with the Sequenase kit (USB) and [35S]ATF' from Amersham.

Transformations Northern analysis

Yeast strains were transformed using the method of Hinnen et al.(1978) with some modifications. Glucoronidase/arylsulfatase (from Boehringer Mannheim, FRG) was used instead of glusulase to generate spheroplasts and the regeneration medium was the selective medium containing 3% agar and 1 M-sorbitol. E.coli strain XL1-B was transformed by the method described by Petes et a1.(1978), except that after growth in LB broth for 45 min at 37"C, the cells were directly plated on selective medium without suspension in soft agar.

Labelling

Molecular cloning procedures and restriction analysis

Computer-assisted sequence comparison

Recombinant DNA methods followed standard protocols (Maniatis et al., 1982). Restriction endonucleases were obtained from BRL and T4 ligase, E.coli DNA polymerase (Klenow fragment) and alkaline phosphatase were obtained from Boehringer Mannheim, FRG; all were used according to the manufacturers' instructions.

Comparisons were done by bitnet. The FASTAP program was used for comparing the ORFs with the known protein sequences in the SwissProt databank at EMBL in Heidelberg. The MacMolly and DNA Strider programs (Marck, 1988) were used to calculate hydrophobicity and folding. The algorithms of Kyte and Doolittle (1982) and Gamier (1978) are used in this computer program.

Total RNA extraction, electrophoresis, hybridization and washing of DBM paper were described previously (Messenguy and Dubois, 1983).

Nick translation of the 1.1 kb Sun-BamHI, 1.2 kb BglII-BglII, 1.4 kb BglII-HindIII, 0.6 kb BstEIIEcoRV and 1.8 kb HindIII-XhoI DNA fragments used for Southern and Northern analysis was performed following the procedure of Rigby et al. (1977).

DNA preparations Yeast DNA was prepared by the method of Cryer et al.( 1975) with minor modifications. E.coli plasmid DNA was prepared by the method of Birnboim and Doly (1979). The DNA fragments were isolated from low melting agarose gels (BRL). Template preparation and sequencing The complete A1G fragment (9.5 kb) was cut into three smaller fragments using SalI restriction sites.

RESULTS AND DISCUSSION Sequence analysis The sequence of AIG fragment (9,543 bp) determined on both strands with largely overlapping subclones, is given in Figure 1. The restriction map of the fragment is shown in Figure 2 along with the sequencing strategy. The sequence reveals five ORFs (YCL311:600aa; YCL3 12500aa; YCL3 13:522aa; YCL3 14:417aa;

Figure 1 Complete sequence of the 9.543bp segment (AlG)on the left arm of chromosome 111. The five major ORFs are shown from their first possible AUG codon to the first encountered stop codon. Only one DNA strand is given. The ORFs encoded by this strand are listed above the DNA sequence, while the ORFs encoded by the complementary strand are listed below the DNA sequence. First and last codons are in bold. The restriction sites indicated in the sequence are the ones that were used to create the gene disruptions.

579

SEQUENCE OF 9,543 BP SEGMENT ON CHROMOSOME 111 160

LLKGKEFTPS lljc

P

WDFWSPTI

KILWQSKLK KNSVSPRIFY T I V L L V F I T I

RTCFVDLGLD

CIISTSELNT

Bamn 1 GULTCC PISILQSD S G

GTT AAA CGA AGA AAC TAA GCA GGA AGT TTC GGT AGG GTT CGA mn GAT 1818 N F S S V L C S T E T P N S L I S I Xho I GAG AAC CCT GTT GCG G r c XG AGA ATC AGG GAT GAC TTT cx CCA AGG 1 9 2 6 L V R N R D R S D P I V K E W P 1 2

132 6 682

CAA TAG ATT TCT GAA ATG AGT GAT AAT GAA ATT ATC ATT AhT TGG 54 L L N R F H T I I F N D N I P 6 6 6

ACC TAA GTT AGC CAG TTG CCA ATE ACT GAT AAA rcc ATC ATC TGA MA 1 9 1 4 G L N A L Q W D T l F A D D S F 2 6

GAT AAC TGC COT G T I TFO GGT AGC CAT AAA CTC TTT TTT CTT A X CTT 1 0 2 I V P T Y P T A M F E K K K D K 6 5 0

AAC GGC TTG GAC ACA ACT CGT GTT TAG GAA GAG TAA AAT W GAC GTA 2 0 2 2 V A Q V C S T N L F L L I F V Y 1 0

XInd I l l aC CAT TTC TTC TGC TGG TTT ACC TCT TGC ATT CAA TAA AAG CTT TGG 1 5 0 A II E E A P K G R A N L L L K P 6 3 1

CAC CAA GTC TGT ACA COT TAT CTT CAT TGCTA TOGOGGUGG GGAGULTGAA 2 0 1 4

L

V TAT GTA GGT GAT TTG TCC ATT TTC TAA TTC CAT AAC GAT TGC CTT TGT 198 I Y T I Q G N E L E M V I A K T 6 1 8

Arc D

m P

ATC ATA AGA M A A r r GTC D Y S F N D

a r

I

T

K

C-

I!

ICL315

0

TCACGGTGTI CGAAUCLIFU TTTAATATAC GCACGGTACA ACTAAGCAAT CCGCAAAGAC 2 2 5 4

GGG AAA AAT GAA TTG TTT AGT TTG AAA TTG AGG TTT GTT AAT GTG TCC 291 P F I F Q K T Q F Q P K N I H 0 5 8 6

m

C

T

T A T A T T E C A T A T A T M T A T GAAATCCCAG CCATATTTTC TCTFOTAGCC G T C T W 2 1 9 1

GGT AAT ATC ATC CGT TGT CTT GGI M T ffiA CAT TCT TTT GAT ART CTC 2 4 6 I D D T T K S I S II T K I I E 6 0 2 T

AGT CAA T L

D

AGTGTTGATA TGAATGTAGG TATTAGTTAT TAATGGAGTG TATATATATA TATGTTATTA 2 1 3 4

CSA ATG TGT CTT TTC ATT AGT GAG A 0 2 CTT GGG GGT AGG CCC TGG T M 2 3 0 2 417 H T K E N T L A K P T P G P 1 1 0 3

TEA GTT AGA CAA ACG CTC 3 4 2 S N S L R E 5 7 0

CEO CTG GTC CGT GCT AGT GSG TGT CTT GGT ATG GGA GGG CAT GGT AGT 2 3 5 0 P Q 0 T S T P T K T H S P M T T 3 8 1

GGT TAG TGI CTC ATA TAA TTC CAC CAC T I C TAA CTT TTG TTC 390 T L S E Y L E V V V L K Q E 5 5 1

TGG TAT GAA TTT GAT GCT TAG CGA AGT TTC TAA GGC CAA AGT CAT CCA 2 3 9 8 P I F K I s L s T E L A L T H n 3 i i

P

S

TGG AAC AGG TTC AGA ACT GAA ATA GGA ATA AAC GAC CCA ATA TTC ACC P v P E s s F Y s Y v v n Y E ~ GAA AAC AAT ATC CAT F V I D M

438 5

3

8

TGG

M A CCT M A ATC CEO AGA ATC CTT GTG CTC 486 P F R F D P S D K H E 5 2 2

TGG ATC ATA AAG CGA TCT TAT TGT ATC GAG ATC AAT ATC TTG CGG GCC 2 4 4 6 P D 1 L S R I T D L D I D Q P 0 3 5 5 TTT CGT GTC ATG GGG CAA AAT GAC ACC TCG GGA GAG TGT GGT GCT GGA 2 4 9 4 K T D H P L I V G R S L T T S 5 3 3 9

TTG f f i T AAT GAG G I T TTC TCC TGT AAT GGT ATC MT TAR GTT AAA AGT 531 0 T I L I E G T I T D I L N F T 5 0 6

rcr

GAT TGT ATG ATG TTC TTC ATT IOC GAT CAG ATA AGC CCC T I G GTT GGG 5 8 2 I T H H E E N A I L Y A A L N P 4 9 0

ATG CAG TCT TTT TTC CAC ATT TTG TAG ATG GAG TTT TTT GGA CTC GGA 2 5 9 0 H L R K E V N Q L H L K K S E S 3 0 1

GIA

Y

R

CAI ATA TTT ATA AAG AAC CGA TTT GTC ACC TAG I C T AAT ACC AAG 630 L Y K Y L V S K D G L T 1 0 L 4 7 4

CTT CTT TAC K K V

AAG ACC TTC AAG

L

E

G

L

TGG TGC P

A

GGA AAT GGA ATC TCC CGA AGA TGC TAA GAA 2 5 4 2 S I S D G S S A L F 3 2 3

GCC AGT AGA AGA AGA CTG CGG ATT GTG GGA TAT GGA 2 6 3 8 O T I S S Q P N H S I S 2 9 1

AGT GTT TAG GTT TGT TFT TTC CCT CTT GCT ATA TGC CAC CAT TTT CTC 6 1 8 T N L N T T E R X S Y A V M X E 4 5 8

ATT GAT GAG TTC AAT GGA ATT TTC TCT AAT GTT TCT TTT ATT N I L E I S N D R I N R K N

TTT CGA ATT TAC CU: TTT TTT CCA AGT TTG TTl T I C CGT ATC GTT TTC 1 2 6 I: S N V A K K W T Q K V T D N E 1 1 2

AAA

crc

F

E

GAT TAT GTA GCC ATA AAT GCC ATT CGT GTC I\TG CTC TGT GAC GAA TAT 1 1 4 I I 1 G 1 I G N T D H E T V F I 4 2 6

GAA AGA CAT EGG CTT GTA TTC CAA GGG TCC GAA CCA TTT AAT AAT GGG 2 1 8 2 F s P K Y E L P G F K I I ~ 2 4

GTG ffiA ACT AGA ATG GTT GTT GGC M C TAT TGG TAC ATC CGT ATT CTT 822 H S 6 S H N N A V I P V D T N K 4 1 0

ATT CTG CGC CTC AAC TTG CGC TCT AGC GAG AGC GAC GCT TTG TTT GAC 2 8 3 0 N 0 A E V Q A R A L A V S Q K V 2 2 1

GCC GGG ATT C A I TTT GAA CAG TAA ATG ATC CTT ATC CTC A M CTT AAT 870 G P N L K F L L H D I( D E F K 1 3 9 4

CTC AGG GaT ATT TGT CGA GAG AGG AGT ACC GTT CAT AGC ATC CTT GAC 2 8 1 8 E P I N T S L P T G N M A D K V 2 1 1

rro

MA 2 6 8 6 F 2 1 5

Q

GAA CAT GTC CAA GTG AAA T r c CAT GCG ATT GAG GTA CTC GTT 2 1 3 1

F

M

L

D

H

E

F

R

II

L

N

Y

N 2 5 9

E

n

CAG ATA CTG GTG ATC ATG TTC ATT CAG CCT AAT AAC GGA ATC AAC AGT 918 L Y O H D H E N L R I V S D V T 3 J 8

TTT TAT TTT GOT GGG GTC TAT GCC CTC GTG ATA ATA TTT GAC &AT A T T 2 9 2 6 I K T P D I G E H Y 1 K V I N 1 9 5 K

TTT CCT TTC AGA TAG GGG GGA I T T TFA TTT GAT aAT CGG TTG ATC TTT 9 6 6 K R E S L P S R 5 K I I P Q D K 3 6 2

GTT GAA M A CTT CTT GTC ACG T I C GGA TTT CAT TTC GTC GCA CCA G m 2 9 1 1 N F F K K D R v s K n E D c n p i 1 9

AGT AAC T X AAT TGT C I A ATA ATG CCC ATC ATG CGA A M AAC AAC AAG 1 0 1 1 r v E I T L Y H G D n E F v v ~ 3 4

6

TTC GTC AT0 GTT TTT ATC CGT CCA GAA CAT TTT u \ C GCC CTG ATC AAT 1 0 6 2 E D H N K D T n F M x v G Q D 1330 TTC TAA ATC GAG TTT CCA GAT M F TTG GCC CTT ATT GAC CAT A X AAG E L D L n I L G K N v D ~

a

x

1

GTC GTA TTT AAG M A GTT CAG GCG ACT AAG CTC CAA CAG TGA CTG AGA 3 0 2 2 D Y K L F N L R T L E L L S Q E l 6 3 AAG TAG CAA AGA GCC A M CAA CCC CCA TAA GAG CTT TCT GAC GCT GAC 3 0 1 0 L L L s G F L G n L L K R v s v i a 7

1110 3

4

GAA TCG AGA TTG GCG AGG TAT GAT GGG AGO AGG GGC GAG CAC AAT GTC 3 1 1 8 F R S Q R P I I P P P A L V I D l 3 1

GCC GCC GAT AAA GCC ATC ATG CGT CAA AAC M T TAA GAT TTT GGC GAA 1 1 5 8 G G I F G D H T L V I L I K A F 2 9 8

ACC ATG GOT GGT GGA GAA GCG GTT C I A AGT TGG AAA GTC AGC AAT ATC 3 1 6 6 H T T S F R N L T P F D A I D l 1 5 G

TCC GAA CTT CAA ATC TGA T I C CGT GGT ATC CTT AGC GTC TAG AGC TAG 1 2 0 6 C F K L 0 S V T T 0 K A D L A L 2 8 2

CTT I T T GAT TTT AGT CTT GCC ATT GAC AAT CAT TTC I C T GGG TIC TIlC 3 2 1 4

GAG TTT AGT GAA GAC ACG TCC TGG TGA GAA TTG GTT TTC TTT CAA TAA 1 2 5 4 L K T F V R G P S F Q N E K L L 2 6 6

GCC AAA GGA TAA ATC CGA GTG ACT GCC AGT GGC AAT GCG GCG COG TGG 3 2 6 2 G F S L D S H T G T A l R R P P 8 3

GTT GAT AAG GCG ATT CCA ATT AGT TGT CAA TCT M G CCA GTA AGC ATT 1302 N I L R N N T T L R L w Y A ~ 2 5

n

CCA AAG CGA ATT TGA ATC M G TTC CGC TTT CAT ATC TTT GTT C M CTC 1 3 5 0 L s N s D L E A K w D K N L ~ 2 3

n

K

0

GGC AGA GAA TCC CAT ATA TTG GET ATC ATT AAC GTT CAA CAC TAA A S F G II Y Q T D N V N L V L CAG GGG ATC TTC ACG ATG AAA ATA

L

P

D

E

R

H

F

Y

TTC CAC TTT ATT E V K N

G

Q

2

ca

1194 A 1 8 6

TGC GTT ATG GTC ATT A N H D N

AAT GGT AAA GGG TAA AAC GTT K T I T F P L V N R

GAA F

ACT S

ATC CAG TTG AAT D L Q I

c n E

AX D

1782 9

0

AAG AAT CTG TCC GUL A m AAC 1 8 3 0 L I Q O S S v J 4

N

V

I

~

E

S

P

L

E

~

~

L

K

Y

L

R

R

L

R

Y

L

A

F

F

G

V

~

~

I

A

S

Q

I

T

I

L

T

Y

L

G

P

L

N

CAT c r II

-

T T T A T T A C ~ TAATTGGAAA ~~ AGGAGAGGGA

D

P

S

I

ATGGAGGAGG ATGAGATAAG

~

3560

rcL 314

0

AGCG&ACAAC TATTOTGTTT GAATTTTMC GTTTATCTTT TTATGATTTT TTTMAAAAA CTTCKTAGM AATTTCTTAT ATATCTCTAT T T A A T G W L A C C I U F T G A

6

M

G

ATTATTICTA CGTGGIICIAT TTTCTCTTTC CCTAAGGCGT TCGTGCAGTG TGACGAATAT 3 6 8 0

AAA CCA ATC GTG TAA ATC ATA TTT CTG OCA ATG GTT GCT TGA AGA GTT 1 1 3 4 F D H L D Y K Q H N s s s ~ 1 0

car

K

TAGTTTCCTA AGCAGTAMG CAAGTGTCCA CATTTAAGTA AAAWCGATG CCATGCACTT 3 6 2 0

2

n

T

AGC Fuj TAC GAG CGG ACC CCA CAG TTT TAA GCC TAA ACT TGG CCT TAT 3 4 0 6 A P V L P G Y L K L G L S P R I ~ Prrrll Y . GTA GAA TTT CTT GAT ATC ATT ATC ATC TTT GTC GTG GTC ATC TTG TGT 3 4 5 1 Y F K K I D N D D K D H D D Q T 1 9

A

CTC TAG T M 1 6 8 6 E L ~ 1 2

CAC ACA AAC V C V

P PIC

AGC

P I J O

TAG TTC ATT GTT CTT AAT AGA TAC TTG ATC GTT TAA ATA TGA GGA TTG 1 6 3 8 L E N N K I S V Q D N L Y S S 0 1 3 8

n

K

GTG AGT A T 0 GTC GGT GGA ATT GCT ATT ACC TGA TGG AGG AGT AAA TAC 3502 H T H D T S N S N G S P P T F V 3

TAA CTC CAA TTT CCa TTC TGC CAG TTT TGA CTG CTC ATC GAG AAT ATG 1 5 9 0 L E L K W E A L K S Q E D L I H I 5 1

AGO TAA AAT GGT CGT AGA AGG GGC ATT ATC TIC GCC TTC P L I T r s P A N D v G E

I

CGG TAG TTT GTA GAG CCT TCT CAG TCT CCA AAG GGC AAA GAA ACC CAC 3310

4

GAA CCC TTG AGG 1 5 1 2

F

X

TGC AGA TTG GAT AGT GAT TAG AGT GTA TAG TCC CGG TAG GTT ATC AGG 3 3 5 8

CAC ATC CCT AGA ATC GAA GAC GTC CAA TAC AGC ATA GTC T I C CAC GTT 1 3 9 8 V D R S D F V 0 L V A Y 0 V V N 2 1 8 AGT C I A CCA TTC ATC TCT TTG CCA AAC GGG GAT CAA TTC TGT GCC ATT 1116 T L n E D R Q w v P 1 L E T G ~ 2 0

3

A C I

L

TTC ATC GTG AAT GGC ATC E D H I A D

rrc E

3140

TCAGAA T l A 3 1 9 9

T r c GTC AGC CAA TTC AGC GTC AGC 3 8 1 1 E D A L E A D A 5 0 1

A X GGC TTC CTC AGC AGC TTT TTC CTG GGC TTC TTC G T 1 CAA GCC CTT 3895 D A E E A A K E Q A E E Y L A K 191 Sal 1 ACC GTC GAC GTC GAA GTG ACC GTT TTC CTT GAT GAA GTC GAA TAA AGA 3 9 4 3 G D V 0 F H G N E K I F D F L S 4 1 5

B. SCHERENS ETAL.

580 GTC CU OEA TCT TCA ACC TTG GTA CAC W C AGA TTC GGA CTT C T T ACC 3991 D L S R S G 0 1 V V S E 5 K K G 4 5 9

I

AGA CTC C A I CAC GAT C T T GTC GCT CAA TTC GTC M A CGC CTC TTC AGA 4 3 7 5 S E L V I K D S L E D F A E E 5 3 3 1

GAG TTG AGG CAA ACC GTA C T T CAA GTC TTC AFT CAT GTC GTG GAT GGC 4 4 2 3 L Q P L G Y K L D E T I4 D H I A 3 1 5 AAA TAG ACG GAA TTG TTC C T T CAT GTT C I A GTT GCC GGC GTG TCT GCC 4 4 1 1 F L P F 0 E K U N L N G A H R G 2 9 9 GAA TTT TCT GGC ATC GAT GCT AAC M A GTT CAT TAG ACC TCT GTT C T T 4 5 1 9 R A D I S V F N n L G R N K283 F K

TTT GGC CAA CTC GGT M A GAG AGG CTT GTA TTC TTC CAA T X TTC CTC 4 5 6 1 E T F L P I( Y E E L E E E 2 6 1 K A L GTC ATT GTA GAA TAA GTA ACC CAA AGG CAA ACC GCT TTC GAC GTA TTG 4 6 1 5 D N Y F L r G L P L G s E v r 0251 GGC GAA M C GGA ACC GTC GAT TTC ACC AAA GTA GGG CAA GGC TTC CAC 4663 A F V S G D I E G F 1 P L A E V 2 3 5 EcoR V

TTG C A I CCA TTT TTC AAA AAC ATC AGC GTC AGC GAT ATC GGC T T T CTT 4 1 1 1 Q L Y K E F V D A D A I D A K K 2 1 9 ACC GTT GTA TAC TAC AGG CTC GTC CAT GGC GGA GGG CAA GTA AAT AGA 1 7 5 9 G N r v v P E D n A s P L Y I s203 H l n d 111 AAG CTT GAA ATC ATC GTC TGC GTT TTC AGC GGA GAC AAA GTC GTA GTC 4 8 0 7 F D Y D 1 8 7 N E A S V F D D D A L K GTT GAA GTG TTT GTT Goc CAT GGA GTA AAA GGT GGC GTT GAA GTC GGC 4 8 5 5 A 1 1 1 Y F T A N F D M S N F H K N A GTC AAT CTT ACC GGA TTG GAC GAT AAC TGG AGT GAC AAA AGT CTC GTT 4 9 0 3 P T V F T E N 1 5 5 1 V Q V K G 5 D I AGC AAG GTA AGC TGG TAG ATC AGC AAC AAC Goc GAC ACf CGG TTG GCT 4 9 5 1 V A V A P 0 S 1 3 9 L D A V P a L Y A TTG CTT GAT CAT GAA TTG GAC AAT U i C CTC GGC AGT TCT A U i TCC CTC 4 9 9 9 Q K I II F Q v I A E A T n P G ~ 1 2 3 Hlnd I I I GTA ATC GAT CGA GTT GTT AAC ATC GCT GTT TTT GAA RAT CTT CAA GCT 5 0 4 7 Y D I S N N V D S N K F I K L S 1 0 7 TGG GAA CCC TGG AAT GTT GTG TTC CAT ACA CAG ATC C I G GTT TTC ACT 5 0 9 5 P F G P I N H E W C L D O N E T 9 1 ACA GTC GAT CTG GGC C I A GGT AAT GTT TTT CTC AAC TAA ACT C X : GCC 5 1 4 3 C D I Q A L T I N K E V L T E A 7 5

TTT AAC GTA TTC AGG AGC CAT GTT C T T ACA GTG GCC ACA CCA T f f i 5 1 9 1 P

A

W

N

K

C

H

G

C

H

P

5

9

AGC M A MA CTC CGC AAG CAC C I A GTC GTG CGA CTG AAT GTA CTC ATT 5 2 3 9 A F F E A L V L D H S Q I Y E N O GAA GGA GTC GGT GGC C I A C T T M C GAC AGC GGA GTC TTC AGG GGC CAC 5 2 8 7

F

S

D

T

A

L

K

V

V

A

S

D

E

P

A

V

2

N

G

T

E

R

G

10

V

H G D H T F S W E Q M X S K 112 N L AAC TTG CAT GGA GIT CAT ACT TTC TCC ATG GAG C M ATG AAG TCC M G 6314

D E L A K G K D A K P U K L G 160 CCG GAC GAG TTG GCC AAG GGT A M GAC GCC AAG CCC ATG MA C T E GGG 6158

GGA OOC ATC ACC T T T CAA GAA GTC C T T AAC CAA AGA TTC AAT AGC C T T 1 3 2 1 S A D G K L F D K V L S E I A K 3 + 7

E

B t C E I1 S P

G

P

GAC AGA GGA ATC TTG GTT CTC GAA GAT CTC TTG GGA CTT CAC GAT TGG 4 2 1 9 S S D Q N E F I E Q S K V I P 3 6 3 V

Y

T

F G F L ' A R R T L A F U K K Y H 111 T T T GGG T T T C T I GCA CGT CGT ACA CTG GCC TTT ATG M G M G TAT CAC 6110

CTT TGG GTC GTT GAC GAT TTC GTC ATG GTT CTT ACC GAC CAA TTG G M 4 2 3 1 G V L Q F 3 1 9 K P D N V I E D H N K

V

V

1 P D D L L D D E N V T S D 0 L 121 ATT CCC GAT GAT TTG CTA GAC GAT GAG M C G X ACA T C T GAC GAC CTG 636.2

CTT ACA GTG ACC ACA CCA TGG GGC ATA GTA CAA AAC AAG AAC GTC CTT 4 1 8 3 K c H G c Y P A r r L v L v D 1 3 9 5

K

F

L A A D L G G T N F R I C S V 96 TTA CTA GCC GCC GAC CTG GET GGT ACC AAT TTC CGT ATA TOT TCT mr 6 2 6 6

Goc GTT GGC GTA GGT A X AGC TAG TTC TTG ETA ACT TGG Goc CAA TCT 4 1 3 5 P A L R 1 1 1 Q 1 T L 1 T D A E A N A

A

A

L

GAC ATC GTT TTC AGT GTG GTC TAG T T T AGC M T CAA AAC G T C GGA TGT 1087 A I L V D S T 4 2 7 L K H D D N E T V

GGC

P

ATT CCG GCG T T C GTC ACC GGG TCA C C C AAC GGG ACG GAG CGC GGT GTT 6211

BILE 11 ACC TGG GTA TAA GAC GAT TGT TGG G?A ACC TTC AAT T I C GAC GCC TCT 4039 V V G R 4 4 3 G E I P Y L V I T G P Y

7

a F C CTC TTG TTG GGC GAA M C AGA GGA Goc GAG CAG CAG GGA GGA CCA 5335 A E Q Q A F Y S S A L L L S S Y I I

F T F S 1 P V D Q T S L N S G T 116 TTC ACT TTC TCA T I C C C T GTA GAC CAG ACC T C T CTA AAC TCC GGG ACA 6506

I R Is T K 0 F R I A D T V G K 112 TTG ATC CGT TGG ACC AAG GGT TTC CGC ATC GCG GAC ACC G T C GGA AAG 6551 L

L S A Q G U P 201 Q 0 L 1 Q E D V V GAT GTC GTG CAA TTG T I C CAG GAG C A I TTA AGC GCT CAG GGT ATG CCT 6 6 0 2

n I K V V A L T N D T V G T 1 L 221 ATG ATC AAG GTT GTT GCA TTA ACC AAC GAC ACC GTC GGA ACG T I C CTA 6650 s

H C Y T S D N T D S U T S G E 210 TCG CAT TGC T I C ACG TCC GAT AAC ACG GAC TCA ATG ACG TCC MI GAA 6691 I S E P V I 0 C I F G T G T N G 256 ATC TCG GAG CCG GTC ATC GGA TGT ATT TTC GGT ACC GGT ACC M T GGG 6116 C 1 U E E I N K I T K L P 0 I L 272 TGC TAT ATG GAG GAG ATC AAC AAG ATC ACG AAG TTG CCA CAG GAG TTG 6 7 9 4

R D K L I K E G K T H M I I N V 281 COT GAC AAG TTG ATA AAG GAG GGT AAG ACA CAC ATG ATC ATC AAT GTC 6 1 1 2 Y G S F D N E L K H L P T T K 304 GAA TOG GGG TCC TTC GAT AAT GAG CTC AAG CAC TTG CCT ACT ACT AAG 6190

E

Y D V V I D Q K L S T N P G F H 320 TAT GAC G T C GTA ATT GAC CAE AAA CTG TCA ACG LAC CCG GGA TTT CAC 6 9 3 1

L F E K R V S G M F L G E V L R 336 TTG TTT GAh AAA CGT GTC TCA GGG ATG T T C TTG GGT GAG GTG TTG COT 6986

N I L v D L H s Q G L L L Q Q r 352 AAC ATT TTA GTG GAC TTG CAC TCG C A I GGC TTG C T T TTG C M C I G T I C 7031

s

R

E

K

R

P

L

Q

H

L

T

P

T

0

F

361

L

AGG TCC AAG GAA CAA C T T CCT CGC CAC TTG ACT ACA CCT TTC CAE TTG 7 0 1 1 S E V L S H I E I D D S T G L 381 TCA TCC GAL GTG CTG TCG CAT ATT GAA A T T GAC GAC TCG ACA GGT CTA 7 1 3 0 S

R E T E L S L L Q S L R L P T T 100 CGT GAA ACA GAG TTG TCA TTA TTA CAG AGT C T C AGA CTG CCC ACC ACT 7 1 7 8 P T E R V 0 I 0 K L V R A I S R 116 CCA ACA GAG CGT GTT CAA ATT CAA AAA TTG GTG CGC GCG ATT TCT AGG 7 2 2 6 R S A 1 L A A V P AGA TCT GCG TAT TTA GCC GCC GTG CCG EcoR V T N A L N K R 1 H ACA AAT GCT TTG AAC AAG AGA TAT CAT

L A A I L I K 132 C T T GCC GCG ATA TTG ATC AAG 1 2 1 4

E V E I G C 118 GOT FAA G T C GAG ATC GGT TGT 1 3 2 2 G

G S V V E I 1 P G F R S M L R 161 GAT GGT TCC G T T GTG GAA TAC T I C CCC GGT T T C AGA TCT ATG CTG AGA 1310

D

H A L A L S P L G A E G E R K V 180 CAC GCC TTA GCC TTG TCA CCC TTG GGT GCC GAG GGT GAG AGG AAG GTG 7118

H L K I A K D G S G V G A A L C 196 CAC TTG AAG ATT GCC AAG GAT GGT TCC GGA GTG GOT GCC GCC TTG TGT 7166 A L V A GCG C T T GTA GCA

TO1

TCTTTT

TACATTTTTT

TGGTTTGTGT

ACGTAXCCA

500 7511

CCGTACTTAC CATCTTCTCT CCTTTATATA TATATATATA TATGTATATT TTCAGTGTAT 7571 ATACATACAT TCTTATACAA TACCGTATAA GAACGTATGT ATGTATGTAT W T T T T C T C A 1637

xipl

S

CAG GAC Gu: ACC AGC AGA AAA CTT L V A G A S F K

U?

AACGG

u

GATAGATGTA 5 3 8 0

- 1 ~ ~ 3 1 3

o

TGTTTTTCTT TATGTAGAGC TTGGCAGCTT TAAAATTACC MTTAGACAT GCTGCTTGCA 5 4 4 0 ACAAGAAAAT GCACGCGTAA CAMATATAT ATATATATAT ATATATATAT GTATGTCCAT 5 5 0 0 ACGGGTTTTT C G T T T W T G TGGTTGTAAC AGCACAACAA MTGCTACAC GGTGGCMAT 5 5 6 0 TGGAAAAAGA GACTAGTGAG AGMAAGGGA AAAAGAGGCG CCGCCCGACA GGGTAACATA 5 6 2 0 TTATCACGTG CAGCCCAGGA T A A T T T X A G GACACGTGTT TCGAAAGGTT TGTCGCI'CCG 5680 I U U A M T C M AMAAACAAA AAACGGGAAA TAACAATAAC GACAAAAATG GAAAAAAAAA 5 1 4 0 AAATTTTAGA CGCGGCGCTT GCACCCCGCA TTATAaGTGG TGTGCCGACG GACGGTCAAC 5 8 0 0

CCTTTCAGGG GGCCCCCTAT CTTGCGTGGT GTCACACCAC GCTCAAACGG TGGGCACTAA 5 8 6 0 TGAAAAGGGG CCCATATAAA TATCCGCTAT CAACAGAACC CCCAACCCCC CCATCAGTGC 5 9 2 0 CCAACTCAGC TTCCGTAAAC CACAACACCA CCACTAATAC AACTCTATCA TACACAAG

5978

YCL 312 M S F D D L H X A T E R A V I O 16 A'SQ TCA TTC GAC GAC TTA CAC AAA GCC ACT GAG AGA GCG GTC ATC CAE 6 0 2 6 BSCE

I1

CATTAA-

CAATATCATC GGAAACOGGC CTTTGGTTTT TAAOOOAAAA TGAAGAACAA 1697

YCL 311 L TAAATGTCTT GCTAGGTCTT CCTTTATCAT TTTCCAACTC AAAT

W

K S S 4 M A AGC AGC 1 7 5 3

S K G R T T S T N K U S H T N K 2 0 TCC AAG GGG CGT ACT ACT TCA ACT AAT A M ATG TCA CAC ACT AAT AAG 7801 I A Y V L N N D T E E T A S P S 3 6 ATC GCA TAC GTG TTG AAC AAT GAC ACG GAG GAA ACA GCC TCG CCC T C T 7119 S V G C F D K K O L T K L L I H 5 2 TCC GTT GGT TGT T T T GAC AAG M A CAG CTC ACT AAA TTA CTG ATA CAT 7197 T L K E L G Y D S A A N ACT CTA AAG GAG CTG GGC T I C GAC TCC GCC GCT AAC EEOR V E S G G Y Q N E S N H I GAG AGC Goc GGA TAT C I A AAT GAG T C T AAC CAC ATC

K

L

I

K

T

G

Q

F

H

L

I

AAA CTC ATC AAG ACC GGC C I A TTC CAT C T T A T T

N

Q L L L 6 1 CAG TTA CTG CTA 1 9 4 5 Q T F F 8 1 CAG ACT TTC TTC 7993

w

Q

I

v l o o

AAT TGG CAG ATT GTT 8011

A V D Q I C D D F E V T P E K L 32 GCC GTG GAC CAG ATC TGC GAC GAT TTC GAG GTT ACC CCC GAG AAG CTG 60'14

C S L P L A H S S P L R S E W L 1 1 6 TGC TCG CTG CCC C T T GCC CAT AGC T C T CCG CTA AGG TCC GAA TGG C T T 8089

D E L T A Y F I E Q M E K G L A 48 GAC GAA TTA ACT GCT T I C TTC ATC GAA C I A ATG GAA A M GGT CTA GCT 6122

Q R L L I P T P T P A T T S L F 1 3 2 CAA AGG CTG C T C ATT CCC ACG CCG ACG CCC GCC ACG ACT TCA C T T TTC 1137

P

D H n L L Q L Q Y L Q Q L n s s118 GAC CAC ATG CTC CTG C I A CTG CAA TAC CTG CAA CAG CTG ATG AGC TCT 1115

P K E G H T L A S D K G L P U 64 CCA CCA AAG GAA GGC CAC ACA TTG GCC TCG GAC AAA GGT CTT CCT ATG 6 1 1 0

581

SEQUENCE OF 9,543 BP SEGMENT ON CHROMOSOME 111 v N 8 s T c s D A E I A 'r L R ~ 1 6 GTI\ SAT TCA TCT h C T TGT TfX GAC GCA GAG ATC GCC ACG CTC AGA AAT 8233

4

r v E I M I L v N R Q I F L E F I ~ TAC GTA GAG I\TC ATG ATT CTA GTT M T AGA CAA ATA TTC CTC G M TlC 8 2 8 1

O

r

H P V T N S A S H K G P H T A 1 9 6 TTC CAT CCA GTC ACA M T TCC GCC TCT CAC M G GGC CUT CAC h C T GCA 8 3 2 9

L P v L r L R K I L I( N i' I E 1 2 1 2 CTG CCC GTC CTC TAT TTG CGC M A ATC C K M A M C TTC I\TC GAG ATA 8311 L 5 S N W 0 L V D Q F L N E E N TGG GAT TCC CTG CTG GTG TCT SAC GAT CAG TTC CTA AAT G M G M AAC sei I I F N P E T T L R E L S T Y L T ATC TTC M T CCG GAA ACG ACC TTG AGA GAA CTG TCG ACG TAC CTG ACC Xha I N P K L T A Q L N L E R D H L 1 M C CCA M A CTA ACC OCG CAG TTC AAC CTC GAG CGA GAC CAT CTG ATT

2 2 8 8425 2 4 4 8473 2 5 0 8521

D

A I S K Y I D P LI E L V P K G 2 J 6 GAC GCC ATC TCC AAA TAT ATC GAT CCG M C GAA CTC GTT CCC TlliC GGT 8569 R L L H L L K Q A I I( Y U Q S 0 2 9 2 CGT CTC TTG CAT CTC TTG AAA CAG GCC ATC M G TAT CAA C M TCA C M 1617

D GAC

I F N I I ATT TTT AAT ATC ATC

D P D D 0 A C I ' S 5 3 0 8 CAT CCG W T GAT GAC K C TCT TI'C TUC TCT 8 6 6 5

P P H R 1 N L L Q D N r S H 0 L 3 2 4 CCA CCC CAC CGG RTC M C CTT TTG CAG GAT M C TTC PCA C K G&T CTG 8713

T V T F Q E W K T I 0 D T T 0 E 3 4 0 ACT GTG ACC TTT CAA GAA TGG M G ACT ATC CAA GAC ACC AcA GAT GAA 8761 1 W F L T F 5 Y N ti K Y i A 5 A 3 5 6 ATT TGG TTT TTG ACA TTT TCG CCC MI. V.C M G TAT TTG CCT TC'I GC Y L 3 7 2 ACT TCC GAG TCT TCA AGA GGC TIC TTC APT ACT GTT TAT OAT GTG G M n n r , i Q D F K I Y K T c v s L s u I "388 CAA GAC TTC M A ATT TAT M G ACT TGC GTG AGC TTG YCh C M 1.CC 011 890,r

L r L M F s P o s x Y L v A c ~ 4 0 TTG TAC CTC ATG TTT TCT CCT GAT AGC CGG TAT CTG G T I GCT T K CCT 8953 F s E D v T I r D M N A T s L TTC AGC GAR GAT GTT ACC A T T TAC CAC ATG AAT UIC ACC TCC CrG

4

~ 4 2 0 9001

ccc

A S A T D S F L L Y P S T K 1 . 4 3 6 GAT GCG TCT GCC ACA GAC TCG TTC CTA CTA TAC CCT TCT ACC AGG CTC 9049 D

S P M D S F K L D T T T 1 P U " TCG CCC ATG GAC TCG TTT M A CTG GAC ACC ACC ACG TAC CCli GAT CAP Xho I T E S S A S 5 S S R P A N A N S ACC GAA TCA TCA GCA TCA TCT TCC TCG AGG CCG GCA AAC GCC AnT TCT

4 5 2

9091 O 6 8 9145

N Q S R V W C C D A F H T A E R 4 8 4 M T C A I TCA AGA GTA TGG TGC TGC GAT GCC TTC CAC IICT GCG GLI\ CGT 9193 A G w M v v G s P GCG GGC TGG ATG GTG GTT GGA TCG CCC

E

D

n

E

A

I

GAC AGA =A19 GCT AT1

v H ' ~ O O GTA CAC 9 2 4 1

L F 5 L X G R T C 1 1 6 TCI\ CTC ACC ACG M A GAG T C A CTT TTT AGC TTA M A CGC Am; ICT TCT 9 2 8 9 5

L

T

T

K

S

I A L G H D E N 1 S G R K S I O 5 J 2 ATC GCG TTG GGC CAC GAT GAA M C I\TC TCT GGG AGA A M TCA A r c 61\11 9 3 3 1 P A K v L r K P T s s N G N w 0 5 4 8 CCT GCA M A GTC CTT TAC M A CCA ACA AGT RGC AAT GGT An

The complete sequence of a 9,543 bp segment on the left arm of chromosome III reveals five open reading frames including glucokinase and the protein disulfide isomerase.

We report here the DNA sequence of a 9.5 kb segment of chromosome III. The sequence was determined by subcloning the segment into subfragments generat...
647KB Sizes 0 Downloads 0 Views

Recommend Documents