YEAST

VOL. 7: 533-538 (1991)

0 0 0

01

111

0

:I Yeast Sequencing Reports

0 0 0 0

The Complete Sequence of a 11,953 bp Fragment from C 1G on Chromosome I11 Encompasses Four New Open Reading Frames MASSOUD RAMEZANI RAD, KATJA LUTZENKIRCHEN, GANG XU, ULRICH KLEINHANS AND CORNELIS P. HOLLENBERG Institut f u r Mikrobiologie, Heinrich-Heine-Universitat,Universitatsstr. I , 0 4 0 0 0 Diisseldorf I , Germany

Received 6 March 199 1; revised 9 March 199 1

The DNA sequence of a 11,953 bp segment of chromosome I11 encompassing part of ClG is reported. Plasmid C l G was received from S. Oliver (UMIST). It is a YIp5 derivative containing a 22 kb BamHI fragment of the small ring chromosome 111 in the HIS4 region (Newlon etal., 1986). The right part of the C 1G fragment, including the HIS4 gene, has been sequenced over 11.9 kb. Recombinant DNA methods followed standard protocols (Sambrook et al., 1989). The sequence was determined by dideoxy sequencing, using directed techniques. The C1G plasmid was submitted to EcoRI digestion and appropriate restriction fragments (cf. Figure 1) were subcloned into pUC derivatives. Progressive unidirectional deletions were constructed in both orientations from double stranded DNA with Exonuclease 111 (Henikoff, 1984). Nucleotide sequencing was performed using the dideoxy chain termination procedure (Sanger et a1.,1977) with T7 DNA polymerase. Custom synthesized oligonucleotide primers were used to sequence the gaps and junctions between the subfragments. Microgenie and GCG programs were used to assemble and edit the DNA sequence. The sequencing strategy is outlined in Figure 1. The sequence presented in Figure 2 was submitted to MIPS in July 1990. SEQUENCE ANALYSIS This sequence overlaps the previously published sequences of the HIS4 locus (Donahue et al., 1982)

and BIKl locus (Trueheart et al., 1987). Table 1 shows the differences between the enclosed sequence with the previous one over the 5.5 kb overlap. None of these differences involves changes in the length of the open reading frames (ORFs). This suggests that at least part of these differences is due to poly morphism in different strains that have been used for the construction of yeast DNA libraries. Table 1. The list of differences in the nucleotide sequence of the C I G segment, which overlapped the previously published BIKl and HIS4 repion.

Position

Previous New Change in ORF sequence sequence

cc

20,21 965 992 1058 2533,2534

TT G T A CG

A C G GC

3501 3532 358 1-83

C T TTT

G C CCC

4314 4756 5314 5336 5338 5477

T T C A G G

C A T G A C

T to A -

H to A R to A S to R T to H V to A F to L -

F to I P to L -

R to H Q to D

Correspondence: Dr. M. Ramezani Rad, Institut fur Mikrobiologie, G. 26. 12, Universitatsstr. 1, D-4000 Diisseldorf 1, Germany.

0749-503X/9 1/050533-06$05 .OO

01991 by John W&y

& Sons Ltd

M. RAMEZANI RAD ET AL.

534

~ g I Ii ,7064

1 -----

PuuII,7578 DraIII.7893 Eco47-3,8162 Spe I ,8465 AurII,8694 Pst1,8844 Bgl I 1.9586 EcoRI, 10045 Af111,10504

EcoRI , 1

11,564

B 111 1~386

! & ~ t l ~ I ,1159

EcoRI ,3094 Esp I ,3388 B 111.4064 PuuII,i7ZZ l%uII,4l51

Ban I I , 1551 ac I, 1551

I

BIKl

YCLl8l

HIS4

c

c

c

c

c

c c

c

c c

c

c

c

c

c

c

c

c

e

c

c

c

c

c c

e

c

c

c c

-

c

c c

BamHI

c

c

e

c

c

EcoRl

c c c

c

YCL187

EcoRl

c

c

c

200 amino acids) downstream of the HIS4 region, YCL184, STESO (YCL185), YCL186 and YCL187 (Figures 1 and 2). These ORFs show no significant similarities with other sequences in data banks. Sequences upstream and downstream of ORFs have been examined for the presence of basic transcription elements. STE.50, YCL186 and YCL187 show putative TAT(TA)A boxes at position 7334,8686 and 11744 respectively. STESO and YCL187 show downstream of ORFs the transcription termi nation signal TTTTATA at positions 6078 and 9434 respectively. We have not found any intron or potential splicing signal in this

12 kb fragment. This reflects the character of the compact yeast genome. Between ORFs STE.50 and YCL187 there are 2737 bp with a number of small direct and inverted repeats, a partial Ah-like sequence and one autonomously replicating sequence (ARS) consensus element. There are two oligo(dA-dT) at positions 7941 and 9900. Such segments of natural DNA can adopt the Z conformation (Chamberlain et al., 1986). At position 8376 to 8395 is a 20 bp Alu-like sequence, similar to a common parental Alu sequence (Deininger and Slagel, 1988). The presence of direct and inverted repeat sequences, and Z-DNA-like structure around the ARS are hints for the occurrence of recombination events in this region (Slightom et a]., 1980).

535

COMPLETE SEQUENCE OF 1 1.953 hp FRAGMENT ENCOMPASSES NEW OPEN READING FRAMES 100 200 300 400 500 TGCATGGTTTCCTTGAGAAAAATGAGACTCAGCCTCTGAGATTAACTTATCCGTATC~TTCAGATCTTTGCTATACGTTTGTATCGCTATATGTACGT 600

ACACCAAAAACGTTTGGACGAGACAGGCATCAAAGGACAAGGTAAAAGGCGTTGAGCTGTGGCTGGCTGTGTATGCGTlTGAAATACC~GATAGATAT 800 CAAAGAAAGATAGGATGTTTCATACAAATCCCAAATTTGGGGCGCGGACAACTGAAATACGTGGGTCCAGTGGACACGAAAGCTGGAATGTTTGCTGGTG TAGACTTACTTGCCAACATTGGTAAGAACGATGGATCATTCATGGGGAAGAAGTATTTTCAAACAGAGTATCCTCAAAGTGGACTATTTATCCAGTTGCA AAAAGTCGCATCATTGATCGAGAAGGCATCGATATCGCAAACCTCGAGAAGAACGACGATGGAACCGCTATCAATACCCAAAAACAGATCTATTGTGAGG CTCACTAACCAGTTCTCTCCCATGGATGATCCTAAATCCCCCACACCCATGAGAAGTTTCCGGATCACCAGTCGGCACAGCGGTAATCAACAGTCGATGG ACCAGGAGGCATCGGATCACCATCAACAGCAAGAATTTGGTTACGATAACAGAGAAGACAGAATGGAGGTCGACTCTATCCTGTCATCAGACAGAAAGGC TAATCACAACACCACCAGCGATTGGAAACCGGACAATGGCCACATGAATGACCTCAATAGCAGCGAAGTTACAATTGAATTACGAGAAGCCCAATTGACC ATCGAAAAGCTACAAAGGAAACAACTACACTACAAAAGGCTACTCGATGACCAAAGAATGGTCCTCGAAGAAGTGCAACCGACTTTTGATAGGTATGAAG CCACAATACAAGAAAGAGAGAAAGAGATAGACCATCTCAAGCAACAATTGGAGCTCGAACGCAGACAGCAAGCCAAACAAAAGCAGTTTTTTGACGCTGA GAATGAACAGCTACTTGCTGTCGTAAGCCAACTACACGAAGAGATCAAAGAAAACGAAGAGAGAAATCTTTCTCATAATCAACCCACTGGTGCCAACGAA GATGTCGAACTCCTGAAAAAACAGCTGGAACAATTACGCAACATAGAAGACCAATTTGAGTTACACAAGACAAAGTGGGCTAAAGAACGCGAACAATTGA AAATGCATAACGATTCGCTCAGTAAAGAATACCAAAATTTGAGCAAGGAACTATTTTTGACAAAACCACAAGATTCCTCATCGGAAGAGGTGGCATCCTT

AACGAAAAAACTTGAAGAGGCTAATG~AAAATCAAACAGTTGGAACAGGCTCAAGCACAAACAGCCGTGGAATCGTTGCCAATTTTCGACCCCCCTGCA CCAGTCGATACCACGGCAGGAAGACAACAGTGGTGTGAGCATTGCGATACGATGGGTCATAATACAGCAGAATGCCCCCATCACAATCCTGACAACCAGC AGTTCTTCTAGGCAGTCGAACTGACTCTAATAGTGACTCCGGTAAATTAGTTAATTAATTGCTAAACCCATGCACAGTGACTCACGTTTTTTTATCAGTC

900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200

ATTCGATATAGAAGGTAAGAAAAGGATATGACTATGAACAGTAGTATACTGTGTATATAATAGATATGGAACGTTATATTCACCTCCGATGTGTGTTGTA 2300

HIS4 (YCLltU) => CATACATAAAAATATCATAGCACAACTGCGCTGTGTAATAGTAATACAATAGTTTACAAAATTTTTTTTCTGAATA~GTTTTGCCGATTCTACCGTTA 2400 2500 ATTGATGATCTGGCCTCATGGAATAGTAAGAAGGAATACGTTTCACTTGTTGGTCAGGTACTTTTGGATGGCTCGAGCCTGAGTAATGAAGAGATTCTCC 2600 2700 2800 2900 3000 AAAGGCCATCGATTTGGGTCGTGGCGTTTATTATTCTC~TTCTAGGAATGAAATCTGGATCAAGGGTGAAACTTCTGGCAATGGCCAAAAGCTTTTACAA 3100 ATCTCTACTGACTGTGATTCGGATGCCTTAAAGTTTATCGTTGAACAAGAAAACGTTGGATTTTGCCACTTGGAGACCATGTCTTGCTTTGGTGAATTCA AGCATGGTTTGGTGGGGCTAGAATCTTTACTAAAACAAAGGCTACAGGACGCTCCAGAGGAATCTTATACTAGAAGACTATTCAACGACTCTGCATTGTT 3200 3300 AGATGCCAAGATCAAGGAAGAAGCTGAAGAACTGACTGAGGCAAAGGGTAAGAAGGAGCTTTCTTGGGAGGCTGCCGATTTGTTCTACTTTGCACTGGCC 3400 AAATTAGTGGCCAACGATGTTTCATTGAAGGACGTCGAGAATAATCTGAATATGAAGCATCTGAAGGTTACAAGACGGAAAGGTGATGCTAAGCCAAAGT 3500 TTGTTGGACAACCAAAGGCTGAAGAAGAAAAACTGACCGGTCCAATTCACTTGGACGTGGTGAAGGCTTCCGACAAAGTTGGTGTGCAGAAGGCTTTGAG GAGACCAATCCAAAAGACTTCTG~TTATGCATTTAGTCAATCCGATCATCGAAAATGTTAGAGACAAAGGTAACTCTGCCCTTTTGGAGTACACAGAA 3600 3700 AAGTTTGATGGTGTAAAATTATCCAATCCTGTTCTTAATGCTCCATTCCCAGAAGAATACTTTGAAGGTTTAACCGAGGAAATGAAGGAAGCTTTGGACC 3800 TTTCAATTGAAAACGTCCGCAAATTCCATGCTGCTCAATTGCCAACAGAGACTCTTGAACTTGAAACCCAACCTGGTGTCTTGTGTTCCAGATTCCCTCG 3900 TCCTATTGAAAAAGTTGGTTTGTATATCCCTGGTGGCACTGCCATTTTACCAAGTACTGCATTAATGCTTGGTGTTCCAGCACAAGTTGCCCAATGTAAG 4000 GAGATTGTGTTTGCATCTCCACCAAGAAAATCTGATGGTAAAGTTTCACCCGAAGTTGTTTATGTCGCAGAAAAAGTTGGCGCTTCCAAGATTGTTCTAG 4100 CTGGTGGTGCCCAAGCCGTTGCTGCTATGGCTTACGGGACAGAAACTATTCCTAAAGTGGATAAGATCTTGGGTCCAGGTAATCAATTTGTGACTGCCGC CAAAATGTATGTTCAAAATGACACTCAAGCTCTATGTTCCATTGATATGCCAGCTGGCCCAAGTGAAGTTTTGGTTATTGCCGATGAAGATGCCGATGTG 4200 4300 GATTTTGTTGCAAGTGATTTGCTATCGCAAGCTGAACACGGTATTGACTCCCAAGTTATCCTTGTTGGTGTTAACTTGAGCGAAAAGAAAATTCAAGAGA 4400 TTCAAGATGCTGTCCACAATCAAGCTTTACAACTGCCACGTGTGGATATTGTTCGTAAATGTATTGCTCACAGTACGATCGTTCTTTGTGACGGTTACGA 4500 AGAAGCCCTTGAAATGTCCAACCAATATGCACCAGAACATTTGATTCTACAAATCGCCAATGCTAACGATTATGTTAAATTGGTTGACAATGCAGGGTCC GTATTTGTGGGTGCTTACACTCCAGAATCGTGCGGTGACTATTCAAGTGGTACTAACCATACATTACCAACCTATGGTTACGCTAGGCAGTACAGTGGTG 4600 CCAACACTGCAACCTTCCAAAAGTTTATCACTGCCCAAAACATTACCCCTGAAGGTTTAGAAAACATCGGTAGAGCTGTTATGTGCGTTGCCAAGAAGGA 4700 G G G T C T AG A C GGT C A C A GA A A C GC T GT GA A A A T C A GA A T GA G T AAGC T T GGGT T GAT C C C AAAGGAT T T C C AG~ ~ T T AT T T C T AAC T T GGAAAC C GAA 4800

AGTTCTCCAAAGAGGAAGAAGTTCCATTGGTGGCTTTGTCCTTGCCAAGTGGTAAATTCAGCGATGATGAAATCATTGCCTTCTTGAACAACGGAGTTTC TTCTCTGTTCATTGCTAGCCAAGATGCTAAAACAGCCGAACACTTGGTTGAACAATTGAATGTACCAAAGGAGCGTGTTGTTGTGGAAGAGAACGGTGTT TTCTCCAATCAATTCATGGTAAAACAAAAATTCTCGCAAGATAAAATTGTGTCCATAAAGAAATTAAGCAAGGATATGTTGACCAAAGAAGTGCTTGGTG AAGTACGTACAGACCGTCCTGACGGTTTATATACCACCCTAGTTGTCGACCAATATGAGCGTTGTCTAGGGTTGGTGTATTCTTCGAAGAAATCTATAGC

CACTAACGAAAATAATATGTATATATACATATATATATCAAACAAAATACAGTCTTCAATGMTAGAGATACACTATGTAATGAATGGTAACGTAAAAAT TGTAATTTTGGATTAAAAGAGAGGTAGTAGCAAGAGTGGGTATCAAATAGCGATTAATAAATGAATATCCTTATTGTCATCACTTCGAACGACGTAAGTT AAATTGGAAAATTTTTCACTTTTTGGTCACCTAAGAAATAAGCAGGAAAAAGAAGAGAACATTGAGAGGATGGTAAAGCAAGAGGCATTTAGGGCGAACG

4900 5000 5100

YCL184 => AACTAGGTAACACATAACAACCTCAGATAGACTGTTACGGG~CCGTATTGAAGACATTAGCGCCATGAAGAACGGGTTTATAGTGGTGCCGTTCAAATT 5200

ACCGGATCACAAGGCACTACCCAAAAGCCAGGAAGCTTCGTTGCATTTCATGTTTGCTAA~GACACCAGAGTTCAAATTCCAACGAGTCTGACTGTTTG TTTTTGGTCAACCTTCCATTATTATCTAACATAGAGCACATGAAGAAATTTGTCGGGCAGCTCTGTGGGAAATACGATACAGTATCGCATGTAGAGGAAC TACTATATAACGATGAATTTGGATTACATGAAGTAGATTTATCGGCATTGACCTCCGATCTGATGTCCTCCACTGACGTCAACGAGAAGAGATACACACC AAGAAACACGGCGCTATTAAAATTTGTTGATGCTGCAAGTATAAATAACTGCTGGAATGCTTTGAAAAAATACTCGAATTTGCATGCCAAACATCCAAAT GAACTATTTGAATGGACATATACGACTCCATCATTCACAACTTTTGTTAACTTCTACAAACCACTGGATATTGATTATTTGAAAGAAGATATTCATACAC ATATGGCAATTTTTGAACAGCGTGAAGCTCAAGCACAAGAGGATGTTCAAAGTTCTATAGTGGATGAGGATGGATTCACATTAGTTGTAGGAAAGAACAC CAAGTCATTGAATTCCATAAGAAAGAAAATATTAAACAAAAATCCATTATCCAAACATGAAAATAAGGCCAAGCCAATTTCAAATATAGATAAAAAGGCA AAGAAAGATTTCTATAGATTTCAGGTCAGAGAACGTAAAAAACAGGAGATCAATCAACTGTTAAGTAAATTTAAGGAAGATCAAGAAAGAATCAAGGTAA

5300 5400 5500 5600 5700 5800 55\00 6000

Figure 2. Nucleotide sequence of the 11.9 kb fragment. The sequence of the strand is in the 5’ to 3’ orientation from BIKI-HIS4 towards the telomere. The seven major ORFs are shown from their first AUG (=>) to the first encountered stop codon (bold and underlined). Sequences matching the A R S core consensus are (+) underlined and two oligo (dAdT) are marked with asterisks.

536

M. R A MEZA N I R A D ET AL TGAAAGCTAAGAGAAAATTCAATCCATACACT~CCACCATCCTCGTCATCATATTCAACATATCTCAATATATCTATAAAACTAATAATATAATATA

6100

ACATAATATAGTATAATAGTATAAGTTTAAAGGCATCAATAATTTTTTTCTTTCATATATATTCTGCTAGCATACATGTTATCCTATTCGTGTAAGTTTA GTATGATGATGTGCATGACAACTGCACATTAGAGTCTTCCACCGGGGGTGACATTGTCACTTCCGTTCATCATTGCTACTTCTTCGAAATCACCTCTTCT

6200 6300

6400 6500 6600 6700 6800 TCGGAGAAGAGCTTGAGCTGGTCTTCATTACGTCCAAGACATCCATCCTCAGCCTTGTGTACTGCGATTGAAATTCTTGCAAlTTCGCAGATGTAGTAGT 6900 7000 GTACAAGTTTTTCAGTACCGTTATCATGTCCTCTTGAGTCTTGTCGTCCTTCCACTCCAACTTGCTGTCTCTCATCTTATTGATCAGTATCTTGAATTTT ATGGCCTTATTCAAATCACCGTCACACAAGTCCTGGCAATCTTGCAAGCACAATTCCGGCAAAAGATCTCCTACAATATCATTTTCTCGCAGTCTCTGAC 7100 7200 ATAATGGATCGGTTTCTTCCACCTCCAGCGTGGATATACACCAAGTTATCACATCATCAACCGACCACTGGGAAAAGTCTTCATTATTCATCAATATTGT GCCATTCACGTCCAGATCCGGCGAAGCATCGTTTGATCCCTCATTGATGGCCTGTTTACCGTCCTC~CTGATTTGCTATCTCTGCTAGTACGACCTCT 7300

The complete sequence of a 11,953 bp fragment from C1G on chromosome III encompasses four new open reading frames.

YEAST VOL. 7: 533-538 (1991) 0 0 0 01 111 0 :I Yeast Sequencing Reports 0 0 0 0 The Complete Sequence of a 11,953 bp Fragment from C 1G on Chr...
439KB Sizes 0 Downloads 0 Views

Recommend Documents