195

Biochimica et Biophysica Acta, 1132 (1992) 195-198 © 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$05.00

BBAEXP 90391

Short Sequence-Paper

Yeast ribosomal proteins: XIV. Complete nucleotide sequences of the two genes encoding Saccharomyces cerevisiae YL16 Tetsuo Hashimoto, Katsuyuki Suzuki, Keiko Mizuta and Eiko Otaka Department of Biochemistry and Biophysics, Research Institute for Nuclear Medicine and Biology, Hiroshima University, Hiroshima (Japan) (Received 4 June 1992)

Key words: Ribosomal protein gene; Sequence analysis; (Yeast)

W e isolated and s e q u e n c e d YL16A and YL16B encoding ribosomal protein YL16 of Saccharomyces cerev&iae. The two nucleotide s e q u e n c e s within coding regions retain 91.1% identity, and their predicted sequences of 176 amino acids show 93.8% identity. Out of the ribosomal protein sequences from various organisms currently available, no c o u n t e r p a r t to YL16 could be found.

The ribosomal proteins (r-proteins) are most appropriate materials for evolutionary studies because they provide an opportunity to examine both the evolution of individual protein species and the co-evolution of the related proteins within the supermolecule [1]. In order to thoroughly understand the mechanism of biological evolution as revealed by the r-proteins, we have reported the primary structures of r-proteins, especially from yeasts [2-9]. Here, we report the primary structures of yeast r-protein YL16 (L17 by Michel et al., see the table for nomenclatures in Ref. 10) as predicted from the nucleotide sequences of genes YL16A and YL16B. When the oligonucleotide mixture synthesized as described in the legend of Fig. 1 was hybridized with yeast genomic DNA, two bands were found (Fig. 1A), most likely indicating the existence of two copies of the gene for YL16 in the haploid genome. To clone the YL16 gene, a YEp24-based yeast genomic DNA library constructed by Carlson and Botstein [11] kindly provided by Prof. A. Toh-e was propagated in Escherichia

Correspondence to: T. Hashimoto (present address), The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan. The nucleotide sequence data reported in this paper appear in the DDBJ, EMBL and GenBank Nucleotide Sequence Databases under the accession numbers D10225 and D10226.

Am

B. E BH

E EBB

kb 23.1 9.4 6.6 4.4

2

2.3 2.0

Fig. 1. Southern blot analyses of chromosomal DNA from Saccharomyces cerevisiae. Chromosomal DNA of S. cerevisiae strain A364A (ATCC22244) was isolated as described by Cryer et al. [20] with slight modification. 5/~g of chromosomal DNA was digested with the restriction endonucleases indicated. Southern blotting and labeling of DNAs were performed as previously described [7-9]. E: EcoRI, B: BamHI, H: HindlII. HindIII-digested phage 3.-DNA was used as a size marker. (A) A synthetic oligonucleotide DNA mixture was used as a probe. Residue positions 15 to 24 of the known N-terminal amino acid sequence of ribosomal protein YL16 [4], AAPKKTRKAV, were chosen to obtain oligonucleotides whose sequences had been designed according to the codon usage preferred by yeasts [7,16]. The mixture of the 16 differently designed DNAs, each 29 nucleotides in length, was synthesized on an Applied Biosystems model 381A DNA Synthesizer. (B) A 1.05 kb fragment including a part of the coding and 3' flanking regions of the YL16A gene was used as a probe.

196 coli strain JM109, and screened using the same mixture by the colony hybridization method. A positive colony was isolated and the target gene was termed YL16A. According to the sequence analysis, however, the clone did not contain the first exon of YL16A. A 1.05 kb KpnI-PstI fragment of the clone was prepared to include a part of the coding and 3' flanking regions of YL16A. The genomic Sourthern analysis using the above fragment as a probe revealed two clear bands (Fig. 1B). One band strongly hybridizing corresponds likely to YL16A while the other band to another copy of the YL16 gene, termed YL16B. In order to obtain YL16A in enough length, genomic DNA digested with EcoRI and BamHI was applied to an agarose gel electrophoresis. From the separated DNA fragments the appropriate fraction, about 5-6 kb, was extracted and ligated with the vector pUCll8 digested with EcoRI and BamHI. The resultant genomic DNA library was propagated in E. coli JM105, and screened using the above 1.05 kb probe, two positive colonies out of 1200 transformants were obtained. A 2.7 kb PstI fragment including YL16A in one of them was subcloned into the PstI site of pUCll9, termed pYL16A. In isolating YL16B, the appropriate fraction of EcoRI digested DNA, 3.5-4.5 kb, was extracted and ligated

A

L(

I

with the vector pUCll8 digested with EcoRI. The resultant genomic library was screened using a 0.4 kb fragment, a part of the coding region of YL16A. One positive colony obtained from 1100 transformants finally resulted in pYL16B. According to Fig. 2, both strands of YL16A and YL16B were sequenced. The open reading frame of YL16A encodes protein YL16A predicted to consist of 176 amino acids being interrupted with an intron of 416 bp (Fig. 3). The 5' flanking sequence contains an UASwg-like sequence, that exists upstream of almost all r-protein genes in Saccharomyces cerecisiae, at nucleotide positions -354 to -340 and a T-rich region from -325 to -296. At 27 bases downstream of the stop codon, a polyadenylation signal AATAA is found. Similarly to YL16A, the open reading frame of YL16B is interrupted with an intron of 384 bp and encodes protein YL16B of 176 amino acids (Fig. 4). The UASrpg-like sequence and the T-rich region are found at positions, -456 to -442 and -430 to -391, respectively. The polyadenylation signal is found at 93 bases downstream of the stop codon. Comparison between the coding regions from the two genes shows great similarity (91.1% identity). Only 47 base substitutions including 32 synonymous ones were observed. On

,

p

D

Iw-

~

~ D , , -

D

q

91

4

B

I

4

4

]

I

I

4

9

4

q

"91-""""

q

k

I

q

4

Fig. 2. Restriction map and sequencing strategy. The arrows show the extent of nucleotide sequencing and the solid bar represents the coding region. (A) The 2.7 kb insert of pYL16A; (B) the 3.6 kb insert of pYL16B.

197 %00 GOTGTCTTTTCTGTGTTTGTTATAATTTTGCGCTATTATTATATAACGTTATATTTACTATAATATATTACAACATATAT -400 TACAACATATTTCAACGCTCTCCTGTTAGCTGCTTAGGOAGAAGAAAGAAATGTGAAGACTGCTATTAMTCCGTACATT -900 ATTAAAATGTATGAAELTGTTATATTTTTCTTCAGTTTTTTCCTTAACCAGATT~AOAAGCGGCCCr-JGCTAGACCATTG -200 GOGTTGCGGCTAAAGCGTTCTGGACACGGCTTCTCCGELOGOCGGACAACAGAAGCGAGAOACACGTTTCCCGTCCAGGA

GOACTCACCGGATTTGTCCAGTTTCCCACTGOAGTGTGCTTTGCTCCTTTTGCAAAATTTCGTTGGTATCGAGOTTATCG -100 AATGOAAOGOTTACATGAMAOATTAAATGTACTACTCATTAMGATTGAATATCATTTTTACATTAGOC GAAACAACAA +1 AGTAATAGCCATCCAACGAAATO AGT GCC CAA AAA ~..T~CCTAC ATGTTAMTTTOAGOOGOAGATOTCAATTT Met SerAla Gin Lys 2 5 +100 ACGTTCACOAAGGCATAAGTTTTGAMTTTGATATGATGCCATCAATCTCAGACATATCCATACAAGAMAAAATAAGAT +200 CAGOACGTCAAACAAAMAATACAATGAMACTGATTAATOACGTTATACAGTOTACAGTGAGOATAMTOATTAAGOCA

GTATACCTTTCAACACAATGTTAATAELATGAGTGAGAAGGTGAGCAAAGCAATAAGGTATATCGACCCTAGCTCATGAT +300 TATTTAGTTTCCAAACAMGGCCCACATTACAGTTTCAAATCGTTAGGTATAATCTCTCCCTATAAAAATTTATCACCAT +400 TCGCAAATCAAOAA~JL(Z~,~C~TTAOTCGTTTGATATTTCTTITTrCCATTT..~GCTCCA AAG TGO TAT CCA TCC Ala Pro Lys Trp Tyr Pro Ser +500 GAA GACGTC OCT OCT CTA AAG AAG ACC AOA /tAG GCT GCT COOCCA CAA AAG TTO COT GCCTEL CTA OIu Asp Val Ala Ala LeuLys Lys Thr Arg Lys Ala AlaArg Pro Gin Lys Leu Arg Ala Ser Leu 18 05 GTT CCA GOT ACC GTC TTO ATC TTA CTA OCT GOT COT TTC AGA GOT AAG AGA GTT GTT TAC TTG AM Val Pro Gly Thr Val Leu l i e Leu Leu Ala GIy Arg Phe Arg GIy Lys Arg Val Val Tyr Leu Lys ipn I +600 CAT CTA GAA GAC AAC ACT TTG TTG ATT TCT GOT CCA TTC AAG GTC AAT GOT GTT CCT TTO AGA AGA His Leu Glu Asp Asn Thr Leu Leu l i e Ser OIy Pro Phe Lys Val Asn GIy Val Pro Leu Arg Arg lba I 65 66 68

+700 GTC AAT GCT COT TAC GTC ATT OCT ACC TOT ACT AAO OTT TCT GTC GAA GOTGTC AAC GTT OAA AAA Val Ash Ala Arg Tyr Val l l e Ala Thr Ser Thr Lys Val Ser Val Glu OIy Val Asn Val Olu Lys

TTC AAT OTC GAA TAC TTT OCT AAG GAA AAA TTG ACT /tAG AAG GAA AAG AAG GAA GCT AAC TTG TTC Phe Ash Val Glu Tyr Phe Ala Lys Glu Lys Leu Thr Lys Lys 01u Lys Lys Olu A|a Ash Leu Phe +800 CCA GAA CAA CAA AAC AAG GAA ATC AAG OCT GAA COT GTT OAA GAC CAA AAG OTT GTC OACAAG OCT Pro Glu Gin Gin Asn Lys Glu l l e Lys Al_~aOlu Arg Val Glu Asp Gin Lys Val Val Asp Lys Ala 127 132 Sa'/ 1 +900 TTG ATT GCT GAA ATC AAG AAG ACT CCA TTA TTG AAG CAA TAC TTG TEL GCT TCT TTC TCT TTG AAG Leu l i e Ala Glu lie Lys Lys Thr Pro Leu Leu Lys Gin Tyr Leu Ser Ala Set Phe Ser Leu Lys 146 AAC GOT OAC AAG CCA CAC ATG TTG AAA TTT TAA ATATAATTATTTTAACGAACTTCTATAATAATATATATOC Asn GIy Asp Lys Pro itis Me_t.LLeu Lys Phe * 173 Dra l +I000 AGGTAAAATTATAATTTAACTACAGTTTCATAAOTGCATTTGTATTGAOCATTGOTT?TTTATTCGCATATTTATATCTT

Fig. 3. Nucleotide sequence of the YL16A gene and predicted amino acid sequence of yeast ribosomal protein YL16A. DNA sequencing was performed as previously described [7-9]. The nucleotides are numbered with respect to the translation initiation codon, designated + 1. Different amino acid positions between YL16A and YL16B are underlined, and the numbers from the first methionine residue are shown. A candidate for a potential UASrpg and a consensus polyadenylation signal are underlined. Arrows indicate the deduced splice sites. Consensus sequences for introns of Saccharomyces cereuisiae are underlined with wavy lines. Several restriction sites are shown as reference to the restriction map presented in Fig. 2.

the other hand, the introns and 5' and 3' flanking regions differ significantly. It is clear that after the g e n e duplication o f Y L 1 6 the n o n c o d i n g regions have diverged considerably, t h o u g h the c o d i n g region has b e e n tightly conserved. T h e a m i n o acid s e q u e n c e s predicted from two copies o f the g e n e s are identical at 165 out o f 176 a m i n o acid positions (93.8% identity). In the duplicated yeast r-

proteins previously reported, relatively remarkable a m i n o acid differences between the two copies were f o u n d in L30 by 5 residues [12] and L4 by 7 residues [13,14], except for 'A' protein having more than two copies [1]. Thus, 11 differences out o f 176 residues, seen between YL16A and YL16B, are considerably great. The molecular weights calculated from the predicted sequences are 19 830 for YL16A and 19 870 for YL16B, when the first methionine residue, not f o u n d in the mature protein [4], is excluded. These values are close to that of the purified YL16 protein, previously e s t i m a t e d by SDS-polyacrylamide gel electrophoresis [15]. Except for a tryptophan residue not d e t e r m i n e d in the previous analysis, the sequence at amino acid posi-500 CCTTGTCTTAATCTGAAAGTTACTAMAATAATAAAGACAAAATATCACCCATGCACCAAAATGTATGGOTCATAATTT] -400 TTTTELTGCCTTCAAAAATCGTTOTTAGELGCAAAATGOCGOTGAAATACATAGTTOTGOTTATGGOTCCCTCCGOGCGG -300 CCGCAGATGO AGGATOCCCC CATGTGCTGA TTCTCCCCGT AGGCGOCGAO TAATATCELG GCCTAACCTO GACTTGCCGT -300 OCTAAGTCOGCCTTCTAOCGTGCGCTTCCGTTCAOCATCGTTTACGGTAOAGOGAGCACATGOGCGOAAACAAACCTCAT

TELGOTCCTGGOCTTGCCCATTTGCACTGTATAGAGTGTGTGAOTTTCAATTAATOTTGTAGOGOCGAGOCCCTGTAATA -I00 TOGCAGTTOATCAAAAGTCAATAOATAATTTTCAAGCAAAATCTCATTTAGOTTTGACTAELTrCTTGAACTTGOAAGAO +I ~, AAOCAAATATATTCAACGAAATG ACT GCCCAA CAA~J~.CcT.OTAAAOATGAOT^AATTGAAOGATOTGATGOTTATG Met Th_L Ala Gin Ol_nn 2 5 +i00 GAGATGGCACACGTGCCTAGAGGAAAATAOTTACCCAAGAACAAAAOTAGGAAGGOAGCTAACGTTTTTAAATATGAACT +200 GOACAAACTGGACAAAATGAAAATAACGATTTTAGTATAAATGAAAATATAOCATCAAATOATTGAGAAACATAGTAACA

ATAAELCATATTATAOCTAATCTGAGTAAACTTCTTCELACGELAAGAAT}liliIGAGO ATAGTACCAAACCTTGTGCT +300 OAATCAOCACACGCCATGCCCACCATCAATTGTGCAAOAAAAAAGTAAACGAGTAGTA~,,~TATAT ATCTOATTTA +400 GATCACOATATOTCAATTA~,OCC CCA AAG TOG TAT CCT TCC GAA GAC OTT OCT OCC CCA /tAG AAA ACC Ala Pro Lys Trp Tyr Pro Set Glu Asp Va! Ala Ala Pro Lys Lys Thr 18 +500 AG^ AAO OCT GTT CGC CCA CAA AAG TTG COT GCC TCT CTA GTT CCA GOC ACT OTT TTG ATC TTO CTA Arg Lys Ala Va._~.lArg Pro Gln Lys Leu Arg Ala Ser Leu Va] Pro 01)" Thr Val Leu Ile Leu Leu 25 GCT GOC COT TTC AGA GOT AAG AGA GTT OTC TAC TTG AAA CAT CTA OAA GAC AAC ACT CTA TTO OTC A]a GIy Arg Phe Arg GIy Lys Arg Val Val Tyr Leu Lys His Leu Glu Asp Asn Thr Leu Leu Va.__~] Iba I 95 +600 ACT GOT CTA TTC AAG GTC AAT GGT GTT CCA TTG AGA AGA GTC AAC GEL GOT TAT OTC ATT GCT ACC Th_.zrGIy Leu Phe Lys Val Asn OIy Val Pro Leu Arg Arg Val Asn Ala Arg Tyr Val Ile Aia Thr 66 68 +700 TCT ACC AM GTC TCC GTG GAA GOT GTC AAC GTT GAA AAA TTC AAC GTT OAA TAC TTT GCC AAG GAA Ser Thr Lys Val Ser Val G1u GIy Val Asn Va] Glu Lys Phe Ash Va! Olu Tyr Phe Ala Lys Glu

AAA TTG ACC AAG AAG GAA AAG AAG GAA GCT AAC TTG TTC CCA GAA CAA CAA ACT AAG GAA ATC AAG Lys Leu Thr Lys Lys Glu Lys Lys Glu Ala Asn Leu Phe Pro Glu Gin Gin Th.__r_rLys Glu l i e Lys 127 +800 ACT GAA CGT GTT GAA GAT CAA /tAG GTT GTT GAC AAG GCT TTG TTG OCT GAA ATC AAG AAG ACC CCA Th.zrGlu ^rg Val Glu Asp Gin Lys Val Val Asp Lys ala Leu Le~uAla Glu l i e Lys Lys Thr Pro 132 146 +900 TTG TTG /tAG CAA TAT TTG TCC GEL TEL TTC TCT TTG AAG AAC GOT GAC AAG CCA CAC CTA TTG AAA Leu Leu Lys Gin Tyr Leu Ser Ala Set Phe Ser Leu Lys Asn GIy Asp Lys Pro His Leu Leu Lys 179 TTC TAAATTGAACTGAAAAAATTATAAATGATTTAAAATAGAATTTCAAAACAAGOAGAATTGATAGAGAATGCGTTGA Phe * +1000 TCTGAAAATACTGTTTGACAATAACGOAAAGGTAAGTGOATATGTAATCTAAGTTTAATCCTCTTELAMTCATTTTTCC

Fig. 4. Nucleotide sequence of the YL16B gene and predicted amino acid sequence of yeast ribosomal protein YL16B. For details, see the legend of Fig. 3.

198

1 . 0 0 kb=,-O. 74 kb =,.-

1

2

Fig. 5. Northern blot hybridization of poly(A) + RNA from S. cereeisiae. Total RNAs from S. ceretqsiae, strain A364A, were isolated according to Jensen et al. [21], and poly(A) + RNA was purified with oligo(dT)-cellulose (Pharmacia LKB Biotechnology, Inc.) as recommended by the supplier. Northern blotting and labeling of probes were performed as previously described [7-9]. Poly(A) + RNA (0.42 kLg) was analyzed using oligonucleotide probes synthesized on an Applied Biosystems model 391 PCR-MATE EP DNA synthesizer. Lane 1: 5'-CTGGAACTAGAGAGGCACGCAACTTTTG-3', corresponding to positions +498 to +525 and +466 to +493 of YL16A and YL16B genes, respectively; lane 2: 5'CAGATCAACGCATTCTCTATCAATTCTCC-3', corresponding to position + 964 to + 992 of the Y L I 6 B gene.

tions 2 to 25 in YL16B is identical to the known N-terminal sequence used for designing the synthetic probe [4]. On the other hand in YL16A, there are four residues different from the known 24 amino acid sequence. The amino acid compositions of both the predicted sequences are in good agreement with the previous one [2] (data not shown). The biased codon usage of YL16A and YL16B resembles those of highly expressed genes in yeasts [7,16] (data not shown). In examining the YL16 gene expression, an oligo nucleotide probe, corresponding to the identical part in the coding regions of the two genes, was synthesized. When poly(A) + mRNAs were screened using the probe, two bands of nearly similar intensity, 0.74 kb and 1.00 kb in length, were detected (lane 1 in Fig. 5). It must mean the balanced expression of the two genes whose transcripts differ from each other in length. When a synthetic oligo nucleotide probe, corresponding to the part specific to the YL16B gene, was used, a single band of 1.00 kb was detected (lane 2 in Fig. 5). These results suggest that the transcripts of YL16A and YL16B are about 0.74 kb and 1.00 kb in length, respectively. Although the poly(A) tail length of each transcript is not evident, they are of reasonable size considering the DNA sequencing results.

The sequence similarity search based on the algorithm of Lipman and Pearson [17] could find no counterpart of YL16 in prokaryotic, metabacterial (so-called archaebacterial) as well as eukaryotic r-proteins structually known up to now. As the primary structures of all the main r-proteins in E. coli have already been ascertained, YL16 species may be regarded as one of the 'excess' proteins not found in the prokaryotic ribosomes which are comprised of less proteins than eukaryotic ones. The phylogenetic place of metabacteria has already been confirmed as being closely related to eukaryotes [18,19]. In terms of evolutionary study, it is, therefore, interesting whether or not counterparts of YL16 are to be found in the metabacterial ribosomes. We thank A. Tokui, C. Oda and K. Fukushima for technical assistance. References 1

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Otaka, E., Suzuki, K. and Hashimoto, T. (1990) Protein Seq. Data Anal. 3, 11-19. Otaka, E., Higo, K. and Osawa, S. (1982) Biochemistry 21, 4545-4550. Otaka, E., Higo, K. and Itoh, T. (1983) Mol. Gen. Genet. 191, 519-524. Otaka, E., Higo, K. and Itoh, T. (1984) Mol. Gen. Genet. 195, 544-546. Itoh, T., Otaka, E. and Matsui, K.A. (1985) Biochemistry 24, 7418-7423. Suzuki, K. and Otaka, E. (1988) Nucleic Acids Res. 16, 6223. Suzuki, K., Hashimoto, T. and Otaka, E. (1990) Curr. Genet. 17, 185-190. Mizuta, K., Hashimoto, T., Suzuki, K. and Otaka, E. (1991) Nucleic Acids Res. 19, 2603-2608. Mizuta, K., Hashimoto, T. and Otaka, E. (1992) Nucleic Acids Res. 20, 1011-1016. Michel, S., Traut, R.R. and Lee, J.C. (1983) Mol. Gen. Genet. 191,251-256. Carlson, M. and Botstein, D. (1982) Cell 28, 145-154. Baronas-Lowell, D.M. and Warner, J.R. (1990) Mol. Cell. Biol. 10, 5235-5243. Arevalo, S.G. and Warner, J.R. (1990) Nucleic Acids Res. 18, 1447-1449. Yon, J., Giallongo, A. and Fried, M. (1991) Mol. Gen. Genet. 227, 72-80. Otaka, E. and Kobata, K. (1978) Mol. Gen. Genet. 162, 259-268. Warner, J.R. (1989) Microbiol. Rev. 53, 256-271. Lipman, D.J. and Pearson, W.R. (1985) Science 227, 1435-1441. Iwabe, N., Kuma, K., Hasegawa, M., Osawa, S. and Miyata, T. (1989) Proc. Natl. Acad. Sci. USA 86, 9355-9359. Woese, C.R., Kandler, O. and Wheelis, M.L. (1990) Proc. Natl. Acad. Sci. USA 87, 4576-4579. Cryer, D.R., Eccleshall, R. and Marmur, J. (1978) Methods Cell Biol. 20, 39-44. Jensen, R., Sprague, Jr. G.F. and Herskowitz, I. (1983) Proc. Natl. Acad. Sci. USA 80, 3035-3039.

Yeast ribosomal proteins: XIV. Complete nucleotide sequences of the two genes encoding Saccharomyces cerevisiae YL16.

We isolated and sequenced YL16A and YL16B encoding ribosomal protein YL16 of Saccharomyces cerevisiae. The two nucleotide sequences within coding regi...
371KB Sizes 0 Downloads 0 Views