YEAST

0

VOL. 8: 68 1-687 ( 1992)

0 0 0

a

0

0

111

0 0

00

Yeast Sequencing Reports

0

0

0 0 0

Nucleotide Sequence of DlOB, a BamHI Fragment on the Small-ring Chromosome I11 of Saccharomyces cerevisiae ELS DEFOOR, REGINE DEBRABANDERE, BRUNHILDE KEYERS, MARLEEN VOET AND GUIDO VOLCKAERT*

University of Leuven, Laboratory of Gene Technology, Willem De Croylaan 42, B-3001 Leuven, Belgium

Received 21 April 1992; accepted 30 April 1992 DlOB is a 5.3 kb BamHI fragment located between the H M L and HIS4 loci on the small-ring chromosome 111 of the yeast Saccharomyces cerevisiae, at about 15 kb from H M L and about 35 kb from HIS4 (Newlon et al., 1991). In this region of chromosome 111, few genes have been found and characterized so far (Mortimer et al., 1989). We have determined the complete sequence of 5337 base pairs of this fragment by a combination of chemical and enzymatical sequencing methods. The DlOB sequence was rather A+T rich with an overall G+C content of 33.5%, but several stretches of extremely high A+T content were found as well. Four open reading frames (ORFs) were detected, two of which were completely confined to the DlOBfragment. One of the latter appeared to be virtually unclonable in Escherichia coli when located on a smaller DNA fragment and is probably responsible for the growth-inhibiting effect of DlOB on E. coli. Comparison of the ORF regions with nucleic acid and protein databases did not reveal any significant homologies to catalogued sequences of any source. A large (2.5 kb) non-coding region was present in the central part of DlOB. Although DlOB subfragments displaying autonomous replicating sequence (ARS) activity were located here (ARS304; Newlon et al., 1991), no perfect match to the core ARS consensus was found and imperfect matches were scarce as well. SEQUENCE ANALYSIS DlOB was isolated from a chromosome 111 library in the vector YIp5, hereafter designated YIp5::DlOB (Newlon et al., 1991). Initial observations on this *Corresponding author. 0749-503)3/92/08068 147$08.50 01992 by John Wiley & Sons Ltd

clone (communicated to us by the DNA coordinator of the Yeast Chromosome 111 Sequencing Project in the framework of the European Biotechnology Action Programme) were an apparent lack of commonly used restriction recognition sites in this fragment, except for the presence of two PstI sites, and a deleterious effect of the plasmid on growth of the E. coli clone. Those facts may hinder current sequencing subcloning approaches. Therefore two strategies were adopted: (1) to avoid direct subcloning of DlOB, the YIp5::DlOB plasmid was adapted for insertion into pGV45 1 (Volckaert, 1987) under positive selection (see below) and re-configured for deletion subcloning. Subclones were then sequenced by chemical degradation (Maxam and Gilbert, 1977; Volckaert, 1987) and dideoxynucleotide termination approaches (Sanger et al., 1977); (2) ‘walking-primer’ dideoxysequencing on double-stranded DNA of either YIp5::DlOB or derivatives generated by the former strategy. In addition, an extensive range of restriction enzymes was tested and mapped. A unique SacI was found at position 943. In the first strategy (Figure l), an NdeI fragment from YIp5::DlOB vector was deleted (yielding YIp5::DlOBV). Subsequently, YIp5::DlOBVB was produced by deletion of the DNA between the unique sites SacI (DlOBV) and EcoRV (YIPS). Thus, 0.94 kb of DlOB DNA and one of the BamHI sites were removed. This facilitated the introduction of the YIp5::DlOBVB into the sequencing vector pGV45 1 (Volckaert, 1987) by ligating their unique BamHI sites and transforming E. coli strain JM83 under positive selection with ampicillin and chloramphenicol. A single clone was isolated and named DlOBp451. Finally, deletion of a ClaI-SphI fragment resulted in

E. DEFOOR ET AL.

682

Circularisation I Cloning

D1OB

Msel (partial digest)

Nde I ?sm 5%

@

(Sph I - Cla I)

sac I)

Eco RV + Sac I Mung bean nuclease

-

m

(Eco RV- Sac I)

Barn HI fusion

DlOBVB

D10Bp451

pGV451

Figure 1 Strategy for systematic deletion subcloning of DIOB. YIPS vector sequences are shown as thick lines, DlOB as a filled bar and pGV451 as a shaded bar. See text for details. The route to unidirectional deletions starting at SmaI in D10Bd451 is pictured. For deletion subcloning in the other direction, SanI was used instead of SmaI in the step preceding partial digestion with MseI. In that case, the DlOB sequence was left at the other side of the vector.

D10Bd451, which contains 4.4 kb of DlOB DNA in a suitable configuration for stepwise deletion and sequencing in both orientations. An overlapping set of deletion derivatives was constructed by a first cleavage of D10Bd451 with either SmaI or Sun to linearize the DNA, followed by limited digestion with MseI. We had observed previously that MseI cleaves DlOB DNA into fragments smaller than about 400 bp. This is quite normal since yeast DNA is rather A+T rich and MseI recognizes TTAA as target for cleav-

age. Chemical sequencing of the subclones was done as described previously (Volckaert, 1987). The chemical degradation reactions were adapted for manipulation on the Biomek 1000 Automated Laboratory Workstation (Beckman Instruments, Fullerton, California), using the Micronic System Tube Holder. In addition, nucleotide sequences were analysed by the dideoxy chain termination method (Sanger e f a[., 1977) on double-stranded DNA of the subclones, using T7 polymerase, [a-35S]dATPaS

683

DNA SEQUENCE OF FRAGMENT DlOB OF YEAST CHROMOSOME I11

J

1111II

r

YCL431

YCL432 YCL433

85% AT

Figure 2 Functional map of DlOB BamHI fragment. The location and polarity of the open reading frames are indicated. (YCL431-434 have been renamed recently to YCL54w. 55w, 56c and 57w, by Oliver et al., Nature, in press.) The unique Sac1 site is marked by an upright arrow. The ARS304-containing fragment (Newlon et af., 1991) is hatched. Regions of high A+T (>85%) are shown in black.

and a flanking sequence of pGV451 as primer (~GCGTCGATTTTTGTGATGCTCGOH). Although MseI sites are distributed over the complete D 10B fragment (as confirmed by sequencing), several regions did not show up as deletion end-points in the subcloning procedure. Moreover, all attempts to clone the 900 bp BarnHI-Sac1 fragment (Figure 2), originally removed as a separate subfragment, failed. Some of these regions probably contain sequences responsible for the deleterious effect that DlOB has on the growth of E. coli. Therefore, to avoid any further selectivity in subcloning, a second strategy was initiated rather soon to join the contigs and to finalize the D 10B sequence. This involved primer-walking dideoxy-sequencing with chemically synthesized primers based on sequences from the subterminal regions of the different contigs. Double stranded DNA of the YIp5::DlOB clone, YIp5::DlOBV or D10Bd45 1 subclones was purified on a Qiagen-tip-20 (Diagen GmbH, Dusseldorf, Germany). The T7 DNA polymerase kit (Pharmacia-LKB AB, Uppsala, Sweden) and the Sequenase version 2.0 kit (USB, Cleveland, Ohio) were used. All compressions in dideoxy-sequencing ladders could be resolved by using dITP/dGTP nucleotides with the latter kit. A compilation of all sequencing runs from both strategies is shown in Figure 3. Both strands are sequenced completely with an average redundancy of 7 to 8 per base pairs.

SEQUENCE FEATURES The complete nucleotide sequence of DlOB is shown in Figure 4 and amounts to 5337 base pairs. An overall G+C content of 33.5% can be counted. Internally, the sequence is quite A+T rich. In particular, in the non-coding region between positions 2180 and 4690, some stretches exceeding 85% A+T were found (Figure 2). Moreover, at position 904, a row of 23 A residues was present. During the sequencing project, we observed the gradually increasing appearance of a weak 24th band on the sequencing ladders at the latter site, which was accompanied by band doubling beyond this stretch on the sequencing gel. Apparently, this homopolymeric stretch of A.T base pairs is unstable in E. coli. Similarly, while analysing the junction between DlOB and the flanking JlOA fragment in YcP5O::JlOA-DlOB-B9G (a plasmid containing about 8 kbp of chromosome I11 DNA), a series of 24 A residues instead of 23 was found at this position. DlOB contains four ORFs (Figures 2 and 4), two of which, YCL433 and YCL432, are entirely confined to the DlOB fragment. YCL.434 starts in JlOA and has its stop codon in DlOB, while YCL431 starts in DlOB and continues in B9G. YCL433, YCL432 and YCL431 encode proteins of 144, 335 and 461 amino acids, respectively. Searching nucleic acids and protein databases did not reveal any significant similarities with known sequences. Yoshikawa and Isono (1990)

684

E. DEFOOR ET AL.

2

1

3

5

4

kb

c

c c t

C

c

t

tt t t

L

L

t

L

L

t

L

t

t L t t 2

t f

Figure 3 Survey of sequence readings. Arrows show the direction of dideoxy sequencing runs. Line segments without m o w s represent chemical degradation sequencing gels. The asterisk marks the (3') end-label.

analysed mRNA transcripts of chromosome I11 and assigned mRNA transcripts to the DlOB region. Potential transcriptional and translational consensus sequences preceding the ORFs are shown in Figure 4. A TTTTAAAT sequence is located at about 60 base pairs downstream of the termination codon of YCL434. This sequence appears to be a consensus transcription termination signal specific for galactose genes (Tajima et al., 1985). No polyadenylation signal AATAAA is found is this region. Codon usage of YCL433, YCL432 and YCL431 agrees with the codon usage in yeast (Bennetzen and Hall, 1982). YCL433 shows a typical codon usage for low-level expressed genes, whereas YCL432 displays a rather high codon usage for highlevel expressed genes (Sharp et d., 1986). YCL433 has a central hydrophilic. basic region, which has a betasheet structure. The encoded protein would be basic (calculated pl= 10.08) and contain a remarkably high percentage of the chemically related amino acids valine, leucine and isoleucine (49 of 144 residues, i.e. 34%). Preliminary cloning experiments for expression of this ORF in E . c d i failed unless expression could be efficiently blocked. Switching on the expression unit has a deleterious effect on growth of the host. Since YCL433 is entirely located in the 900 bp BarnHI-Sac1 fragment (Figure 2). this might also explain why we failed to subclone this fragment for sequencing, why YIp5::DlOB

slows down growth of E . coli., and similarly, why this region is not found in yeast gene libraries after amplification steps (Newlon et al., 1991). The natural function of this gene is currently being studied. A large non-coding region is present between positions 2180 and 4690. Several A+T-rich stretches can be found in this region (Figure 2 ) . Even more striking is the multiple occurrence of TTTR (or RTTT) repeats in several parts of this region (Figure 4). It is known that DlOB contains an ARS, which has been mapped between the SpeI and DruI sites at positions 3444 and 4306, respectively (ARS304: Newlon et a/., 1991). However, no perfect 12/12 bp modified and expanded ARS core consensus sequence (WTTTAYRTTTWB; Van Houten and Newlon, 1990) has been observed. A single 11/12 bp mismatched core is found in the ARS fragment, but several such sequences are present in the surrounding DNA as well (see Figure 4). ACKNOWLEDGEMENTS This work was supported by an EEC-BAP contract and a grant from the Belgian 'Dienst voor Programmatie van het Wetenschapsbeleid'. We thank S.G. Oliver for the supply of the DlOB clone DNA, J. Sgouros for searching the databases for homologies, and J . Robben for help with the artwork.

Figure 4 Nucleotide wquence of the DlOB fragment. The deduced amino acid sequence of each ORF IF shown in one-letter code underneath the nucleotide sequence. Con.;ensur \equencer of putative expression signals such as start- and $top-triplets, TATA-boxes and polyadenylation \ires, and A R X W are marked by single underlining. The sequence TTTTAAAT is a transcription termination \ignal reportedly qxcific for galactow gene\ (Tajima d , 1985). The 11/12 bp mismatched ARS core sequences (Van Houten and Newlon. 1990) in the non-coding region are underlined twice. one in ARS30J (SpeI-Dral),five in the remainder of the non-coding region. Note the GTTT and A m Iimperfectly) repeating structure\. e.g. between positions 2546-2594 and ?691-2760. (31

685

DNA SEQUENCE OF FRAGMENT DlOB OF YEAST CHROMOSOME I11

_. lCL434 G

G

D

M

P

C

L

C

N

W

A

I

L

!

T

N

~

G

R

I

X

Q

X

Y

&

R

D

T

K

I

V

A

L

A

A

T

R

W

G

100

T AT AG APT A T T A A T G A T M - G h A m

~

G

+

L

Y

D

I

N

D

N

L

X

E

F

L

G

MXXAKT-T~T~ATKAA~TA~BATMITG 2 00 R

E

T N

S

T

S

L

T

T

N

E

"

K

T

C

V

M

L

R ~

N

C

I

A

Y

~ I L

G

L

S I

L

Q

V

C

V

~ C S

T

L

G

K

I

X

D

L

T

D

~ S

T

F

P

X

V

A

F

E

A

L

~ I

X

L

~ T G

A

A

.-

c

F

L

F ~ C

V

L

A

-

X M W

X

D

~

X

T L

X

~ X F

R

m

Y

-

P

N

.

M

~ L Y

~

L

A

G

N

~ Y C

c L W

G

C

A

T

A

G

S

W

I

T

G

T

A

I

X

L

G

!

~

A

I

F

V

Y

L

l

C

R

X

C

A

D

X

A

S

I

~

I

H

~

X

~

C

R V

N

E X

c G

V

400 ~

~

L

T

~

N T

C

G

A

~

I A

500T

A

T

~

L

600 ~ GccGTATmTxTKGAAA A ~

T

E

T

~

~ D

X

R

V

300

T

H

X

~

S

~

I

A

X

I ~

T

C

D T

G

~

A

C

Q

M

W

L

A

R L

F

E

Q

L

T R L

N

A

D

D

G V X

V

L A

E

C

Z

S

~

G

S C

' V I I R C L T I I C

A

l

T

N

T

X

X

S

V G

700 c

XXXGXTTATC

800

A

900

C

t

X

G

A

A

~

~

ICL433

m M l T X C C l T K T A C T F - T P T

-

C

C

C

X

A

T

A

P

A

1000

1100 1200

1300

1400

A L

Q

X

L

F

Q

A

X

A

~ Q I

X

N

Q

F

A

T

T

P

H

W

T

Q

a

L

D

A

A

F

M

L

A

F

C

I

~

~

T

T G

Y

~

C

N

E

A

G

E

E

l

D

M

G

T

N

A

"

l

T

A

T

P

V

L

A

~

C

A

P

~

G

T

W

G

A

R

Y

L

A

A

~

T

S

T

A

C

E

C

S

T

D

G

F

P

M Q

N

I

G

~

Y

P

'

H

C

A

C

V

A

P

T

I

I

I

G

C E

I

G

I

T

F

D

S

P Y P

I

I

V

P

G C I L T

~ 1500 L

T

A

Q

T 1600

L

P

P& Z T ~ M T G A M T A T G C G C P 17 M 00 I N E L T X L L N N E I W A K

K

S

R

R

U

V

C

Q

X

~

L

W

G

S

X

~

V

H

N

D

A

S

I

P

A

I

G

T

m

R

M A

P

A

G

P

G

V

V

W

A

V

C

T

~

V

V

I

A

F

A

I

A

X

D

F

G

V

C

G

F

G

F

L

M

T

L

A

P

W

~

D

G

G

S

A

H

X

T

R

A

W

X

E r

S

T

R

X

Q

N

X

R

A

X

T

I

A C K

G

A L T

-RcAGAAcGMM-

N

F

F

~

A

D

V

T

C

L

X

.

D

Q

D

D

E

A

T

l

I

H

C

A

C

~

S

T

T

M

A

L

L

C

N

"

G

T

G

D

I

V

~

A

~

L

"

N

N

W

!

D

T

A

R

R

C

~

F

S

P

H ~

X

S

~

L ~

D

L

Y

A

S

A

E

~

G1800 A T

T 1900 C G

A

I

P

~

~

E

A

C

A

~

~

~

X A

A

A

E A2000 ~

~

I

K

T

M

C

I A

R

~

T

T

L

c

W

R

~

~

~

-

T

C

~

E

I

~ 2 100~

T

m

T

~

G

A

T

686

E. DEFOOR ET AL.

T~ G A T a K C A T T A m T P 3 X A A T A A 4 6 00 T

~

~

~

~

~

M

~

A

~

X

T

A

Q

K

T

G

X

N

M

N

E

A

S

K

G

~

X

Y

G

T

R

L

M

H

F

A

D

~

R

Y

P

X

V

V

O

L

E

X

S

~

Y

L

I

D

C

A

A

X

E

L

C

A

l

!

i

X

G

P

A

P

G

W

'

R

A

S

W

X

l

R

S

Q

V

T

K

I

A

S

K

T

~

G

A

F C

C

M

+

~

S

4 700

~

XU31

~

I

Q

I

L

C

P

A

~

4 8 0A 0

~

~

4900 ~

~

~

C

G T U U L C P C C ~ T M ~ T A ~ T A ! P X T I C C A A ~ C ~ A T ~ A ! ~5000 ' ~ ~ ~ ' L W , A V N S L I I G V D I V P M X P M P N V I T F Q S D I T T E D C R S A X

A T F S L R G Y M

T

C

Q A

Z

L ~

I

W

T

L

A

L

Q

~

~

G

V

F

Q

A G G C Q K T W K A D T

A

T

L

C

Q

A

X

C

L

&

F

A

C

A

V

L

E

3

M

X

V

P T A T T V L W D L

E

G

N

A

E

G

L

C

A

T

X

V

N

V

C

T

X

~ G

A

C

P

P

U X ' G A P N V G T

G

r

T

C

A

R

F

C

S

G

V

t

R

A T A ~ ~ L G W V Q D

T

X

N

T

T

X

I

A

V

S

A

F

T

C

A

E

C

R

I

S

!

X

~

D

Y

~

N

A

T 52M00 T

X

L

€ AGTQF3TAM-A G M A ~

A

I

C T ~ ~ ~ 5 100~ A F T Q S

F

V

V

C

X

G

C

A ~

5300

F

A a x A C ~ c u 4 M X T T j M x G a T m 5337 K A P X R L D P R L L D-.

Figure 4 (conr'd)

REFERENCES Bennetzen, J.L. and Hall, B.D. (1982). Codon selection in yeast. J. Biol. Chem. 257,3026-303 1. Maxam, A.M. and Gilbert, W. (1977). A new method for sequencing DNA. Proc. Natl. Acad. Sci. USA 74,560-564. Mortimer, R.K., Shild, D., Contopoulou, L.R. and Kans, J.A. (1989). Genetic map of Saccharomyces cerevisiae, Edition 10. Yeast 5, 321-403. Newlon, C.S., Lipchitz, L.R., Collins, I., Deshpande, A., Devenish, R . J . , Green, R . P . , Klein, H.L.. Palzkill, T.G..

Ren, R., Synn, S. and Woody, S.T. (1991). Analysis of a circular derivative of Saccharomyces cerevisiae chromosome 111: a physical map and identification and location of ARS elements. Genetics 129,343-357. Sanger, F., Nicklen, S. and Coulson, A.R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467. Sharp, P.M., Tuohy, T.M.F. and Mosurski, K.R. (1986). Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucl. Acids Res. 14,5125-5143.

A

T

C

M

DNA SEQUENCE OF FRAGMENT DlOB OF YEAST CHROMOSOME I11

Tajima, M., Nogi, Y. and Fukasawa, T. (1985). Primary structure of the Saccharomyces cerevisiae GAL7 gene. Yeast 1,67-77. Van Houten, J.V. and Newlon, C.S. (1990). Mutational analysis of the consensus sequence of a replication origin from yeast chromosome 111. Mol. Cell. Biol. 10, 3917-3925.

687

Volckaert, G. (1987). A systematic approach to chemical DNA sequencing by subcloning in pGV45 1 and derived vectors. Methods E m . 155,231-249. Yoshikawa, A. and Isono, K. (1990). Chromosome I11 of Saccharomyces cerevisiae: an ordered clone bank, a detailed restriction map and analysis of transcripts suggest the presence of 160 genes. Yeast 6 , 1-19.

Nucleotide sequence of D10B, a BamHI fragment on the small-ring chromosome III of Saccharomyces cerevisiae.

YEAST 0 VOL. 8: 68 1-687 ( 1992) 0 0 0 a 0 0 111 0 0 00 Yeast Sequencing Reports 0 0 0 0 0 Nucleotide Sequence of DlOB, a BamHI Fragment...
438KB Sizes 0 Downloads 0 Views