TIBTECH - SEPTEMBER 1990 [Vol. 8]

241

I lt lt l [ t[ ll

II .111l .

Yeast: the model "eurokaryote'?

genes are contained within its 14-15 x 106 base pairs2':t This implies a high density of information packing that is unrivalled by any other organism. For the sequencer, this means a gratifyingly high return for a minim u m of effort and the acquisition of valuable expertise that can readily be applied to the sequencing of more complex genomes.

L. A. Grivell and R. J. Planta At the beginning of 1989, the European Community allotted funds to a collaborative project involving 35 European laboratories. The goal: a complete sequence determination of chromosome III, one of the smallest of the 16 chromosomes that make up the genetic blueprint for the yeast S a c c h a r o m y c e s cerevisiae. By the end of 1990, the project, which is a first step towards sequence determination of the whole yeast genome, should have met that goat, presenting the scientific community with the first complete DNA sequence of a eukaryotic chromosome and a wealth of information on the organization of the several hundred genes contained within it. Why yeast? Like other decisions to tackle the large-scale sequencing of eukaryotic genomes, the decision to undertake sequence analysis of the yeast genome could have been criticized on the grounds that such projects are costly and divert resources from other, perhaps more deserving, research programmes. Furthermore, in the absence of rapid, reliable automated methods for analysis, they tie up highly qualified scientific manpower in monotonous, repetitive work. So w h y yeast? The decision of the European community to push ahead was based on four main arguments: (1) The yeast Saccharomyces is one of the simplest eukaryotes known. As Table 1 shows, its genome is small - barely four times the size of

L. A. Grivell is at the Section for Molecular Biology, Department of Molecular Cell Biology, University of Amsterdam, Amsterdam, The Netherlands. R. J. Planta is at the Biochemical Laboratory, Free University, de Boelelaan 1083, 1081 HV Amsterdam, The Netherlands. (~ 1990, Elsevier Science Publishers Ltd (UK)

that of the bacterium E. coil, yet it is organized in chromosomes whose structure 1 is like those of all other eukaryotes, from worms to man. Similarities do not stop there, however. Recent work in many laboratories has confirmed what many yeast researchers have long suspected: namely that the organism is a model eukaryote in many respects, including the way it regulates gene expression, carries out housekeeping tasks necessary for cell growth and division, responds to signals from the environment and targets proteins to various destinations within or outside the cell. Genomic sequence information is thus likely to provide important insights into the general principles of eukaryotic celt function. (2) Yeast, Saccharomyces in particular, is one of man's oldest biotechnological partners. For more than 6000 years, yeast has been intimately associated with the progress and well-being of mankind. Today, it is one of the most important commercial microorganisms, with yeast and yeast products representing an annual financial turnover of tens of billions of US dollars in the chemical, pharmaceutical and agricultural industries. Indeed, there is potential for still more, given the ease with which both classical genetic and recombinant DNA technology can be used to create production strains capable of using new raw materials, or of expressing foreign genes whose products have significant pharmaceutical applications. Sequence information is vital to researchers in their efforts to tailor specific genes to production needs. (3) Unlike the human genome, most of whose three billion base pairs do not code for protein, the genome of yeast is compact and efficiently organized. An estimated 5000-6000

0167 - 9430/90/$2.00

(4) Last, but by no means least, yeast offers a unique opportunity to put predictions of gene function to the test. Any gene sequenced can be isolated and modified or inactivated in the test tube. Then, using the organism's ability to promote specific recombination between any DNA fragment and a homologous site on the chromosome, the modified sequence can be introduced into the genome, thus disrupting the function of the original chromosomal gene (see Fig. 1). This procedure yields a mutant cell (disruptant), whose biochemistry and physiology can be analysed, thus providing in days or weeks information that might require years to obtain in mammalian, or other types of cells. In view of the strong evolutionary conservation of many cellular functions between yeast and man, it can

--Table

1

Physical sizes of yeast (Sacc h a r o m y c e s ) chromosomes a Chromosome I II III IV V Vl VII VIII IX X XI XII b XlII XIV XV XVI

S i z e (kb) 242 840 355 1622 590 270 1130 595 450 760 677 -2600 957 822 1140 995

aChromosome sizes are based on electrophoretic mobility in pulsed field electrophoresis1. bWithin chromosome XII, the repeated genes for the ribosomal RNAs occupy some 1500 kb of sequence.

242

TIBTECH- SEPTEMBER1990[Vol.8] ~Fig. 1

be expected that for many human genes whose function is at present unknown, the first indications for their function will come from an analysis of the corresponding genes in yeast. Setting up the network Although small in relative terms, analysis of the yeast genome, together with the appropriate checks on overlap of the sequenced clones and accuracy of the data, still represents approximately 450 man-years of effort. How can this task best be tackled? Rather than founding a centralized institution, with sequence analysis as its sole activity, the Commission of the European Community opted for a set-up in which initial effort in a pilot project could be divided over a network of laboratories, each specialized in particular aspects of yeast molecular biology or physiology. There are obvious advantages to this arrangement: such a network requires little in the w a y of investment in infrastructure and makes optimal use of existing expertise, thus expediting functional analysis of the sequenced genes. Furthermore, it creates a climate favourable to the exchange of ideas, manpower and students, and a broad base of expertise necessary to carry the programme through into later phases, both of which are aspects of vital importance to the future of European science. The division of labour over many groups of course requires strict coordination of the distribution of DNA clones and subsequently, storage and analysis of the collection of sequences produced. With respect to both points, the consortium has been fortunate. Coordination of clone distribution has been in the capable hands of Prof. Steve Oliver (UMIST, UK), who has maintained all the necessary clones, checked mapping information and kept the various groups supplied with high-quality material for their sequence efforts. Coordination of the storage and analysis of sequence data is the task of the Martinsried Institute for Protein Sequence Data (MIPS), which maintains computer links with the participating laboratories and is currently developing suitable software for accessing the rapidly growing databank of new sequences, for extracting information from it and

f

GeneX ~ ,

LEU2-gene

GeneX-LEU2fusion

X

X chromOsomal copyof geneX Homologous recombination

~

LEU2

Chromosome Gene disruption in yeast. A cloned fragment, containing the coding sequence of gene X is cut with a restriction enzyme that recognizes a single site within the gene. A selectable yeast marker (LEU2 in this example) is then cloned into this site and a fragment containing the sequences of both gene X and the selectable marker is liberated from the plasmid by digestion with a second restriction enzyme. This fragment is used to transform yeast cells. Homologous recombination between the fragment and the corresponding site in the chromosome results in substitution of the chromosomal sequence by the linear disrupted copy. Adapted from Ref. 4.

for performing sensitive, but rapid sequence comparisons. Why chromosome III? Working on the principle that small is not only beautiful, but also quickly sequenced, the consortium was faced with a choice out of three chromosomes, namely I, VI or Ill, which (having lengths of 240, 280 and 355 kb, respectively) represent manageable chunks of the genome (see Table 1). Although good arguments could be made for choosing any one of the three, the availability of extensive genetic mapping data on chromosome III swung the final decision in favour of this chromosome (see Fig. 2), since it was felt that such data w o u l d permit the more rapid assignment of a biological context to the sequence. From a practical point of view, this chromosome also offered the advantage that clones derived from a small ring derivative (formed by recombination between HML, close to the telomere on the left arm and M A T on the right arm5; see Fig. 2) had already been mapped and ordered and were freely available from Dr Carol Newlon

(Newark, NJ). Together with a set of clones in bacteriophage )~, constructed and mapped in Dr Maynard Olson's laboratory (St Louis), virtually the whole chromosome could be covered. Results so far At a recent meeting of the consortium at Tiitzing, FRG 6, representatives of the 35 participating laboratories presented analyses of data accumulated in the past year, amounting to almost 160 kb of sequence, or about 45% of chromosome III. Although there is probably still much to be gained from a detailed and systematic analysis of the data by MIPS, results are fascinating and fully live up to expectations. Over 60 reading frames were discovered, confirming the predicted high density of gene packing. Most of these display no clear resemblance to protein sequences at present stored in various databanks, and thus represent n e w genes. Others display sequence similarity to known proteins in other organisms, thus giving clues to their identity, but at the same time raising some

TIBTECH - S E P T E M B E R 1990 [VoI. 8]

243

--Fig. 2 a

intriguing questions as to their function in the yeast cell. Members of this group include proteins resembling halorhodopsin, a bacteriophage receptor, an adhesion peptide with epidermal growth factor-like repeats, a R A S oncogene and the transport protein thought to be involved in the etiology of the human disease of cystic fibrosis. Clearly, the technique of gene disruption will yield valuable information on what these proteins do in yeast. Besides information on possible n e w genes, examination of the sequence gathered so far has given two additional bonuses. (1) It has revealed quite significant differences with genetic estimates of the order and distances between certain genes. The origin of such differences is at present unknown, but an interesting possibility is that specific sequences have influenced the outcome of the recombination events on which such mapping is based. (2) The inadvertent duplication of previously published sequences has brought two valuable pieces of information to light: first, such sequences are far from error-free and, in at least one case, correction has resolved details of protein structure which have worried protein chemists for some time; and second, different yeast strains display a remarkable conservation of sequence, at least as far as proteincoding regions are concerned. Most changes found concern intergenic regions; large-scale rearrangements, when present, are invariably the result of the movement of Ty elements in or out of the DNA. (Ty elements are transposable elements with a strong resemblance to the retrotransposons of other eukaryotic cells.) Study of the factors governing integration of these elements and of the consequences of their movement are thus likely to have wide application. What next? Exciting as these results are in terms of our fundamental understanding of yeast, there is much here for the biotechnologist too. As mentioned ab/~ve, the DNA sequence is a first step along a path towards an understanding of yeast physiology and thus towards a better ability to manipulate this organism to new heights of versatility in terms of the nature and range of raw materials

c

b

HML glkl cha 1

L m

mHML

his4 ags l

,Tyl

SUP53 21eu2 ,l, ~ EEe~6f - - ~ cdc 10 SUF16 3 . ~ SUF2 srdl " , _ L \ p g k l 1 - SUF16 , pet18

,Tyl SLT3

-~lcry MAT "tsml

1

I HIS4

! LAHS

I LEU2 a CDCIO a SUF2 i PGK I RAHS

a PET18 - thr4 t .,sec55 ~ 4 tsm5 Jrad18 SUP161 - tup 1 ABP1 cdc39, rosl HMR ~MAL2

I CRY1 • MAT mTHR4 I MAL23 MAL21 MAL22

(a) The genetic map of chromosome III, based on data summarized by Mortimer et al. 7. (b) The physical map of chromosome III, based mainly on a compilation of data so far submitted to the MIPS databank by laboratories participating in the BAP sequencing programme. It shows the positions of a selection of known genes relative to recognition sites for the restriction enzymes Hindlll (shown to the left of the bar) and EcoRI (shown to the right of the bar), as revealed by DNA sequence analysis. In the case of the MAL21-23 loci, positioning is based only on data cited by Mortimer et al. 7 since this region of the chromosome, which is highly variable in structure, has yet to be sequenced. LAHS and RAHS: left and right arm hypervariable sites, two regions containing sites at which insertion of Ty elements frequently occur. See Ref. 1 for the nomenclature of the remaining genetic IocL

used for fermentation. In addition, newly discovered proteins or enzymes may have potential as marketable products, while the ability to manipulate genes for any yeast enzyme by in vitro mutagenesis opens up interesting possibilities for improvement of the enzyme in re-

lation to the process in which it is used. Such improvements might include increased stability under certain conditions, new substrate specificity, or even - by changes in the signals that govern protein routing within the cell - a new location in or outside the cell. In addition to these prospects, spin-off from the sequence project is also to be expected in the areas of yeast identification and classification, since cloned and sequenced chromosome fragments provide excellent highresolution taxonomic probes. Such probes will be invaluable for tracing lineages in support of patent specifications of production strains and in the assessment of the short- or longterm stability of such strains. All being well, the final bases of chromosome III will be slotted into place before the end of this year, thus completing our 'jigsaw' picture of the first eukaryotic chromosome, but representing only about 2.5% of the yeast genome. What next? Prospects are bright. The European Community has already given the green light to funding of a follow-up project within the framework of BRIDGE (Biotechnology Research for Innovation, Development and Growth in Europe). Within this programme, sequencing activities will be intensified by an order of magnitude, allowing several chromosomes to be tackled simultaneously by smaller consortia working in parallel. Together with projects currently being initiated in Japan, Canada and the USA, this should result in completion of the genome around the year 2000, providing food for thought for both the yeast community and the scientific community as a whole until well into the 21st century. References 1 Mortimer, R. K., Schild, D., Contopoulou, C. R. and Kans, J. A. (1989) Yeast 5,321-403 2 Kaback, D. B., Angerer, L. M. and Davidson, N. (1979) Nucleic Acids Res. 6, 2499-2517 3 Goebl, M. G. and Petes, T. D. (1986) Cell 46,983-992 4 Rothstein, R. J. (1983) Methods Enzymo]. 101, 202-211 5 Strathern, J. N., Newlon, C. S., Herkowitz, J. and Hicks, J. B. (1979) Cell 18,309-319 6 BAP meeting on Sequencing of the Yeast Chromosome III (1989) Commission of the European Communities

Yeast: the model 'eurokaryote'?

TIBTECH - SEPTEMBER 1990 [Vol. 8] 241 I lt lt l [ t[ ll II .111l . Yeast: the model "eurokaryote'? genes are contained within its 14-15 x 106 bas...
446KB Sizes 0 Downloads 0 Views