Plant Molecular Biology 11:255-269 (1988) © Kluwer Academic Publishers, Dordrecht - Printed in the Netherlands

255

Molecular cloning and analysis of four potato tuber mRNAs Willem J. Stiekema, Freek Heidekamp,1 Wim G. Dirkse, Joke van Beckum, Peter de Haan, Carolien ten Bosch I and Jeanine D. Louwerse 1 Research Institute Ital, P.O. Box 48, 6700 A A Wageningen, Netherlands; 1present address: TNO-CIVO, P.O. Box 360, 3700 A J Zeist, Netherlands Received 11 March 1988; accepted in revised form 19 May 1988

Key words: potato, patatin, proteinase inhibitor, cloning, gene expression Abstract

Tuberization in potato is a complex developmental process involving the expression of a specific set of genes leading to the synthesis of tuber proteins. We here report the cloning and analysis of mRNAs encoding tuber proteins. From a potato tuber cDNA library four different recombinants were isolated which hybridized predominantly with tuber mRNAs. Northern blot hybridization experiments showed that three of them, pPATB2, p303 and p340, can be regarded as tuber-specific while the fourth, p322, hybridizes to tuber and stem mRNA. Hybrid-selected in vitro translation and nucleotide sequence analysis indicate that pPATB2 and p303 represent patatin and the proteinase inhibitor II mRNA respectively. Recombinant p322 represents an mRNA encoding a polypeptide having homology with the soybean Bowman-Birk proteinase inhibitor while p340 represents an mRNA encoding a polypeptide showing homology with the winged bean Kunitz trypsin inhibitor. In total, these four polypeptides constitute approximately 50°7o of the soluble tuber protein. Using Southern blot analysis of potato DNA we estimate that these mRNAs are encoded by small multigene families.

Introduction

Tuberization in potato is a complex process leading to the differentiation of an underground stem, also called a stolon, into a specialized storage organ, the tuber [1]. During this developmental process morphological and genetical changes such as radial expansion of the stolon take place and the expression of specific genes is dramatically influenced. This is reflected by the synthesis of starch and specific proteins [33, 35]. This differentiation process can be influenced by environmental conditions [3, 5, 7]. Factors such as short daylength, low temperature or low nitrogen supply favour tuber formation. Plant hormones also play a role [15, 22]. Cytokinin [8, 22, 34] enhances tuber formation while gibberellic acid [10, 17, 23, 27] inhibits this developmental process. On

the other hand, coumarin has been described as an enhancer of tuber formation [49]. Therefore, theories based on both physiological and hormonal factors controlling tuber formation have been put forward. However, despite the large amount of descriptive work favouring one of these theories, the molecular mechanisms underlying tuberization remain totally unknown. As shown by Paiva et al. [33], a set of specific proteins are present in mature tubers; the mRNAs of only two of them are isolated. Mignery et al. [29] and Rosahl et al. [39] reported the molecular cloning of the patatin mRNA; Sanchez-Serrano et al. [43] reported the cloning of the proteinase inhibitor II mRNA. Recently the isolation and analysis of patatin genes has been accomplished [2, 36, 40]. Pikaard et

256 al. [37] showed that two classes of patatin genes exist. Class I is exclusively expressed in the tuber while Class II patatin genes are expressed in both tubers and roots. In tubers equal amounts of Class I and Class II mRNAs are present. In roots the expression of Class II genes show a 50-100- fold lower level. Recently, Twell and Ooms [52] presented data showing that the 5' flanking region of one of the patatin genes is able to direct tuber-specific gene expression. A 3 800 basepairs (bp) DNA fragment located upstream of the patatin protein-coding sequences contains the regulatory signals conferring organspecific expression upon a reporter gene. Patatin is the most abundant tuber protein [35], suggesting that it serves as a storage protein. Recently, it has been shown that patatin also displays a lipid acyl hydrolase activity [38, 41]. This enzymatic activity of patatin might have a function in the transition of the tuber from dormancy to vegetative growth [41] or in protection against microbial growth [38]. The mRNA encoding a second tuber protein, the proteinase inhibitor II, has been cloned by SanchezSerrano et al. [43], Keil et al. [19] and Thornburg et al. [51] isolated the corresponding genes. Under normal conditions the potato proteinase inhibitor II gene is only expressed in the tuber. Upon wounding the expression o f this gene is systemically induced in potato [13, 43], suggesting that the proteinase inhibitor II protein is involved in the defence response of the plant against pathogens. As shown by SanchezSerrano et al. [44] and Thornburg et al. [51] 1000 bp of 5' and 3' flanking regions of the proteinase inhibitor II gene are sufficient to confer wound-inducible expression upon a reporter gene in transgenic tobacco plants. To enlarge our knowledge of tuber proteins and the regulatory mechanisms involved in the developmental expression of their structural genes we decided to clone and characterize a number of genes encoding these proteins. As a first step towards this goal we report here the cloning and analysis of four different tuber mRNAs. Northern blot analysis shows that these mRNAs are predominantly present in tubers. Two of them encode the patatin and proteinase inhibitor II proteins respectively of potato cv. Bintje. A third cloned mRNA codes for a tuber pro-

tein having homology to the Bowman-Birk proteinase inhibitor described in soybean [16]. The fourth cloned tuber mRNA encodes a protein showing homology to the Kunitz trypsine inhibitor of winged bean [55].

Materials and methods

Growth conditions f o r plants

Potato plants (Solanum tuberosum cv. Bintje and monoploid Solanum tuberosum clone AM 79.7322) were grown from seed potatoes in 4-1itre containers in a growth chamber at 18 °C with a light intensity o f 10-15 klux for 14 hours followed by 10 hours of darkness at 15 °C.

R N A isolation

Tissue was harvested from young potato plants and immediately frozen at - 8 0 °C until use. Frozen tissue was ground in a mortar to a fine powder under liquid nitrogen. To 1 g of tissue a mixture of 2 ml 0.2 M sodium acetate, pH 5.0, l°70 SDS, 0.01 M EDTA and 2 ml distilled phenol containing 0.1°70 8-hydroxyquinoline was added. After rigorous shaking and centrifugation the aqueous phase was removed and the phenol phase re-extracted with 0.2 M sodium acetate, pH 5.0, 1°70SDS, 0.01 M EDTA. The aqueous phases were collected and re-extracted twice with an equal volume of phenol:chloroform (1:1) and chloroform, respectively. Subsequently LiC1 was added to a final concentration of 2.5 M. The RNA was precipitated overnight at 4 °C, collected by centrifugation, washed once with 2.5 M LiC1 and twice with 70°7o ethanol. The dried pellet was dissolved in water and stored in portions at - 8 0 °C. By this procedure total RNA was isolated from different plant tissues. Poly(A) ÷ RNA was isolated by oligo-dT cellulose chromatography [25].

Construction o f c D N A library

A quick one-tube method was used for the construc-

257 tion of the cDNA library using tuber poly(A) ÷ RNA as a template. Tuber poly(A) ÷ RNA (5 #g) and oligo-dT (1.5 #g) were incubated at 80 °C for 30 s in a volume o f 20/~1 double-distilled water. Subsequently mix I (20 units reverse transcriptase (Boehringer) in 30/~I 0.1 M Tris-HCl pH 8.3, 0.01 M MgC12, 0.15 M KCI, 0.02 M DTT, 1 mM dCTP, 1 mM dGTP, 1 mM dTTP, 1 mM dATP) was added and the mixture incubated at 43 °C for 45 min. After the first-strand synthesis 150/~l ice-cold mix II (40 mM Tris-HCl pH 7.5, 10 mM MgCI2, 75 mM KC1, 25 #g/ml BSA) was added and the mixture put on ice for 5 min. Second-strand synthesis was performed by adding 2 #l o~-32p-dATP (3 000 Ci/mmol -- 110000 GBq/mmol; 10 mCi/ml; New England Nuclear), 8 tzl DNA polymerase I (5 U//~l; Boehringer) and 1 #l E. coli RNase H (2 U/#l; Boehringer), and the mixture incubated at 11 °C for 60 min. Subsequently 2/xl 20 m M ATP and 1 #l T4 DNA ligase (1 U//~l; Boehringer) were added and the incubation prolonged for 2 h at 18 °C. To stop the reaction 200 #l 0.4 M EDTA and 1 #120% SDS was added and the ds-cDNA separated from unincorporated a-32p-dATP by Sephadex G-50 column chromatography in 0.01 M Tris, 1 mM EDTA, 0.3 NaCl pH 7.5. The ds-cDNA was precipitated with ethanol, dried and dissolved in double-distilled water. DscDNA was size-fractionated on a 5 - 30% sucrose gradient in 0.1 M NaC1, 0.01 M Tris-HC1 pH 7.5, 1 mM EDTA for 16 h at 4°C and 30000 rpm. DscDNA having a length of 500 bp or more was used for the tailing reaction in 120 mM potassium cacodylate pH 6.9, 1 mM dCTP, 1 mM CoCI 2, 50 units terminal deoxynucleotidyltrans ferase (Boehringer). Tailed ds-cDNA (1 ng) was precipitated with ethanol, dried and annealed to 10, 20, 50 or 100 ng Pst I-cut oligo-dG-tailed pBR322 (Bethesda Research Laboratories) in 50/~1 0.1 M NaC1, 1 mM Tris-HC1 pH 7.5, 0.1 mM EDTA at 58 °C f o r 2 h. The annealed mixture was used to transform E. coli M H 1 [11]. For selection o f transformants 10 #g/ml tetracycline was used. On the average 3 000 transformants were obtained per/~g o f poly(A) + RNA.

Differential screening of the cDNA library Individual

transformants

stored

in

96-well

microtitre plates were grown on GeneScreen Plus (New England Nuclear) and hybridized to radioactively labelled probes essentially as described by Franssen et al. [9]. For selection 10 #g/ml tetracycline was used in Luria Broth (LB) agar. Singlestranded cDNA probes for differential screening of the cDNA library were prepared from poly(A) ÷ RNA according to the first-strand synthesis procedure used in the construction of the cDNA library except that 50 #Ci o f ot-32p-dATP and o~-32p-dCTP (specific activity 3 200 Ci/mmol; New England Nuclear) and 0.2 mM unlabelled dATP and dCTP were added.

Hybrid-selected translation PIasmid DNA was isolated by the alkaline lysis method [25]. For hybrid-selected translation inserts o f selected recombinants ( 5 - 1 0 #g of DNA) were denatured and applied to 0.5 cm 2 discs of diazophenylthioether paper (BioRad) essentially as described [25]. Total potato tuber RNA (750 #g) was hybridized to the filter-bound DNA in 300/zl 50% (v/v) deionized formamide, 0.1% SDS, 0.6 M NaC1, 4 mM EDTA, 80 mM Tris-HC1 pH 7.8. Hybridization was initiated at 56 °C and the temperature was slowly decreased to 37 °C over a period of 6 hours. After thorough washing, the bound RNA was eluted at 90°C in H20, ethanol-precipitated, dried and dissolved in 5 #1 H20; 1.5/~1 was translated in a wheat germ extract (Bethesda Research Laboratories) in a 7.5/~1 mixture to which 20 #Ci [35S]methionine, 2.5 #1 wheat germ extract in 20 mM magnesium acetate, 30 mM potassium acetate, 0.5 mM each of 19 amino acids was added. Translation products were separated by electrophoresis on a 12.5% SDS-polyacrylamide slab gel according to Dorssers [12]. The translation products were visualized by fluorography using a Kodak XAR-5 film.

Nothern blot analysis Total RNA was denatured in dimethylsulphoxide and glyoxyl, subject to electrophoresis in 1%0agarose [25] transferred to GeneScreen and treated further according to the manufacturer's manual. The blots

258 were prehybridized for 4 hours in 50% formamide, 1 M NaC1, 0.05 M Tris-HCl pH 7.5, 10× Denhardt's solution, 1% SDS, 100/~g denatured salmon sperm D N A per ml and hybridized with nicktranslated [25] probes. Hybridization was performed for 16 h at 42 °C. Blots were washed in 0.3 M NaC1, 0.03 M sodium citrate, 0.1% SDS at 42 °C and subsequently several times in 75 mM NaCI, 7.5 mM sodium citrate, 0.1°70 SDS.

Southern blot analysis D N A was isolated from young potato leaves as described by Dellaporta [6]. Southern blot analysis was performed on nitrocellulose according to Maniatis et al. [25].

D N A sequence analysis Standard procedures were used for cloning into M13 vectors [28], for the chain termination [45, 46] and the chemical sequencing methods [26]. The D N A sequence data were stored and analysed using Staden programs [48] on a VAX computer.

Results

Molecular cloning o f tuber-specific m R N A s Poly(A) + R N A isolated from potato tubers was transcribed into ds-cDNA and cloned after GCtailing into the Pst I site of pBR322. A quick onetube method was used which abolished both hairpin formation normally used for priming the secondstrand synthesis as well as Sl nuclease digestion necessary for removal of this hairpin. It is most likely that less cloning artefacts are introduced using this strategy than in the standard approach [50]. Part of the 6000 c D N A recombinants obtained were screened by differential colony filter hybridization using 32p-labelled cDNA probes against leaf and tuber mRNAs. About 20O7o of the recombinants hybridized predominantly to the tuber m R N A probe. Cross-hybridization analysis showed that several of these c D N A clones could be divided into one of four groups. Subsequently one clone of each group was analysed further by Northern blot hybridizations as shown in Fig. 1. Total R N A isolated from tubers, stolons (subsoil and aerial), stems, leaves and roots was separated on a denaturing glyoxal-agarose gel and transferred onto nylon membranes. Recombinant p303 hybridized exclu-

Fig. 1. Occurrenceof tuber mRNAsin different tissues of potato. Autoradiographsare shownof Northern blots containing RNA isolated from underground stolon (1), aerial stolon (2), root (3), tuber (4), stem (5) and leaf (6) tissue and separated by electrophoresis on 1°70 agarose gels, blotted onto GeneScreen filters and hybridized against 32p-labelled p303, p322, p340 and p207. The position of the ribosomal RNAs, visualized after ethidium bromide staining of the agarose gel, is indicated.

259 sively to a 800 nucleotides long tuber m R N A and thus this recombinant contained a copy o f a tuberspecific mRNA. The three other c D N A recombinants, p322, 340 and p207, hybridized mainly to tuber m R N A having a length of 600, 900 and 1 500 nucleotides respectively. The m R N A s cloned in recombinants p207 and p340 could also be detected in underground stolons and stems. We estimated the amount of these two m R N A s in stolons and stems as approximately 1% of the amount present in tubers. Based on these data we also classified recombinants p340 and p207 as containing copies of tuberspecific mRNAs. Recombinant p322 contained a copy of an m R N A which is about 10 times more abundant in tubers than in stems. Therefore, we regarded p322 as a stem-tuber-specific m R N A containing cDNA recombinant.

Protein coding capacity of the cloned tuber mRNAs The translation products of total tuber m R N A have been analysed by electrophoresis on a denaturing SDS-polyacrylamide gel. A number o f abundant polypeptide bands could be identified having a molecular weight of 10000, 12500, 16000, 21000, 23 000 and 43 000 (Fig. 2). Only three potato tuber proteins have been characterized thoroughly until now: the patatin protein of Mr 43 000 [29], the proteinase inhibitor I of M r 12600 [4] and the proteinase inhibitor II o f M r 16000 [51, 55]. Patatin can account for up to 40% of the tuber protein [35] while the proteinase inhibitors can represent up to 10% [4]. So, most likely the dominant M r 43000, 16000 and 12 500 polypeptide bands detected after in vitro translation of tuber poly(A) ÷ R N A represented the patatin and the proteinase inhibitor I and II polypeptides, respectively. The identity o f the cloned tuber m R N A s was established by hybrid selection and in vitro translation. Polyacrylamide gel electrophoresis (PAGE) demonstrated that polypeptides of M r 16000 and 43000 were encoded by m R N A s selected by c D N A clone p303 and p207 (Fig. 2) respectively. This strongly suggested that p207 contained a copy of cv. Bintje patatin m R N A while c D N A clone p303 contained a

Fig. 2. Characterization of the coding capacity of tuber cDNA clones by hybrid selected translation. Tuber RNA eluted from filter-bound p207, p303, p322and p340 as well as total RNA isolated from tuber, leaf and stem tissue was translated in a wheat germ extract in the presence of 35S-methionine. The products obtained were separated on a 12.5% polyacrylamide gel and fluorographed. The molecular weight of marker proteins is indicated as Mr × 10-3.

copy of the cv. Bintje proteinase inhibitor II mRNA. Recombinant p322 selected a m R N A encoding a M r 10000 polypeptide while p340 hybrid-selected m R N A s appeared to code for polypeptides of M r 23 000, 21000 and 19000. This might be explained by assuming that p340 contains a copy of a m R N A encoded by a diverged multigene family. Tuber proteins showing these molecular weights have not been described in the literature.

Nucleotide sequence analysis of the tuber m R N A s containing cDNA recombinants pPA TB1 and pPA TB2 Recombinant p207 contained an insert of ca. 750

260

B2 BI B2 BI

I i0 20 30 40 50 60 GAAAACACTTTGAACATTTGCAAA ATG GCA ACT ACT AAA TCT TTT TTA ATT TTA TTT TTT Met Ala Thr Thr Lys Ser Phe Leu Ile Leu Phe Phe

70 80 90 i00 ii0 120 B2 ATG ATA TTA GCA ACT ACT AGT TCA ACA TGT GCT AAG TTG GAA GAA ATG GTT ACT GTT CTA C BI B2 Met lle Leu Ala Thr Thr Ser Ser Thr Cys Ala Lys Leu Glu Glu Met Val Thr Val Leu BI 130 140 150 160 170 180 B2 AGT ATT GAT GGA GGT GGA ATT AAG GGA ATC ATT CCA GCT ATC ATT CTC GAA TTT CTT GAA BI B2 Ser lle Asp Gly Gly Gly lle Lys Gly lle lle Pro Ala lle lle Leu Glu Phe Leu Glu BI

190 200 210 220 230 240 B2 GGA CAA CTT CAG GAA GTG GAC AAT AAT AAA GAT GCA AGA CTT GCA GAT TAC TTT GAT GTA B1 B2 Gly Gln Leu Gln Glu Val Asp Asn Asn Lys Asp Ala Arg Leu Ala Asp Tyr Phe Asp Val BI 250 260 270 280 290 300 B2 ATT GGA GGA ACA AGT ACA GGA GGT TTA TTG ACT GCT ATG ATA ACT ACT CCA AAT GAA AAC BI B2 lle Gly Gly Thr Ser Thr Gly Gly Leu Leu Thr Ala Met lle Thr Thr Pro Asn Glu Asn BI

310 320 330 340 350 360 B2 AAT CGA CCC TTT GCT GCT GCC AAA GAT ATT GTA CCC TTT TAC TTC GAA CAT GGC CCT CAT B1

B2 Asn Arg Pro Phe Ala Ala Ala Lys Asp lie Val Pro Phe Tyr Phe Glu His Gly Pro His B1 370 380 390 400 410 420 B2 ATT TTT AAT TAT AGT GGT TCA ATT TTA GGC CCA ATG TAT GAT GGA AAA TAT CTT CTG CAA T G BI B2 lle PhelAsn Tyr SerlGly Ser lie Leu Gly Pro Met Tyr Asp Gly Lys Tyr Leu Leu Gln BI Phe Arg 430 440 450 460 470 480 B2 GTT CTT CAA GAA AAA CTT GGA GAA ACT CGT GTG CAT CAA GCT TTG ACA GAA GTT GCC ATC BI B2 Val Leu Gin Glu Lys Leu Gly Glu Thr Arg Val His Gin Ala Leu Thr Glu Val Ala lie BI 490 500 510 520 530 540 B2 TCA AGC TTT GAC ATC AAA ACA AAT AAG CCA GTA ATA TTC ACT AAG TCA AAT TTA GCA AAG A BI B2 Ser Ser Phe Asp lle Lys Thr Asn Lys Pro Val lle Phe Thr Lys Ser Asn Leu Ala Lys 550 560 570 580 590 600 B2 TCT CCA GAA TTG GAT GCT AAG ATG TAT GAC ATA TGC TAT TCC ACA GCA GCA GCT CCA ATA T BI B2 Ser Pro Glu Leu Asp Ala Lys Met Tyr Asp lie Cys Tyr Ser Thr Ala Ala Ala Pro lie BI lie 610 620 630 640 650 660 B2 TAT TTT CCT CCA CAT CAC TTT GTT ACT CAT ACT AGT AAT GGT GCT AGA TAT GAG TTC AAT BI C B2 Tyr Phe Pro Pro His His Phe Val Thr His Thr Set Asn Gly Ala Arg Tyr Glu Phe Asn BI Thr

261 670 680 690 700 710 720 B2 CTT GTT GAT GGT GCT GTT GCT ACT GTT GGT GAT CCG GCG TTA TTA TCC CTT AGC GTT GCA G BI B2 Leu Val Asp Gly Ala Val Ala Thr Val Gly Asp Pro Ala Leu Leu Ser Leu Ser Val Ala BI Gly 730 740 750 760 770 780 B2 ACG AGA CTT GCA CAA GAG GAT CCA GCA TTT TCT TCA ATT AAG TCA TTG GAT TAC AAG CAA B1

B2 Thr Ar~ Leu Ala Gln Glu Asp Pro Ala Phe Ser Ser lie Lys Ser Leu Asp Tyr Lys Gin B1

790 800 810 820 830 840 B2 ATG TTG TTG CTC TCA TTA GGC ACT GGC ACT AAT TCA GAG TTT GAT AAA ACA TAT ACA GCA B1

B2 Met Leu Leu Leu Ser Leu Gly Thr Gly Thr Asn Ser Glu Phe Asp Lys Thr Tyr Thr Ala B1

850 860 870 880 890 900 B2 GAA GAG GCA GCT AAA TGG GGT CCT CTA CGA TGG ATG TTA GCT ATA CAG CAA ATG ACT AAT T BI B2 Glu Glu Ala Ala Lys Trp Gly Pro Leu Arg Trp Met Leu Ala lle Glu Glu Met Thr Asn BI Leu 910 920 930 940 950 960 B2 GCA GCA AGT TCT TAC ATG ACT GAT TAT TAC ATT TCT ACT GTT TTT CAA GCT CGT CAT TCA B1 B2 Ala Ala Ser Ser Tyr Met Thr Asp Tyr Tyr lie Ser Thr Val Phe Gin Ala Arg His Ser BI 970 980 990 i000 i010 1020 B2 CAA AAC AAT TAC CTC AGG GTT CAA GAA AAT GCA TTA AAT GGC ACA ACT ACT GAA ATG GAT BI CA B2 Gin Asn Asn Tyr Leu Arg Val Gln Glu Asn Ala Leu Asn Gly Thr Thr Thr Glu Met Asp BI Thr 1030 1040 1050 1060 1070 1080 B2 GAT GCG TCT GAG GCT AAT ATG GAA TTA TTA GTA CAA GTT GGT GAA ACA TTA TTG AAG AAA B1

B2 Asp Ala Ser Glu Ala Asn Met Glu Leu Leu Val Gln Val Gly Glu Thr Leu Leu Lys Lys B1 1090 II00 iii0 1120 1130 1140 B2 CCA GTT TCC AAA GAC AGT CCT GAA ACC TAT GAG GAA GCT CTA AAG AGA TTT GCA AAA TTG BI G B2 Pro Val Ser Lys Asp Ser Pro Glu Thr Tyr Glu GLu Ala Leu Lys Arg Phe Ala Lys Leu BI 1150 1160 1170 1180 1190 1200 B2 CTC TCT GAT AGG AAG AAA CTC CGA GCA AAC AAA GCT TCT CAT TAATTCAAGGTCCCGGGTTGTAG A G T T BI B2 Leu Ser Asp Arg Lys Lys Leu Arg Ala Asn Lys Ala Ser His BI Asn Tyr 1210 1220 1230 1240 1250 1260 1270 1280 B2 TAGTTAAATAATAAGCGCTTGCAATATTTATGATCTGCACGCATTTAAATATTTCAACCCTCAAAC BI G AACCTTACTATGC -T G T A 1290

1300

1310

1320

1330

1340

1350

1360

B2 TAAAAGGAGTTTGAGGGATAAATTTCAATAGAAATGTCTCTCTATGTAATGTGTGCTTGGATTATGTAACCTTTTGGTT BI

1370 1380 1390 1400 B2 G T G T T A A A T A T T T A A A T A A - T T A T C C T T T A ~ B1 A G ATTTATGTTCAAGT Fig. 3. Nucleotide sequence of the insert of clones pPATBI and pPATB2. The copy of the patatin m R N A was cloned by GC tailing in the Pst I site of pBR322. These tails are not shown here. The 5' -terminal nucleotide of the insert of pPATB1 is indicated by the triangle. Putative polyadenylation sites are underlined, and the glycosylation site is boxed.

262 nucleotides (data not shown) while it hybridized to an mRNA having a length of ca. 1 500 nucleotides (Fig. 1). Obviously p207 did not contain a fulllength copy o f the cloned mRNA and therefore the cDNA library was rescreened using the insert of p207 as a probe. A number of cross-hybridizing cDNA clones were detected, two of which, pPATB1 and pPATB2, were analysed further. The complete nucleotide sequence of the inserts of pPATB1 and pPATB2 was determined (Fig. 3). The insert of pPATB2 showed a length of 1 377 bp (the GC tails and the poly(A) ÷ tail not included). The insert of pPATB1, lacking 50 nucleotides at the 5' end compared to pPATB2, showed 13 nucleotide changes and 9 amino acid substitutions resulting in an overall homology of more than 97% with pPATB2. In both recombinants one open reading frame could be detected. In pPATB2 this sequence could accommodate a potential polypeptide of 386 amino acids having a molecular weight of 42610. The deduced amino acid sequence of this polypeptide shows more than 95 % homology with the patatin amino acid sequence as determined by Mignery et al. [29]. The homology of pPATB2 with the patatin amino acid sequence as determined by Rosahl et al. [39] in another potato cultivar extends even to 385 out o f 386 amino acids. Only the carboxy terminal (C-terminal) tyrosine residue as determined by Rosahl et al. has changed to histidine in pPATB2 (TAT--CAT). These data infer that pPATB1 and pPATB2 contain copies of two different cv. Bintje patatin mRNAs. In total, we have isolated cDNA clones containing copies of five different patatin mRNAs (data not shown) which indicates that at least five different patatin genes are actively expressed in the tetraploid potato cv. Bintje. The patatin mRNA copy cloned in pPATB2 contained an untranslated leader sequence of 23 nucleotides. Recently Pickaard et al. [37] reported a Class I patatin mRNA leader of 37 nucleotides in cv. Superior as determined by S 1 nuclease protection experiments. The 23 nucleotide long cv. Bintje patatin leader showed complete homology to the corresponding part o f the cv. Superior patatin mRNA leader. By analogy we concluded that the pPATB2 mRNA is encoded by a Class I patatin gene which is only expressed in tubers in contrast to Class II

genes which are expressed in tubers and to a very low extent in roots [37]. Class II mRNAs contain a 22 bp insert 9 nucleotides upstream of the ATG translation initiation codon, as opposed to Class I mRNAs. In accordance with the in vivo glycosylated nature of the patatin polypeptide the patatin precursor protein contains a signal peptide of 23 amino acids. A putative glycosylation site in the mature protein could be assumed on the basis of the Nglycosylation site rule as formulated by Sharon and Lis [47] and formed by Asn-Tyr-Ser in both pPATB1 and pPATB2. The 3' untranslated regions contained four putative polyadenylation signals (AATAA) two of which were located near the polyadenylation site. It is tempting to speculate that the pPATB2 mRNA transcript has used the more distal of these two polyadenylation sites (Fig. 3) for termination in comparison to the pPATB1 transcript. p303

The insert of cDNA p303 has been partly (200 nucleotides) analysed by DNA sequence analysis (data not shown). Comparison of the determined nucleotide sequence with the potato inhibitor II mRNA nucleotide sequence as published recently by Sanchez-Serrano et aL [43] shows more than 90°70 homology. So, this cDNA clone contained a copy of the cv. Bintje proteinase inhibitor II mRNA. We did not examine this clone further. p322

As shown in Fig. 4, the insert of recombinant p322 has a length of 470 bp (except GC tails and poly(A) ÷ tails of 30 residues). The determined nucleotide sequence allowed for one open reading frame coding for a polypeptide o f 76 amino acids having a molecular weight of 8 614. The coding capacity of p322 as determined by hybrid-release translation and SDS-polyacrylamide gel electrophoresis has been estimated at ca. M r 10000 (Fig. 2). This discrepancy might be explained by anomalous migration of the polypeptide on the gel system used, possibly due to differences in the extent of SDS binding [18] or by assuming that the first methionine codon (nucleotides 7 - 9) does not function as a start codon. The sequences surrounding this triplet do

263 i0 20 30 40 50 60 i CTA TCC ATG CGT TTC TTT GCT ACT TTC TTT CTT CTA GCT ATG CTT GTC GTG GCT ACT AAG Leu Ser Met Arg Phe Phe Ala Thr Phe Phe Leu Leu Ala Met Leu Val Val Ala Thr Lys 70 80 90 I00 ii0 120 ATG GGA CCA ATG AGA ATT GCA GAG GCA AGA CAT TGC GAG TCG TTG AGC CAT CGT TTC AAG Met Gly Pro Met Arg lle Ala Glu Ala~Arg His Cys Glu Ser Leu Ser His Arg Phe Lys i

130 140 150 160 170 180 GGA CCA TGT ACG AGA GAT AGC AAT TGT GCT TCG GTC TGT GAG ACC GAA AGA TTT TCC GGT Gly Pro Cys Thr Arg Asp Ser Asu Cys Ala Ser Val Cys Glu Thr Glu Arg Phe Ser Gly 190 200 210 220 230 240 GGC AAT TGC CAT GGA TTC CGT CGC CGT TGC TTT TGC ACT AAG CCA TGC TAAATGAGTATTAAAAAT Gly Asu Cys His Gly Phe Arg Arg Arg Cys Phe Cys Thr Lys Pro Cys 250 260 270 280 290 300 310 320 TATGTGTAATAGAAGAAGTTTGAGAAAAAAATTATGTACTCTTGAATAAAGTACACTATGATTGTTCAAAGATATATGTGGT 330 340 350 360 370 380 390 400 GCTAGTTTTGTTTGTAAAACTAGTCGTGATCTTTGAATTTATATGCAATTATGGTGCACTAGACTTGTTAATTTCTTCATG 410 420 430 440 450 460 470 TGATGTATTTTTTGCTCTTTTGTTATGAAATATTATGGATAAAATTTGTCTTTTAGTCTTTA~ 490 A ~

480

500

Fig. 4. Nucleotidesequenceof the insert of clone p322 (exceptthe GC tails). The putativepolyadenylationsignals are underlinedas well

as the putative methionineinitiation codon (nucleotide7-9). The direct invertedrepeat in the 3' untranslated region is overlined.The putative proteolytic cleavagesite is indicated by an arrow.

not obey the Kozak rule [20]. Instead o f purines a thymidine residue is found at the position - 3 while a cytidine residue is located at position + 4. Therefore, recombinant p322 might not contain a fulllength copy of the corresponding mRNA. On the other hand comparison o f the amino terminal (Nterminal) amino acid sequence to published protein sequences showed homology to some of these. Starting at the methionine residue mentioned the p322 polypeptide showed 54% identity to the N-terminal part of the signal peptide of human lactalbumin precursor protein [14] (7 out of 13 N-terminal amino acids), suggesting the presence o f a signal peptide in the N-terminal domain o f p322. This was further supported by the presence of the amino acid sequence phe-phe-leu-leu-ala (nucleotides 25 to 40) also found in the signal peptide of the proteinase inhibitor I of potato ([4], W. J. Stiekema, unpublished results). Inspection of the hydrophobicity profile (data not shown) indeed showed a strongly hydrophobic region at the N-terminus of the p322 polypeptide. The cleavage site is predicted between

ala-29 and arg-30. This site obeys the ( - 3 , - 1 ) rule proposed by Von Heye [53] which suggests a small residue at - 1 (ala) whereas at - 3 the amino acid may not be aromatic, charged or large and polar (ala). Moreover, the putative signal peptide contains a hydrophobic core of 14 amino acids which is flanked at its N-terminal site by an arginine at position 4 and at its C-terminal site by 10 amino acids, o f which two are basic (lys-20, arg-25) and one is acidic (glu-28). Cloning of the p322 gene and $1 nuclease analysis or primer extension experiments will ultimately show whether or not the p322 cDNA recombinant contains a full-length copy of the mRNA. At the C-terminus of the p322 polypeptide another remarkable homology is found (Fig. 5). Five of the six amino acids o f the C-terminal sequence phecys-tyr-pro-cys appeared to be identical to the Cterminal part of the Bowman-Birk trypsin inhibitor o f soybean [31, 32]. Comparison of the mature part o f this proteinase inhibitor to the putative mature part of the p322 polypeptide showed an overall ho-

264

4,

1 I0 20 ~30 ~ 40 I LSMRFFATFFLLAMLVVATKMGPMRIAEARHCESLSHRFK 0 0 0 ~ 0 • 0 O0 SDQSSSYDDDEYSKPCCDLCMCTRSMPPQCSCEDIRLN II i i0 20 30 •

I

50 60 70 Gp CTRNSNCASVCETERFS GGNCHGFRRRCFCTKP C •









O •

O OOQOOO

010000

II - S C - - H S D C K S - C M C T R S Q P G Q C R C L D T N D F C Y K P C K S R D D 40 50 60 70

Fig. 5. Comparison of the amino acid sequence of the p322 polypeptide (I) and the mature amino acid sequence of the proteinase inhibitor CII (Bowman-Birk) of soybean (II) as determined by protein sequencing. The arrow shows the putative proteolytic cleavage site in I. Identical amino acids are depicted by e , whereas conservative amino acid substitutions by o. The putative active site of the proteinase inhibitor II is Arg-38 as indicated by *.

mology of 3007o. If conservative amino acid changes are disregarded this homology increases to more than 5007o. It is also striking that 8 o f the 14 cysteine residues of the soybean proteinase inhibitor are found in similar positions in the p322 polypeptide, while the other six cysteine residues are part of an N-terminal domain of the soybean mature inhibitor not present in the p322 polypeptide. This suggests similar types of disulphide bridges in both polypeptides. The 3' terminal untranslated region of the p322 mRNA contains a number o f palindromes. The most remarkable one consists of 25 nucleotides present immediately after the stop codon. Putative polyadenylation sites (GATAA and AATAT) are found 18 bp and 30 bp upstream of the poly(A)+ tail respectively.

p340 The nucleotide sequence of the insert of cDNA clone p340 and part of the sequence of the insert of a crosshybridizing clone p34021 were determined (Fig. 6). The insert of p340 has a length of 560 nucleotides (excluding the A and GC tails). One open reading frame could be detected potentially encoding a polypeptide of 145 amino acids having a M r of 16445. No N-terminal methionine was found indicating that p340 did not contain a full-length copy of the cloned mRNA. The cross-hybridizing clone p34021 con-

tained an insert of 786 nucleotides (besides A and GC tails). This insert completely overlapped p340, but additionally it contained extra nucleotides at the 5' end compared to p340. The nucleotide sequence of only strategic regions of the p34021 insert were determined (Fig. 6) including the 5' and 3' terminal regions: bp 1 - 303 (50 bp overlap with the 5' end of p340) and bp 6 0 9 - 819 (complete overlap with the 3' end of p340). The overlapping regions of p340 and p34021 showed complete sequence identity. A combination of both sequences would lead to an open reading frame of 220 amino acids having M r of 23 915 in agreement with the hybrid-selected translation data. Unfortunately, p34021 also did not contain a putative N-terminal methionine. Nevertheless the hydrophobicity pattern (data not shown) predicted a partial signal peptide of at least 20 amino acids. The proteolytic processing site is located most likely between ala-20 and arg-21. The ( - 3 , - 1 ) rule [53] is obeyed in this region. This suggests that p34021 contained a nearly full-length copy of the cloned mRNA. Comparison of the p34021 amino acid sequence to published protein sequences showed 30°70 homology to the C-terminal 90 amino acids of the Kunitz trypsin inhibitor of winged bean (Fig. 7) [55]. This homology would increase to more than 5007o if conservative amino acid changes were to be disregarded. It is remarkable that in a small region of the C-terminal part an even larger homology could be detected: 7007o homology was shown in the region comprising amino acids 199 to 207. Moreover three of the four cysteine residues were located in identical positions while the fourth was found in corresponding regions of both peptides. Two overlapping AATAAA sequences are present 26 nucleotides upstream of the polyadenylation site in agreement with the consensus sequence AATAAA found 2 5 - 3 0 nucleotides in front of the poly(A) ÷ addition site in most eukaryotic genes analysed. It is therefore likely that this sequence serves as a poly(A) ÷ additional signal. A second AATAAA sequence is present 83 nucleotides upstream of the poly(A) ÷ tail. The significance of this sequence is unknown.

265 i p34021TCG P34021Ser

40 50 60 I0 20 30 ATT AAT ATT TTG AGT TTC CTC TTG CTT TCA AGT ACC CTC TCT TTG GTT GCC TTT GCT lie Asn lle Leu Ser Phe Leu Leu Leu Ser Ser Thr Leu Ser Leu Val Ala Phe Ala~

i00 ii0 120 70 80 90 p34021 CGA TCT TTC ACT TCT GAG AAT CCA ATT GTC CTC CCC ACA ACT TGT CAT GAT GAT GAT AAT p 3 4 0 2 1 A r g Ser Phe The Ser Glu Asn Pro lle Val Leu Pro Thr Thr Cys His Asp Asp Asp Asn

p34021CTT p34021Leu

160 170 180 140 150 130 GTA CTC CCT GAA GTT TAT GAC CAA GAT GGC AAT CCG CTG AGG ATT GTG AGA GGT ACA Val Leu Pro Glu Val Tyr Asp Gin Asp Gly Asn Pro Leu Arg lie Val Arg Gly Thr

200 210 190 220 230 240 p340 ATA TTG GAA ACC TTC p 3 4 0 2 1 T T A TTA ACA ATC CTC TCC TCG GGG CCG GAG CCG TAT ACT TGT ACA lle Leu Glu Thr Phe p340 p 3 4 0 2 1 L e u Leu Thr lle Leu Ser Ser Gly Pro Glu Pro Tyr Thr Cys Thr . . . . . 280 290 300 250 260 270 p340 AAT GCC CAA ATG CAC GTG TTG CAG CAC ATG TCG ATT CCC CAA TTT TTG GGA GAA GGC ACG p34021 p340 Asu Ala Gln Met His Val Leu Glu Gis Met Ser lle Pro Glu Phe Leu Gly Glu Gly Thr p34021 . . . . . . . . . . . .

p340 p340

340 35O 360 310 320 330 CCC GTC GTG TTC GTT CGT AAG TCG GAG TCG GAT TAT GGT GAT GTG GTG CGT GTA ATG ACT Pro Val Val Phe Val Arg Lys Ser Glu Ser Asp Tyr Gly Asp Val Val Arg Van Met Thr

p340 p340

400 410 420 370 380 390 GTT GTT TAT ATC AAG TTC TTT GTT AAA ACA ACA AAG TTG TGT GTT GAC CAA ACT GTT TGG Val Val Tyr lie Lys Phe Phe Val Lys Thr Thr Lys Leu Cys Val Asp Gln Thr Val Trp

p340 p340

460 470 480 ~30 440 450 AAA GTT AAT GAT GAA CAG TTG GTG GTA ACT GGT GGT AAG GTA GGA AAT GAA AAC GAC ATC Lys Val Asn Asp Glu Gin Leu Val Val Thr Gly Gly Lys Val Gly Asn Glu Asn Asp lle

p340 p340

520 530 540 490 500 510 TTC AAG ATT ATG AAA ACT GAC TTG GTG ACA CCA AGA GGT TCC AAA TAT GTA TAC AAG TTA Phe Lys lle Met Lys Thr Asp Leu Val Thr Pro Arg Gly Ser Lys Tyr Val Tyr Lys Leu

p340 p340

580 590 600 550 560 570 CTG CAT TGT CCC TCT CAT CTT GGG TGC AAA AAT ATC GGC GGC AAC TTT AAA AAT GGA TAT Leu His Cys Pro Ser His Leu Gly Cys Lys Asn lle Gly Gly Asn Phe Lys Asn Gly Tyr

640 650 660 610 620 630 p340 CCT CGT CTG GTG ACT GTC GAT GAC GAT AAG GAC TTT ATT CCA TTT GTG TTC ATC AAG GCG p34021 p340 Pro Arg Leu Val Thr Val Asp Asp Asp Lys Asp Phe lle Pro Phe Val Phe lle Lys Ala p34021 670 680 690 700 710 720 730 740 p340 TAGAATGCTAATTAGCTGGCTAGCTTGTAGCTTTCTAAATAAAGTGCATATATCCTTCTATCGCTCCATGTAAATTAATG p34021 750 760 770 780 790 TATGCTTATCAATAAATAAACAAGCTAGCAATTAGCCTATTACCTT~ p340 P34021

800

810

820

830 p340 AAAAAAAAAAAAAAAA Fig. 6. Nucleotide sequence of the insert of clone p340 (except GC tails) and part of the nucleotide sequence of p34021 (nucleotides 1- 303 and 609-786). Nucleotides and amino acids identical in p34021 and p340 are indicated by -. The putative proteolytic cleavage site is indicated by the arrow, and putative polyadenylation signals are underlined.

266 130 140 150 160 FVKTTKLCVDQTVWKVNDEQLVVTGGKVGN-ENDI @0 • I0 0 O0 • 00@0 0 @

Molecular cloning and analysis of four potato tuber mRNAs.

Tuberization in potato is a complex developmental process involving the expression of a specific set of genes leading to the synthesis of tuber protei...
2MB Sizes 0 Downloads 0 Views