Plant Molecular Biology 20: 311-313, 1992. © 1992 Kluwer Academic Publishers. Printed in Belgium.

311

Update section

Short communication

Isolation and sequence analysis of the genomic DNA fragment encoding an aspartic proteinase inhibitor homologue from potato (Solanum tuberosum L.) Darja Barli6 Maganja, Borut Strukelj, Jo~e Punger6ar, Franc Gubengek, Vito Turk and Igor Kregar Department of Biochemistry, Jo~ef Stefan Institute, Jamova 39, 61000 Ljubljana, Slovenia Received 1 April 1992; accepted in revised form 10 April 1992

Key words: potato, Solanum tuberosum, aspartic proteinase inhibitor, gene structure

Abstract A genomic D N A clone encoding an aspartic proteinase inhibitor of potato was isolated from a lambda EMBL3 phage library using the aspartic proteinase inhibitor c D N A as a hybridization probe. The gene has all characteristic sequences normally found in eucaryotic genes. Typical CAAT and TATA box sequences were found in the 5'-upstream region. In this part are also two putative regulatory A G G A box sequences located. In the genomic sequence there are no intron sequences interrupting the coding region. An open reading frame of the gene encodes a precursor protein of 217 amino acids which shows high percent identity with the aspartic proteinase inhibitor cDNA.

Plant storage organs such as seeds and tubers represent a rich source of proteins that potently inhibit several proteolytic enzymes [10]. From potato tubers several inhibitors of metallocarboxypeptidases, serine, cysteine and aspartic proteinases have been isolated [ 11]. The first aspartic proteinase inhibitor was isolated from potato tubers in 1976 [5]. This was an inhibitor of cathepsin D, the major mammalian lysosomal aspartic proteinase. It inhibits also trypsin, a serine proteinase. Two isoforms of cathepsin D inhibitor from potato were sequenced [6, 9] and the primary structure of the third one was deduced from its c D N A sequence [15]. Southern blot

analysis of potato genomic D N A reveals the presence of more than ten genomic fragments per haploid genome (Strukelj et aI., in press) indicating that the aspartic proteinase inhibitor homologue genes belong to a moderately sized multigene family. We screened a genomic library of monohaploid potato calli (genotype AM 57/93 from the collection of the Max Planck Institute in Cologne) constructed in the lambda replacement vector EMBL3. Escherichia coli Y1090 was used as a host. Screening of about 3 x 104 recombinant plaques with the 35S-random primer-labeled [1] c D N A probe [ 15] yielded one positive clone. The

The nucleotide sequence data reported will appear in the EMBL, GenBank and DDBJ Nucleotide Sequence Databases under the accession number X64370. The standard abbreviations for nucleotides was used in agreement with the Nomenclature Committee of the International Union of Biochemistry, 1985 [ 12]: Y = C or T, K = G or T, W = A or T and N = A, C, G or T.

312 insert of this clone was Eco RI/Bam HI-digested. 2.0 kb internal Eco RI/Bam HI fragment hybridizing to the same probe was isolated and subcloned into pUC19 vector. For sequence analysis, D N A insert was further digested with appropriate restriction enzymes and subcloned into convenient restriction sites of plasmid pUC19. The nucleotide sequences were determined from both ends of the inserts using the dideoxy chain termination method [14]. The genomic Eco RI/Sau 3A fragment of about 1.3 kb (Fig. 1) contains the entire RNA coding as well as 252 nucleotides of the 5'-upstream and 401 nucleotides of the 3'-downstream region. In the gene there are two putative regulatory sequences, CAAT sequence beginning at position 135 and

TATAAA sequence beginning at position 177. At positions 61 and 121 two sequences (TAAGGAATT and TAAGGAACT) are located which are related to the A G G A box consensus sequence YA2_sKNGA2_4YY. The A G G A box is found in a number of other plant genes about 80 bp upstream of the transcription initiation site. It might be involved in the regulation of gene transcription, as it bears some resemblance to the animal core enchancer sequence G T G G W W W [4]. At the 3'-end two polyadenylation signals AATAAA are found beginning at positions 977 and 1023. The presented gene belongs to the rare class of eucaryotic genes that do not contain intervening sequences. It is intronless, but it shows in the promoter, protein coding and 3'-untrans-

G~TT~T~TA-AATTCGAAAAGTGC~CACAAACTGAGAcG~GAAAAT~T~TATTTGAT~GG~TTTATTAT~TTG~TGACCATT

90

T~GT~TTTACGGGT~T~C~CCAC~T~GG~CTCTAGTC~TTTT~TACATGGC~GG~TATGAGAGTGTGATGAGTCTATA

180

~TAG~GGCTTCGTTAGTGTAGAGGAGTCACAAAC~GC~TACACAAATAAAATTAGTAGCTTAAAC~GATG~GTGTTTATTTTTG M

270 K

C

L

F

L

6

TTATGTTTGTGTTTGGTTCCCATTGTGGTGTTTTCATC~CTTTCACTTCCAAAAATCCCATT~CCTACCTAGTGATGCTACTCCAGTA ~

C

L

C

L

V

P

I

V

V

F

S

S

T

F

T

S

CTTGACGTAGCTGGTAAAGAACTTGATTCTCGTTTGAGTTAT L

D

V

A

D

K

E

L

D

S

R

L

K

N

P

I

N

L

P

~

D

A

T

P

360

V

36

CGTATTATTTCCACTTTTTGGGGTGCGTTAGGTGGTGATGTGTACCTA S

Y

R

I

I

S

T

F

W

G

A

L

G

G

D

450

V

Y

L

V

R

F

GGTAAGTCCCCAAATTCAGATGCCCCTTGTGCAAATGGCATATTCCGTTACAATTCGGATGTTGGACCTAGCGGTACACCCGTTAGATTT G

K

S

P

N

S

D

A

P

C

A

N

G

I

F

R

Y

N

S

D

V

G

P

S

G

T

P

540

A G T C A T T T T G G A C A A G G T A T C T TTGA~A_AATGAACTA C T C A A C A T C C A A T T T G C T A T T T C A A C A T C GAA/~TT G T G T G T T A G T T A T A c A A T T S

H

F

G

Q

G

I

F

E

A

S

• TGGAAAGTGGGAGATTACGATGCAT W

K

V

G

ATTGTTAAATCAT I

V

K

S

D

Y

D

N

E

L

L

N

Q

F

Q

F

A

I

S

T

S

K

L

C

V

S

Y

T

I

F

K

CTCTAGGGACGATGTTGTTGGAGACTGGAGGAACCATAGGTCAAGCAGATAGCAGTTGGTTCAAG L

G

T

M

L

CACAATTTGGTTACAACTTATTGTATTGC S

I

G

Y

N

L

L

Y

L

E

T

G

G

T

I

G

Q

A

D

S

S

W

P

V

T

S

T

M

S

C

P

F

S

S

L

K

V

G

V

CAGTAATAACAAATGTCTGC Q

*

V

H

Q

N

G

K

R

R

L

A

L

V

K

D

N

P

L

D

V

S

D

Q

F

F

K

Q

V

126

156

186

900

CTGCTAGCTAGACTATATGTTTTAGCAGCTACTATATATGTTATGTTGTAAATTAAAATAAACACCTGCT

*

AAGCTATATCTATATTTTAGCATGGATTTCTAAATAAATTGT CTTTC CTTATCTGGAGCGTTTGCTTATACCTAATAATGAAATAAGGTG TGTGAACAAAGT CCTACGTGAAAAATAAGAAATAAGGAGTATGAATACACTTAATGGTAGTGTGACATGGCTTTAATTTGGAGGTATAAA

TTTCATAAGGATAAAGATTACTCTAGCAAAGT•AGATTATTAAATTTCATcTT•TTTCAAATcCTATTTTCTAACAATATTAAGAGTATG TTTGGGT CAGCTTAACCAACGACCAACTTTTGAACAGIkGGGATC

630

810

D

TGTTTAA-AAGTTGGTGTAGTTcACCAAAATGG~GA~GTTTGGCTCTTGT~AAGGACAATCCT~TTGATGT~TCcTT~A/~G~AAG~ C

96

720

CCTGTTACTAGTACAATGAGTTGTCCATTTTCCTCTGATGATCAATTC C

66

216

990 217

1080 1170 1260 1304

Fig. l. Nucleotide sequence and deduced amino acid sequence of the cathepsin D inhibitor homologue gene. A G G A , CAAT, TATA and potyadenylation sequences are underlined. The presumptive signal sequence is also underlined. The two stop codons at the end of the coding region are marked with asterisks.

313 lated regions all characteristics of a functional gene. An open reading frame of the gene encodes a precursor protein of 217 amino acids. Comparison of its amino acid sequence with the protein sequences of aspartic proteinase isoinhibitors [6, 9] and deduced sequence from the c D N A [ 15] indicates that the first amino acid residue of the mature protein is probably an aspartic acid at position 32. The first 31 amino acid residues show characteristics similar to other eucaryotic signal sequences [8, 16]. We propose that this region is post-translationally lost, during or after transport into the vacuole like it is evident for some other proteinase inhibitors isolated from tomato and potato plants [2, 3, 7, 13, 17]. In the entire protein coding region, the genomic sequence shares 83 ~o nucleotide and 75 ~o amino acid identity with c D N A encoding an aspartic proteinase inhibitor [15]. The sequences of the mature proteins deduced from the genomic and c D N A nucleotide sequences show 73 ~o identical amino acid residues. The amino terminal sequence comparison of the mature form deduced from the genomic clone and the Kunitz-type proteinase inhibitor PKI-1 [ 18] shows strong similarity in the first 42 amino acid residues. The three-dimensional structure and the active site of the molecule responsible for the inhibition of the cathepsin D are still a subject of recent research [6]. However, the putative active center for trypsin inhibition identified as Arg-Phe at position 95-96 is recognized [61.

Acknowledgements We are grateful to Dr. L. Willmitzer (Institut fttr Genbiologische Forschung, Berlin) for kindly providing us with the potato genomic library. This work was supported by the Ministry for Science and Technology of the Republic of Slovenia.

References 1. Feinberg AP, Vogelstein B: A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal Biochem 137:266-267 (1984).

2. Graham JS, Pearce G, Merryweather J, Titani K, Ericsson L, Ryan CA: Wound-induced proteinase inhibitors from tomato leaves. J Biol Chem 260:6555-6560 (1985). 3. Graham JS, Pearce G, Merryweather J, Titani K, Ericsson L, Ryan CA: Wound-induced proteinase inhibitors from tomato leaves. J Biol Chem 260:6561-6564 (1985). 4. Heidecker G, Messing J: Structural analysis of plant genes. Annu Rev Plant Pbysiol 37:439-466 (1986). 5. Keilova H, Tomasek V: Isolation and some properties of cathepsin D inhibitor from potatoes. Collect Czech Chem Commun 41:489-497 (1976). 6. Mareg M, Meloun B, Pavlik M, Kostka V, Baudy~ M: Primary structure of cathepsin D inhibitor from potatoes and its structure relationship to soybean trypsin inhibitor family. FEBS Lett 251:94-98 (1989). 7. Nelson CE, Ryan CA: In vitro synthesis of pre-proteins of vacuolar compartmented proteinase inhibitors that accumulate in leaves of wounded tomato plants. Proc Natl Acad Sci USA 77:1975-1979 (1980). 8. Perlman D, Halvorson HO: A putative signal peptidase recognition site and sequence in eucaryotic and procaryotic signal peptides. J Mol Biol 167:391-409 (1983). 9. Ritonja A, Kri~aj I, Me~ko P, Kopitar M, Lu6ovnik P, Strukelj B, Punger6ar J, Buttle D J, Barrett AJ, Turk V: The amino acid sequence of a novel inhibitor of cathepsin D from potato. FEBS Lett 267:13-15 (1990). 10. Ryan CA: Proteolytic enzymes and their inhibitors in plants. Annu Rev Plant Physiol 24:173-196 (1977). 11. Ryan CA: Proteinase inhibitors. In: Marcus A (ed) The Biochemistry of Plants. vol. 6. Proteins and Nucleic Acids, pp. 351-370. Academic Press, New York (1981). 12. Sambrook J, Fritsch EF, Maniatis T: Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (1989). 13. Sanchez-Serrano J, Schmidt R, Schell J, Willmitzer L: Nucleotide sequence of proteinase inhibitor II encoding cDNA of potato (Solanum tuberosum) and its mode of expression. Mol Gen Genet 203:15-20 (1986). 14. Sanger F, Nicklen S, Coulson AR: DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA 74:5463-5467 (1977). 15. Strukelj B, Punger6ar J, Ritonja A, Krif~aj I, Guben~ek F, Kregar I, Turk V: Nucleotide and deduced amino acid sequence of an aspartic proteinase inhibitor homologue from potato tubers (SoIanum tuberosum L.). Nucl Acids Res 18:4605 (1990). 16. Von Heijne G: Patterns of amino acids near signalsequence cleavage sites. Eur J Biochem 133:17-21 (1983). 17. Walker-Simmons M, Ryan CA: Immunological identification of proteinase inhibitors I and II in isolated tomato leaf vacuoles. Plant Physiol 60:61-63 (1977). 18. Walsh TA, Twitchell WP: Two Kunitz-type proteinase inhibitors from potato tubers. Plant Physiol 97:15-18 (1991).

Isolation and sequence analysis of the genomic DNA fragment encoding an aspartic proteinase inhibitor homologue from potato (Solanum tuberosum L.).

A genomic DNA clone encoding an aspartic proteinase inhibitor of potato was isolated from a lambda EMBL3 phage library using the aspartic proteinase i...
248KB Sizes 0 Downloads 0 Views