Protein Engineering vol 4 no.l pp.33 —37, 1990

Cleavage-site motifs in mitochondrial targeting peptides

Ylva Gavel and Gunnar von Heijnel>2 Research Group for Theoretical Biophysics, Department of Theoretical Physics, Royal Institute of Technology, S-100 44 Stockholm and 'Department of Molecular Biology, Karolinska Institute Center for Biotechnology, NOVUM, S-141 52 Huddinge, Sweden 2

To whom correspondence should be addressed

Introduction Although the mitochondrion has a genome of its own, most mitochondrial proteins are encoded in the nucleus. The cytoplasmic precusors of the nuclear-encoded proteins contain N-terminal extensions (targeting peptides or mTPs), that are responsible for targeting to the mitochondria! matrix (Roise and Schatz, 1988). The mTPs are cleaved off by matrix proteases upon import. mTPs do not share any distinct consensus sequences, but they do exhibit some common features. Their most obvious characteristic is the amino acid composition: they are rich in basic, hydrophobic and hydroxylated residues, but lack acidic amino acids (von Heijne, 1986). Within the targeting peptides, two domains can be discerned. In the N-terminal part, there is usually a segment that has the potential to form a positively charged amphiliphic a-helix (Epand et al., 1986; von Heijne, 1986; Bedwell et al., 1989; Endo et al., 1989; Lemire et al., 1989; von Heijne et al., 1989). This domain seems, to be associated with the initial targeting of the cytoplasmic precursor to the mitochondrion. The C-terminal domains of mTPs do not as a rule have a strong amphiphilic-helix potential, and probably serve as recognition domains for the matrix proteases (von Heijne et al., 1989). The sequence requirements for correct targeting to the mitochondrion seem to be rather unspecific. Both experimental data (Allison and Schatz, 1986; Horwich etai, 1987; Roise et al., 1988) and theoretical considerations (Gavel et al., 1988) indicate that almost any N-terminal segment with the correct © Oxford University Press

Materials and methods Sequence collection Mitochondrial precursor sequences where the cleavage site has been determined by amino acid sequencing of the mature Nterminus were collected from the literature. Closely related sequences were removed, leaving a sample of 69 non-homologous sequences. An annotated list of the sequences is available from the authors. 33

Downloaded from http://peds.oxfordjournals.org/ at McGill University Libraries on November 19, 2013

Although mitochondria] targeting peptides lack a common consensus sequence, a certain bias in the positional distribution of amino acids has recently been found. These patterns seem to be associated with cleavage of the precursor proteins by matrix processing proteases. We have extended the previous studies and found new sequence motifs that are conserved within subgroups of mitochondria] targeting peptides. These motifs have certain common themes, indicating that they are associated with cleavage by one single protease. Two of the conserved patterns have a high predictive value, but even for sequences that do not possess these patterns, a fairly accurate prediction of the cleavage site is shown to be possible. We also suggest that a well-conserved RXY1(S/A) pattern may be used to engineer efficiently recognized cleavage sites into uncleaved or artificial mitochondrial targeting peptides. Key words: cleavage site/matrix processing proteases/mito chondrial targeting peptides/precursor proteins/sequence motifs

overall amino acid composition can promote some level of import of a protein into the mitochondrion. However, functional artificial mTPs are usually not cleaved upon import, suggesting that more precise amino acid patterns are required for recognition by the matrix protease. In agreement with this, positional amino acid preferences have been found in the region immediately upstream from the mature amino terminus (Hendrick et al., 1989; von Heijne et al., 1989). In particular, Arg is enriched in positions - 2 , - 3 , - 1 0 and - 1 1 relative to the cleavage site. A bias towards hydrophobic residues in position - 8 of those mTPs that have Arg in position —10 has also been noted. The Arg in position - 2 seems to be part of a recognition signal for the major matrix protease [called MPP + PEP in yeast and Neurospora, protease I in higher eukaryotes (Hawlitschek et al., 1988; Kalousek etal., 1988; Pollock etal., 1988; On etal., 1989)]. A number of mTPs with Arg in position - 1 0 (or, in a few cases, -11) have been shown to be cleaved in two steps (Rosenberg et al., 1983; Schmidt et al., 1984; Hurt et al., 1985; Hard etal., 1986; Sztul etal., 1987, 1988; Kalousek etal., 1988; Tropschug et al., 1988). In mammalian mTPs, the bond between positions - 9 and - 8 (or - 1 0 and - 9 ) is first cleaved by protease I (i.e. according to the Arg_2 signal), whereupon an additional eight- (or nine-) residue segment is removed by a distinct protease (protease IT), thereby generating the mature form of the protein. Although no protease-EI-like activity has so far been isolated from yeast or Neurospora mitochondria, both statistical and experimental indications for two-step cleavage have been found (Hurt et al., 1985; Haiti et al., 1986; Tropschug etal., 1988; Hendrik etal., 1989; von Heijne etal., 1989). Since arginines are found throughout mTPs, it is obvious that the Arg_2 signal alone is not sufficient to promote cleavage by protease I. Neither is it necessary: there are many examples of mTPs that do not have arginines in positions - 2 or - 1 0 , and at least some of them neverthless seem to be cleaved by protease I (Hendrick etal., 1989). To study the processing specificity of the matrix proteases further, we have collected a large number of mTPs with known cleavage sites. By analyzing this collection, we have found additional cleavage-site motifs that are conserved within or between subgroups of mTPs. This offers possibilities for rather reliable predictions of the position of cleavage sites, and suggests designs for 'cleavage cassettes' that may be introduced into otherwise uncleaved mTPs to direct their intra-mitochondrial removal.

Y.Gavel and G.von Heijne

Results Amino acid frequencies as a function of position relative to the N-terminus of the mature protein were obtained for samples of mTPs with Arg in either of position —10, —3 and —2 relative to the cleavage site (located between positions — 1 and +1), and for sequences with Arg in neither of these positions as described in Materials and methods. The pattern R-X-(F/I/L) is conserved between R-10 mTPs In agreement with the findings of von Heijne et al. (1989) and Hendrick et al. (1989), the hydrophobic residues Phe, lie and Leu are enriched in position - 8 for the R - 1 0 group (Figure 1). In particular, the distribution of Phe has a highly significant peak in position - 8 (P < 10"6). The other groups do not show any distinct peaks in the positional distribution of Phe, He and Leu. The pattern R-X-Y-(SZA) is conserved between R-3 mTPs Unexpectedly, in the R - 3 group Tyr is strongly preferred two steps downstream from the Arg (i.e. in position — 1, P < 10" l4 ) (Figure 2). When a Tyr is present in position - 1, it is always followed by a Ser or an Ala. This results in a significant Ser peak in position +1 (P < 10~4) (Figure 3a). A Ser is often found three residues downsteam of Arg_2 For the R - 2 group, no special residue seems to be enriched in the position two steps downstream from the Arg. However, just like the R - 3 peptides, the R - 2 peptides often have a Ser three steps downstream from the Arg, i.e. in position +2 {P < 10"4) (Figure 3b). The Ser peak in position +2 is also found for the R-none group (Figure 3c). However, due to the small size of the latter group, this observation should be viewed with some caution. Although weaker than for the R - 2 and R - 3 groups, an apparent enrichment of Ser three steps downstream of the critical Arg (i.e.in position - 7 ) is found for the R - 1 0 peptides as well 34

0.0 -20

-15

-10

-5

+1

+6

+11 +16 +21

Position Fig. 1. Positional distribution of Phe, He and Leu residues for the R - 1 0 group. 0.8

0.0 -20

-15

-10

-5

+1

+6

+11 +16 +21

Position Fig. 2. Positional distribution of Tyr residues for the R—3 group.

(Figure 3d). As was noted by Hendrick et al. (1989), Ser is also frequently found in position —5 in this group, but no similar enrichment of Ser five residues downstream of the critical Arg is seen for any of the other groups. Although these peaks are not statistically significant on their own, the fact that the —7 peak is in the same position relative to the critical Arg as the Ser peaks in the R—2 and R—3 groups suggests that it may not be a simple statistical fluctuation. For all the groups, the residue in the position of the conserved Ser peak tends to be one of the six smallest amino acids (i.e. Gly, Ala, Ser, Pro, Val or Thr), found in 72% of the sequences—the expected frequency is 40% for mature parts and 50% for mTPs. Discussion

Significance of sequence patterns The results presented above indicate that a small number of motifs characterize the cleavage sites in mTPs from both higher and lower eukaryotes (Figure 4). Versions of these patterns can be recognized in about two-thirds of all the mTPs in our collection. In addition, peptides with Arg in position — 11 may constitute a functionally distinct group. In fact, malate dehdyrogenase is cleaved between position — 10 and —9 rather than —9 and —8 (Grant et al., 1986; Joh et al., 1987). This protein has Arg in position —10 as well as — 11, but due to the shifted cleavage

Downloaded from http://peds.oxfordjournals.org/ at McGill University Libraries on November 19, 2013

Statistical analysis In agreement with previous reports, a majority of the mTPs in our collection contained Arg in position - 2 , - 3 , - 1 0 or -11 relative to the cleavage site. Since most mTPs had Arg in only one of the positions — 2, —3 and —10, the material was subdivided into four non-overlapping samples as follows: R — 2, Arg in - 2 but not in - 3 or - 1 0 (18 sequences); R - 3 , Arg in - 3 but not in - 2 or - 1 0 (16 sequences); R - 1 0 , Arg in - 1 0 but not in —2 or —3 (17 sequences); and R-none, no Arg in —2, —3 or —10 (9 sequences). Sequences with Arg in — 11 were found to be distributed among the other groups, and were not treated as a separate class. All the groups contained sequences from higher as well as lower eukaryotes. Some of the mTPs in our database (9 sequences) had Arg in more than one of the critical positions. These sequences were not included in the statistics. Each subsample was analyzed separately. The sequences were aligned according to the position of the N-terminus of the mature protein, and the positional distribution of each amino acid was calculated. The statistical significance of non-uniform amino acid distributions was estimated by comparing the observed number of a particular residue in a given position with that expected for a binomial distribution based on the mean frequency of that amino acid. For residues within mTPs, we used frequencies obtained for our total sample of mTPs when the N-terminal methionines had been removed. For residues within the mature parts, we used the frequencies given by von Heijne et al. (1989).

Mitochondria! targeting peptides

-15

-10

-5

+1

-5

+6

+1

+6

+11

+16

+11

+16

-15

-10

-5

+1

+6

+11

+16

-15

-10

-5

+1

+6

+11

+16

Downloaded from http://peds.oxfordjournals.org/ at McGill University Libraries on November 19, 2013

-15

-10

Fig. 3. Positional distribution of Ser residues for the R - 3 (a), R - 2 (b), R-none (c) and R-10 (d) groups

R-none

x

x

x i x

S

R-2

i S

R-3

A

R-10 I L Fig. 4. Conserved sequence motifs. Known cleavage sites are indicated by I. (I) indicates that intermediate cleavage has been shown for some sequences, x denotes any ammo acid.

site, it may not belong in the R - 10 group. However, since it has all the characteristics of the Arg-10 pattern (Phe in - 8 , Ser in - 7 and - 5 ) , we have included it in the R - 1 0 group. The other mTPs with Arg-11 have likewise been included in the R - 2 , R - 3 , R - 1 0 or R-none groups. At any rate, the existence of rather well-conserved sequence similarities around the cleavage site suggests that one single protease is responsible for the cleavage of mTPs from the R—2, R —3 and, possibly, R-none groups, as well as for the first cleavage of R—10 sequences. Experimental evidence indicates

that this is indeed the case, at least for mammalian mTPs (Hendrick etal., 1989). The highly conserved R-X-Yl(A/S) pattern of the R - 3 peptides raises the possibility that protease I (or MPP+PEP in yeast and Neurospora) does not always cleave the same bond within the recognition motif. Another possibility is that the cleavage occurs at the normal site (i.e. just before the Tyr), but that the resulting N-terminal tyrosine is subsequently trimmed off by an unknown protease. Like the R - 3 mTPs, the R—10 group has a preference for 35

Y.Gavel and G.von Heijne

Possibilities to predict cleavage sites The findings presented above indicate that it should be possible to make rather reliable predictions of the position of the mature N-terminus, at least in certain cases. We have thus developed some simple rules that take the cleavage-site motifs into account, and applied them to our sample of non-homologous mitochondrial peptides, including the ones that have Arg in more than one of the critical positions, but excluding one sequence where only a small number of residues from the mature part was known. We first looked for perfect matches to the R-X-Y-{S/A) motif in the region between position 12 and position 67 counting from the N-terminus of the precursor (position 12 corresponds to position —3 in the shortest targeting peptide in our sample, whereas 67 corresponds to position +1 in the longest one). In this step, 11 cleavage sites were predicted correctly, together with four false positives. In the remaining sequences, we next looked for the R_10 motif R-X-F-S in positions 5-60. In one case, the R-X-F-S signal occurred twice in the same sequence; we chose the match where the Arg was positioned nearest to the position which would be correct in a targeting peptide of average length (i.e. position 24). This resulted in an additional five correct predictions and one false positive. In sequences that did not possess the above motifs, we looked for the transition point between the mTP and the mature protein in terms of negative charges. In most precursors, negatively charged residues (i.e. Glu or Asp) are absent •from the targeting region. Therefore, a negative charge followed by closely spaced additional negative charges can be rather safely predicted to lie within the mature part. Along these lines, a Glu or Asp was assumed to belong to the mature part if it was followed by yet 36

another Glu or Asp not too far downstream (a maximum allowed distance of 13 residues proved to be optimal for our sample). Excluding the residue immediately before the first negative charge in the tentative mature part, we then looked upstream for arginines. When an Arg was found, it was assumed to be in position —2. By this criterion, we were able to predict another 15 cleavage sites correctly. Since Arg is enriched in mTPs, the predicted cleavage site is often close to the correct one even for mTPs that do not actually have Arg in position - 2 . In summary, the cleavage site was identified correctly in 31 sequences out of 68 (i.e. 45%). In particular, the cleavage site could be predicted with high confidence (75 % correct predictions) in sequences that possessed the motifs R-X-Y-(S/A) or R-X-FS. Altogether, 46 predicted sites (i.e. 65%) were displaced less than five residues from their correct positions. Implications for protein engineering Natural mTPs are usually 15-70 (typically 30) residues long and have a distinct amino acid composition (von Heijne, 1986; von Heijne et al., 1989). In particular, they are devoid of negative charges. Many artificial N-terminal peptides with the above properties have been shown to promote mitochondrial import, especially if they have a high potential for forming an amphiphilic a-helix (Horwich et al., 1987; Gavel et al., 1988; von Heijne et al., 1989). Given the sequence motifs described in this paper, it should also be possible to design efficiently recognized cleavage sites in normally uncleaved or artificial mTPs. In particular, the RXYl(S/A) motif seems to be a good candidate for such experiments. Acknowledgement This work was supported by a grant from the Swedish Natural Sciences Research Council to G.von H

References Allison,D.S. and Schatz.G. (1986) Proc. Nail Mad. Sci. USA. 83, 9011 -9015. Bedwell.D.M , Strobel.S.A., Yun.K., Jongeward.G D. and Emr.S D (1989) Mol. Cell Biot., 9, 1014-25. Endo,T.,Shimada,I.,Roise,D and Inagaki.F. (1989)7. Biochem., 106,396-400. F^and.R M.. Hui.S.W . Argan.C , Gillespie.L.L and Shore, G C (1986)7. fto/. Chem., 261, 10071-10020. Gavel,Y , Nilsson,L. and von Heijne.G (1988) FEBS Lett.. 235, 173-177. Grant,P.M., Tellam.J., May.V.L. and Strauss.A.W (1986) Nucl. Acids Res , 14, 6053-6066. Hartle,F.-U , Schmidt.B., Wachter.E., Weiss.H. and Neupert.W (1986) Cell. 47, 939-951. Hawlitscnek.G., Schneider.H , Schmidt.B., Tropschug.M., Hartl.F U. and Neupert.W. (1988) CeU. 53, 795-806. HendriclcJ.P , Hodges.P E. and Rosenberg.L.E (1989) Proc. Nail Acad Sri. USA, 85, 4056-4060. Horwich.A., Kalousek.F., Fenton.W.A., Furtak.K., Pollock.R.A M and Rosenberg.L.E. (1987)7. Cell Bioi. 105, 669-677. Hurt.E.C, PesoU-Hurt,B., Suda.K., Opplinger.W and Schatz.G. (1985) EMBO 7 , 4, 2061-2068. Isaya.G., Kalousek.F., Fenton.W.A. and Rosenberg.L.E. (1989)7. Cell. Bioi. 109, 55a. Joh.T., Takcshima.H., Tsuzuld.T., Shimada.K., Tanase.S. and Morino.Y. (1987) Biochemistry, 26, 2515-2520. Kalousek.F., HendrickJ.P and Rosenberg.L.E (1988) Proc Nail Acad. Sci. USA, 85, 7536-7540. Lemire.B.D., Fankhauser.C , Baker.A. and Schatz.G. (1989) 7. B\ol. Chem.. XA, 20206-20215. Ou.W.-J., Ito,A.,Okazaki,H. and Omura.T. (1989) EMBOJ.. 8, 2605-2612. Pollock.R.A.. Hartl.F.U . Cheng.M.Y . Ostermann.J , Horwich.A and Neupert.W (19S8) EMBO J . 7. 3493-3500. Roise.D and Schatz.G. (1988)7 Bioi. Chem., 263. 4509-4511. Roise.D . Theiler.F . Horvath.S J . Tomich.J M . Richards.J H . Allison.D.S and Schatz.G. (1988) EMBO J.. 7, 649-653. Rosenberg.L.E., Kalousek.F. and Orsulak.M.D (1983) Science. 222, 426-428.

Downloaded from http://peds.oxfordjournals.org/ at McGill University Libraries on November 19, 2013

bulky amino acids two steps downstream from the Arg, but in this case it is the non-hydroxylated hydrophobic residues Phe, De and Leu that are enriched. Since the hydrophobic residue in position —8 becomes the N-terminal amino acid upon cleavage by protease I, it may be a recognition signal for protease II, the enzyme that cleaves the intermediate. Alternatively, it could be a signal that enhances recognition by protease I. What could be the function of the octapeptide that separates the protease I cleavage site from the mature N-terminus in the R—10 group? Experimental evidence indicates that this spacer region is crucial for correct processing (Isaya etal., 1989). Possibly, some structure at the mature N-terminus of the R— 10 proteins interferes with recognition by protease I so that a normal R - 2 signal would not work. For example, the targeting peptide of methylmalonyl-CoA mutase (normally processed in one step) is not cleaved when joined to the mature N-terminus of OTC (normally processed in two steps) (Isaya et al., 1989). To address this question, we calculated the amino acid composition of the spacer (position - 7 to — 1, i.e. excluding the conserved hydrophobic - 8 position) and the amino acid composition of residues 2 - 8 of the mature parts from all groups (not shown). Two observations were made: (i) the spacer is similar to other regions of mTPs in that it lacks negatively but not positively charged residues; and (ii) the mature part of the R—10 proteins tends to have more positively charged and less negatively charged residues than the mature parts of other groups. A speculative possibility is the positively charged spacer may contribute to the targeting function of the mTP, besides its role in mTP cleavage. The spacer may also act as an adaptor to 'insulate' the protease I cleavage site from a (positively charged?) region in the mature protein that would be incompatible with a one-step cleavage reaction.

Mitocbondrial targeting peptides Schmidt.B., Wachter.E., Sebald,W. and Ncupert.W. (1984) Eur. J. Biochem., 144, 581-588. Saul,E.S., HendrickJ.P , KrausJ.P , Wall.D., Kalousek.F. and Rosenberg.L E (1987)7 Ceil. Biol.. 105, 2631-2639. SztuI.E.S.. Chu.T W , Strauss.A.W. and Rosenberg.L E (1988)/ Biol. Chem.. 263. 12085-12091. Tropschug.M., Nicholson.D.W., Hartl.F.U., Kohler.H , Pfanner.N., Wachler.E. and Neupert.W (1988) / Biol. Chem.. 263, 14433-14440. von Heijne.G. (1986) EMBOJ., 5, 1335-1342 von Heijne,G., Steppuhn.J. and Herrmann.R.G. (1989) Eur. J. Biochem.. 180. 535-545 Received on April 27. 1990; accepted on July 17, 1990

Downloaded from http://peds.oxfordjournals.org/ at McGill University Libraries on November 19, 2013

37

Cleavage-site motifs in mitochondrial targeting peptides.

Although mitochondrial targeting peptides lack a common consensus sequence, a certain bias in the positional distribution of amino acids has recently ...
374KB Sizes 0 Downloads 0 Views