Thylrrkaid; Membrane pratein; Tapolapy; Pexi~iveinside rule

1,

INTRODUCTOON

The polypepride backbone of most integral membrane proteins has at least one long hydrophobic segment that spans the membrane,. most likely as an a-helix [l]. This applies to proteins from the plasma membrane of both prokaryotes and eukaryotes; however, proteins from th!: outer membrane of Gramnegative bacteria are built on a different set of structural principles, and will not be considered further here [2]. Many integral membrane proteins are polytopic, i.e. have several membrane-spanning segments connected by surface-exposed loops. The topology is a specification of the membrane-embedded domains and the sidedness of the protein. Proteins from different membrane systems appear to have certain topogenic signals in common: statistical studies correlating the transmembrane topology with the amino acid sequence indicate that proteins from the bacterial inner membrane as well as proteins from the plasma membrane of eukaryotes follow a ‘positiveinside rule’ [3,4]. This conclusion was originally based on the observation that positively charged residues (i.e. Lys and Arg) tend to be enriched in non-translocated (i.e. cytoplasmically exposed) as compared to translocated segments of the protein. Recently, substantial experimental support for this notion has been reported [5-131. These results suggest that it is possible to predict the transmembrane orientation of a protein once the putative membrane-spanning segments have been identified by hydroplaobicity analysis: a Correspondence

address: G. von Heijne, Department of Molecular Biology, Karolinska Institute, Center for Biotechnology, NOVUM, S-141 59 Huddinge, Sweden

Published

by Efsevier

Science

Pub!khers

B. V.

topology that is consistent wirh the positive-inside rule should be most gcobablc. The evidence chat the topology of proteins from the bacterial inner membrane and membrane proteins in the secretory pathway of cukaryores follow the positive-inside rule is thus quite solid. However, for lack of sequence and topology data, it has hitherto not been possible to extend the analysis co include proteins of the thyfakoid membrane of chloroplasts. In this paper, we analyse the topological distribution of amino acids in a number of recently published chloroplast protein sequences where all or parts of the topology is known from experimental data, and show that the positive-inside rule applies also to this class of proteins. 2. MATERIALS AND METHODS 2,1, Sequence colfecrion nnd analysis Amino acid sequences of intrinsic thylakoid membrane proteins were collected from the literature. In cases where several closely related proteins were found, only one of them was included in the collection. Hydrophobicity analysis of the sequences was performed using a l9wresidue moving window and the Engelman-Steitz hydrophobicity scale 1141as described previously [3,4]. We found 10 proteins for which experimental data on the transmembrane topology are available (Table 1). In most cases, the mappings have been made by protcolytic digestion and/or antibody binding assays. Some proteins possess phosphorylation sites, and, since ATP does not readily cross the thylakoid membrane, the phosphorylated residues should be exposed to the stroma, where ATP is available [IS]. Furthermore, a number of proteins are homologous to bacterial mem.brane proteins with known topology, In some cases, the transmembrane topology is only partially known from experimental data. We have tried to extend the topology of these proteins using hydrophobicity analysis as described [3,41. This requires that the sequence neighbouring the mapped parts contain6 segments hydrophobic enough (mean hydrophobicity 2: 1.5) to be unambiguously assigned as putative membrane-spanning helices alternating with segments hydrophilic enough (mean hydrophobicity

41

c, PItwirl-cncodbYd prolrins wlthorrt prmqvencc~ 10 kBa phon. [15,31,32) I-39” pho.proWin, spinach’ CFa subunit 111. [21,22,33,34) spinach 23-62

40-58 59-73 1-P 4-22 63-81

Cytochromc b-S59 a-subunit, spinach’ Cytachromc b6. rpinnch

[35-3a1

I-24 25-43 44-$3’0

I39,401

l-33” 34-52 53-87 88-106 107-111 112-130 184-202

Dl protein, spinach’

(32,41,42]

203-21 I” 1_34w 35-53’5 128-141”

109-127” 142-160”

215,2722oa.22

196-214” 273-291’5 292-3532’

44 kDa chloro phyll a apoprotein, spinach2”

L32.431

1-48’5 49-67 68-84 8% 103 104-111 112-130 159-177*

Cytochrome f, pea

144.451

CFo subunit I, spinach

[21,22,46,47]

l-2512’ 252-270 27 I-285”

1-P 9-27 28-16628

42

’ Thts protcDnhrr rksrnrly bran rcyusnc@dOuqqxlhn ef #I., in ptrptr~tian) * Prtxtalyrlr: dil]txlirm ext%rimenlrindkrrla lhr\rit Ir\r]arpcsrriunal thlr pro{& iscrxpdeed IO Ihe biroma, Wer rhc?trnly lrun~n~~rnbr~n~rirren@ment thwr would be car&font wlrlr ihe hydrophsblzlcy pralili: la I tepale~y with en@ mctmbrancxpzrn near rhe NWminur, the cxpred parr Ix probablyIho te&m brrwoenIhr putative membranespurnand Ihc Ct.rerminud ’ including rhc inlllnrerf MIX, whieh 1%par~~rrnxlrA%~srlty ele#vcd ’ Thr.2 1%phesptrorytafcd,Indicatingthai Ihe Wlerminur Ir crxporedw Ihe srorna ’ Thr! rcrtuenecaf this pratcin is 3Wv homologous to the aequcecr al CFe subunit r af E ro/E. In subunit r”, the Wterminua and the C~~crmlrrux are W@xartd IWO% rlr@ mcmbtancand kre thr psriplnsm lo When ;hytaksid mcmbranrsarc cxpaxcd10 trypstn ar VB pr(lfea~, parr ot’ Ihe prorein is digcrtcrt withsuf loaxal rcaerivi~yw rn antibody to a Cterminat epitope. When inside-out verlclexarc expoxcdIO tryprin, reactivity to the antibody IO the C~rcrminnl cpiropc is les. Immunogold labeling also wu$@rstx that the C-rermioua is cxpo~~l ?o the lumen ‘I Trypsin sensitivity of an N~tcrminal epitopc indicates that the N-terminus is exposed IO the %lPomR Ia A luminal orientation of thin rclgmrnt would weeaunt for lhe prcscncc of trypsin-scnaitivc sited on the lumen side I1 Tryprin scnaitiviry of ~1Grcrminal cpiropc indieetca that the C-terminus is exposed ID the stroma ” An antibody ID residues 22-31 binds to the stromrl surface ‘s tn addition to the mapping with antibodies, rhc topology is consistent with proroolytic digestion experiments, sequence homology to the R/ro~~~$~uilornonns~/}~Qtl~ reaction center and cxon-intran boundnrics in the Dl genes of Chlantydorrrorrus and Errglma tn Increased binding of an antibody to residues 59-68 was observed for inside-out vesicles after washing, suggesting a luminnl location of this cpitopr I’ lncrenscd binding of an antibody to residues 97-105 was obscrvcd for inside-out vesicles nftcr washing, suggesting a luminal location of this cpitope ‘” An antibody to residues 127-136 binds to the stromal surface Iv An antibody co residues 165- 174 binds to the lunrinal surface ” An antibody to residues 225-235 binds to the stromal surface ” An antibody to residues 261-270 binds to the stromal surface ” Point mutations at VaL219, Phe-255 and Ser-264 confer herbicide resistance, This suggests that these residues are located on the stromal side *’ An antibody to residues 301-310 binds to the luminal surface 24 Including the first 14 amino acids, which are trimmed off but do not look like a lumen targeting domain ” Thr.15 is phosphorylatcd, indicating that this residue is exposed to the stroma *’ Digestion with cnrboxypcptidase A indicates that the C-terminus is exposed on the stromal side. Proteoiytic digestion experiments indicate that the exposed sequence is rather short *’ There is a putative heme-binding site near the N-terminus. The heme is thought to interact with plastocyanin, a luminal protein. Incubation of disrupted membranes or inside-out vesicles with proteinase K results in loss of hcme-peroxidase activity. Activity is not affected in right-sideout vesicles or intact membranes *’ The sequence of this protein is 19% homologous to the sequence of CFO subunit b of E. co/i. In subunit 6, the N-terminus is translocated across the membrane, whereas a large C-terminal region remains in the cytoplasm 5 0.7) to be unambiguously assigned as extramembraneous. Sequence stretches containing hydrophobic peaks in the ambiguous interval (i.e. mean hydrophobicity between 0.8 and 1.4) cannot be analyzed by this method, since a safe prediction of the number of membrane spans is not possible. Our dataset was prepared using the sequences of the wellcharacterized proteins described above.Those parts of the proteins where the topology had not been mapped by experiments or where it

could not be reliably inferred from hydrophobicity analysis were omitted. The topologies are shown in Table 1. From this dataset, 4 smaller samples were prepared: membranespanning segments with the N-terminus oriented towards the stroma, membrane-spanning segments with the N-terminus oriented towards the lumen, exposed segments facing the strorna, and exposed segments facing the lumen. The membrane-spanning parts were assumed to encompass the most hydrophobic I9-residue stretch of

tr~~~~~rnbr~~~ ~~~rn~nt~ from thr rnuitin~~~~~~n~ thylakoid ncmbrrnei protclns of TarWe I la shcswn in

Pig, la. Therm am na %igalfkmt ~~ff~r~~~~~ bawsen rkr tkylakoid aamplie and previously ~~&~y~ed~~m~~es

of prokwryotie and euharyotic glarma membrane proteins. On cha ocher hand, wlthough the sample i$ small, the amino weid romgositioa of ~i~~~e.~~~~ni~~ thylakoiel mcmbrnne proteinn appear% more similar to that observed for the eorrrspondhq group of cuksryotie proteins than for the prokaryotic ones, with a rather low frequency of rtlanine (-LO%) and Q high frequkney of leucine (-2X%, Fig. lb), Tmnsmembrane sqments with the: N-terminus oriented towards the aroma do not differ significantly from those with the N-terminus oriented towards the lumen (data not shown).

1

o-0 ACDEFGH

11

: I.MNPQRSTVW‘I

(

Amino acid

A~C’B‘E’F~G~II’

I -K‘L’M-N.P

S

T V W Y

Amino acid Fig. la. Amino acid frequencies in the transmembrane segments of multi-spanning membrane proteins. The frequencies for prokaryotic and

eukaryotic proteins are from reference [4]. Fig. lb. Amino acid frequencieg in the transmembrane segments of single-spanning membrane proteins. The frequencies for prokaryotic and eukaryotic proteins are from reference [43. 44

xrrain@t &malrM

$wt#t*~ rhrr ‘~o~~rtv~t~~~t~t~ nrk ’

The amlno eetd ~~mp~sicion 5P rxpcaxcd

mcnta

ftt’wchipt rht! ~~r~~~ difkrr sj~n~~e~ntl~ from t at- cxposed s~$ments Pa&g the tumrn QTabli~ ff),. In partlcular, the number of positively charged residues (Arg and Lp) is enriehcd mare than fwu-fold in rrromal as compared ts luminrrl ioops (ll.9’89 vs 5.4%, F’ < 3 UP). Taken individually, troth Arg (7,S% vs, 3&B, Bz < 0.02) and Lyr (5.4Vu vs. 1.91, P < W-Jr) were significantly enriched in the: ttamal loops, Na similar bias in the distributiran of ne tivrply charged residues is found, with the frequency of Asp 9 Glu being f0.4% in the stromal loops and IO.c)% in the luminnl laops. This is similar to what has previously been found for both bacterial and eukaryotic plasma membrane proteins, 4. BISCU%SIQN The results presented above demonstrate that the distribution of positively charged residues in the cxpos. ed parts of thylakoid membrane proteins has a sidedncss of the same type as has been reported previously for other membrane systems [3,4]: Arg and Lys are less abundant in the luminal segments that have been translocated across the membrane than in the stromal segments that remain non-translocated. However, the bias is not as striking as in bacterial inner membrane proteins, where the difference in the frcquency of Ary and Lys is almost four-fold [31. Still, the

strict alternation of the absolute number of positive charges in translocatcd vs. non-translocatcd loops, Table II Amino acid frequencies Amino acid A C D E F G I-l

I K* L NM : R” S T ?I W Y

in exposed loopy

Stromal 0.06576 0.00454 0.02494 0.07937 0.04989 0.10884 0.01134 0.04535 0.05442’ 0.08390 0,01814 0.03628 0.05442 0.02268 0.07483’ 0.08844 0.06576 OJJ4762 0.02268 0.04082

Luminal 0.07126 O.OCi?OO 0.0475 I 0.06176 0.06888 0.0855 1 0.02375 0.0665 I 0.01900* 0.07601 0.02613 0.07126 0.06176 0.02613 0.03563’ 0.08789 0.04038 0.06413 0.02375 0.04276

Amino acids that differ significantly in frequency between stromal and luminal loops are marked with an asterisk

For newly sequenced thyfakoid membrane proteins where the topology cannot ba inferred from asapetimenm tal data, ow results suggest two complementing strntsgics for predicting the sidodncna QRCC the. hydrophobiciry profile hws been determined. Firstl if the grstein has a presequcnce containing a rhylakoidtargeting signal (i.e. w rigntrl+ptide like segment, [ 16)) it is highly likely that the N.terminus of the mature protein will be luminally exposed since thylakoid.targeting signals arc cleaved by a luminally disposed proteasc [17-191. Second, the positive-inside rule can be applied to predict the topology of the entire molecule, whether or not there is a thylakoid=targeting presequence. The topology that minimizes the number of positively charged residues (Arg and Lys) in the lumen should be chosen (note chae the positive-inside rule does not apply when a translocated segment is very long, since only segments shorter than abc,ut 70 residues appear to have a reduced content of arginines and lysines (3,4], If the protein lacks a thylakoid-targeting sequence, only the positive-inside rule can be used as a guide. There are many bacterial and cukaryotic membrane proteins that are made without a cleavable N-terminal signal pcptide, yet asscmblc into the cytoplasmic membrane with the N-terminus translocating to the ex-

tracytoplasmic side (3,4,20]. Among the thylakoid membrane proteins, CFo subunit III seems to provide such an example: it lacks a thylakoid-targeting sequence but its topology as predicted by homology to its bacterial relative is nevertheless one with the Nterminus protruding into the lumen [21,22], which is also consistent with the positive-inside rule. When analyzing thylakoid membrane proteins lacking a thylakoid-targeting sigrral and with unknown topology, we have again found examples where the positive-inside rule predicts a luminally exposed N-terminus, and other examples where the opposite topology is predicted. For example, for subunit IX of photosystcm I [48] we predict a translocated N-terminus and membrane spans between the residues 3-21 and 63-81. On the other hand, the @subunit of cytochrome b-559 [36,38,49] (predicted to have a membrane span between the residues 19-37) should have its N-terminus exposed to the stroma according to the positive-inside rule. In summary, we have shown that the positive-inside rule holds for all integral membrane proteins (except for bacterial outer membrane proteins, cf. section 1) analyzed so far, whether of prokaryotic, eukaryotic, or 45

Vsn Hcijne, 0, wntl Manull, C. (II)9fl) ProI. En~lnrcr., in press. Van Hcijnr, G. (lQk6) EhlBO J. 5, 3021-3027, Von Hcijnr, G. and Gurcl, Y, (lQ88) Eur. J. Iliochrm, 174, 67 l-678. Van Hsijnr, Ct., Wiskncr, W. and Dalbcy, R.E, (1988) Proe, Nall. Acad. Sci. USA 85, 3363-3366. Van Hcljnc, 0. (I9891 341, JSb-458, Nilaxon, Ia and Ven Weijnr, 0. (1990) 62, 1135-l 141. Hr~cuptlc, M.Y., Flinr, N.. Gough, N.ht. and Dobbcrstrin, B, (1989) J, Cell EM, lOIF, 1127-1236, Park%, G.f>,, Hull, 5.13, and Lamb, R.A. (1989) J, Ccl1 Biol. 109, 2023-2032. SxczcxnaSkorupn, E. and Kempcr, B, (19SQ) J. Cell Biol. IOR, 1237-1243, Sato, T,, Saknguchi, M., hlihnra, K. and Qmttm, T. (1990) EMBG 1. 9, 2391-2397, Boyd, D, and Beckwith, J. (1919) Pmt. Nail, Acad. Sci. USA 86, 9444-Q4ScJ. Yamanc, K., Akiynma, Y.. Iro, K. and Mirushima, 5. (1990) J. Biol. Cltctn. 265, 21166-21171. Enycltnan, D.M. and Stcitr, T-A, (1981) Cell 23, 41 l-422, [I51 Wcsrhoff, P., Farchaus, J,W. and Hcrrmann, R.G. (1986) Curr. Gcnct. II, 16%l6Q. (161 Von Hcijnc, G., Steppulrn, J, and Herrmann, C, (IQ89) Eur, J. Biochem. 180, 535-545. [I71 Hagrmnn, J., Robinson, C,, Smcekcns, S. and W&beck, P. (1086) Nature 324, 567-569. [IS] Kirwin, P.M., Ehlcrficld. P.D. and Robinson, C. (1987) J. Biol. Chem. 262, 16386-16330. 1191 Kirwin, P.M.. Elderfield, P.D., Williams, R.S. and Robinson, C. (1988) J. Biol. Chem. 263, 18128-18132, [20) Von Heijne, G, (1988) Biochitn, Biophys. Acta 947, 307-333. (21) Hudson, GS., Mason, J.G., Holton, T.A., Keller, B., Cox, G.B., Wttitfcld, P.R. and Bottomley, W. (1987) J. Mol. Biol, 196, 283-298. (221 Deckers-Hebestteit, G, and Altendorf, K. (1986) Eur. J. Biothem. 161, 225-23 I.

bl6!hel, it,P. untl Bwwt~, J. (l9B7) PEBS Let;. 212, It%-1011. Ml&cl, H., Hum, D.F., fhabmtowhx, J. and Bcnnc~r, J. (191385 .I. Bioh Chem. 263, I 123-l 130, Ah, J., Winter, P,, Ssbnld. W,, hloxcr, J.G,, Sctrcdcl, R., Westhoff, P, and Hcrrmnnn, R.G. (19113) Gurr. Gcncr, 7, 12Q-l38# Scbaltl, W, and Wachter, E. (1980) FERS LCII. 122, 307-31 Iv Tw, G., Rlnck, M.T., Cramer, WA.. VaIlon, 0. and Bugorad, b. (lYS8) Miochcmixrry 27, 9(175-9080. Hcrrmann, R.G,, Ah, J., Srhillcr, B., Widkcr, W.R# and Cramer, WA. (IUSS) FEBS LCB. 176, 239-244. Widgcr, W.R#, Cramer, WA., Hcrmadron, M.. Meyer, D. and Gullifor, M. (19114) J. Biol. Cttcnr. 259, 3870-3876. Vallon, 0, Tar, G., Cremcr, W,h., Simpson, B,, Haycr. Hansen, G. and Bogarnd, L. (1989) Biochim. Biapltyx. Aetn 975, 132-141. Hcincmeyer, W., Ah, J. nnd Hcrrmann, G. (1984) Curr. Gcncr. 8, 543-549. Szczcpaniak, A. and Cramer, W.A. (1990) J. Biol. Chcnr. 265, 17720-17726. Sayre, R.T., Andcrrson, B. and Boporad, L. (1986) Cell 47, 601-608. Zurawski, G., Bohnerr, H,J., Whitfcld, P.R, and Battomlcy, W. (1982) Proc. Natl. Acad. Sci. USA 79, 7699-7703. Ah, J., Morris, J., Wcatboff, P. and Herrmann, KG. (1984) Curr. Gcncr. 8, 507-606. Wiltey, D,L,, Auffrc!, A.D. and Gray, J.C. (1984) Cell 36, 555-562. Szczepaniak, A., Black, M,T. and Cramer, W,A, (1989) Z. Naturforsch. Sect, C. Biosci. 44, 453-461 I [461 Hennig, J. and Herrmann, R.G. (1986) Mol. Grn. Gcnct. 203, 117-128. (471 Bird, C.R., Kollcr, B., Auffret, AD., Huttly, A,K., Howe, C,J., Dyer, T-A, and Gray, J.C. (1985) EMdO J. 4, 1381-1388. [48] Franzen, L..G., Frank, G., Zuber, H. and Rochaix, J.*,D. (1989) Mol. Gcn. Genet. 219, 137-144. [49] Widger, W.R,, Cramer. W.A., Hermudson, M, and Herrmann, R.G. (1985) FEBS Lett. 191, 186-190.

The 'positive-inside rule' applies to thylakoid membrane proteins.

In integral membrane proteins, regions that span the lipid bilayer alternate with regions that are exposed on either side of the membrane. For protein...
872KB Sizes 0 Downloads 0 Views