Int. J. Peptide Protein Res. 9, 1977, 5 - 10 Published by Munksgaard, Copenhagen, Denmark N o part may be reproduced by any process without written permission from the author(s)

SYMMETRY P A T T E R N S IN T R Y P S I N O G E N SEMIH ERHAN, LARRY D. GRELLER and BARBARA RASCO Department of Animal Biology, School of Veterinaty Medicine, University of Pennsylvania Philadelphia, Pen nsy 1vania, U.S.A .

Received 14 April 1976 When the primary structure o f bovine trypsinogen is searched f o r the existence of regularities, according t o Greller & Erhan (1974), one finds eight pairs of peptides. arranged in a symmetrical pattern along the molecule. These peptides cover 49% of the length of the molecule-1 12 of the 227 amino acids - and each pair folds in a similar way. This observation is in agreement with the observation that “Trypsin folds into t w o halves, each o f which contains a pseudo-cylindrical arrangement of hydrogen bonds.. .”, Stroud et al. (1971). Thus the above mentioned method is capable not only of detecting regularities along the primaty structure but also o f predicting the folding o f a protein.

Whether any regularities occur within the primary structure of a protein is a question that emerged almost as soon as the amino acid sequence of proteins became available and has remained a point of strong controversy ever since. During our studies on amino acid homologies found among proteins that appear to be ancestrally unrelated, we have developed a method whereby a computer performs a sliding match between two proteins and computes the similarities between two vertical amino acid pairs and prints the result (Greller & Erhan, 1974). Using this method, it soon became apparent that many repeating segments occurred along a protein chain which were homologous either to a small peptide or to a “peptide” found within a protein which was used as the “key” (Erhan & Greller, 1974). These repeating segments, which we have named “subsequences”, also appear when a single protein is matched against itself. When tqpsinogen is matched against itself one observes, typically, many repeating sub-sequences; however, 16 of these sub-sequences have something unique about them: eight of them are found along the

amino half and eight of them along the carboxyl half of the trypsinogen molecule and they form eight pairs of homologous peptides. Furthermore, these peptide pairs fold in a similar way, and we believe this method can be used to predict the folding of a protein. MATERIALS AND METHODS The matching was performed according to Greller & Erhan (1974) whereby a sliding match is made, either between two different proteins, a protein and a small peptide, or of a protein against itself. At each position the similarity scores of each vertical amino acid pair are computed and printed. The similarities are scored on a scale of 0 to 9 , where 0 indicates a no-match and 8 and 9 represent identical amino acids, with 9 indicating the occurrence of the identical vertical pairs between F, C, W,and Y, according to McLachlan (1971). In the printout 8 is represented by (.) and 9 by (’) for easier recognition of the matches we consider significant (Greller & Erhan, 1974), where about half of the amino acids are identical. 5

SEMIH ERHAN, LARRY D. GRELLER AND BARBARA RASCO

The alpha carbon backbones of the homologous peptides were constructed utilizing a wire bender developed by Rubin & Richardson

(1972) and which is avadable from Charles Supper Co., Natick, Mass. The bending angles needed for this construction were obtained from a program devised and kindly furnished by Dr. Byron Rubin of the Institute for Cancer Research, Philadelphia, Pa.

Atomic coordinates of trypsinogen were kindly supplied by Drs. R. E. Dickerson and R. M. Stroud of California Institute of Technology. RESULTS

When trypsinogen is matched against itself one finds the ususal occurrence of many repeating

TABLE 1 Homologous peptides found in trypsinogen, their position, sequence, and double-matching probabilities Position of the peptidea

Symbol used for the peptide

Amino acid sequence of the peptide b Individual scores

9-18 210-219

1 11

M score of the match

Double-matching probability P(M' > M)

GGYTCGANTV GVYTKVCNYV .2'.021.1.

47

2.0 x 10-3

22-27 197-202

2 12

VSLNSG* VSWGSG ..33..

38

1.4 x 10-4

31-35 189-193

3 13

CGGSL CSGKL '3.3.

31

4.5 x 10-4

KSGIQV DSGGPV 3..13.

31

4.5 x 10-3

GQDN GKDS .4.5

25

9.7 x lo-'

49 -54 182-187 57-60 175-178

4 14 7 15

12-76 152-156

8 16

SASKS SSCKS .42. .

30

2.0 x 10-3

81-92 130-142

9 6

SYNSNTLNNDIM TKS SGTSY PDVL 515.3.221 .56

54

2.5 x 10-3

SLNSNVAS* SLPTSCAS ..1541. .

43

1.0 x 10-3

101-108 110-1 17

10 5

a Position number indicates the position along the protein of the sequence being matched, numbered contiguously from the amino terminus of the protein. Contiguous numbering is used to avoid confirsion arising from many different groups introducing gaps into their sequences for various reasons. b The numbers, periods, and apostrophes below the target sequences represent the individual similarity scores between the amino acids of the two peptides. M score shows the cumulative score for the span length of the key which is obtained by summing of the individual scores. Double-matching probability gives the probability for such a matching to occur by chance. *SLNS sequence occurs within the two peptides 2 and 10, underscoring the significance of these repeating sequences. These remarks are also valid for Tables 2 and 3.

6

SYMMETRY PATTERNS IN TRYPSINOGEN

sub-sequences with different levels of significance (Table I). These sub-sequences are not all of the homologies which cover over 90% of the length of the molecule found during the matching of trypsinogen against itself. They were selected for this study because of their interesting distribution along the trypsinogen molecule. Among these some stand out, not because of their exceptional statistical significance, but because they were situated along the trypsinogen molecule in a symmetrical fashion (Fig. 1). Thus one sees eight pairs of peptides; eight peptides stretch along the carboxyl half and their respective pairs stretch along the amino half of the trypsinogen molecule. We have already discussed the reasons why and lo4 truly represent significant homologies (Greller & Erhan, 1974, Erhan & Greller, 1 9 74a) Throughout this study all amino acids are numbered contiguously from the N-terminus. Furthermore, these sub-sequences cover nearly half (1 12 o r 49%)of the 227 amino acid length of the molecule. There are two basic assumptions behind amino acid homology studies among proteins: 1) If the amino acid sequences of two peptides contain similar amino acids their folding will be similar. 2) If proteins that perform the same reaction or bind the same substrates have active FIGURE 1 Symmetry pattern found in trypsinogen. a) Blocks represent the relative positions of the homologous subsequences found along the protein chain. Trypsinogen is drawn as a Line from the N-terminus to the Cterminus. The numbers found above the blocks represent the arbitrary numbers given to the homologies as listed in Table I. The numbers found below the Figure give the contiguous numbering of each amino acid from the N-terminus; each small vertical line represents the 10th amino acid. b) Shows the trypsinogen molecule folded over itself with eight homologous peptides on each side.

and/or binding site fragments that are homologous then the 3-dimensional conformation of these fragments is similar. The first assumption follows directly from the demonstration that folding of a protein is dependent only on its amino acid sequence (Anfinsen et al., 1961). Sequences that are homologous contain similar amino acids, and proteins containing similar sequences in all likelihood will fold similarly. The second assumption is a logical extension of the first assumption, supported by certain observations. We have found very significant homology between active site fragments of trypsinogen and a human Bence Jones protein (Erhan & Greller, 1974b). A comparison of the 3-dimensional folding of trypsinogen active site fragments and the homologous peptides from Bence Jones, on a molecular model of the.latter molecule, has demonstrated great similarity of conformation. Preliminary experiments in my laboratory have demonstrated the presence of weak proteolytic activity in a Bence Jones preparation. Since then, Rossmann & Argos have demonstrated amino acid sequence similarities between the heme-binding pockets of globins and cytochrome b5 (personal communication). Tufty & Kretsinger (1975) also find homologous regions, in myosin light chain, to troponin and parvalbumin calcium-binding regions.

a I

2nN

3

4 7

8

n n n

2

nn

n

1 0 5

9

n nn

50

8

15 1413 12

16

nnnn

n n

100

150

200

II

n ,

coon

b

n n n

H003

n n

n

n m

50

2HN

QOZ

u u u u u

I00

OSI

u

u

d 7

SEMIH ERHAN, LARRY D. GRELLER AND BARBARA RASCO

In order to test this idea, alpha carbon backbones of the eight pairs of homologous peptides were constructed from steel wire and compared, Five of the eight pairs were found to fold similarly. DISCUSSION

On the basis of homology studies which had demonstrated that a number of homologous peptides occurred repeatedly along a protein molecule, it was suggested that early proteinoids might have been formed by stepwise condensation of primordial peptides and that it was possible to detect these primordial peptides today (Erhan & Greller, 1974~).This idea was supported by theoretical considerations (Simon, 1973). Furthermore, based on the Anfinsen et ul. (1961) demonstration that folding of a protein is dependent only on its amino acid sequence, it was also suggested that since the homologies found represented peptides with similar amino acid sequences, their folding should also be similar. This idea was

supported when homologies were found between active site fragments of trypsinogen and a particular Bence Jones protein (Erhan & Greller, 1974b). The homologous segments on the Bence Jones protein were demonstrated as folding in a way similar to active site of trypsinogen. Preliminary experiments with a Bence Jones protein have demonstrated the existence of weak proteolytic activity. Therefore it is reasonable to expect conformational similarity between homologous peptides. If these homologous sub-sequences are found within the primary structure of a protein then one can expect to find similar folding along the homologous segments. If, furthermore, the homologous sub-sequences display a symmetry pattern, then it should not be surprising to find that two halves of the molecule have a “roughly” similar folding. Stroud er ul., (1971) have found “. . . the trypson molecule to fold up into two halves, each of which contains a pseudo-cylindrical arrangement of hydrogen bonds between adjacent antipardel extended chains similar to that described by

TABLE 2 Homologous peptides found in chymotrypsinogen A ~~

Position of the peptidea

Amino acid sequence of the peptide b Individual scores

16 234

IVN LVN 5..

53 23 1

...

TTS STS

74 187

GSSS

5..

GVSS .2..

8

Double-matching probability P (M’ > M)

21

10-3

24

10-l

21

10-~

26

10-~

29

10-3

21

10-3

KLKIA KIKDA .5.0

112 158

the match

VTA VTA

61 221

82 175

M score of

ASF ASL

. .5

SYMMETRY PATTERNS IN TRYPSINOGEN TABLE 3 Homologous peptides found in elastase

M score of the match

Double-matching probability P (M' > M)

SWF'SQI SFVSRL .63.55

35

10-3

AAHCV AVHGV .3.1.

29

10-3

CVQ GVR ..5

21

10-3

YWNTDDVA YLPTVDYA '31.1.3.

41

10-3

RLAQSU QLAQTL 5 . . .54

38

10-4

Position of the peptidea

Amino acid sequence of the peptide Individual scores

11 207

43 201 73 179 82 149 98 140

Blow for crchymotrypsin. . .". Similarly, referring to a hydrogen-binding map of chymotrypsin, Birktoft & Blow (1972) write, ". . .the pattern of zigzag lines is drawn to emphasize the existence of two folded units in the molecule from residues 27-112 and from residues 133-230. . . .A newspaper could be inserted, through the molecular model, almost completely bisecting it into two halves.. .". Hartley & Shotton (1971) too, make the observation, ". . .one can see that as in chymotrypsin the elastase molecule also appears to be divided into two halves composed of residues 27-127.. .in the upper left hemisphere and residues 128-230.. . . in the lower right hemisphere. . .", We have therefore included these proteins in our studies also, to find out whether similar symmetry patterns could be observed since both chymotrypsin and elastase are related to trypsinogen. These studies have yielded six symmetrically situated homologies on chymotrypsinogen A, and five on elastase. These results are shown in Tables 2 and 3. Thus the method developed appears capable of suggesting regions along a protein where folding can be expected to be similar, in ad-

dition to detecting regularities along the primary structure. When used together with a predictive method such as the one developed by Chou & Fasman (1974ab) it can be expected to improve the accuracy of predictions.

ACKNOWLEDGMENTS The authors are indebted to Drs. R. M. Stroud and R. E. Dickerson for making atomic coordinates of trypsin available, to Dr. Byron Rubin for permitting the use of his algorithms to obtain bending angles, and to Dr. J. P. Glusker for letting us use the Byron Bender in her laboratory. Thanks are also due to many colleagues, students, and others too numerous to list individually, for participating in this study.

REFERENCES Anfinsen, C. B., Haber, E., Sela, M. &White, F. H., Jr. (1961)Proc. Natl. Acad. Sci. U.S. 47,1309-1315 Buktoft, J. J. & Blow, D. M. (1972) J. Mol. Biof. 68, 187-240 Chou, P. Y. & Fasman, G. D. (1974) Biochem 13, 211-222 Chou, P. Y. & Fasman, G. D. 19743 13, Biochem 222-245

9

SEMIH ERHAN, LARRY D. GRELLER AND BARBARA RASCO Erhan, S. & Greller, L. D. (19740) Int. J. Pep?. Prot. Res. 6,175-181 Erhan, S . & Greller, L. D. (1974b) Nature (Lond.) 251,353-355 Greller, L. D, & Erhan, S. (1974) Int. J. Pep?. Pro?. Res. 6,165-173 Hartley, B. S. & Shotton, D. M. (1971) in The Enzymes(Boyer, P. D., ed.), 3rd Edn., vol. 3, pp. 323373, Academic Press, New York McLachlan, A. D. (1971) J. Mol. Biol. 61,409-424 Rubin, B. & Richardson, J. S. (1972) Biopolymers 11, 2381-2385

10

Simon, H . A . (1973) in Hierarchy Theory (Pattee, H. H., ed.), p. 3, George Braille, New York Stroud, R. M., Kay, L. M. & Dickerson, R. E. (1971) Cold Spring Harbor Symposium vol. 36, p. 125 Address: Semih Erhan 2101 Chestnut Street Philadelphia Pennsylvania 19 103 U.S.A.

Symmetry patterns in trypsinogen.

Int. J. Peptide Protein Res. 9, 1977, 5 - 10 Published by Munksgaard, Copenhagen, Denmark N o part may be reproduced by any process without written pe...
307KB Sizes 0 Downloads 0 Views