J. Mol. Biol. (1977) 115, 135-175

B-Turns in Proteins PETER

Y. CHOU AND GERALD

II.

FASMAN

Department of Biochemistry, Brandeis Waltham., Mass. 02154. U.S.A,

C/m&de

19i’7, and in reviwd.form

(Received IS March

.C’Gversity

9 Zuy

1977)

TIIC X-ray atomic co-ordinates from 29 proteins of known sequence and structure wtre utilized to elucidate 459 B-turns in regions of chain reversals. Tetrapeptides whose CZC~~,---~C~~+~, distances were below 7 A and not in a helical region were cllaracterized as B-turns. In addition, p-turns were considered to have hydrogen bonding if their computed 0c1)-Nci+3) distances were 5; 3.5 A. The torsion angles of 26 proteins containing 421 B-turns were oxarnined and classified into 11 bend types based on the (4, 4) dihedral angles of t’hr i I- 1 and i + 2 bend residues. The average frequency of p-turns is 32% as compared to the 38% helices and ZOO;, B-sheets in the 29 proteins. The most’ frequently occurring bend residues are Asn, Cys, Asp in the first position, Pro, Ser, Lys in the second position, Asn, Asp, (:ly in the third position, and Trp, Gly, Tyr in tho fourth position. Residues with t,he highest p-turn potential in all four positions are Pro, Gly, Asn, Asp, and Ser lr.ith the most hydrophobic residues (i.e. Val, Ile. a11d Leu) sllowing the lowest btlnd potent,ial. However, in the region just beyond the ,%turns, hydrophobic rc:sidues occur with greater frequency than do hydrophilic residues. An environmental analysis of /-turn neighboring residues sllows that reverse chain folding is stabilized by anti-parallel b-sheets as well as helix -Ilelis and CI-/3 inberactions. Tile b-t)urn potential at the 12 positions adjacent to and including the bend were plotted for the 20 amino acids and showed dramatic positional preferences, which may be classified according to the nature of tile side-clIwins. An examination of tire 27 /?-turns in elastase showed that 21 were found in identical positions as thoso ill cc-chymotrypsin. Howover, only 37 of the 84 bend residues were conserved, indicating that structural similarity may persist, tlespite differences in sequence l~omology. A survey of residues occupying bend types I’, 11’ and III’ showed that (:ly appeared most frequently in the third position in betld types I’ and III’ as wall as in the second position in bend types II’ and 111’. Fourt,een llydrogen-

bonded type> II bends were found wit,hont a Gly at tile third position, to t,he energy calculations. position were also elucidated.

Eight

type

Vl

bends

Tyitlt

11 cis Pro

contrary

at the third

1. Introduction It, is well-known proteins.

Segments

that

a-helices

of the prot,ein

and /?-sheet’s are the major chain

which

are not helical

stabilizing

structures

nor ,&sheet

have

in been

generally designated as random coil or irregular regions. However, it has become increasingly apparent that these so-called random regions of the protein do exhibit regular structural patterns. One of these is the p-turn. a region of the protein involving four

consecutive

residues

where

the polypeptide

chain

folds

back

on itself

by nearly

(Lewis et al., 1971,1973; Kuntz, 1972; Crawford et al., 1973; Chou & Fasman, 1974). It is these chain reversals which give a protein its globularity rather than linearity. 180”

I 36

I’.

Y.

(‘HOI’

.\SI)

(:.

I).

l1’.-1SJlAS

J’enkatachalam (1968) was the first t,o charn&erize thretl types of turns in a t c+rapept,ide where there is a hydrogen bond hotween t’he CO group of t,hr: first, rrsidut and the NH group of the fourth residue?. T\ZU of these turns arc illust,rated in Figure 1. The type I turn has (r,$,$)Z = (-60”, -30”) and (4, #)3 = (-9W, O”), while the type It turn has (4, $)Z = (-SO’, 120”) and (4, JJ)~= (SO”, 0”) for the second and t’hird residue of the tetrapeptide. The type I turn is related to type 11 hy a rotation of 180” in the second peptide unit, so, as seen in Figure 1, the oxygen atom is directed into the plane for type I and out of t’he plane for type II. The type 111 t’urn having (4, #) = (-60”, -30”) for the second and third residues of the bend is equivalent to one turn of the 3,0 helix.

FIG. 1. Two types of /l-turns with the tetrapoptido ctc’ atoms denoted as 1, 1, 3, 4. ,L3-Turnn which have OC1)-NC4) distances 13.6 A and c&,,,-&,~, distances (: 7.0 w are considered to be hydrogen bonded. Bend type I has (4, +)2-( -6O”, -30”) and (4, I/)~--( -BOO, 0’). Bend type II has (4, I,/J)~=(-60”, 120”) and (4, #)3=800, 0”). Adapted from Dickerson et nl. (1971) with permission.

Crawford et al. (1973) found 125 reverse turns in 7 proteins, while Lewis et al. (1973) characterized 135 bends in 8 proteins with known X-ray structure. However, both groups stated that their data set was still too sparse for successful predictive value. Chou & Fasman (1974) increased the data set to 12 proteins containing 165 /J-turns, but these results were tentative, since many of the bends were obtained from stereodiagrams. A more refined p-turn frequency table based on 17 enzymes with 298 bends was published and used in predicting the chain reversals in Eucrepressor (Chou et al., 1975). Further analysis based on 29 proteins revealed 408 p-turns, and these bend frequencies (Chou & Fasman, 1977) were utilized to elucidate the probable bend regions in the histones (Fasman et al., 1976). Since the p-turns are not usually clearly defined in the original X-ray papers as compared to the helical and p-sheet regions, it is the purpose of the present study to provide this structural information. t The notations 1, 2, 3, 4 and i, i+l, positions of the ,%turn.

i+2,

i+3

will bo used interchangeably

to denote the 4

P-TURNS

IN

PROTEINS

137

2. Methods (a) Examination

of proteins

with known

conformation

and sequence

The atomic co-ordinates of 29 proteins were analyzed to elucidate the p-turns: adonylate kinase (Schulz et al., 1974), carbonic anhydrase C (Liljas et al., 1972; B. Strandberg, 1971), chromatium personal communication), carboxypeptidaee A (Q uiocho & Lipscomb, (Birktoft & Blow, 1972), high potential iron protein (Carter et al., 1974), a-chymotrypsin concanavalin A (Reeke et al., 1975; and personal communicat,ion), cytochrome b5 (Mathews cytochrome c (Takano et al., 1973 ; R. Swanson et aZ., 1972; and personal communication), Jt It. E. Dickerson, personal communication), cytochrome c2 (Salemme et al., 1973), cxlastase (Sawyer et al., 1973; and personal communication), ferredoxin (Adman et al., 1973; L. H. Jensell, personal communication), fla\,odoxin (Burnett et al., 1974; M. L. , glycera hemoglobin (Padlan & Love, 1974; E. Lattman, Ludwig, personal communication) personal communication), c( and b-chains of horse deox>-hemoglobin (Perutz et al., 1968; and personal communication), sea lamprey llemoglobin (Hendrickson et al., 1973; and persor~al communication), midgo larva hemoglobin (Huber et al., 1971; and personal cornlnunioatiort), insulin (Blundell et al., 1!)72; and personal communication), lactate personal communication), lysozymt+ dehydrogenase (Adams et al., 19 73 ; M G. Rossmann, (Jmoto et al., 1972; Diamond, 1974 (atomic co-ordinat’e set RSlHA, Brookhaven Protein Data Bank)), carp myogen (Kretsinger & Nockolds, 1973; and personal communication), communication), myoglobin (Wat,son, 1969), papain (Drenth et al., I!)7 1; and personal ribonuclense S (Richards Sr Wyckoff, 1973; and personal communication), rubredoxin nuclease (Cot)ton et al., (Watrrllpaugh et al., 1973; Herriott et al., 1973), st~aphylococcal 1972; and personal communication), subtilisin BPN’ (Alden et al., 1971), thermolysin (Matthews et al., 1974; and personal communication), bovine pancreatic trypsin inhibitor (Deist,nhofer & Steigemann, 1975 ; and personal communication). Furthermore, thf) atomic co-ordinates data deposit,nd at the Brookhaven Protein Dat,a Bank (T. F. Koetzle, persotlal communication) and at the National Institutes of Health Atlas of Macromolecular Structure on Microfiche (R. J. Fcldmann, personal communication) were checked so that, t II bond rrsitlues. The helical and @-sheet regions arc also given to show that they precede aud follow thv regirnw of chain rrvcvd. Adapted from Mathews et rd. (195%) with permission.

Tetrapeptide

Gly-Pro-Gly-Ser Gly-Ser.Gly-Lys Gly-Lys-Gly-Thr Gln-Lys-Tyr-Gly Ser.Thr-Gly-Asp Thr-Gly-Asp-Leu Ser-Ala-Arg-Gly Met-Glu-Lys-Gly Glu-Lys-Gly-Gin Val-Asp-Thr-Ser Ile-Asp-Gly-Tyr Pro-Arg-Glu-Val ilrg-Lys-Ile-Gly Gly-Glu-Thr-Ser Thr-Glu-Pro-Val Glu-Lys-Arg-Gly Gly-Ser-Val-Asp

Gly-Lys-His-Am Pro-Glu-His-Trp Gly-Glu-Arg-Gln Gln-Ser-Pro-Val Asp-Pro-Ser-Leu Tyr-Asp-Gln-Ala Asn-Gly-His-Ala Gly-Gly-Pro-Leu Leu-Asn-Gly-Gln

16- 19 18- 21 20- 23 30- 33 38- 41 39- 42 51- 54 61- 64 62- 65 8P 87 92- 95 96- 99 107-110 133-136 157-160 165-168 177-180

i- 10 12- 15 24- 27 27- 30 40- 43 5ck 53 61- 64 86 83 99-102

1. Aden ylate kinase (porcine)

B-Turn

?

*

*

/3-Turns

2

4.6 4.6 3.7 3.7 5.1 2.5 3.5 7.3 6.6 5.7 5.6 5.2 5.8

5.9

2.6

3.4

om--NW (4

5.4

5.5

7.3 7.3 5.9 5.2 5.6 6.4 5.6 5.0 6.9 4.1 6.3 4.2 5.4 6.1 6.2 5-5 6.3

UCc1,--aCo, (4

-- 103 86 -35 -iY

-80 -170

-64 -34 -22 152 -62 44 -4 -56 -23

(7

59t -29

$2

(“)

analysis

42

located in 29 proteins from X-ray structure

TABLE

-123 -98 - 144 ? -14 -150 -146 ? -81

(“)

$3

-6 12 75 ? -31 25 3 10 20

I I IV VI III IV IIIII I

Bend type

36 47 7 CL 11 29- 32 30- 33 41- 44 56- 59 67- 70 69- 72 7& 73 89- 92 9cb 93 108-111 110-113 123-126 142-145 143-146 144-147 148-151 150-153 153-156 159-162 162-165 169-172

3. Carboxy

107-l 10 108-111 123-126 135-138 152-155 167-170 178-181 198-201 210-213 231-234 234-237 249-252

A (bovine)

Ser-Thr-Am-Thr Thr-Am-Thr-Phe Am-Tyr-Ala-Thr His-Pro-Glu-Leu Pro-Glu-Leu-Val Ser-Tyr-Glu-Gly Gly-Ser-Aw-Arg Gly-Ile-His-Ser His-Ser-Arg-Glu Ser.Arg-Glu-Trp Am-Tyr-Gly-Gln Tyr-Gly-Gin-Am Glu-Ile-Vel-Thr Val-Thr-Am-Pro Asn-Arg-Leu-Trp Asp-Ala-Am-Arg Ala-Asn-Arg-Am Asn-Arg-Asn-Trp Asp-Ale-Gly-Phe Gly-Phe-Gly-Lys Lys-Ala-Gly-Ala Ser-Pro-Cys-Ser Ser-Glu-Thr-Tyr Tyr-Ala-Asn-Ser

peptidase

Thr-Val-Asp-Lys Val-Asp-Lys-Lys Am-Thr-Lys-Tyr Glu-Pro-Asp-Gly Lys-Pro-Gly-Leu Thr-Lys-Gly-Lys Asp-Pro-Arg-Gly Thr-Pro-Pro-Leu Leu-Lys-Glu-Pro Gly-Glu-Gly-Glu Glu-Pro-Glu-Glu Leu-Lys-Am-Arg

* * *

4.7 4.9 6.2 5.6 5.4 4.7 5.3 5.1 5.9 6.5 5.5 5.8 6.2 5.5 5.4 5.3 5.4 6.5 5.3 5.7 6.4 5.1 4.7 5.1

6.1 5.6 5.8 5.9 5.2 6.9 6.8 5.7 6.3 5.3 6.3 6.9

3.3 3.3 4.1 3.2 4.9 2.9 4.4 4.1 2.9 3.6 4.0 5.3 4.6 4.4 3.0 2.7 3.4 5.6 5.8 3.2 3.4 2.7 3.0 3 ’3

5.7 3.4 3.2 4.1 2.9 4.0 3.5 3.2 3.8 3.7 4.3 4.5

W -53 -87 -67 -44 -- 81 -142 - 157t --56 -53 -62 -65 ~~ 80

- 73 -82 -92 -59 -117 -56 -50 -78 -25 - 33 --91

? 61

-34 -17 -4 -11 23 ~ 36 -116 - 35 - 63 -51 129 -35 --51 -33 -13 -40 24 -11 91 132 120 -19 -~44 107

-20 -1 -111 -~ 147

-61 -91

--67

101 49 -33 - 61 -44 75 27

- 1307 52 -50

-82 -99 -96 -117 - 153 -87 -100 -113 -33 -65 96 -93 --131 -112 -- 95 -81 - 142 -89 71 83 136t -85 -79 95

- 100 - 123 4 -58

52 85 -90 -- 100 -67 178t -98

-35 -- 43 106t 72-f 9 24 11 5 29 16 26 -12 21 16

W

8lt -51

-17 13 43 23 -26 -~ 15 46

74 44 -77 44

Y

49 -15 36 50 5 --58 -43

I I I T IV I IV I TIT III II I I I I I IV IV II II II I I II

II I’ I 1 I \ I VI I I IV II’

23

11

6

63667275-

66 69 75 80

23p 26 33- 36 37r 40 38- 41 42- 45 43% 46 46.- 49 53% 56 57 -60

22- 25

382op

4. Chromtrtium

Pro-Ala-Am-Ala Ala-Ala-Asp-Am ilsn-Gln-Asp-Ala Asp-Ala-Thr-Lys Ala-Thr-Lys-Ser Arg-Pro-Gly-Leu Pro-Pro-Glu-Glu Pro-Glu-Glu-Gln His-Cys-Ala-Asp Cys-Ala-Asp-Cys Cys-Gln-Phe-Met Ala-Ala-Gly-Ala Thr-Asp-Glu-Trp Cyn-Gln-Leu-Phc Phe-Pro-Gly-Lys Asn-Val-Asn-Gly Cys-Ala-Ser-Trp

high potential

Tyr-Gly-Tyr-Thr Ile-Pro-Asp-Lys Ser.Leu-Tyr-Gly Ser.Ile-Ile-Thr Ile-Thr-Thr-Ile Ala-Ser-Gly-Gly Ile-Lys-Tyr-Ser Asp-Thr-Gly-Arg Gly-Arg-Tyr-Gly Tyr-Gly-Phe-Leu Pro-Ale-Ser.Gln Ala-Ser.Gln-Ile

206-209 213-216

232-235 242-245 244-247 256253 263-266 273-276 275-278 277-280 282-285 283-286

Tetrapeptide

B-Turn

iron

* * *

*

5.5 5.8 4.8 4.9 5.7 5.0 5.8

5.1

*

* *

*

5.3 5.2 6.1 5.7 4.9 7.5 5.6 5.1 6.2

?

protein

‘!

* *

*

*

5,3 4.8 5.5 4.8 5.5 6.5 6.4 5.6 4.4 4.8 5.9 5.3

TABLE

3.5 4.1) 2.6 3.0 2.7 34

3.1

4.0 2.8 3.0 4.4 2.9 2.6 3.1 3.1

3.1

3.4

4.4 3.7 5.9 2.4 5.4 4.9 4.4 5.7 4.2 2.6 3.4 2.6

m-78 -64 -53 -48 -67 -59 -55 -47 -45 -49 -55 -75 88 -48 -48 -53 .-- 53

- 63 51 -61 -- 61

-1lOi -89 -98

-1151 -63 -104 -36 -97

2 (continued)

18

-11

- 36 -36 -44 144 -25 -35 137 -20

-10

-48

- 13 -167

-18 -33 -26 -29

-7 -43 -12" -41

-10

-51 -58 -62 -67 152

7

-

-104 -85 88 --IO6 -95

88

-98 -88 -122 -67 -108 109 --47 -117 -49 - 118 -.- 72

-70 -61 -128

142t -129

-62 -84 67 -98

-121 -111 -118

43 (7

18 8

3 -7lT -10 4 36 43 -4 2 -14 - 3 --'I

17 99t ~ 13

"4

2 -18 -24 18 2

w

7 -32 32

-25

W

-3

(“)

$3

I r II I 1

I rr

1 I 1 III T II T11 I Tll I

I I I I III II I I I II’ T I

Bend type

c

(bovine)

Pro-Ala-Ile-Gin Ile-Val-Am-Gly Val-Pro-Gly-Ser Gly-Ser-Trp-Pro Trp-Pro-Trp-Gin Asp-Lys-Thr-Gly Am-Glu-Am-Trp Ala-Ala-His-Cys Ala-His-Cys-Gly Thr-Thr-&r-Asp Asp-Gln-Gly-Ser Ser-Ser-Ser-Glu Am-SW-Lys-Tyr Am-Ser.Leu-Thr Ile-Am-Am-Asp Leu-Ser-Thr-Ala Ser-Gln-Thr-Val Ser-Ala-Ser.Asp Ala-Ala-Gly-Thr Leu-Thr-Arg-Tyr Pro-Asp-Arg-Leu Trp-Gly-Thr-Lys Gly-Thr-Lys-Ile Lys-Asp-Ale-Met Ala-Ser-Gly-Val Cys-Met-Gly-Asp Asp-Ser-Gly-Gly Lys-Am-Gly-Ala Ser-Ser-Thr-Cys Ser-Thr-Ser-Thr Arg-Vd-Thr-Ala Val-Thr-Ala-Leu

5. a-Chymotrypsin

47 16- 19 23- 26 2&- 28 27- 30 35- 38 48- 61 65- 58 66 69 61- 64 72- 76 75- 78 91- 94 95 98 99-102 108-111 116-118 126-128 131-134 143-146 152-156 172-175 173-176 177-180 185-188 191-194 194-197 203-206 217-220 221-224 230-233 231-234

* * *

* * *

*

5.2 5.9 4.6 5.5 5.9 5.4 4.7 5.7 6.2 5.8 5.8 5.5 5.7 5.2 5.7 5.9 5.1 5.8 6.0 6.1 6.7 5.6 6.1 6.6 6.4 5.7 5.4 5.4 4.9 6.7 5.7 4.9

4.0 4.4 2.8 3.6 3.0 4.1 4.0 3.3 3.1 3.1 4.0 5.5 3.3 2.8 3.5 4.2 4.3 3.7 3.3 5.6 5.5 3.2 2.9 2.8 6.2 3.1 2.9 2.9 2.9 3.7 3.1 2.7 -61 -104 -60 -61 -56 - 1207 -69 -77 -65 -60 -98 -78 -64 ~- 62 -64 -48 -61 -64 -61 -71 -72 62 ~~-39 -48 -114t -69 -53 45 -56 -87 -59 -40

-50 138 134 -28 -34 2 -37 -36 -6 -17 -8 -25 -33 -8 150 -51 -32 -13 152 -- 25 -38 -132 -32 -24 10 133 144 31 -36 9 -45 -45

-77 65 72 - 109 -75 -75 -84 -65 -72 -87 -81 172 - 102 -114 57 -121 -98 -73 86 - 133 -87 -39 -66 -93 111 110 89 85 -98 -114 -40 -89 84t -22 2 -45 -6 -17 -18 -27 127 10 33 47 152 -45 -39 -12 145 149t -32 --26 -3 28 ..- 27 -11 -41 8 -14 -45 7

113f 31 9

I II II I III I III III III I III VII I I II IV I III II IV I II’ III I I’ II II 1’ I I III I

1725-

20 28

7. Cytochrome

lo13 14- 17 15- 18 28- 31 31- 34 34- 37 43- 46 55- 58 56- 59 67- 70 86- 89 97-100 117-120 134-137 137-140 143-146 147-150 149-152 150-153 160-163 166-169 183-186 203-206 216-219 222-225 226229 227-230 229-232

6. Concanavalin

B-Turn

bean)

Asn-Ser-Lys-Ser Leu-His-Tyr-Lys

bs (calf liver)

Asp-Thr-Tyr-Pro Am-Thr-Asp-& Thr-Asp-Ile-Gly Asp-Ile-Lys-Ser Ser-Val-Arg-Ser Ser.Lys-Lys-Thr Gin-Asp-Gly-Lys Asn-Ser-Val-Asp Ser-Val-Asp-Lys Tyr-Pro-Asn-Ala Pro-Glu-Trp-Val Thr-Gly-Leu-Tyr Ser.Asn-Ser.Thr Ser.Lys-Asp-Gln Gln-Lys-Asp-Leu Gln-Gly-Asp-Ala Thr-Thr-Gly-Thr Gly-Thr-Asp-Gly Thr-Asp-Gly-Asn Ser.Ser.Asn-Gly Glu-Gly-Ser-Ser Glu-Ser-Ser-Ala Asp-Ser.His-Pro Asn-Ile-Asp-&x Pro-Ser-Gly-Ser Thr-Gly-Arg-Leu Gly-Arg-Leu-Leu Leu-Leu-Gly-Leu

A (Jack

Tetrapeptide

*

* * *

5.5 4.7

6.7 5.9 5.7 5.8 6.8 6.9 5.3 5.8 4.4 4.9 5.9 6.1 6.9 4.8 5-9 6.2 6.2 4.1 6.1 5.1 6.9 5.1 5.8 5.2 5.3 5.5 6.1 6.4

TABLE

3.2 2.3

5.4 2.9 3.3 5.3 4.2 5.8 3.0 3.7 3.3 3.2 3.9 6.3 4.7 4.1 4.3 3.4 4.9 3.1 4.1 3.6 5.3 3.3 4.4 2.8 3.5 3.4 3.6 2.9

2

-70 35

-90 -44 -49 - 12ot -62 -80 -60 -61 -41 -66 -46 - 1581 -86 -77 - 101 49 -115f -36 -131 -77 -120 -42 -94 -57 -74 74 -81 20

(continued)

57

-8

-51 -44 -15 107 -36 -32 133 -48 -68 127 -32 -75 -41 -24 -12 -107 138 -62 26 -12 175 -67 -14 -19 120 -137 -9 22

-78 69

- 105 - 106 87 -131 83 -114 52 -100 - 108 -89 66 -81 -69 53

-91 -49 -99 81 -58 -131 76 -41 -81 83 -123 -110 -31 ~ 8”

- 53 4

42 -9 7 36

-8

6W

fw

III III’

I III I II I I II III I II I I III III I II’ II I IV I VII 1 I I II II’ I III’

type

(“)

-15 -53t -26 71t 148-f 6 -68 -32 -33 122t -38 -63 --82t 39 14 -8 26 36 -11 38 21

Bend

(cl3

18 24 29 35 38 42 46 59 74 76 87

SW-Gly-SW-SW Arg-Gin-Asn-Trp Ala-Ala-His-Cys Ala-His-Cys-Val His-Cys-Val-Asp Asn-Leu-Am-Gln His-Pro-Tyr-Trp

Am-Thr-Asp-Asp

36Ap37a 4% 61 55- 58 56- 59 57- 60 72- 75 91- 94

95-

98

Gin-Arg-Asn-Ser Trp-Pro-Ser-Gin

Leu-Ala-Cys-His Asp-Gin-Gly-Gly Asn-Lys-Val-Gly Leu-Phe-Gly-Val Val-Phe-Glu-Am Thr-Ala-Als-His Lys-Asp-Am-Tyr Lya-Ala-Lys-Gly Val-Lys-Asn-Pro Asn-Pro-Lys-Ala Asp-Pro-Lys-Ala

c2 (rhodm,

Leu-His-Gly-Leu Leu-Phe-Gly-Arg Ala-Pro-Gly-Phe Ile-Pro-Gly-Thr

26 30

2327-

10. Elnstuss

1521263235394356717384-

9. Cytochrome

35 38 46 78

3% 354375-

Ala-Gin-C&-His Glu-Lys-Gly-Gly His-Lys-Thr-Gly

c (horse)

18 24 29

8. Cytochronze

152126-

His-Pro-Gly-Gly Cln-Ala-Gly-Gly Ser.Thr-&p-Ala His-Pro-Asp--&p

42 52 67 83

39% 49BP 80-

mbrum)

*

*

? *

*

* * *

? * *

*

* ?

* *

* *

*

* *

5.1 6.0 5.5 4.5 5.6 5.4 5.5 5.1 5.5 5.4

4.4 4.8 6.0 5.6 4.9 5.0 6.4 4.7 4.7 4.8 4.4

5.3 5.2 6.6 5.2 6. I 6.3 5.2

5.9 5.6 5.6 5.3

3.0 3.5 3.1 2.8 2.8

3.2

3,3 3.3 3.1

” ‘Q

5.3 2.6 3.7 4.0 2.8 343 3.6 3.0 3.5 3.6 3.9

3.6 2.6 6.4 2.9 3.2 4.6 2.5

3.1 3.2 3.5 2.7

52 -83 38 -5' -34 -41 -61 -56 -62 --42

162t -41 50 - 119-f ~- %ot -79 -74 R8 19 -54 -69

-107 - 13 -74 - 1117 ~~41 -61 -54

- 71 --~57 75 GO

16 24 -96 -30 -45 -5% - 13 -52 -18 -~ 37

1‘3

-33 110 109 136 -90 -%8 “0 I06 --67 -49

-62 86 -1zet -58 1’70 83 112

-_ 1 128 -~ 18 --- 11

4

-41 -61 -115 -78 - 93 73

-113

9

7 -52 - I3 23 1 -1 -8

-8 -35

55-i 18 4G 57 19 8

- 107 -110 108 - 130 66 123

-~ II:! -115 -61

--34 -11 89 40 70

24 . 24 -81 -17

-.6_”

-11

-7 34 -51 30

-117 110 66 59 -103

-~7" 105 -64 -94 104 61 13q

-97 83 -66 .- 97

1 I II’ r III III I I I I

I II IV II \” I I TT iv III II

I II III I II V 11

I II 111 I

b1P lb18253132394145-

8 17 18 21 28 34 35 42 44 48

Asn-Asp-Ser.Cys Cys-Lys-Pro-Glu Lys-Pro-Glu-Cys Cys-Pro-V&l-Am Gly-Ser.Ile-Tyr Asp-Ala-Asp-Ser Ala-Asp-Ser-Cys Gly-Ser-Cys-Ala Cys-Ala-Ser-Val Cys-Pro-Val-Gly

(Micrococcus

Asp-Val-Ala-Ala Val-Ala-Ala-Gly Asn-Ser.Tyr-Val Arg-Ala-Gly-Thr Ala-Am-Asn-Ser Arg-Thr-Asn-Gly Ser-Ser-Ser-Tyr Trp-Gly-Ser.Thr Lys-Am&r-Met Cys-Gln-Gly-Asp Asp-Ser-Gly-Gly Val-Asn-Gly-Gln Ser-Arg-Leu-Gly Gly-Cys-Asn-Val Val-Thr-Arg-Lys Arg-Val-Ser.Ala Val-Ser-Ala-Tyr

98- 99B” 99%100a 115-118 125-128 131-134 145-1498 170-171a 172-175 177-180 191-194 194-197 203-206 217-219” 219-221A 22lA-224 23@233 231-234

11. Ferredoxin

Tetrapeptide

P-Turn

* *

*

aerogenes)

* * ?

* ? * * *

6.2 5.2 5.8 5.6 5.0 6.4 5.2 5.2 5.4 5.2

5.3 5.4 4.9 5.7 5.7 4.5 5.0 5.8 5.5 5.7 5.7 5.2 5.3 6.3 5.1 5.8 5.5

TABLE

3.3 2.5 3.5 3.6 4.4 3.4 3.2 2.7 3.0 3.7

3.1 3.0 3.0 3.3 3.3 2.9 2.9 3.5 2.9 3.0 3.1 2.9 2.9 4.6 2.9 2.9 2.9

42 (“) -90 -69 -50 -58 -72 -78 -72 64 -50 -53 -44 64 -66 -74 -65 -55 -56

2 (continued)

0 -24 144 144 129 -25 -153 -9 139 145 38 -33 -59 135 -40 -5

-31

*a (“)

lb -5 5

53t

I I I II II II I II’ I II II I’ I I II III I

-69 -95 -91 84 76 61 -77 -68 -89 105 65 75 -82 -90 58 -56 -111

0 -21 -17 -22 21 41 -23 -8 11 -- 25 4 lb 1

type

Bend

43 (7

17- 20 43- 46 49- 52 76- 79 89- 92 113-116

(Clostridium

Vel-Gly-Gly-His Phe-Pro-His-Phe Ser-His-Gly-Ser Leu-Pro-Gly-Ala His-Lys-Leu-Arg Leu-Pro-Am-Asp

(home)

Ala-Gly-Am-Asp His-Pro-Gln-Met Leu-Gly-Asp-Glu His-Lys-Gly-Tyr Ile-Gly-Gly-Lys

(glycera)

Ser-Gly-Thr-Gly Am-V&l-%x-Asp Val-Ser.Asp-V&l Am-Ile-Asp-Glu Ile-Asp-Glu-Leu Leu-Leu-Am-Glu Glu-Asp-Ile-Leu Met-Gly-Asp-Glu Gly-Asp-Glu-Val Glu-Glu-SW-Glu Glu-Ser-Glu-Phe Ser-Thr-Lys-Ile Ile-Ser-Gly-Lgs Tyr-Gly-+rp-bly Gln-Asn-Glu-Pro Pro-Asp-Glu-Ala Asp-Glu-Ala-Glu

14. a-Hemoglobin

18- 21 38- 41 73- 76 9ck 93 118-121

13. Hemoglobin

7- 10 34- 37 35- 38 39- 42 40- 43 43- 46 46- 49 56 59 57% 60 62- 65 63- 66 7P 77 77- 80 88- 91 118-121 121-124 122-125

12. Flavo ldoxin

MP)

*

* * *

*

? 1

*

* ? ?

5.1 5.6 6.0 5.2 6.3 5.7

5.0 6.8 5.2 5.1 6.5

4.9 5.0 5.2 5.7 5.7 5.1 6.6 4.8 5.8 5.9 6.1 6.4 5.5 6.3 6.9 4.7 5.0

2.8 3.5 3.8 3.4 5.4 3.3

2.7 3.2 3.6 4.2 4.2

4.3 2.6 3.3 3.1 3.3 2.8 6.0 2.4 4.5 3.7 5.2 4.1 3.3 6.2 4.6 2.2 4.1

32 -68 -66 -64 ~ 121t -67

52 -54

-4;

0

~~ 70 -53 -66 -63 -17 -66 -89 -1t 66 -66 -84 -72 -69 118 45 gt -94

-111 -10 -10 94 -71 -37

109 45 -41 114t 170

- 47 - 33 - 29 -55 -50 -12 -47 71 63 -24 -36 106 145 ~~~103 133 -48 -28

-- 92 -87 -87 144t -66 -74

84 -92 - 126 59 27

~ 134 -66 ~~ 95 - 17 -119 -98 - 126 66 67 -84 -118 87 84 -71 - 72 -94 - 106

26 -28 -63 -52 -37 -36

42 -53 1181 38 88

16 - 29 3 -50 115t 13 134t 63 68 -36 -26 48 - 26 -26 133t -28 16

II’ I III II III III

II IV 1 III VII

I III I III I I I III’ III+ I I II II IV II’ I I

A 7-10 A12-15 B 7-10 B20-23

18. Insulin

115-118

16- 19 31- 34 38- 41 41- 44 72- 75 73- 76 88- 91 113-116 llP117

? ? *

*

? * ? *

*

A an.d R chains)

Cys-Thr-Ser-Ile Ser-Leu-Tyr-Gin Cys-Gly-&r-His Gly-Glu-Arg-Gly ? ?

*

*

*

Y

* *

larva/erythrocruoril2)

Val-Lys-Gly-Asp Asp-Pro-Ser-Ile Phe-Thr-Gin-Phe Phe-Ala-Gly-Lys Glu-Leu-Pro-Am Leu-Pro-Am-Ile Lys-Pro-Arg-Gly Asp-Phe-Gly-Ala Phe-Gly-Ala-Ala Gly-Ala-Ala-Glu

(midge

Tyr-SW-Am-Tyr Thr-Ser-Thr-Pro Phe-Pro-Lys-Phe Phe-Lys-Gly-Leu Ser-Met-Asp-Asp Lys-Ser.Phe-Gln Ala-Gly-Phe-Glu

(lamprey)

Phe-Asp-Ser-Phe Asp-Ser.Phe-Gly Leu-Asp-Am-Leu Leu-Lys-Gly-Thr Cys-Asp-Lys-Leu Gly-Lys-Asp-Phe

(porcine

17. Hemoglobin

27- 30 43- 46 52- 55 55% 58 87- 90 106-108 131-134

16. Hemoglobin

42- 45 43- 46 78- 81 81- 84 93- 96 119-122

(horse)

Tetrapeptide

15. ,%Hemoglobin

B-Turn

6.6 5.5 5.7 5.1

6.2 6.2 5.0 6.0 6.1 6.5 5.4 6.7 5.9 4.3

4.2 4.5 5.5 6.1 4.4 5.7 4.4

5.8 5.9 5.5 5.1 5.7 5.5

TABLE

5.6 3.6 3.0 3.3

3.0 3.3 2.6 3.3 3.3 7.3 3.1 4.0 5.2 3.4

2.3 2.7 3.3 4.0 2.4 4.0 3.7

3.2 4.9 4.7 2.6 4.1 2.7

2 (continued)

?

-89 -76 84 - 143t

? -77 98 -74

-48 -43 -49 -96

-53

-43 -26 -60 -87 ~~ 16t -90 -114t

-69 -54 -77 -23 -81 -59

7

20 -- 40 - 107 11

5 -19 -15 129 136 -11 -22 166 -- 39 -9

-54 -25 - 82t 120 65 -18 -25

-21 -67 -29 -8 -21

*z (“)

-92 -96

- 141 69

-92 -55 ~~ 100 57 -96 -56 ~ 103 98 ~- 74 - 14ot

-77 -162 -55 98 125 -80 -62

-54 -114 -124 -79 ~ 122 -79

$3 (“1

- 134 -47 -24 -40

9

-Qt

- 15 -59 -26 12 -11 1107 37 -39

2 60 31 16 48 23 -18

- 63 -38 -11

got

IV III TT’ I

I III I II VI I I II T” I

I IV I TT I’ III III

I I III I I

III

(") -21 - 1367

BWd type

*3

19- 23b 30- 33 31p 34 46- 49 55p 58 70- 73 81p 8+fib 8P 87 86- 89 88- 91 102-106b 128-131 131-133b 139-142 156159 163-166 178-181 183-186 184-187 195-198 196-199 202-205 207-210 208-211 212-215 215-218 22&223 237-240 247-250 261-264 278-281 294297 308-311

19. Lactnte

SW-Tyr-Asp-Lys Asp-Ala-Val-Val Ala-Val-Val-Leu Ala-Asp-Glu-Val Met-Glu-Asp-Lys Ser.Leu-Phe-Leu Gly-Lys-Asp-Tyr Asp-Tyr-Ser-Val Ser-Val-Ser-Ala Ser-Ale-Gly-Ser Gln-Gln-Glu-Gly Val-Lys-His-&r Ser-Pro-Asp-Cys Pro-Glu-Leu-Gly Pro-Met-His-Arg Ser-Gly-Cys-Am Glu-Arg-Leu-Gly His-Ser-Cys-Leu Ser-Cys-Leu-Val His-Gly-Asp-Ser Gly-Asp-Ser.Val Val-Trp-Ser.Gly Trp-Asp-Ala-Lys Asp-Ala-Lys-Leu His-Lys-Asp-Val Val-Vel-Asp-Ser Tyr-Glu-Val-Ile Val-Ser-Asn-Pro Val-Ala-Trp-Lys Ile-Met-Lys-Asn Phe-Tyr-Gly-Ile Am-Asp-Gly-Ile Lys-Pro-Asp-Glu

dehydrogenase

(dog&h)

* *

*

*

*

*

*

*

* ‘!

*

* 6.1 5.7 5.9 5.9 6.7 6.0 6.4 6.9 5.5 6.3 7.0 5.3 5.9 6.2 5.9 6.1 5.2 6.0 5.1 7.0 4.9 5.8 4.8 6.0 5.1 6.1 4.8 6.4 6.6 4.8 6.5 5.3 5.4

6.2 3.1 3.8 5.7 4.5 3.9 6.3 5.2 3.0 3.7 6.1 3.2 3.3 4.1 3.4 3.2 4.4 2.9 3.5 6.3 2.4 3.3 5.8 3.2 2.5 4.0 2.3 3.4 5.5 2.9 3.6 2.8 2.9 2t -40 112 -51 -80 -150 91 --57 -- 42 9-f -24 129 -52 14 -26

-8t -76

-51 -60

-- 153 ot -73 --112 -. 54 -81 ~~ 104 - 141 -64 -3 -81 -51 -7 -56 22 -19 88 -11 -38 -29 164 113 96 -29 -58 -74 - 151t 115 4 139 103 -32 -30 75 -75 -~ 40 - 50 -- 139 87 -35

-29 120 - 3 27 -.~ 77

-124 ~~ 73 -99 170 -3x --115 -63 -82 -72 75 1507 -96 -89 67 137t 105 -116 -40 -66 -51 106 -99 91 1xot -44 m-116 89 -42 .- 71 -95 -102 36 -58 8i 3 --41 133 66 29 - 16lt -27 -28 74 -42 1 27 62t 8 -40 -27 -74 -55 115 11 3 103 29 -60 - 35 -26 - 38 34 28 -4 49 -36

IV II’ I VII III I III IV III IV III I I II II II I III III V’ II I IV II’ III I I’ 111 IV I II’ II III

-

Lysozyme

2203551-

21.

Leu-Asp-Am-Tyr Am-Tyr-Arg-Gly Tyr-Arg-Gly-Tyr Ser-Asn-Phe-Am Am-Thr-Gln-Ala Asn-Thr-Asp-Gly Gly-Ile-Leu-Gln Am-Ser-Arg-Trp Ser-Arg-Trp-Trp Asp-Gly-Arg-Thr Thr-Pro-Gly-Ser Asn-Leu-Cys-Asn Ser.Ser-Asp-& Ile-Val-Ser.Asp Val-Ser-Asp-Gly Asp-Gly-Met-Asn Gly-Met-Asn-Ala Met-Am-Ala-Trp Cys-Lys-Gly-Thr Ile-Arg-Gly-Cys

(hen egg-white)

Tetrapeptide

5 23 38 64

Phe-Ala-Gly-Val Ala-Ala-Asp-Se1 Leu-Thr-Ser.Lys Asp-Gin-Asp-Lys

filyogen. (carp muscle)

17- 20 19- 22 20- 23 36- 39 39- 42 46p 49 54- 57 59- 62 60- 63 66- 69 69- 72 74- 77 85- 88 98-101 99-102 103-106 104-107 105-108 115-118 124-127

20.

/3-T=

1 ? *

* *

*

*

*

* * *

4.6 5.7 5.2 5.3

4.9 4.5 5.8 5.5 5.8 5.5 4.7 6.2 5.6 6.9 5.7 6.2 6.9 5.6 6.6 5.5 6.4 5.6 5.8 5.7

TABLE

2.8 3.1 2.6 3.4

3.2 2.9 3.1 3.1 3.4 3.2 3.3 3.9 5.0 5.4 3.3 2.9 4.6 3.3 4.5 3.1 3.4 3.1 3.3 3.5

2

-32 - 36 -34 -45

52 57 -69 -52 -62 -83 -79 61 -53 -62 - 73 -62 -76 57 -74 -72 -38 -67

-64 -63

(continued)

-56 121 -68 -76

123 130 16 39 -17 -37 -31 -7 -28 16 -40 -23 -13 -25 -23 - 148 7 -16 117 141

*z (7

- 139 68 -44 -74

60 52 92 59 -84 -81 - 108 -79 - 127 - 140 -66 -86 -94 -76 -93 -74 -72 -66 76 96

30 54t -36 -1

15 16 13 19 -3 -3 7 -28 -42 19 -30 -15 127-f -23 -6 7 -16 -16 35 -28

I II III I

II II I’ III’ I I I III I IV III I I III I II’ III II1 II II

type

Bend

74 93

Papain

8- 11 19- 22 57- 60 62- 65 82- 85 83- 86 847 87 97-100 114-117 135-138 167-170 178-181 181-184 183-186 195-198 198-201 199-202 201-204

23.

1% 21 43- 46 44- 47 46- 49 76- 79 78- 81 95- 98 119-122 126123

22. Myoglobin

71go-

whale)

Arg-Gin-Lys-Gly Gln-Gly-Ser-Cys Asp-Arg-Arg-Ser Gly-Cys-Asn-Gly Tyr-Arg-Asn-Thr Arg-Asn-Thr-Pro Asn-Thr-Pro-Tyr Ser-Arg-Glu-Lys Gin-Pro-Tyr-Asn Gln-Ala-Ala-Gly Gly-Pro-Asn-Tyr Gly-Thr-Gly-Trp Trp-Gly-Glu-Asn Glu-Asn-Gly-Tyr Asn-Ser-Tyr-Gly Gly-Val-Cys-Gly Val-Cys-Gly-Leu Gly-Leu-Tyr-Thr

(papaya)

Glu-Ala-Asp-Val Phe-Asp-Arg-Phe Asp-Arg-Phe-Lys Phe-Lys-His-Leu Leu-Lys-Lys-Lys Lys-Lys-Gly-His Thr-Lys-His-Lys His-Pro-Gly-Am Pro-Gly-Am-Phe

(sperm

Lys-Ala-Asp-Ala Asp-Ser.Asp-Gly

* *

* * *

* *

*

5.6 6.0 5.2 5-7 6.5 4.8 4.9 5.2 6.7 6.1 5.2 6.1 6.0 4.9 4.8 5.1 5.7 5.5

5.3 5.4 5.6 5.4 5.4 5.3 6.4 5.1 5.9

5.1 4.9

2.8 3.4 3.4 3.9 4.1 4.8 3.9 2.9 4.1 4.9 4.0 3.7 3.5 3.1 3.1 3.1 2.8 2.9

4.4 3.2 3.8 2.5 2.4 3.0 6.3 3.9 3.9

2.6 2.8

-88 58 64 -62 -57 51 -36

-113t

-63 65 -92 -94 - 104 - 117t -82 -47

-71 -61 -67 -38 -41 43 - 138.f -58 -70

-36 -40

7 - 127 -14 51 24 .- 49 8 -50 160 -19 -41 18 -- 137t 29 -38 137 17 -28

-35 -42 -9 -53 -20 45 -65 -61 -47

-37 -66

-104 ~ 100 -68 -~ 180 -117 --- 82 1731 ~- 65 55 -65 ~ 120 -109 -~~105 101 -62 51 81 -102

- 149 -67 - 120 -41 -83 93 -54 -70 -59

-76 -49

- 10 21 -20 50 -49 8 46 -36 32 -35 --34 32 83 -9 -53 17 9

69 -9 -26 - 35 -5 -16 -35 -47 -31

-18 ~- 1 I

I II’ III IV IV T I IIT VII III I I V’ I’ III II I’ I

IV III I III I I’ III III III

III III

Ribonudeme

10 17 18 22 23 28 32 33 37 42 49

Rubredoxin

3192627p 47-

6 22 29 30 50

Ser.Thr-Lys-Lys Asp-Gly-Asp-Thr AMet-Tyr-Lys-Gly Tyr-Lys-Gly-Gln Pro-Lys-Lys-Gly

nuclease

Thr-Val-Cys-Gly Asp-Pro-Glu-Asp Pro-Glu-Asp-Gly Asp-Pro-Asp-Asp Pro-Asp-Asp-Gly Asn-Pro-Gly-Thr Asp-Phe-Lys-Asp Phe-Lys-Asp-Ile Pro-Asp-Asp-Trp Cys-Pro-Leu-Cys Lys-Asp-Glu-Phe

(Clostridium

Ser-Thr-&r-Ala Ser-Asn-Tyr-Cys Asn-Leu-Thr-Lys Thr-Lys-Asp-Arg Cys-Lys-Asn-Gly Lys-Asn-Gly-Gln Ser.Tyr-Ser.Thr Thr-Gly-Ser-Ser Lys-Tyr-Pro-Asn Tyr-Pro-Asn-Cys Gly-Asn-Pro-Tyr

S (bovine)

Tetrapeptide

26. Staphylococcal

714151920252930343946-

25.

16- 19 23- 26 3P 37 36- 39 65- 68 66- 69 75- 78 87- 90 91- 94 92- 95 112-115

24.

@-Turn

* *

? *

* *

* * *

* *

pasteurianum)

*

* * ? 1 ?

? *

5.4 5.0 5.6 5.1 3.4

5.7 4.9 4.9 5.6 5-O 5.9 5.4 5.4 5.8 5.8 5.2

5.2 7.2 5.1 4.6 5.2 6.8 6.4 5.8 5.1 6.0 4.2

3.3 2.5 5.1 3.7 3.0

3.6 6.7 3.0 6.9 3.1 6.6 6.5 3.1 5.8 5.9 3.4

4.5 5.9 5.0 4.1 2.5 8.6 4.6 3.3 2.6 6.8 2.4

42 (“)

1 -4t -121t 64 61

-81 -56 -113 - 107 -35 -59 64 -78 -39 -89 -150

2 (continued)

acCl,--aCW 0 1.0 x IO-* (greater than double the average probability of bend occurrence) are denoted by *. Tetrapoptidos for which 1.0~ 10-4>~~t> 0.75 x lo-* (greater than 1.5 times the averagr bend probability) are denoted by ?. The bend frequencies used in these computations \vere based on 408 fi-turns from 29 proteins (Chou & Fasman, 1977). 1973). Insertions are indirated 36-4, 36B, 36C, 99A, 99B, a The chymotrypsinogen-A numbering is given to the elastase sequence (Shotton & Hartley, 170A, 170B, 217A; there is a deletion at residue 146. b Residues 21, 82 and 104 have not been assigned to any amino acid; residues 132A and 132B are Pro and Asp in lactate dehydrogenase.

Gly-Pro-Cys-Lys Ala-Lys-Ala-Gly Gly-Cys-Arg-Ala Lys-Arg-Asn-Asn

(bovine

inhibitor

16 28 40 44

Trypsin

29.

12253741-

Met-Ser-Asp-Pro Asp-Pro-Ala-Lys Pro-.Ua-Lys-Tyr Tyr-SW-Lys-Arg Thr-Gin-Asp-&n Gin-Asp-Asn-Gly Asp-Asn-Gly-Gly Asn-Gly-Gly-Val Gly-Val-His-& His-Ile-Asn-Ser Thr-His-Tyr-Gly His-Tyr-Gly-Val Gly-Arg-Asp-Lys Thr-Pro-Thr-Ser Gly-Ser.Thr-Sor

205-208 207-210 208-2 11 2 17-220 224-227 “25-228 226-229 227-230 229-232 231-234 249-252 250-253 259-262 276-279 297-300

1 56

P. T.

(‘HOL-

A\,\;11 (:.

I).

I’.iSRIAS

Bends are classified as ideal if all their dihedral angles are \vithin 4 50” from the angles for a particular bend type. As can be seen from Table I. 288 ideal bends wcrc found, and of these 73% vvere hydrogen bonded. Bends that Iravc ouly one am& ($2 A.3 $3, or #a) differing by more t)han 50” from t,he given angles for that bend type are still classified as that’ type, but considered as non-ideal. Bend type IV was categorized by Lewis et ul. (1973) as having two or more angles differing by at least 40” from the most similar bend types I through III’. and incapa’ble of forming hydrogen bonds. The present survey showed that 30 of the 35 type 1V bends do not possess hydrogen bonding. The five exceptions are carboxypeptidase A residues 143 to 146, cytochrome ca residues 71 to 74, glycera hemoglobin residues 38 t’o 41, lamprey hemoglobin residues 43 to 46 and staphylococcal nuclease residues 3 to 6. whose O,,, t’o N,,, distances lvere all below 35 A, and thus capable of hydrogen bonding. The total number of non-ideal bends (including type 1V) in t’he 26 proteins is 133, of which 30 (i.e. 22%) are hydrogen bonded. Lewis et al. (1973) found that deviation in a single dihedral angle by more than 50” is sufficient to break the Co,X0, hydrogen bond in type 1 through t’ype III’ bends. In contrast, the present study revealed that of the 92 non-ideal bends (types 1 t,hrough III’). 23 are hydrogen bonded. (b) /LTurn

frequemies

The p-turn frequencies of the 20 amino acids based on 29 proteins are given in Table 3, where i, i + 1, i + 2, and i + 3 represent t)he four positions of the p-turn. The notations t - 4 to t - 1 and t + 1 to t -t 4 refer, respectively, to the four residues immediately preceding and following the /I-turn along the protein sequence. The terms n, and n,, refer to the total occurrence of each residue in all four positions and the two middle positions of the p-turns, respectively, where p-turn overlap residues were not counted twice. The last column, n, gives the total occurrence of each residue in the 29 proteins. For example, the frequency of Ala in /-?-turns may be calculated by dividing the number of Ala residues in the first (27), second (40), third (18) and fourth (28) position by the total number of Ala residues (n,,,=433) to yield the values fi = 0.062, fa = 0.092, fa = 0.042 and f4 = 0.065. The number of Ala residues in p-turns is n, = 89 and is smaller than nl+n2-tna +n,-113, because overlapping /I-turn Ala residues were not counted twice. That is, if Ala occurred as residue 12 in p-turns between residues 9 and 12 and between 11 and 14, it is counted only once in n,. Likewise, 52 Ala residues were located in the second and third bend positions. The 4, # angles of these two middle residues determine the bend type as elucidated in Table 1. The frequency of Ala residues in p-turns and middle p-turns are, respectively, ft = 89/433=0*206 and ftm = t52/433=0.120. The average frequencies of all residues in /l-turns and middle /3-turns are (f$ =: 1524/4740==0*322 and (f&845/4740==0~178. The conformational potentials of Ala in the p-turn and middle p-turn are obtained by normalization: Pt=ft/(fJ = 0.206/0*322=0.64 and P,,-ft,/(ft,>=0.120/0.178 =0.67. Likewise the frequencies of Ala residues in each bend position when divided by the average frequency (f,) = 0.096 yield the /I-turn positional potentials P,, = 0.65, P,, = 0.96, P,, = 0.43, and Pt4 = 0.67. These normalized p-turn frequencies or p-turn positional potentials are arranged in their hierarchical order in Table 4. The following residues were found with the greatest’ frequency. At position i, Asn (170/,), Cys (17%) and Asp (16%); at p osr‘t’ion i+l, Pro (333/,), Ser (14%) and Lys

p-Y’URN6

IS

PROTEINS

(UO,,); at position i+Z, Asn (21x), Asp (20%) Trp (19?/0), Gly (17%) and Tyr (15%) (Table 3).

157

and Gl,y (207;) ; at position

I-1-3.

Several residues were found to have dramatic positional preferences in the p-turn : Pro in the second (33%) but not in the third position (60,;); Trp in t’he fourth (19%) but not, in the second posit’ion (1 y’); His in the first (15:;;,) but not in the second (5%) or fourth position (57;); Cys in the first (17”~) but not in t,he second position (7%,) ; Lys in the second (13Ob) but not in the first position (59;)) ; and Gln in t,hc fourt,h (12:;) but not in the third position (4Sh). These preferences are no doubt, due t)o sbereochemical consideraeions which give great)or stability to certain residues in a specific posit’ion of the p-turn. Residues with thcl highest &turn pot’ential in all four positions (conformational P, value) are Pro (1.56). Gly (1.54). 4sn (1.51). Asp (143). and Ser (1*35), with the most hydrophobic residues showing t’he lowest bend potential, i.e. Val (0.53), Ile (0.54) and Leu (0,57). This becomes even more evident when only the two middle residues of the p-turn arc’ considered. An examinat’ion of the hierarchical order of the P,, values in Table 4 shows a close resemblance to an inrcrs~~ hvdrophobicity scale (Noznki & Tanford. 1971 : Aboderin, 1971). As a,n estima,te of the precision in the P, valuesi: t’he standard calcula~ted (Hendricks, 1956; Spiegel, 1961) :

error 0 may

IJC

(1) where ft. = + is the frequency of occurrence of each residue in the p-turn, (it) is the average frequency of all residues in the p-turn. and n is the total occurrence of each residue (last column of Tablo 3). Hence, for Aan v&h n = 227, (+JAsn=O.10 (f,)=O.O96), (using f,=O.338, (j”,)=O.322) and (uptl)Asn= 0.26 (using f,=0.070, giving (PJAsn= 1~51~0~10 and (Ptl)Asn= 1.7SkO.26 for a 68.270/, confidence level in the Y, values as shown in Table 4. For a, 95.450,: confidence level, P&2 gPt may be calculated by doubling the number after t,he & sign in Table 4 (e.g.(P,),,,=1.51 1.7810.52). Although the P, values showed a highly significant Itwo. (PtJ*sn= deviation from a random distribution when all four bend positions are considered together, this is not t’he case for several of the residues at individual positions. The standard errors for Met, Trp and Cys a,re noticeably large due to their small respective number of occurrence of n=72, 78, and 94. Hence, statements regarding the bend potentials of these less frequently occurring residues should be considered as tentative, while those P, values with smaller standard errors should be given more weight, (c) P-Turn

neighbors

Whik the &turn involves just four residues, it is possible that neighboring residues before and after the /I-turn may be important for its conformational stability. An examination of the regions adjacent to the p-turns revealed many hydrophobic residues. The following residues occurred with the greatest frequency at the four posit’ions before and after /I-turns. At posit)ion (t-4), Trp, Val, His and Tyr ; at position (f -3), Cys, Trp, Ser and Val : at position (f - 2), His, T”yr and Gly ; at position (t-l). Arg, Trp, His and Tyr; at position (t fl), Pro; Trp, Ile and Gln; at position (tf2). Pro, Cys and Thr; at position (t+3), Cys, ‘l’rp and Asp; at position (t-+4), Trp, ll(s and Cyx. The complete normalized frequencies or p-turn neighbor potentials are arranged in their hierarchical order in Table 5, along with their standard errors,

Lz‘”

LW

Ile

His

Gly

Glu

Gin

CVS

ASP

ASll

Arg

Ala

33 0.092 27 0.078

0.109

0,068 37 0,088 16 0.124 25

16

30 0.069 9 0.063 26 0.115 23 0,083 II 0.117 13 0.081

t--4

33 0.092 30 0.086

0.119 13 0.101 2” 0.096

50

3

23 0.098 54 0.129 17 0.132 24 0.104 35 0.097 25 0.072

0.119 10 0.106 17 0.106

28 0.065 12 0.085 16 0.070 33

t-2

15 0.093 14 0.060 45 0.107 20 0.155 26 0.113 32 0.089 27 0.078

0.149

30 0.069 26 0.183 21 0.093 24 0.087 14

t-1

0.073 50 0.119 19 0.147 15 0.065 23 0.064 18 0.052

0.099

0.081 17

0.130

0.028 45

10

20 0.085 41 0.098 6 0.047 9 0.039

18 0.197I 21 0.093 34 0.123 7 0.074 16

10 0.070 39 0.172 45 0.162 16 0.170 13

40 0.092

i+1

27 O-062

i 18

6 0.026 14 0.039 27 0.078

0.109

0.043 24 0.103 83 0.198 14

0.199 12 0.128 I

47 0.207 55

19 0.134

0.042

i+2

of occurrence of amino acids in the /l-turns

0.149 15 0.093 19 0,081

37 0.085 12 0.085 18 0.079 18 0.065 14

t-3

Frequewy

TABLE

20 0.085 7" 0.171 7 0.054 16 0~070 28 0.078 34 0.098

13 0.138 19 0.118

0.077 22 0.097 24 0,087

11

28 0.065

i+ 3

0.085 30 0.130 31 0.086 36 0.104

11

0.056 35 0.083

13

0,084 26 0.094 10 0.106 20 0.124

19

29 0.067 14 0.099

t+1

20 0.087 33 0.09" 21 0.061

0,107 12 0.093

24 0.106 32 0.116 13 0.138 15 0.093 25 0.107 45

0.108 31 0.089

0.068 26 0.111 30 0.071 9 0.070 14 0.061 39

0.119 15 0.160 11

0.077 24 0.106 33

11

47 0.109

t+3

25 0.107 33 0.079 12 0.093 28 0.122 37 0.103 22 0.063

0.101 11 0.117 17 0.106

8 0.056 17 0.075 28

0.111

48

t+4

based on 29 proteins

13 0.092

34 0.079

t+2

and their environment

66 0.183 106 0.305

0.174

0.485 127 0.458 39 0.415 50 0.311 62 0.265 208 0.495 39 0.302 40

110

89 0,206 48 0.338

52

0.199

0.065 "4 0.067 69

15

0.140

18

31 0.218 6" 0,273 79 0.285 16 0.170 22 (1.137 39 0,167 118 0.281

0.120

433

1.000

234 1~000 4% 1 .noc) 129 1 4ocI '30 1.000 360 1.000 34;

1.000

277 1 .oon 94 1~000 161

1 .uoo ““7 1.000

142

1 .ooo

x

0.111 18 0.106 15 0.085 30 0.082 28 0.100 13 0.167 22 0.120 60 0.140 4.50 0.095

6 0.083 15 0.088 10 0.057 45 0.122 28 0.100 10 0.128 15 0.082 43 0.120 453 0.096

5 0.069 17 0.100 12 0.068 38 0.103 30 0.107 6 0.077 24 0.130 29 0.081 455 0.096 7 0.097 18 0.106 14 0~080 28 0.076 24 0.086 13 0.167 28 0.152 30 0~084 456 0.096 6 0.083 14 0.082 23 0.131 49 0.133 26 0.093 6 0.077 19 0.103 22 0.061 457 0.096 6 7 0.041 58 0.330 52 0.141 33 0.118 1 0.013 14 0.076 19 0.053 457 0.096

0.083

1 0.014 11 0.065 10 0.067 47 0.128 22 0.079 6 0.077 22 0.120 12 0.034 467 0.096 0.06: 14 0.082 13 0.074 38 0.103 25 0.089 15 0.192 27 0.147 26 0.073 457 0.096

7 0.097 17 0.100 26 0.148 34 0.092 31 0.111 11 0.141 21 0.114 35 0.098 466 0.096 O-047 33 0.188 33 0.090 34 0.121 5 0.064 17 0.092 34 0.096 466 0.096

0.106 20 0.114 30 0.082 25 0.089 10 0.128 17 0.092 40 0.112 456 0.096

6

0.06”s0483 8 18

8 0.111 15 0.088 16 0.091 38 0.103 29 0.104 13 0.167 18 0.098 33 0.092 466 0.096

15 0.208 35 0.206 88 0.500 160 0.435 89 0.318 25 0.321 67 0.364 61 0.170 1524 0.322

6 0.083 17 0.100 65 0.369 92 0.250 50 0.179 7 0.090 34 0.185 29 0.081 845 0.178

72 1.000 170 1 .ooo 176 1 .ooo 368 1 .ooo 280 1.000 78 1.000 184 1.000 358 1.000 4740 1.000

The total occurrences of each residue in the lst, Znd, 3rd and 4th position of the /?-turn are represented by i, i+ 1, i-k 2 and i + 3. The numbers of residues at the 4 positions before and after the fi-turn are represented by t- 1, t-2, t-3, t-4 and t+l, t+2, t-+3, t+P, where t-l and t+l are adjacent to the 1st and 4th bend positions, respectively. 1~~ and n,, represent the total occurrence of each residue in all 4 positions and t,he 2 middle posit,ions (i.e. i+ 1, i j-2) of the p-turns, respectively, where overlapping p-turn residues were not count.ed twice. n is the total occurrence of each residue in the 29 proteins. Thr frequencies of occ~~~‘~‘enc~~ of coch residue in the p-turn and the middle 8.t,urn arc, respect,ively, ft-nt/n and f,,, :- We,,,jn, and are givcm below the ~~~ and n,, values. The frequencies of occurrence of each residue in the respective 12 posit,ions of t,he p-turn are obtained in a similar fashion (i.e. ft - ,=(t-4)/n, fi + 1 2 (i; I)/*L, et.c.). The average freyuency of all residues in t.he @-turns is =A\‘,/~%‘= 1524/4740= 0.322. The average frequency of occurrence of a-turns xj/S=. 457/4740-z 0.096, where j- i, i + 1, i + 2, or i + 3. The sums and average frequencies for all the bend and neighboring in the 29 proteins is (f,> positions are given in the bottom 2 rows. Two additional p-turns (carbonic anhydrase C, residues 27 to 30 and midge larva hemoglobin, residues 72 to 75) were located after the compilation of this Table, but are listed in Table 1 and used in the bend type analysis of Table 1.



Sum

V&l

TY~

TOP

Thr

Ser

PI.0

Phe

_ Met

GUY Ty1 Thr Met Phe Gln Trp Glu Arg Ile Leu Ala Val Lya

ASP His SW Pro

Asn cys

pt1

1.78 1.77 1.68 1.53 1.38 1.36 1.23 1.07 0.96 0.86 0.85 0.84 0.80 0.75 0.73 0.68 0.66 0.65 0.64 0.54

.- ~

& 0.26 + 0.40 & 0.23 & 0.32 * 0.18 & 0.26 $ 0.16 k 0.23 i- 0.18 & 0.34 & 0.22 & 0.22 A: 0.31 i- 0.18 & 0.22 3: 0.17 * 0.13 m+.0.12 & 0.13 -!: 0.12

__~.-.-

Pro Ser Lys Arg Asp Thr Gln Gly Asn Ala Glu Met Tyr CYS Val His Phe Ile Leu Trp

__-

3.42 1.47 1.35 1.31 1.27 1.22 1.03 1.01 0.96 0.96 0.89 0.86 0.79 0.77 0.55 0.48 O-43 0.41 0.29 0.13

+ 0.37 + 0.19 + 0.19 * 0.29 5 0.20 + 0.20 * 0.24 -t 0.15 + 0.20 + 0.14 i 0.19 * 0.34 mc 0.20 & 0.28 :k 0.12 & 0.19 + 0.16 * 0.13 * 0.09 & 0.13

~~. ___---~

pt,

Bend positional

Am Asp Gly Arg Ser Cys Tyr His Glu Thr Lys Trp Phc Pro Gln Ala Leu \‘a1 Ile Met

potentials

2.15 2.06 2.05 1.39 1.32 1.32 1.24 1.13 1.06 0.81 0.81 0.80 0.67 0.59 0.45 0.43 0.40 0.35 0.27 0.14

pt3 + 0.28 k 0.25 & 0.20 * 0.30 * 0.18 + 0.36 !- 0.25 = 0.28 i 0.21 & 0.17 + 0.15 f 0.31 + 0.20 I 0.18 J- 0.17 yo.10 i- 0.11 + 0.10 + 0.11 + 0.14

and /l-turn

Trp Gly Tyr cys Gin SW Lys Am Thr Asp Glu Phe Leu Arg Pro Val Ilc Met lb His

1.99 1.78 1.52 1.43 1.22 1.07 1.02 1.01 0.93 0.90 0.89 0.85 0.81 0.80 0.77 0.75 0.7” 0.72 0.67 0.66

---

Pf4

conformational

TABLE 4

$ & * 5 A * + i ; = & x & + E h e = + G

0.46 0.19 0.27 0.37 0.26 0.16 0.17 0.20 0.18 0.18 0.19 0.22 0.15 0.23 0.20 0.14 0.17 0.31 0.12 0.21

parameters

-Pro Gly Asn Asp Ser Cys Tyr Arg Trp Thr Gin Lys His Glu Met Phv Ala Lou Ile Val

1.56 1.54 1.51 1.43 1.35 1.29 1.13 1.05 1.00 0.99 0.97 0.95 0.94 0.82 0.65 0.64 0.64 0.57 0.54 0.53

pt

of 29 proteins

= & + + -i = h + ;: = + & 1 2 < ~j = ‘I II-

0.12 0.08 0.10 0.09 0.08 0.16 0.11 0.12 0.16 0.09 0.11 0.08 0.13 0.09 0.15 0.10 0.06 0.06 0.08 0.06

Pro Asp Gly ~b3Il Ser Arg Lys Tyr Thr q-s GlU His Gln Ala Phc Trp JIct Val LCN Ile

.-

= _ z = .-::

-; 5 .: --~ =

0.77 0.67 0.56 0.50 -Cb4i _: 0.45 .0.37 -= 0.37

IbiX

247 1.60 1.58 1.53 1.40 1.2” 1.12 1 .04 1.00 0.95 0.93

P tm 0~20 0.15 0.12 0.17 0.13 0.19 0.12 0.16 0.13 0~22 0.14 0.17 0.15 0.09 0.13 0.18 0.18 0.08 0.07 0.09

B-TURNS IS PROTEISA

161

upt, as calculated from equation (1) and using th(s frequrncies of the residues adjacent to the p-turns listed in Table 3. That /I-turns act as terminat,ors of helices and p-sheets was observed by Chou $ Fasman (1974) in lysozyme and trypsin inhibitor. Hence, it is not surprising that a great,er abundance of hydrophobic residues (which arc strong M and /3-formcrs) are found adjacent bo the p-turns. As shown in Figure 2. the /3-turns of cytochrome h, arc hounded by interacting x and p-regions: x--p for bends 17-20 and 49-52, p-p for bend 25-28, M--M for bends 39-42 and 6467, and 18-2 for bend 80-83. When a /I-t,urn environmental analysis was performed for the 29 proteins (Table 6), it, \\‘iLs found that K-X, /3-/I and a-/3 interactions increased as one proceeded further away from the p-turn region. For example, there are ten residues at position i that are also part of a C-terminal a-helix, and likewise t’w residues at position i +3 of the /?-turn that arc also part of an N-terminal a-helix, as revealed by X-ray cryst,allography. However, there are 47 a-a residues at posibion t-4 and t+4 (i.e. 4 residws away from both /I-turn ends). Similar increases for /3-p residues are observed when positions (i? i f~3) are compared with (t-4, t-i-4). \vhew the first pair had 18 occurrents and the second pair had 52. Likewise. comparison at these positions for a-/3 and p-a residues showed (2 and 25) and (4 and 25), respectively. Rimultaneously, tlecreasw were observed for coil-coil residues (237 +85). It is also noteworthy that the number of pair\\iae a-coil residues, coil-a residues, p- coil residues and coil-/3 residues remains fairly constant at the regions adjacent to p-turns. However, the increase of ,f?+, a-a, a-p and /3-x residues beyond the p-turn indicates that reverse chain folding is stabilized I)y ant’i-parallel ,&sheets, helixhelis interactions, as well as a-/3 and 8-a int’eractions. Csing t’he normalized ,&turn and boundary frequencies in Tables 4 and 5, t’he /3-t,urn positional potentials at the 12 positions were plotted graphically for all 20 amino acids (shown in Fig. 3). The p-turn positions i to ii3 are represented on the x-axis by 0 to 3, the positions before a’nd after the p-turn t-4 to t-l and t+l to t f4 are represented by -4 to -1 and 4 to 7, rwpect’ively. The p-turn potential is measured on the y-axis, where (P,)=l.O represents the average /?-turn potential at all 12 positions. Residues that occur more frequent Ip than the average at any position will ha,ve their P, values great’er t’han unity or above the broken horizontal lines, The p-turn region is enclosed by the broken vertical lines (i.e. positions i to i+3), to the left of which are the four positions before the ,&turn and to the right of which are the four positions after the p-turn. For legibility, error bars are not included in tho plots of Figure 3, they are given in Tables 4 and 5. Howwer, in the discussion below, t,he standard errors are included to indicate t’he confidence limit of the P, values at the various environmental bend positions. The eight hydrophobic residues (Ala, Ile, Lcu, Met, Yhe, Trp, Tyr and Val) are grouped t,ogether in Figure 3 for comparison. With the exception of Ala, all the hydropholJic residues exhibit a V-shaped trough in the p-turn region, with P, values rising ZltJOW unity at) positions beyond the bend region. All bhe hydrophobic residues have I’, 1 at all 12 positions, except at the second bend position (P;-O-77*0.28), suggwting t,hat disulfide linkages occur frequently at chain reversal regions. While there is remarkable similarity in the cn~i~~onnl~~ntal l)end position profiles of Asp and Asn, this likeness is absent in Gluand

Trp Val His Tyr Cys Am Met Ile Phe Thr Leu Gly Pro Asp Ser Gln Lys Ala Glu Arg

1.76 1.47 1.31 1.26 1.23 1.21 1.17 1.14 1.12 1.06 0.97 0.93 0.90 0.87 0.86 0.85 0,82 0.73 0.72 0.67

pt-4

I * + 5 & & f * -& & + & * & * f & + & -&

0.44 0.19 0.31 O-25 0.35 0.22 0.39 0.22 0.26 0.19 0.16 0.15 o-22 0.17 0.15 0.23 0.15 0.13 0.17 0.22 TY~ Glu Asn Asp Pro

Cys Trp Ser Val Gly His Thr Ilk? Gln Leu Phe Lys Ala Arg Met

Bend positional

1.56 1.34 1.28 1.26 1.26 I.05 1.06 1.00 0.97 0.96 0.92 0.90 0.89 0.88 0.87 0.86 0.85 0.83 0.68 0.59

pt-3 f * * + h & & * f & f & + + & * * h i *

potentials

5

0.38 0.40 0.18 0.18 0.17 0.28 0.19 0.20 0.24 0.16 0.23 0.16 0.14 0.24 0.34 0.21 0.19 0.19 0.16 0.18 TY~ Gly Asp Thr cys Gln Ile Ser Phe Glu Leu Arg Val Trp Lys Asn Met Pro Ala

His

1.37 1.36 1.34 1.24 1.12 1.11 1.10 1.09 1.08 l-04 1.02 1.01 0.88 0.84 0.80 0.75 0.73 0.72 0.71 0.67

pt.-, i rf f k & * f f & i * * & * & f + & & +

adjacent to the ,&turn

TABLE

0.31 0.26 0.17 0.20 0.19 0.33 0.26 0.21 0.17 0.24 0.20 0.16 0.24 0.15 0.31 0.14 0.18 0.31 0.20 0.12

Arg Trp His Tyr cys He Gly Phe Met Gln Am Leu Asp Thr Val Pro Lys Ser ,41a Glu

in 29 proteins

1.90 1.73 1.61 1.68 1.56 1.18 1.11 1.10 1.01 0.97 0.96 0.92 0.90 0.89 0.87 0.83 0.81 0.79 0.72 0.62

pt-I * + 5 & i f & h f j= f h & & * f + + h +

0.34 0.44 0.33 0.28 0.38 0.22 0.16 0.25 0.36 0.24 0.20 0.16 0.18 0.17 0.15 0.21 0.15 0.14 0.13 0.16

P‘-2, f’t-,

the frequrncies

pi-4. p,-,,

malizing

1.54 1.47 1.36 1.29 1.19 1.15 1.11 1.08 1.04 1.02 1.02 l-01 O-98 0.96 0.90 0.89 0.87 0.87 0.70 0.58

* 2 & f j, & f + f & & & + A f i & f & k

0.28 0.41 0.23 0.27 0.24 0.19 0.33 0.17 0.24 0.26 0.16 O-36 0.18 0.16 0.15 0.26 0.19 0.14 0.12 0.16

given

in Table

p,+,

3. The number

and p,+,, pt+2, I’,.,,

Pro Trp Ile Gin Tyr Thr cys Lys Phe Arg Val Met Asp Ser Leu His Am Gly Ala Glu

P t+1

1.95 1.44 1.26 1.20 1.11 1.11 1.10 0.99 0.97 0.97 0.96 0.95 0.96 0.93 0.90 0.82 0.72 0.67 0.63 0.49

c & i + k & & + f + f f + & * & & & * 5

0.31 0.37 0.20 0.20 0.16 0.21 0.21 0.16 0.24 0.27 0.22 0.16 0.25 0.16 0.19 0.13 0.31 0.29 0.13 0.17

cys Trp Asp Pro Val Glu Ala Leu Phe Asn Tyr Lys Thr Met Ser Arg Gly His Gln Ile

1.66 1.33 1.24 1.18 1.16 1.15 1.13 1.13 1.10 1.10 0.96 0.93 0.93 0.87 0.85 O-81 0.74 0.73 0.71 0.63

Pt+3 * & & & & i + & 3 & 5 +h & jy & 5 + * &

0.39 0.39 0.20 0.26 0.17 0.21 0.16 0.17 0.25 0.21 0.22 0.16 0.18 0.34 0.15 0.23 0.13 0.23 0.21 0.16

Trp Ile cys Met Ala Glu Gln Thr Ser Leu Asp Tyr His Val Pro Phe Gly Asn Lys Arg

1.73 1.27 1.22 l-15 1.15 1.11 I.10 1.08 I.07 1.07 1.05 1.02 0.97 0.96 0.94 0.92 0.82 0.78 0.66 0.69

P t+* I * * & f & i f f f & & i f & f & * $i

are, respcctivul~, the bent1 potentials at the 4 positions before and after the + sign denotes o, the standard wror as given in eqn (1).

Pro cys Thr Asp Gly Glu Am Val Gln His Tyr Leu Arg Ser Ile Ala Met Trp Lys Phe

P t+2

after

0.44 0.22 0.34 O-38 0.16 0.21 0.25 0.19 0.16 0.17 0.19 0.23 0.27 0.16 0.23 0.23 0.14 0.18 0.14 0.20 the p-turn

obtained

by nor-



10 11 29 36 47

-



2 7 14 19 25

-

= 1.0 (horizontal broken lines) represents the average p-turn potential at all 12 positions. Note that the P, scale for Pro is expanded as compared to the ot.her amino acids due to the high P,= 3.42 value for Pro at tho i + 1 bend position. Error bars may be obt,ainod from the standard errors, :t a, shown in Tables 4 and 5.

166

I’. Y.

UHOU

AXI)

G. 1). FASMAN

His, which has P, > 1 before and P, < 1 after the p-turn. However, an examination of the standard errors in Table 5 indicates that the (PJHis values should be t’rea,ted with more caution than the (Pt)G,u values, due to the smaller sampling of His (TL=~ 129) as compared to Gly (n=420) in the 29 proteins. On the other hand, an opposite t,rend is noted for Pro, with P, < 1 before and P, > 1 after the bend. Aside from these three cases, no dramatic p-turn directional effect was observed for the ot’her 17 amino acids. Gly and Ser are the only residues having P, > 1 at, all four bend positions i to i+3. Thr may be considered as bend indifferent, except at the second posit,ion where it is a p-turn former with P, = 1.22&0.20. The frequency of occurrence in the p-turn environment according to side-chain groupings is further examined and compared in Figure 4. It will be noted that the charged polar residues (Asp, Glu, Arg, His, Lys) have a higher preference for the third bend position, while the uncharged polar residues (Asn, Gln, Ser, Thr) are found with equal frequency at the first, second and third bend positions. In the regions adjacent to the p-turns, polar residues are found with average frequency, as indicated by P,zl (Fig. 4(a) and (b)). The hydrophobic residues (Ile, Leu, Met, Val) are the strongest B-turn breakers at all four positions, and the deep trough in the P, value at the i+S position (Fig. 4(c)) appears as an inverse curve to that for the charged polar residues (Fig. 4(a)). Tl le aromatic residues (Phe, Trp, Tyr) and Cys

2.0 ,

(Cl

Id)

I

-MO (h)

FIG. 4. The p-turn potential P, at the 12 positions adjacent to and including the bend region for amino acids of similar side-chain groupings. (a) Charged polar (Arg, His, Lys, Asp, Glu); (b) uncharged polar (Am, Gln, Ser, Thr); (c) hydrophobic (Ile, Leu, Met, Val); (d) Cys and aromatic (Phe, Trp, Tyr); (e) bulky side-chains (Trp, Tyr, Arg, Phe, His); (f) small side-chains (Gly, Ala, Ser, Pro); (g) helical farmers (Glu, Met, Ala, Leu, Lys, Phe, Gln, Trp, Ila, Val); (h) p-sheet former8 (Val, 110, Tyr, Phe, Trp, Leu, Cys, Thr, Gln, Met). As in Fig. 3, the positions t-4 to t+4 are represented as -4 to 7 on the z-axis and the 0 to 2 scale on the y-axis represents the P, value,

/?-TURNS

IN

PROTEINS

ltii

behave similarly in that they all have a P, minimum at the i-+1 position (Fig. 3). They also prefer the t - 1 and i+3 bend positions, which are represented by maximum peaks in Figure 4(d). As a group, their P, values fluctuate most widely, almost in a zig-zag fashion when compared with the other groupings in Figure 4. The bulky (Trp, Tyr, Arg, Phe, His) and small (Gly, Ala, Ser, Pro) side-chains grouped according to their molecular weight’s appear to have curves (Fig. 4(e) and (f)) inverse to each other. The bulk-v groups have a maximum P, at f- 1 and a minimum at ifl, while the small residues show the opposite effect at these positions. The frequencies of helical formers (Glu, Met, Ala, Leu, Lys, Phe, Gln, Trp, Ile, Val) and p-sheet formers (Val, lle, Tyr, Phe, Trp, Leu, Cys, Thr, Gln, Met) in the bend environment are compared in Figure 4(g) and (h). As expected, the a and ,%formers are both bend breakers, as seen from their low P, values in the bend region i t#o i+3. However. unlike a-formers, /?-sheet formers have P, > 1 at, all eight positions outside the bend, except’ at t+2 (where P,=O*96), indicating that anti-parallel p-sheet formation occurs more frequently than helix-helix interactions in chain-reversa,l regions. (d) Conservation

of ,8-iurm

An examination of the proteins with homologous sequences in Table 2 revealed that, many of t,he chain reversal regions were conserved, One such comparative analysis is given in Table 7, showing that 21 of the 27 p-turns in elastase were found at sites identical to those in a-chymotoypsin. It can be seen in Table 7 (column 2) that of the 21 conserved p-turns, 14 have the same bend types. It is surprising that of all the conserved p-turns listed in Table 7, only two (bends 55-58 and 194-197) have the same amino acid residues at all four positions. On the other hand, three of the bends (72-75, 91-94 and 99-100) were found to have none of the residues conserved at all four positions. For the 21 p-turns common t’o elastase and chymotrypsin, the numbers of residues conserved at the four bend positions i, i + 1, %+ 2 and i + 3 are 12, 9, 7 and 9, respectively. The conservation of only 37 of the 84 p-turn residues indicates that sequence homology may not plav the dominant role in determining structural similarity in proteins. It is interesting to note that of the five bends in Table 7 involving Arg substitution in elast’ase, four of them also had changes in bend type : Arg +Pro

(P,,=

1*31+3*42)

Arg +Asn

(P,, =0*73 +1.78)

Arg-tSer

(P,=O.73

Arg +Ser

( Pt3= I .39 hl.32)

jl.38)

in the second position in the first position in the first position in the third

position

of 23-26, type I -+I1 ;

of 48-51, type I +I11 ; of 125-128, type II-tIII; of 221 A-224,

type II -+I.

Although the bulkiness of the Arg side-chain may be the contributing cause of these bend type changes, the conformational bend potent,ial at specific positions may also be important. Thus, in the four examples given above, the first three cases showed quite large changes in the Pt values at the position of Arg replacement. While the P,, value did not change much in the last case, at, residue 223, there is a bend potential change at the first position (see Table 7, residue 221A) Val+Ser (P,=O64+1.38), which may cause deviations in the torsion angles in the bend region. The only case where an Arg replacement did not result in bend type change in Table 7 is at p-turn 217-219. Here the Arg-tSer substit’ution caused little bend potential deviation at the second position (P,Z=1.31 +1.47), and the other replacements such as Leu+Thr at

8.turnsa

Rend types b Elas- Chymotaso trypnin

23-26 27-30 48-51 55-58 56-59 72-75 91-94 95-98 99%1008 115-118 125-128 131-134 172-175 177-180 191-194 194-197 203-206 217-2198 221Ap224O 230-233 231-234

I I I III III I I I I I II II II’ I II II I’ I II III I

II III III III III III I I II I III II II’ I II II I’ I I III I

i

Gln/Val TV drg/Asn Ala Ala Am/Asp His/Am Asn Val/Ilo Asn/Ser Arg/Ser Ala Trp Ls’” CYS -4sp Val/Lys SW Val/Ser A% Val

itesiduc at p-turn position” +1 if 2

Arg/ Pro Pro Gln/Glu .-Zla His Leu/Gh~ Proper Thr/Ser Ala/Asn Rer/Gln Ala ASIl/AlFi Gly Asn/Asll Gin/Met ST As11 Arg/Ser Thr Val Srr/Thr

.\sn/Gly Ser/Trp As11 His cys dsn/Gly Tyr/Lys Asp/Lell Ala/Am Tyr/Thr Gly/Ser Asn/Cly Hcr/Thl SW/Ala Gl) Gly Gly Leu/Thr Arg/Ser Ser/Thr Ala

i +3

SIX

Gln Trp (‘ys Val/Gly Gln/Sw Trp/Tyr Asp/Thr Gly/Asp Val Thr/Asp S(,r/Thl Thr/Lys iat Asp Gly Gin/Ala Gly/Cys Lya/Thr Ala Tyr/h1

Identical residues”

1 3 2 4 3 0 0 1 0 1 1 1 2 2 3 4 2 1 1 3 2

* Of the 27 p-turns in elastase, 21 were found in the same position in a-chymotrypsin. b Of the 21 conserved ,&t,urns, 14 of them have identical bend types in elast,ase and a-chymotrypsin. c The first residue entry in each bend position is for rlast,asc, the second residue entry is for a-chymotrypsin in positions where the amino acid is not conserved. No second entry is made in positions where residues are conserved (e.g. residues 55 to 58 and 194 to 197). d The number of ,&turn residues conserved is given in this column. e There are insertions at residues YQA, 99I3, 217A, 221A in the elastase sequence (Shotton & Hartley, 1973).

the third position (P,,=O-40 +0.80) and Gly +Cys at the fourth position (P,,= 1.78 + 1.43) did not change the overall bend potential dramatically, thus preserving the bend type I. An example of a single residue replacement resulting in a bend type change (I +III) is seen at p-turn 27-30, where Ser +Trp ( Pt3= 1.32 +0*80) at the third position. It is easy to postulate that this change was caused by a polar --f hydrophobic and small -f bulky side-chain residue replacement as well as a drop in the bend potential at position i+2. However, added contributory causes may be due to the steric constraints of Trp27 and Pro28 at the first and second bend positions. It is also worthy of note that p-turn 91-94 retained the same bend type despite changes at all four positions: His + Asn (P,1=153+1*38) at the first position, Pro +Ser (P,,=3*42 --f 1.47) at the second position, Tyr+Lys (P,,= 1*24+0.81) at the third position, and Trp+Tyr (P,,=1*99 +1.52) at the fourth position. Although the bend potential appears to drop quite sharply at the second position, the relative change is minor, since Ser has the second highest P,, value (see Table 4). Hence, the tetrapeptide (Asn-Ser-Lys-Thr) in u-chymotrypsin 91-94 possess an ideal type I bend with

fl-‘WRNS

IN

PROTEINS

-33”), (r#, #)3.=(-1020, 10”) as does (His-Pro-Tyr-Trp) 91-94 with (4. #),=(-62’, -18”), (4, $),=(-93‘, -I‘-). (4, $),=(-64”,

169

in elastasr

(e) Qly residues in ,&turna For type 11 bends only, Gly should occupy the t’hird position if the 00,--N,,, hydrogen bond were to remain intact (Venkatachalam, 1968; Crawford et al., 1973). Indeed, this \vas observed for cytochromc c, where the three type II B-turns, residues 21 t)o 24, 35 to 38 and 75 to 78, had Gly at the third posit’ion (Dickerson et al., 1971; see also Table 2). As can be seen from Figure 1, the carbonyl oxygen of the second bend residue is in close proximity to the side-chain (R-group) of the third bend residue. Hence, residues other than Gly at the third position will be energetically unfavorable for t,ype IT bends due to steric hindrance. An examination of Table 2 revealed surprisingly bhat 29 of the 64 type 11 bends did not possess a Gly residue at the third position. Thirteen of these were found to benon-ideal (see footnoted, Table l), including four hydrogen-bonded bends : lactate dehydrogenasc, residues 156 to 159 (Pro-MetHis-Arg) and 163 to 166 (Ser-Gly-Cys-Asn); myogen, residues 20 to 23 (Ala-Ala-AspScr) : and ril)onuclease S. residues 87 to 90 (Thr-Gly-Ser-Ser). Of the 16 ideal type IL bends wit’hout G1,v at t’hc third position, 14 were hydrogen bonded and two were not: sc-ohymotrypsin, residues 16 to 19 (Ile-Val-Asn-Gly). -4.4 A; and flavodoxin, residues 74 to 77 (Ser-Thr-Lys-He), Oo,---No,== 0,1,--WC*,4.1 a (Table 8). It is interesting that these t’wo non-hydrogen-bonded type II bends possess bulky side-chains. Of thr: hydrogen-bonded brands, three have small side-chains at all four positsions (elastase, rtbsidues 131 to 134 (Ala-Asn-Bsn-Ser); glycera hemosubtilisin BPN’, residues 97 to 100 globin. residues 18 to 21 (Ala-Gty-Bsn-Asp); (Gly-Asp-Ala-Gly)) and five at three bend positions (carboxypeptidase A, residues 169 to 172; a-chymotrypsin, residues 99 to 102; concanavalin A, residues 67 to 70; elastase, residues 145 to 149; and lacta’te dehydrogenaae, residues 196 bo 199). Nevertheless, there are six hydrogen-bonded type 11 bends possessing at least two bulky side-chains. In the case of cytochrome c2, residues 56 to 59 (Lys-Ala-Lys-Gly), the small Ala and Gly residues alternate with the large lysine groups; while in lysoZJxl~‘, residues 17 to 20 (Leu-Asp-Asn-Tyr), the sma.11side-chains occupy the middle brnd posit’ions with the large hydrophobic groups at the ends. However, three of the bends have large side-chains at the two middle positions with the small groups on the outside: lysozyme, residues 19 to 22 (Asn-Trr-Arg-Gly); papain, residues 198 to 201 (Gly-Val-Cys-Gly); and staphylococcal nuclease, residues 52 to 55 (Glu-LysTyr-Gly). Finally, elastase, residues 2218 to 224 (JTal-Thr-Arg-Lys), has three large side-chains but still possesses hydrogen bonding. Although the above examples appear to contradict the stereochemical requirement of Gly at the third position in type II bends as a result of energy calculations (Venkatachalam, 1968), it may be that the small angular deviations (see Table 8) from the perfect, type II bend ((4, #)2=(-60”, 120”), (4, $)3=(SO”, 0”)) are such that contact between the side-chain of the third bend residue with the carbonyl oxygen of the second bend residue is avoided. .It would be worthwhile for the crystallographers to re-examine these bend regions for possible side-chain contacts or adjustments in the dihedral angles of the two middle bend residues. One recent theoretical analysis revealed that, while the bend conformation of the tetrapeptide Asp-LysThr-Gty is differtbnt from that of t,he native p-turn in a-chymotrypsin residues 35 t,o

t Bends as ideal.

which

do not

have

any

Carboxypeptidase A a-Chymotrypsin u-Chymotrypsin Conoanavalin A Cytochrome c2 Elastase Elastase Elastase Flavodoxin Glycera hemoglobin Lactate dehydrogenase Lysozyme Lysozyme Papain Staphylococcal nuclease Subtilisin BPN’

Protein

angle

differing

by

169-172 16-19 99-102 67-70 56-59 131-134 145-149 22IA-224 74-77 18-21 196-199 17-20 19-22 198-201 52-55 97-100 more

than

50” from

Tyr-Ala-Asn-Ser Ile-Val-Asn-Gly Ile-Asn-Asn-Asp Tyr-Pro-Asn-Ala Lys-Ala-Lys-Gly Ala-Asn-Am-Ser Arg-Thr-Asn-Gly Val-Thr-Arg-Lys Ser.Thr-Lys-Ile Ala-Gly-Asn-Asp Gly-Asp-Ser.Val Leu-Asp-Asn-Tyr Asn-Tyr-Arg-Gly Gly-Val-Cys-Gly Glu-Lys-Tyr-Gly Gly-Asp-Ala-Gly

Tetrapeptide

the

Ideal type II @-turns without

TABLE

(4, +)2=(

is-2

- 60’,

5.1 5.9 5.7 4.9 4.7 5-7 4.5 5.1 6.4 5.0 4.9 4.9 4.5 5.1 5.4 5.2

Cly at the

8

120’)

and

3.3 4.4 3.5 3.2 3.0 3.3 2.9 2.9 4.1 2.7 2.4 3.2 2.9 3.1 2.8 2.7

positiont

(4, #)3=(800,

-80 -104 -64 -66 -88 -72 -78 -65 -72 -21 -51 -64 -63 --57 -41 -38

96 65 67 83 108 76 61 58 87 84 106 60 52 51 77 83 0”) for b end

107 138 150 127 106 144 129 135 105 109 115 123 130 137 120 145

type

16 31 47 -33 --46 21 41 15 48 42 11 15 16 17 -6 -26 II

~LI’P co~lsidrrrtl

B-TURNS

IN

PROTEINS

171

38, the addition of an extra residue of the native sequence on both sides of the tetramer resulted in closer resemblance to the native structure (Hurwitz & Hopfinger, 1976). Similar theoretical calculations could be performed for the bend tetrapeptides in Table 8 to see whether the hydrogen-bonded type II bends without Gly at the third position are indeed energetically stable. Of the 13 &turns having type I’ bends ((4, $)2=(600, 30”), (4, #)3=(900, O’)), nine were found to have a Gly residue at the third position. It will be noted that the dihedral angles for the third bend position in type I’ and type 11 bends are almost identical, which may account for the predominance of Gly at this position. However, contrary to the conclusions of Venkatachalam (1968) that the type I’ bend can accommodate only a Gly-Gly sequence, not a single case was observed in the 13 type I’ bends. Although eight of the bends possess a small side-chain at’ the second position, there are exceptions where bulky residues occur in hydrogen-bonded t’ype I’ bends: lactate dehydrogenase, residues 220 to 223 (Tyr-Glu-Val-Ile); lysozyme, residues 20 to 23 (Tyr-Arg-Gly-Thr); and papain, residues 199 to 202 (Val-Cys-Gly-Leu). There are 20 type II’ bends, of which 14 have Gly at the second bend position, including five with the Gly-Ser sequence at the middle two positions. There are only two t,ype II’ bends with Gly at the third position: horse u-hemoglobin, residues 17 to 20 (Val-Gly-Gly-His), which links the A and B helices (Perutz et al., 1968), and lactate dehydrogenase, residues 278 to 281 (Phe-Tyr-Gly-Ile), which forms the antiparallel ,&sheets 267-279 and 281-295 (Holbrook et al., 1976). In addition, the third position of the 20 type II’ bends is occupied by three hydrophobic residues and 15 polar residues, of which eight are charged. Two of the type II’ bends contain three bulky hydrophobic side-chains: lactate dehydrogenase, residues 278 to 281 (Phe-Tyr-GlyHe); and carboxypeptidase A, residues 277 to 280 (Tyr-Gly-Phe-Leu). It is interesting to note that these two ,&turns occur at similar positions along the primary sequence of both fairly large-sized proteins. While the hydrogen bonding in the former case may be marginal (O,,,-N,,,=3*6 A), that for the lat’ter is quite stable (O,,,--No,= 2.6 A). This may be due to the positioning of Gly at the second position in carboxypeptidase A residues 277 to 280, rather than at the third position in lact’ate dehydrogenase residues 278 to 281, which seems to favor type II’ bends. Of the 13 type III’ bends observed in Table 2, the third position was found to contain five Gly, three Asn and t’wo Tyr residues, while the second position contained three Gly. three Lys, two Asn and two Asp residues. The tendency of Asn and Tyr to adopt the left-handed helical conformation wax discussed previously (Chou $ Fasman, 1974), so it is not surprising to find these residues in the type III’ bends. It is interesting to note that there are three consecutive type III’ bends in thermolysin, residues 225-230 (Table 2), of which Matthews et al. (1974) have classified residues 226 t(o 229 as a left-handed u-helix. (f) Cis Pro in f9-turns While the present survey of 29 proteins located 58 p-turns with Pro at the second position, only 12 were found with Pro at the third position of the bend. Two of these bends were not classified (adenylate kinase, residues 157 to 160, and ferredoxin, residues 14 to 17), since the torsion angles were not available. Two others were of bend type I (papain, residues 84 to 87) and type III (carbonic anhydrase C, residues 80 to 83). The remaining eight bends belonged to type VI having a cis proline at the i,$-2 or t,hird position, and are summarized in Table 9. With the exception of

t Classified

as bend

type

VI

in Table

Carbonic anhydrase C Carbonic anhydrase C Midge larva hemoglobin Ribonuclease S Ribonuclease S Staphyloccocal nuclease Subtilisin BPN’ Thermolysin

Protein

1.

27-30 198-201 72-75 91-94 112-115 115-118 166-169 49-62

,&TLlrIl

at position

Gln-Ser-Pro-Val Thr-Pro-Pro-Leu Glu-Leu-Pro-Asn Lys-Tyr-Pro-Asn Gly-Asn-Pro-Tyr Tyr-Lys-Pro-Asn Gly-Tyr-Pro-Gly Thr-Lou-Pro-Gly

Tetrapeptide

Cis prolines

TABLET i+2

7.3 5.7 6.1 5.1 4.2 5.6 5.6 6.1

of /3-turnsf

4.6 3.2 3.3 2.6 2.4 2.9 3.1 3.6

-49 -39 -160 -53 -96 -113

-170 ?

152 ? 136 132 105 133 145 156

-96 -89 m-64 -83 -86 -68

? ?

-11 11 162 14 13 -29

? ?

P-TURNS

IN

PROTEINS

173

carbonic anhydrase C, residues 27 to 30, and thermolysin, residues 49 t,o 52, having O,,,--S,,, distances above 3.5 8, the ot,her six bends may be hydrogen bonded. Huber & Steigemann (1974) have also located two cis Pro bends in the BenceJones prot’ein Rei at residues 6 to 9 (Gln-Ser-Pro-Ser) with (4, #),=(-145”, 139”), (4, #)3= (--SF’. 174”) and at residues 93 t’o 96 (Ser-Leu-Pro-Tyr) with (4, 4),=(-SS“, 150”), (4, #)3==(-730, 156“). The aCo,--aCo, distances in both cases are below 6.0 d but there is no hydrogen bonding between O,,, and N,,,. An examination of the dihedral angles of cis Pro bends indicat)es that they approximat’e those for poly(L-Pro) I and II n-it’ll (4. z/)=--83”, 158” a,nd (4, $)=-78”. 149’ (IUPAC-IUB Commission on Biochemical Nomenclature, 1970). However, half of t,he cis Pro bends (Table 9) a,lso have (4, 4) angles similar t,o t,gpe I bends at thf$ third position. Ramachandran & Mitra (1976) have performed energy calculat8ions showing that the tlarb.s-ci.s-tran,.s conformation for a tripeptide can occur only to an extent of O.lq/,, except in the case of the X-Pro sequence, where the order is of 300,1. Hence, additional type VI bends may be elucidat’ed when crystallographers reexamine the chain reversal regions involving Pro residues. It is hoped that the /l-turn computations and compilations presented herein, along with the classification of 11 bend types, will encourage even great)er detailed analysis of thrac: chain-reversal regions in globular proteins. \Vo tllank tile following X-ray crystallographers who generously supplied atomic coordinatca data prior to publication: Drs T. Blundell & D. Hodgkin, F. A. Cotton & E. E. Hazon, .Jr, J. Drenth, W. A. Hendrickson, R. Huber & W. Steigemann, L. H. Jensen, ,J. Krallt. K. H. K&singer, E. Lattman & W. A. Love, M. L. Ludwig, F. S. Mathews, B. \V. Matthews, M. F. Porutz, G. N. Reeke, F. M. Richards, M. G. Rossmann & S. *J. St~c~intlt~l, 1,. Sawy(ar & H. (:. Watson, B. Strandberg, and R. Swanson & R. E. Dickerson. WV also thank Drs T. F. Koetzle and 1%. J. Feldmann for sending us the latest atomic co-ordinate data deposited, respectively, at the Brookhaven Protein Data Bank and at t,hcb National Institutes of Health. \Vo appreciate the hcblp of Mr Sheldon Reis and Mr Tim Hickcy for assistance in computer programming. ‘I’llis research was generously supported in part, by grants from the United States Public, Healt,h Services (GM17533), National Science Foundation (PCM76-21856) and the Americatl (Iancer Society (NP-92E). This is publication no. 1160 from the Graduate Drpartlnent of Biochemistry, Brandeis University, Waltham, Mass. 02154, U.S.A. REFERENCES Aboderin, A. A. (1971). 1st. J. B&hem. 2, 537-544. Adams, M. J., Ford, G. C., Liljas, A. & Rossmann, M. G. (1973). Biochem. Biophys. Res. Commun. 53, 46-51. Adman, E. T., Sieker, L. C. & Jensen, L. H. (1973). J. Biol. Chem. 248, 3987-3996. Alden. R. A., Birktoft, J. J., Kraut, J., Robertus, J. D. & Wright, C. S. (1971). Biochewa. Biophya. Res. Commun. 45, 337~-344. Birktoft, J. J. CL:Blow, D. M. (1972). J. Mol. Hiol. 68, 187.-240. Blundcll, T., Dodson, G., Hodgkin, D. 8.x Mercola, D. (1972). Aduarc.. Protein Chem. 26, 27!& 402. Burnett, R. RI., Darling, G. D., Kendall, D. S., LeQuesne, M. E., Mayhew, 8. G., Smith, IV. W. & Ludwig, M. L. (1974). J. Biol. Chem. 249, 4383.-4392. Carter, C. W., Jr, Kraut, J., Freer. 8. T., Xuong, N.-H., Alden, R. A. & Bart,sch, R. G. (1974). .J. Biol. Chem. 249, 4212-4225. Chou. I’. ‘I’. & Fasman, G. D. (1974). Biochemistry, 13, 222~.245. Chou. P. Y. & Fasman, G. D. (1977). Advan. Enzywbol. 45, in the press. Chou. P. P., Adler, A. J. & Fasman, G. D. (1975). J. 11102. Biol. 96, 29-45.

174

I’. Y.

(‘HO11

AN11

a.

1). I’ASMAN

Cotton, F. A., Bier, C. J., Day, V. W., Hazen, E. E., Jr bz Larsen, S. (1972). C&t ,$‘p&!l Harbor Syrnp. Quark Biol. 36, X&255. Crawford, J. L., Lipscomb, IV’. N. & Schcllman, C. C:. (1973). I’roc. X’at. &ad. Nci.. r’.,S’.A. 70, 538-542. Deisenhofer, J. & Steigemann, W. (1975). Acta Crystallogr. sect. Ii’, 31, 238-250. Diamond, R. (1974). J. Mol. Biol. 82, 371-391. Dickerson, R. E., Takano, ‘I?., Eisenberg, D., Kallai, 0. B., Samson, I,., Cooper, A. & Margoliash, E. (1971). J. l3ioZ. Chem. 246, 1511-1535. Drenth, J., Jansonius, J. N., Koekoek, R. & Wolthers, B. G. (1971). Advan. Protein Chem. 25, 79-116. Fasman, G. D., Chou, I’. Y. & Adler, A. J. (1976). Biophys. J. 16, 1201-1238. Hendricks, W. A. (1956). The Mathematical Theory of Sampling, pp. 66, 106-109, Scarecrow Press, New Brunswick. Hendrickson, W. A., Love, W. E. & Karle, J. (1973). J. iMoZ. Biol. 74, 331-361. Herriott, J. R., Watenpaugh, K. D., Sieker, L. C. & Jensen, L. H. (1973). J. Mol. Biol. 86, 423-432. Holbrook, J. J., Liljas, A., Steindel, S. J. & Rossmann, M. G. (1976). Enzymes, 11, 191292. Huber, R. & Steigemann, W. (1974). PEBS Letters, 48, 235-237. Huber, R., Epp, O., Steigsmann, W. & Formanek, H. (1971). Eur. J. Biochem. 19, 42-50. Hurwitz, F. I. & HopfInger, A. J. (1976). Int. J. Peptide Protein Res. 8, 543-550. Imoto, T., Johnson, L. N., North, A. C. T., Phillips, D. C. Cy:Rupley, J. A. (1972). Enzymes, 7, 666-868. IUPAC-IUB Commission on Biochemical Nomenclature (1970). Biochemistry, 9, 347 l-3479. Kretsinger, R. H. & Nockolds, C. E. (1973). J. Biol. Chem. 248, 3313-3326. Kuntz, I. D. (1972). J. Amer. Chem. Sot. 94, 4009-4012. Lewis, P. N., Momany, F. A. & Scheraga, H. A. (1971). Proc. Nat. Acad. Sci., U.S.A. 68, 2293-2297. Lewis, P. N., Momany, F. A. & Scheraga, H. A. ( 1973). Biochim. Biophys. A&a, 303, 211-229. Liljas, A., Kannan, K. K., Bergsten, P.-C., Waara, I., Fridborg, K., Strandberg, B., Carlbom, U., Jarup, L., Lovgren, S. & Petef, M. (1972). Nature New Biol. 235, 131137. Lim, V. I. (1974). J. Mol. BioZ. 88, 873-894. Mathews, F. S., Argos, P. & Levine, M. (1972). Cold Spring Harbor Symp. Qua&. Biol. 36, 387-395. Matthews, B. W., Weaver, L. H. & Kester, W. R. (1974). J. Biol. Chem. 249, 8030-8044. Nozaki, Y. & Tanford, C. (1971). J. Biol. Chem. 246, 2211-2217. Padlan, E. A. & Love, W. A. (1974). J. Biol. Chem. 249, 4067-4078. Perutz, M. F., Muirhead, H., Cox, J. M. & Goaman, L. C. G. (1968). Nature (London), 219, 131-139. Quiocho, F. A. & Lipscomb, W. N. (1971). Advan. Protein Chem. 25, l-78. Ramachandran, G. N. & Mitra, A. K. (1976). J. Mol. Biol. 107, 85-92. Reeke, G. N., Jr, Becker, J. W. & Edelman, G. M. (1975). J. Biol. Chem. 250, 1525-1547. Richards, F. M. & Wyckoff, H. W. (1973). Atlas of Molecular Structures in BiologyRibonuclease S, Clarendon Press, Oxford. Salemme, F. R., Freer, S. T., Alden, R. A. & Kraut, J. (1973). Biochem. Biophys. Res. Commun. 54, 47-52. Sawyer, L., Shotton, D. M. & Watson, H. C. (1973). Biochem. Biophys. Res. Commun. 53, 944-951. Schiffer, M. & Edmundson, A. B. (1967). Biophys. J. 7, 121-135. Schulz, G. E., Elzinga, M., Marx, F. & Schirmer, R. H. (1974). Nature (London), 256, 120-123. Shotton, D. M. & Hartley, B. S. (1973). Biochem. J. 131, 643-675. Spiegel, M. R. (1961). Theory and Problems ofStatistics, pp. 144, 157-158, Schaum Publishing Co., New York.

,&?-TURNS

IN

PROTEINS

176

Takano, T., Kallai, O., Swanson, K. & Dickerson, R. E. (1973). J. Sol. Chem. 248, 52345255. Venkatachalam, C. M. (1968). Biopolymers, 6, 1425-I 436. Watenpaugh, K. D., Sieker, L. C., Herriott, J. R. & Jcnseu, I,. H. (1973). Acta Crystnllogr. SKI. B, 29, 943-956. \Vatson, H. C. (I 969). Progr. b’tereochem. 4, 299. 333.

Beta-turns in proteins.

J. Mol. Biol. (1977) 115, 135-175 B-Turns in Proteins PETER Y. CHOU AND GERALD II. FASMAN Department of Biochemistry, Brandeis Waltham., Mass. 02...
2MB Sizes 0 Downloads 0 Views