J. Mol. Biol. (1990) 214, 261-279

Crystal Structure of Thermitase at l-4 A Resolution Alexei V. Teplyakovt, Institute

Inna P. Kuranova,

Emil H. Harutyunyan,

Boris K. Vainshtein

of Crystallography, Academy of Sciences of the U.S.S.R. Leninsky pr. 59, Moscow 117333, U.S.S.R.

Cornelius Friimmel,

Wolfgang E. Hiihne

Institute for Biochemistry, Humboldt University Heissische str. 3-4, Berlin 1040, G.D.R.

and Keith S. Wilson European

Molecular

Biology

Laboratory,

Notkestr. 85, 2000 Hamburg

(Received

7 April

Hamburg 52, F.R.G.

1989; accepted 7 March

Outstation

1990)

The crystal structure of thermitase, a subtilisin-type serine proteinase from Thermoactinomyces vulgaris, was determined by X-ray diffraction at 1.4 A resolution. The structure was solved by a combination of molecular and isomorphous replacement. The starting model was that of subtilisin BPN’ from the Protein Data Bank, determined at 2.5 A resolution. The high-resolution refinement was based on data collected using synchrotron radiation with a Fuji image plate as detector. The model of thermitase refined to a conventional R factor of l4*9o/o and contains 1997 protein atoms, 182 water molecules and two Ca ions. The tertiary structure of thermitase is similar to that of the other subtilisins although there are some significant differences in detail. Comparison with subtilisin BPN revealed two major structural differences. The N-terminal region in thermitase, which is absent in subtilisin BPN’, forms a number of contacts with the tight Ca2+ binding site and indeed provides the very tight binding of the Ca ion. In thermitase the loop of residues 60 to 65 forms an additional (10) B-strand of the central P-sheet and the second Ca2+ binding site that has no equivalent in the subtilisin BPN’ structure, The observed differences in the Ca2+ binding and the increased number of ionic and aromatic interactions in thermitase are likely sources of the enhanced stability of thermitase.

1. Introduction

lisin BPN’ (SBT) from Bacillus amyloliquefaciens (Markland & Smith, 1967) and 44% of residues to subtilisin Carlsberg (SBC) from Bacillus licheniformis (Smith et al., 1968). TRMase belongs to the cysteine-containing subgroup of subtilisins, which includes proteinase K from Tritirachium album Limber (Jany & Mayer, 1985), thermomycolin from Malbranchea pulchella (Gaucher & Stevenson, 1976) and the alkaline proteinases from Bacillus thuringiensis (Stepanov et al., 1981) and Bacillus cereus (Chestukhina et al., 1982). All subtilisins make up a well-characterized family of serine proteinases that have been extensively studied both to provide insight into the mechanism and specificity of enzyme catalysis and

Thermitase (TRMaseS; EC 3.4.21.14) is an extracellular bacterial serine proteinase isolated from Thermoactinomyces vulgaris (Friimmel et al., 1978). The enzyme molecule consists of a single polypeptide chain of 279 amino acid residues (Meloun et al., 1985). TRMase is homologous in its sequence to subtilisins: 42% of residues are identical to subtit Presentaddress: Laboratory of Chemical Physics, University of Groningen, Nijenborgh 16, 9747 AG Groningen, The Netherlands. $ Abbreviations used: TRMase, thermitase; SBT, subtilisin BPN’; SBC, subtilisin Carlsberg; r.m.s., root-mean-square. 261 0022-2836/90/130261-19

$03.00/O

0 1990 Academic

Press Limited

262

A.

because of their considerable industrial importance as a protein-degrading component of washing powders. TRMase is more stable against heat denaturation and proteolytic degradation than SBT and SBC, and thus the comparison of their three-dimensional structures could shed light on the molecular basis of stability of these proteins. To date several X-ray diffraction studies of the subtilisins have been carried out. The first threedimensional structure of a subtilisin to be reported was that of SBT (Wright et al., 1969; Drenth et al., 1972). The structures of SBT (McPhalen et al., 1985a; Hott et al., 1988), SBC (McPhalen et al., 1985b; Bode et al., 1987) and proteinase K (Betzel et al., 1988a,b,c) have now been refined at high resolution. Extensive protein engineering studies are being carried out on SBT and the crystal structures of site-directed mutants have been reported (Pantoliano et aZ., 1987, 1988). Recently, the structure of TRMase complexed with eglin-c has been determined in two crystal forms and refined by molecular dynamics techniques to 2.2 a (Gros et al., 1989a) and 1.98 .& (Dauter et al., 1988; Gros et al., 19896) resolution (1 A = 0.1 nm). The solution of the native TRMase structure at 2.5 A resolution has been published (Teplyakov et al., 1986, 1989). In this paper we describe the determination of the crystal structure of native TRMase by X-ray diffraction methods at 1.4 w resolution and the comparison of the highly refined structures of TRMase and SBT. For the present

comparison

the

structure

of SBT

by the vapour diffusion method from (~5% (w/v) protein solution in 0.1 M-sodium acetate buffer with addi~ tion of 2 to 4’3, (v/v) 2-methyl-2,4-pentanedioland 20 to 25% (w/v) ammonium sulphate. The crystallization grown

buffer was adjusted to pH 56 to avoid aueolysis, which occurs at alkaline pH. Tn 3 to 4 months thin plate cryst,als achieved a maximum size of @8 mm x 0.4 mm x 0.1 mm. Crystals belong to the orthorhombic space group P2,2,2,. with unit cell parameters: a = 7295 A. 6 = 64.05 A. c = 47.55 8. The asymmetric unit contains I protein moecule with M, = 28,380. The volume/dalt,on ratio is 1.95 A3/dalton and lies near the lowest value reported by Matthews (1968) for protein crystals. This corresponds to a solvent content of 38%). (b) Datn

Solution

collection

Information on the native and isomorphous derivative data sets used in the TRMase structure solution and refinement is summarized in Table 1. The NATI-I) data set used in the first stages of the X-ray analysis was collected on a KARI) diffractometer with a multiwire area detector (Andrianova et al., 1982) at the Institute of Crystallography, Moscow. Complete data to I.65 A resolution were obtained from 2 crystals with different crystallographic axes. namely (I and b, oriented parallel to the rotation axis to avoid the “blind” region. Absorption effects were corrected by the method described by North et al. (1968). The radiation damage during data collection was monitored after every 30” of rotat.ion by comparing the intensities of the measured reflections with those of the ecluivalent reflections in the first 90” of reciprocal space. When processing the data correction coefficients K and B were calculated for each of these 30” zones b> scaling to the equivalent zones in t’he first 90”. The intensities of the reflections were then corrected according to the formula: I = I, x K x exp(H x sin20/12).

(BSNI;

McPhalen & James, 1988) from the Protein Data Bank (Bernstein et al., 1977) was used. The SBT structure was refined to an R factor of 15*4o/oat 2.1 A.

2. Structure

et al.

V. Teplyakov

The largest values of h’ and H, obtained for the last zone (330” to 360’) of the first crystal, were I.15 and 0.10 A’. respectively. The data set for the Hg derivative (TRM-HG in Table 1) was used in the structure solution as described in section (c), below. It was collected on the same diffractometer using a very small crystal with dimensions of approximately 0.3 mm x @3 mm x WV3 mm. Neverthe-

and Refinement

(a) CrystaZZization

Thermitase was isolated and purified as describedby FrGmmel et al. (1978). Crystals of native TRMase were

Table 1 Data collection Uata set Resolution (d) X-ray source Wavelength (A) Detector Collimation (mm) Crystal to detector distance (mm) Total rotation (deg.) Oscillation step (deg.) Number of crystals Total

exposure

time

(h)

Total number of reflections Reflections with I > o(l) (%) mzu~z Decay correction (max) Absorption correction (maxjmin) Number of unique reflections II

Rmerge= w - mm

NATI-U 180-1.65 Seal. tube:

TRM-HG 18G2.5 40 kV x 30 mA 1.54

Multiwire Dia,m. 0% 400 360 025 2 64 134,024 68.8 @I66 1.15, 1.08 195, 155 25,805 o-115

chamber Diam. @4 400 180 0.25 I 40 16,974 692 0.110 1.12 1.45 7435 0.084

NATI-H

NATI-L

11.0-1.35

1 I G2.5

SR: ZP66GeV. 40-80 mA 09 1.49 Fuji image 0.3 x 0.3 145 SO 1

plate 0.3 x 0.3 145 so 2.5

21 111,21 I 932 0182

4 22,867 96.8 0.201

43,796 0108

7365 0110

Crystal

Structure

solution

The structure of TRMase was solved by a combination of molecular and isomorphous replacement (Teplyakov et

ry

263

al., 1986). The search model was the unrefined SBT structure (1SBT; Alden et al., 1971) from the Protein Data Bank (Bernstein et al., 1977). The orientation of the molecule was determined with the fast rotation function described by Crowther (1972). Figure 1 shows the section of the rotation function that contains the main peak corresponding to the correct orientation of the model. This peak is about 1.4 times stronger than the second peak in the rotation function. The NATI-D data in the range 160 to 4-O A were used in the calculations. A radius of integration of 32 A was chosen. This is somewhat less than the mean dimension expected for the molecule. The second problem in the molecular replacement method is to find the translation vector that brings the oriented model to the correct place in the unit cell. Attempts to solve it directly using the translation function described by Crowther & Blow (1967) were unsuccessful, probably because of the high packing density of the TRMase crystals, but also perhaps because the search model was not a refined structure. We therefore decided to use isomorphous replacement to obtain an initial map in which we could place the model. A number of heavy-atom reagents were tried and only a few of them showed specific binding to the protein. All derivatives were prepared by soaking native crystals in crystallization buffer containing 0.1 to the 5.0 mM-heavy-atom reagent. Specific binding was identified from precession photographs. Several X-ray data sets were collected. However, difference Patterson maps showed a single binding site common for all the heavyatoms tried (Hg, Pb, Tb and Ag). The 39 A resolution electron density map based on the phases calculated from the Hg derivative, with a figure of merit 056, was very noisy and did not allow us to position the BPN’ model in the unit cell correctly. Therefore, the translation problem was solved by a combination of molecular and isomorphous replacement. The use of t.he mercury derivative was restricted to correctly positioning the model in the cell. We supposed that the Hg ion was covalently bound to the sulphydryl group of Cys75 in the active site of the TRMase molecule. This is the only cysteine residue in TRMase and Hg ions are known to inhibit TRMase (Friimmel et al., 1978). According to the sequence alignment the corresponding residue in the SBT structure was ValGA. The search model

less the quality of the data was satisfactory up to 2.5 A. fractional isomorphous difference The mean (C(lFr,l - lFrll/ZlFrl) between the native and derivative data was l&6% in the range of l&O to 2.5 A. The high resolution data set for native TRMase was collected on the X31 synchrotron beam line at EMBL, Hamburg. The rotation camera was used with a Fuji image plate as detector. The advantages of the image plate are its large dynamic range (from 1 to 16,000 with the present scanner) and high sensitivity over a wide range of wavelengths. Using wavelengths of about 1 A, when the absorption in protein crystals becomes negligible, it is possible to avoid a lot of the systematic errors related to the absorption correction. This was especially important for TRMase crystals because they were very thin plates. Synchrotron data for native TRMase were obtained from one crystal at 4°C and contained 2 parts. The highresolution data to 1.35 A (NATI-H in Table 1) were collected in oscillation steps of 1 deg. in the range of 0 to 90” with a wavelength of 03 A. There were many lowangle reflections in these data that were saturated on the image plate scanner. To record these low resolution data wit,hout overexposed reflections the same crystal was used to collect the data up to 2.5 A at a wavelength of 1.49 A (NATT-L). The 2 data sets were processed using the MOSCO program (Machin et al., 1983). No absorption or decay correction was applied. The 2 data sets were merged with an R-factor (c(Z-(Z)l/cZ) of 12.6% calculated for the 6965 common reflections. The high value of the internal merging R-factor for the NATI-H data set in the resolution shell of 1.40 to 1.35 A, owing to a large number of weak reflections, caused us to use data to only 1.40 A in the refinement, The total number of reflections with structure factor amplitude F > o(F) is 43,794, that is 9&l”/& of all possible reflections up to 1.4 A resolution. The R-factor (XIIFII -lFIII/~llFII + 1FJ) between the NATI-H and NATI-D data sets is 8.5% in the resolution range 11.0 to 1.7 A. These data sets were not merged and only data from the image plate were used in the 1.4 A refinement. (c) Structure

of Thermitase

p=56”

Figure 1. Section fl = 56” of the rotation function of TRMase. The maximum corresponds to the correct orientation the molecule. The search model was SBT, resolution range 10.0 to 40 A, radius of integration 32 A.

of

264

A. V, Teplyakm

was placed in the unit cell so that the side-chain centroid of Val68 was at the position of the Hg ion derived from the Patterson map. This led to 8 possible positions of the model according to the 4-fold ambiguity of the rotation function (180” rotation around each axis) in the space group P2,2,21 and the 2-fold ambiguity of the Patterson function resulting from the lack of enantiomorph definition. Four of these were incompatible with the packing of TRMase molecules in the crystal. The other 4 were treated as possible solutions. The correct variant was selected by crystallographic refinement of the models as rigid bodies using the program CORELS (Sussman et aZ., 1977). For the correct solution the conventional R-factor (hereinafter defined as X[lFol - IF,II/ElF,l, where F. and Fc are the observed and calculated structure amplitudes, respectively) dropped after 5 cycles from 9545 to 0502 at 6.5 A resolution, while for the other solutions it remained greater than 954. To check the correctness of the solution the difference Fourier map for the Hg derivative with the model phases was calculated. It showed a single peak in the electron density corresponding to the Hg position derived from the difference Patterson map (Teplyakov et al., 1986).

(d) TRMase

model building

et al.

is exacerbated by the small unit cell resulting from t,he very close crystal packing of TRMase molecules. The automatic part of the refinement converged after cycle 46 to an R-factor of 9284 at 3.0 A resolution. At this stage “omit” maps were calculated with the coefficients (F,,--F,) and (2Fo-FJ. Only reflections with F, > 30 were included in the calculation of the maps and the refinement. The fragments of the model under invest,igation were not included in the phase calculation for the omit maps. Partial improvements of the model based on these maps were followed by cycles of least-squares refinement, and the whole procedure was reiterated. At the end of this procedure many of the missing side-chains and all 3 internal insertions had been introduced into the model. The positions of the N and C termini and of some of the loops on the molecular surface remained unclear until the data included in the refinement were extended to 1.8 A resolution. After extending the resolution used to 2.2 A individual isotropic temperature factors were refined. Starting from cycle 47 the least-squares restrained-parameter procedure of Hendrickson & Konnert (1981) was used and from cycle 87 its fast Fourier modification (Finzel, 1987). Cycles 1 to 186 were performed on a NORD-509 computer and cycles 187 to 316 on a VAX station-3200. Cycles of refinement were followed by manual improvement of the model using an Evans & Sutherland PS330 interactive graphics system with the program FRODO (Jones, 1978). Until cycle 240 the diffractometer data set NATI-D was used, At that point the structure had refined t,o a.n R-factor of 0.172 at 1.7 A resolution. Two Ca ions and 174 water molecules were incorporated into the model. The new data obtained on the synchrotron beam were now introduced. These scaled well to the diffractometer data with no indication of significant systematic differences. With the new data the R-factor fell very quickly t,o a reasonable value, comparable to that which had been obtained for the diffractometer data. There were no significant changes in the protein structure detecated during this transition. In contrast the water structure changed a little: some of the molecules disappearing and other new ones appearing. This ambiguity in the localization of some weakly bound water molecules is probably inevitable even at very high resolution (Fujinaga et (11.. 1985) although we cannot exclude completely the possibility that it, reflects small changes in the solvent> structure between the particular crystals used for the 2 data collections. During retinement water molecules with H-fact,ors greater than 50 A* were excluded from the model. The occupancies were set to 1.0 and were not relined.

and refinement

The most important stages of the refinement are presented in Table 2. The SBT model containing 1928 atoms was refined as a rigid body with data in gradually increasing resolution ranges. Twelve cycles of the refinement resulted in axial shifts of 0.58, 1.78 and 046 A> a rotation angle of 1.58, and a root-mean-square (r.m.s.) deviation from the starting co-ordinates of 1.96 A. Then the model was modified in accordance with the amino acid sequence of TRMase (Fig. 2). This included the deletion of 8 residues from the SBT model. Side-chains were also corrected, but only if the side-chain was smaller in thermitase. This involved only the deletion of atoms. Extra atoms were not added to BPN’ side-chains as we were carrying out automatic refinement without rebuilding on the graphics at this stage. The resulting model containing 1783 atoms was refined using CORELS. Initially the model was divided into 267 rigid groups with each residue treated as a group. Three positional and 3 rotational parameters for each group were allowed to vary. When the resolution was extended to 3.0 A it became possible to include the dihedral angles as variables in the refinement. When using low and medium resolution data the problem was to maintain a ratio of observations to variables greater than 1. The relatively small number of reflections

Table 2 SumnuLry

CORELS

Program Cycle number Resolution (A)

10-65

R = Wcl -I~cll/wol

0545

No. of reflections No. of variables No. of residues No. of protein atoms No. of solvent atoms r.m.s. deviation in bond distances (A)

of the refinement

1

PROLSQ

274 1928 0

12 60-40 0484 1227 6 274 1928 0

18 6fM.O 0282 1221 1604 267 1783 0

46 50-30 0284 3579 3124 267 1783 0

0030

0030

0.187?

0102t

368 6

88

1912 0

186 50-2.2 0243 10,152 7817 277 1954 0

Co-l.7 0172 20,659 8693 274 1997 176

31ti so- I .4 0149 37,446 8725 270 1997 184

w24

0.020

0018

0.015

5fL2.5 0255 6733 5738 271

t For the C-N peptide bond only; the other bond distances are constrained to the ideal values

Crystal

10 Trw

aA

AQSV---

all

Tw4

P , LAGKVVGGWDFVDNDSTP

SaT

L-

60

40

I

1

p2

P

70 ac , -QNGNGHGTHCAGIAAAVTNNSTGIAG'TAP

4

-KVAGGASWVPSETNPFQDNNSHGTHVAGTVAALI ,

150

90

4 042

60

aI3

NNS1GVLG2hp 80

70

120

gs I

SA,SLYAVKV,LGADGSGQYSWIINGIEWAIANNM~VINH,SLGGPSG~ L 90 83 100

40

80 1

,

50

lu 110 6 I 4 KASILAVRVLDNSGSGTWTAVANGITYAADQGAKVISLSLGGTVGNSG

cdi

j31

30

-l

-PYGVSQIKAPALHSQGYTGSN,VKVAVIDJSGIDSSHPD 1 IYA aB 20

50

SEr

of Thermitase

I YTPNDPYFSSRQYGPQKIQAPQAW-DIAEGSGAKIAIVDTGVQSNHPD

SC

TM

Structure

aD

1

120

w

140 -

t

B,

130

170

180

ga

I ml

--TAPNYPAYYSNA-DQ

LQQAVNYAWNKGSmGNAGN--

LKAAVDKAVASGV,VVVAAA,GNEGTSqSSSTVGYPGKYPSV,IAVGA~DS aE

140

b

190

200

160

p7

170

210

p6

220

180

230

aF

I NDNKSSFSTYGSWVD-SSIYSTYPTSTYASLSGTSMATPHVAGV SNQRASFSSFGPELPVHAPG,VSIQSTLPGNKYGAYNGTSMASPHVAGA 190

g7

210

240

CG I A-Q--GRSASNIRAAIENTADKISGTGTYWAKGRVNAYKAVQY

260

aF

gs

4

AALILSKHPNWTNTQFRSSLQNTTTKLG240

220

c&l

230

279

I DSFYYGKGLI,NVQAAAQ

cdl

250

260

a

gs

aH

275

Figure 2. Amino acid sequence alignment of TRMase and SBT based on the structural equivalence of residues as described in comparison with SBT, section (a). a-Helices and p-sheets are indicated by the bars and numbered according to Kraut (1977). Note that a-helix C consists of 2 regions of the chain. The secondary structural elements of both structures are shown as described in Structure of TRMase.

(e) Final

model

After cycle 346 of the refinement no significant improvement in the model could be made. The r.m.s. shift of atomic co-ordinates within the final cycle was O-010A. A difference electron density map with coefficients (F,, - FJ and phases calculated from the final model was essentially featureless. The agreement between observed and calculated structure factors was good. The R-factor is 9149 for the 37,446 reflections with F > 3a in the resolution range 50 to 1.4 A. For all 42,063 measured reflections in this range the R-factor is 9163. The final model contains 1997 protein atoms, 2 Ca ions and 182 water molecules. The stereochemistry of the model is near to the standard values as can be seen from Table 3. One indication of the quality of the model is the b-$ plot (Ramakrishnan C Ramachandran, 1965). As shown in Fig. 3 there is only 1 non-glycine residue with main-

chain torsion angles significantly outside the “allowed” conformational region. It is the active site Asp38 that possesses the same geometry in all subtilisins for which the structure is known (Bode et aE., 1987; McPhalen, 1986; Betzel et al., 1988a). The overall co-ordinate error in the atomic positions has commonly been estimated using the Luzzati (1952) plot (Fig. 4), although it tends to overestimate the error. Another estimation of the co-ordinate error derived from the rr* plot of Read (1986) is 613A. All the 43,794 reflections that were measured were used in this estimation. For all the atoms in the structure including water molecules the mean temperature factor (B) is 165 A2, for the protein atoms only it is 147 A2. The maximum values of the B-factors are 437 A2 for protein and 604 A2 for water molecules. The B-factors averaged over the mainchain atoms are plotted as a function of the residue

266

A. V. Teplyakov

et al.

Table 3 Final re@nement

statistics

Target at Bond distances (A) Bond distances (l-2 neighbours) Angle distances (l-3 neighbours) Planar distances (1-4 neighbours) Planar groups (A) Chiral volumes (A3) Non-bonded contacts (8) Single torsion contacts Multiple torsion contacts Possible hydrogen bonds Torsion angles (deg.) Planar (peptide w) Staggered (aliphatic x) Transverse (aromatic x2) Thermal factors (!I’) Main chain (l-2 neighbours) Main chain (1-3 neighbours) Side chain (1-2 neighbours) Side chain (1-3 neighbours) t The weight

number

in

Fig.

5. The

reflects the fact that

on each restraint

absence

of

These

distances

are

largest

significant

x1 = -52”, ably from

(CD1

2041 2796 780 358 313

0500 0500 0500

0.160 0.269 0.157

727 845 186

2.9 142 26.9

maxima

Trp208

283 270 28

1.85 2.50 3.62 4.77

1163 1476 878 1320

to l/c’.

all parts

Aspl05: 3.17 d; CD1 Trp208 CE3 Trp208 CB Ser207:

0.015 0.022 0026 0.015 0,200

400 500 5.00 5.00 corresponds

No. of parameters

0020 0025 0030 0.020 0200

30 200 300

of the chain are wellordered. Only for the side-chains of Lys128, Gln186 and Trp208 is there no good electron density. These 3 residues were included in the final model as serine. The positioning of a Trp side-chain on residue 208 would pose additional difficulties because of the close contacts with the side-chains of Ser207 and Ser222 of the same molecule and Asp205 of the symmetry-related molecule.

Standard deviation

ODl

OG Ser222: 3.41.&; and 410 A) when Trp208 has

xZ = -5O”, but these values differ considerthe ideal ones of -60” and -9O”, respectively.

On the basis 212 (resolution

of electron density 1.8 A) one further

c~~-clr to t,hr sequence (Meloun et al.. 1985) was was replaced by Trp. The electron is shown in Fig. 6 and gives a solid

after

published TRMase introduced: Val199 density for Trp199 indication that the residue is indeed Trp. Two residues in the TRMase model are in the cisconformation. One of them is Pro172. which possesses the same conformation in all subtilisins with known structure (McPhalen, 1986; Bode et al., 1987; Betzel et al., 198%). Tt is involved in the formation of the reverse turn in the Ca2+ binding site, which was found to be occupied by Ca2+

in SBT,

in TRMase

SBC

and

proteinase

K. The

second

residue

with a cis-peptide

bond is Thr215. It forms a reverse turn in the long B-hairpin in the substrate binding region. The same conformation was found for the equivnlent threonine residue in SBC (McPhalen, 1986). In the other 2 subtilisins Thr is replaced by Gly with the usual tralzs-peptide bond, although the conformation of the /?-hairpin is conserved.

3. Structure

60

calculated corre&ion

of TRMase

The a-carbon chain of TRMase is shown in Figure 7. The peptide fold of TRMase is similar to the other serine proteinases of subtilisin family. Thus, t)he -

-60

-

I L

\ I

---;-A--. .

. -120

general description first given by Wright

--r-I

= -60

OI I

I

60

120

I 0

Resolution

Figure 3. Ramachandran TRMase.

Glycine

et ol.

--J--;-I I

I 1 , .x

d I

plot for the final model of residues are shown as squares.

Figure 4. Luzzati

(1952)

plot

for the TRMase

structure.

Crystal

I

0

20

I

L

40

Figure 5. Variation

60

I

80

Xtructure

L

100

L

I

I

120 140 160 Residue no.

in the average main-chain

267

of Thermitase

temperature

1

I

180 200

L

220

1

240

factor along the polypeptide

I

260 chain.

Figure 6. The electron density in a (2&‘c-F,) omit map at residue 199 calculated with phases from the final model. Side-chain atoms were not included in the calculation of phases. This residue is valine according to the chemical sequence but the X-ray analysis indicates tryptophan.

(1969) for SBT is, with valid for TRMase.

some

modifications,

still

The core of the TRMase molecule is composed of a highly twisted eight-stranded parallel P-sheet and five a-helices, four of which are aligned antiparallel to the P-strands. The topology of TRMase is schematically represented in Figure 8. The cr-helices are labelled by letters from A to H, the j&strands of the parallel P-sheet are numbered from 1 to 8, the

antiparallel two-stranded P-sheets from I to V. TRMase can be classified as an u/B protein according to Levitt & Chothia (1976). Analysis of the main-chain hydrogen bonds and the conformational angles 4 and I/I allowed the determination of the elements of the secondary structure in TRMase. The criteria used for the assignment of hydrogen bonds were the following: distances N . . 0 < 3.4 A, H 0 < 2.4 A;

Figure 7. Stereo view of the cc-carbon chain of TRMase along the central helix (F). The positions of the 2 Ca ions are shown: Ca-1 is at the top right and Ca-2 at the bottom right. The N terminus of the chain forms part of the Ca-I binding site.

268

A. V. Teplyakov

et al.

Figure 8. Topological labelled

by letters.

scheme for the secondary structural elements of TRMase. a-Helices are shown by circles and are The squares numbered by Arabic figures represent b-strands of the parallel b-sheet. The squares

numberedby Latin figuresshow antiparallel 2-strandedB-sheets.

.O > 130”, angles N.. . O=C > lOO”, N-H.. H . . 0 =C > 100”. The segments of the sequence corresponding to the a-helices and the parallel p-sheet are shown in Figure 2. Sixty-five residues in TRMase are involved in the p-structure (Table 4). Most of them form the central P-sheet that contains eight parallel and two antiparallel b-strands. The two antiparallel P-bridges are located at both edgesof the p-sheet and contain two hydrogen bonds each. The regular conformation of the P-strand 2 is disturbed by the insertion of one residue (Va153 or Gly54) and can be defined as a B-bulge (Richardson et al., 1978). All other residues comprising the /?-sheet have torsion angles 4 and $ close to the standard values, derived by Baker & Hubbard (1984) from well-defined protein structures. The average C#Jangle is - 111” and $ angle is 130” for the 37 residues of the P-sheet. The mean

distance between hydrogen-bonded N and 0 atoms

Table 4 Elements of b-structure A. Parallel

/?-sheet Residues

B-Strand 1 2 3 4 5 6 7 8 B. Antiparallel

No. of H bonds

Ala32-Asp38 Lys51-Phe58 Ser97-Va1103 Lys128-Leu132 Va1156-Ala161 Ile179-Thr184 Va1202-Gly206 Gly269-Va1271

11 6 13 9 9 9 7 2

/l-sheets

D-Sheet

/?-Strand

I II III IV V

Asp57 Gly136 Ser183-Asp185 Ile209-Tyr213 Asp257

1

B-Strand Asp62 Tyr171 Asnl89-Serl91 Thr217-Leu221 Arg270

2

No. of H bonds 2 2 2 6 2

is 2.95 A. The parallel B-sheet is highly twisted so that the /?-strands 2 and 8 at the edgesof the sheet run approximately in opposite directions. All /?-structure elements except B-bridge I are also present in SBT (McPhalen t James, 1988). This additional b-strand of the central /?-sheetforms part of the Ca2+ binding site 2 (Ca-2) in TRMase. The site is located in a loop that has a different conformation compared to SBT. This structural feature will be discussed below with regard to the Ca2+ binding. There are eight a-helices in the TRMase structure containing at least two consecutive a-helical turns with 5 3 1 hydrogen bonds. Average conformational parameters for all the a-helices are given in Table 5. Two of the helices in TRMase are interrupted but in different ways. Helix C is built up of two parts as was found for SBT and SBC (McPhalen, 1986). Residues 70 to 80 form a typical a-helix with all parameters close to standard values. Residues 81 to 91 then form a loop projecting away from the helix to the surface of the protein. Residues 92 to 93 form a final turn of the helix C with hydrogen bonds as expected. This loop seemsto be an important structural feature as it provides the Ca-1 binding site. Helix F contains the active Ser225, which forms a number of hydrogen bonds in the active centre but does not form the expected 5 -+ 1 hydrogen bond in the helix becauseof the Pro residue at position 229. There is also a lack of 5 --f 1 hydrogen bonds residues Met266 + His230 and between Ala227 --) Va1231. Instead there is a 4 + 1 hydrogen bond Ala227 -+ His230. The amino acid sequence of helix F is highly conserved among the different’ subtilisins. This helix, together with the central strands of the B-sheet, forms the hydrophobic core of the molecule. Three cl-helices (B, F and H) are followed by 3io-helices each of which has two 3,,-turns. They are listed in Table 6 together with the other reverse turns. The turns were classified according to the

Crystal

Structure

of Thermitase

269

Table 5 Average parameters a-Helices A Tyr13-Ile18 B Glnl9-Trp24 C Gly70-Ala80, Thr92-Ala93 D Thrlll-Gln125 E Asn146Lys153 F Gly223-Ser240$ G Ser244-Asn254 H Asn272-Va1277 Total: t First $ Only

94 residues

of a-he&es

4ft

Jlt

(deg.)

(deg.)

-585 -595

-455 -@8

2 2

301 306

162 158

209 218

152 147

-61% - 598 -61% -628 - 63.9 -633

-441 -443 -423 -396 -433 -425

9 11 10 11 7 2

299 295 298 298 295 307

156 156 154 151 156 151

204 %oo 203 205 2.00 2.14

158 160 157 155 160 154

-61.7

-426

54

298

154

204

157

No. of H bonds

N...O (4

N-04 (deg.)

and last residues are not included in the calculation of average regular part of this helix is included in the calculations.

definition given by Venkatachalam (1968). All reverse turns of type II found in TRMase contain a glycine residue in the third position. There are three x-turns with 6 + 1 hydrogen bonds in the TRMase structure: Pro15 to Ala20, Ala122 to Ala127 and Ala150 to Ser155. All of them are associated with the a-helices (A, D and E) and are followed by 5 --t 2 reverse turns.

4. Comparison (a) Tertiary

with SBT structure

The superposition of the TRMase and SBT structures based on all 267 common a-carbon atoms gave an r.m.s. deviation in their positions of 1.35 A and a

H...O (4

N-H-O (deg.)

4 and $

maximum deviation of 6.56 A. Then only those residues were chosen for which the a-carbons lie closer than 2 A to one another, i.e. just over half the distance between two neighbouring a-carbon atoms. When the remaining 227 a-carbon atoms are overlapped the distances between equivalent a-carbon atoms has an r.m.s. of 0.63 A and a maximum of 1.47 A. This superposition of the structures gave the sequence alignment for TRMase and SBT shown in Figure 2 and shows the structurally equivalent residues in the two proteins. The superimposed structures are shown in Figure 9 and the deviations between their a-carbon atoms are presented in Figure 10 as a function of the residue number. As one can see the major differences between the TRMase and SBT structures are related to the

Table 6 Reverse turns Residues Asp5-Phe8 Gln16-Gln19 Gln22-Asp25 Ala23-Ile26 Gly29-Ala32 Gln42-His45 His45-Leu48 Leu48-Lys51 Ala93-Ala96 Asp105-Gly108 Ala123Gly126 Trp151-Gly154 Pro172-Tyr175 Tyr175-Ala178 Asp185-Asp188 SerlSl-Ser194 Gly223-Met226 Ala227-His230 Leu238-Gln241 Ala239-Gly242 Ile259-Thr262 Gly261-Thr264 Lys275-Gln278 Ala276-Tyr279

42

$*

43

*s

Type

(deg.)

(deg.)

(deg.1

(deg.)

I I III III II I I II I III I III III I I III III III III I II II III III

-61 -61 -62 -60 -58 -62 -51 -52 -51 -62 -62 -57 -51 -54 -63 -58 -64 -46 -62 -63 -55 -63 -60 -69

-30 -37 -34 -25 123 -21 -30 128 -36 -23 -30 -49 -38 -39 -17 -33 -21 -59 -40 -14 137 132 -46 -17

-94 -94 -60 -67 88 -99 -100 91 -105 -82 -85 -65 -66 -83 -101 -85 -63 -62 -63 -90 105 100 -69 -93

-1 8 -25 -17 -4 13 12 -13 5 -14 0 -21 -24 4 10 -12 -30 -21 -14 -2 -18 -16 -17 -6

N..

.O (4

315 306 322 2.84 299 310 2.86 290 323 2.98 3.03 2.97 2.98 3.05 300 3.09 2.93 3.03 302 2.81 2.99 3.01 3.11

N-O-C (deg.) 119 109 113 124 140 122 125 150 116 118 115 115 129 123 114 119 112 122 115 123 150 137 112 112

H..

.O (&

2.21 2.15 2.34 1.92 2.04 2.14 1.89 1.97 2.27 2.05 211 205 207 2.07 206 2.15 206 225 2.12 1.89 2.07 208 2.31 215

N-H-O (deg.) 158 152 147 152 159 160 161 155 161 153 153 152 151 163 157 156 144 134 150 151 152 154 137 153

270

A. V. Teplyakov

et al.

Figure 9. Superposition of the cr-carbon chains of TRMase (thick lines) and SBT (thin lines). Stereo view along axis .Z The loop in SBT corresponding to the deletion of 4 residues in TRMase is clearly seen.

and deletions in the sequences, all of which occur at the surface of the molecule. One of the largest, and probably most important’, differences occurs at the N terminus where there are seven additional residues in TRMase. The region of the chain including residues Tyrl to Tyr7, t(ogether with Gln12, forms a number of contacts with the Ca-I binding loop 82 to 90. There are six direct hydrogen bonds between these segments of the chain: Asn4 N . Asn84 ODl (2.90 A); Asn4 ODl . . Ser86 N (2.80 A), Asp5 OD2 . . Asn84 N (2.81 A), Tyr7 N. . . Gly88 0 (3.05 A), Gln12 OEl . . Ala90 N (2.90 A), Gln12 NE2. . . Ala90 0 (3.09 8). Asn84 ND2 is also hydrogen-bonded to Tyrl OH through a water bridge. Asp5 replaces Gln2 at the equivalent position in SBT with the side-chain oxygen involved in the co-ordination of Ca-1. The appearance of the second negatively charged group in the Ca-1 site together with the other observed interactions suggeststhat the N-terminal region gives extra stability to the Ca-J binding site in TRMase. Probably the tighter binding of Ca2+ in TRMase (Briedigkeit & Friimmel, 1989) is the result of these changes in the Ca-1 environment. insertions

7.0

-

6.5

-

I

20

The N terminus also interacts with other parts of the structure. An a-helical turn Tvr7 to Argll together with the helices A and B participates in the formation of the hydrophobic cluster of residues Pro3, Pros, Prol5, Pro21 and Trp24 between these helices. Residues Tyrl, Tyr7 and Phe8 contribute to the aromatic interactions in TRMase. As can be seen in Figure 10 there are two regions in TRMase deviating from SBT by more than 5 8. One of them is associated with the deletion of SerlH from the SBT sequence. It leads to the formation of a 3,0-helix at the end of helix B in TRMase instead of the regular a-helix observed in SBT. The other region is related to the loop of residues 59 to 65 in TRMase, which precedes the delet,ion of Phe58 (SBT). The conformation of this loop will be discussed below in terms of the (:a’+ binding. Another five regions in TRMase and SBT deviat,e by more than 2 8. Two of them are associated wit,h the loops 158 to 163 and 237 to 240 in SBT that, are absent in TRMase due to the deletions of four and two residues?respectively. Three other regions have insertions in the TRMase sequence: Ala49 to Gly50. Thr83 and Gly261. The part of the chain Jle259 to Thr264 in TRMase forms two consecutive type JJ

I

40

60

1

00

100

120

140 Residue

160

4

100

200

220

240

260

no

Figure 10. Deviations betweenequivalent a-carbon atoms of TRMase and SBT. The arrows indicate deletions

in the TRMase

sequence. The insertions

are shown by gaps.

the positions

of

Crystal

Structure

reverse turns with the peptide bond 261 to 262 participating in both of them. This double-turn structure appears in place of an a-helix in SBT. The insertion of residues Ala49 and Gly50 occurs without destroying the interactions of the neighbouring parts of the chain that are of great importance for the TRMase structure: the side-chain of Asp47 is involved in the co-ordination sphere of Ca-1 and residues Lys51 to Phe58 form the P-strand 2 of the parallel P-sheet. This insertion, as the previous one, results in the formation of a type II reverse turn with a glycine residue in the third position. Residue Thr83 is inserted in the Ca-1 binding loop. The geometry of the Ca-1 binding is described in section (c), below. (b) Active centre and substrate

binding region

All serine proteinases contain a catalytic triad of the same residues: Ser225, His71 and Asp38 in the TRMase sequence. These residues are supported by completely different protein folds in the trypsin and subtilisin families. The overall similarity of the tertiary structures within the subtilisins reflects a common architecture for the active centre and the substrate binding region. The catalytic triad (Fig. 11) is located at the edge of the central b-sheet with Asp38 being the last residue of the /?-strand 1. His71 and Ser225 are positioned at the beginning of the a-helices C and F, respectively. Considered as dipoles it has been suggested that these helices influence the electrostatic state of the catalytic site (Hol, 1985). As mentioned above Asp38 has the conformational angles 4 and rl/ outside the allowed region in the Ramachandran plot (Fig. 3). Such a conformation of the active Asp has been found in all other subtilisin structures (McPhalen, 1986; Bode et al., 1987; Betzel et al., 1988a). The dihedral angle w of the peptide bond between Va137 and Asp38 (170”) also deviates significantly from the standard value. The lengths of the hydrogen bonds between the active site residues in TRMase are: His71 NE2 . Ser225 OG 2.94 A; His71 ND1 . . Asp38 ODI 3.15 A; His71 ND1 Asp38 OD2 2.66 A. They indicate that all possible hydrogen bonds between these residues can exist although the orien-

of Therm&use

271

tation of the Ser225 side-chain is not ideal for the hydrogen bond to His71. With the hydrogen positioned at His71 NE2 the angle OG . . H-NE2 is 139.4” and with the hydrogen positioned at Ser225 OG it is 148.0”. Ser225 also hydrogen bonds to Ser133, forming a ten-membered ring with the hydrogen bonds Ser225 OG . . Ser133 0 (3.19 A) and Ser225 0 . . . Ser133 OG (2.68 A). The r.m.s. deviation between the TRMase and SBT active site residues is 918 A for the 24 atoms of the catalytic triad. These values were calculated using the superposition computed with the 227 equivalent a-carbons described above. In the absence of a substrate or an inhibitor the active site cleft is highly hydrated. One of the water molecules occupies the oxyanion hole and forms hydrogen bonds to Asn163 ND2 (2.62 A), Ser225 OG (2.91 A) and Thr224 OGI (3.10 A). The residues comprising the substrate binding region in subtilisins were identified in a number of X-ray studies of the complexes with synthetic and natural inhibitors (Robertus et al., 1972; Hirono et al., 1984; McPhalen, 1986; Bode et al.; 1987). The crystal recent structure analysis of the TRMase-eglin complex (Dauter et al., 1988; Gros et al., 19893) provided direct evidence for the participation of particular residues in the inhibitor binding. All subtilisins possess broad specificity showing some preference for large hydrophobic residues at the PI site of a substrate (Markland & Smith, 1971). (The notation of Schechter & Berger (1967) is used: residues of substrates are numbered PI, P2, etc. towards the N-terminal direction and PI’, P2’, etc. in the C-terminal direction from the scissile bond. The complementary sites of the enzyme are numbered Sl, S2 and Sl’, S2’, etc.). The substrate binding site of subtilisins includes two segments of the chain (Glyl08 to GlyllO and Ser133 to Gly135 in the TRMase sequence), which form a three-stranded antiparallel P-sheet with the substrate chain (Hirono et al., 1984). The comparison of TRMase with SBT shows that these regions of the chain, as well as those forming the S2’-S4’ sites, are highly conserved both in sequence and conformation. The superposition of 30 residues comprising the substrate binding site in TRMase (104-115, 133-139, 160-164, 1922194,224-226) and

Figure 11. Stereoview of the active centre of TRMase. The position of the mercury atom in the heavy-atom derivative of TRMase is shown.The positionsof the protein atomscorrespondto the native structure. The short distancesbetween Hg’+ and the side-chain of the active site His71 indicates that the latter should change its position upon Hg’+ binding.

272

A. V. Teplqakou et al.

SBT gives an r.m.s. deviation of 653 A for 120 main-chain atoms. Those amino acid substitutions which do occur in the substrate binding region do not disturb the interactions between the residues. For example, the substitution of Leu209 (SBT) by Tyr213 (TRMase) is compensated by the complementary substitution of Tyr217 (SBT) by Leu221 (TRMase). The long twisted /?-hairpin 206-223 contains a p-bulge at position 207 which seems to be important for the substrate binding as it is present in all subtilisins and is located in the S2’ site near the invariant Phe193 (Bryan et al., 1986; Bode et al., 1987; Bott et al., 1988; Betzel et al., 1988b). The conformation of the /?-bulge in TRMase is stabilized by the hydrogen bond Ser208 N . . . Ser222 OG. In SBT an equivalent hydrogen bond is formed between Ser204 N and Asn218 ODl. The substitution of Asn218 (SBT) by Ser was the only one found in the thermostable SBT variant (Bryan et al., 1986). On the basis of the X-ray structure it was concluded that the slight improvement in the hydrogen bond parameters around this position gives the mutant its enhanced thermal stability. It is of note that this position occupied by Asn in the Bacillus subtilisins is taken by Ser in the more stable enzymes TRMase and proteinase K. TRMase belongs to the subgroup of subtilisintype proteinases with a free cysteine near the active centre. Cys75 in TRMase is located at the bottom of the active site cleft. The distances from the sulphur atom of Cys75 to Ser225 0 (3.51 A) and Ser133 OG (3.56 A) suggest some possibility of weak hydrogen bonding whilst those to Asp38 ODl (4.03 A) and His71 ND1 (426A) are clearly too long to form hydrogen bonds. Although Cys75 is practically inaccessible to solvent it can bind Hg ions both in the crystalline state and in solution (Frommel et al., 1978; Teplyakov et al., 1986). The position of the mercury

atom is shown in Figure 11. It was derived from the difference electron density map calculated with the coefficients (FpH - Fp), where FpH and Fp are the structure amplitudes for the mercury derivative and native TRMase. The distance of 1.73 A between Hg2+ and the cysteine sulphur atom indicates that they are covalently bonded. It is clear that this position of the Hg ion in the active centre influences the catalytic residues causing the complete loss of activity as has been shown by Friimmel et aZ. (1978). The significance of the presence of the cysteine residue near the active centre with regard to the activity of the enzymes remains unclear. Knowledge of the precise positions of the atoms in the active site is crucial for an understanding of the mechanism of action and the mode of substrate binding of the enzymes. It will be interesting to compare in detail the refined high-resolution structure of native TRMase with that which has been determined for eglin-inhibited TRMase (Gros et al., 19893). (c) Calcium binding sites Calcium ions are known to bind to subtilisins and to stabilize their structure against thermal denaturation and proteolytic degradation (Voordouw et al., 1976). Kretsinger (1976) has suggested that, the requirement for calcium ensures that these enzymes will only be active extracellularly, since intracellular Ca2+ concentrations are extremely low. From atomic absorption experiments (Friimmel & Hiihne, 1981; Briedigkeit & Frommel, 1989) it was expected that TRMase would bind three Ca ions, two of them more tightly than in SBT. Based on similar dissociation constants (lop4 M) and a comparable stabilizing effect Frijmmel & Hohne (1981) suggested that the third Ca2+ site in TRMase corresponds to the weakly bound Ca2+ in SBT. The removal cff Ca ions from these protins leads to a significant

Figure 12. The first calcium binding site in TRMase.

Crystal

Structure

Table 7 Co-ordination First

calcium

Ligand

of calcium in TRMase

site

Second Distance

Asp5 ODl Asp47 ODl Asp47 OD2 Va182 0 Asn85 ODl Thr87 0 Ile89 0

(A)

2.30 243 2.59 236 2.45 2.29 237

calcium

Ligand Asp57 Asp60 Asp62 Asp62 Thr64 Gln66 Wat475

site Distance

OD2 ODl ODl OD2 0 OEl

(A)

241 239 2.51 269 2.32 249 230

decrease in stability. However, it is impossible to remove all three Ca ions from TRMase without destroying the structure. Based on predicted models of TRMase obtained by model-building and energy minimization techniques Friimmel & Sander (1989) suggestred that binding of Ca2+ by TRMase was the main cause of its thermostability. There are two Ca ions clearly present in the crystal structure of native TRMase. If refined as water molecules they have very small temperature factors and relatively short contacts to neighbouring atoms. When refined as calcium these Ca ions have temperature factors of 19.5 and 31.2 A2. The occupancies were set to 1.0 and were not refined. In the final stages of the refinement no restraints were applied to Ca2+-ligand distances. The Ca-1 ion lies in the loop of residues 82 to 89, which projects out to the surface from the interrupted helix C. Ca-1 is co-ordinated to seven oxygen atoms that form a pentagonal bipyramid (Fig. 12; Table 7). The equatorial ligands are Asp5 ODl, Asp47 ODl and OD2, Asn85 ODl and Ile89 0. The carbonyl oxygen atoms of Va182 and Thr87 are the axial ligands of Ca-1 . The co-ordination number (7) and the geometry of this site is typical for the calcium complexes reviewed by Einspahr & Bugg (1984). Ca-1 is completely inaccessible to solvent and taking into account its lower temperature factor (195 W2) 1‘t is likely that it is this ion which cannot be removed from the protein by soaking in EDTA. The same Ca2+ binding site was found in SBT

Figure Residues

of Thermitase

(McPhalen & James, 1988; Bott et al., 1988). The comparison of these sites reveals two structural differences between TRMase and SBT. The insertion of Thr83 in the Ca-1 binding loop of TRMase results in the left-handed helical conformation of the residues Asn84 and Ser86 and a slight rearrangement of the loop. The substitution of Gln2 in SBT by Asp5 in TRMase provides a second negatively charged group in the binding site and may be the main reason for the tighter binding of Ca-1 in TRMase. The second Ca ion (Ca-2) in TRMase is positioned in the loop of residues 59 to 65 near one edge of the central P-sheet. There is a cluster of three negatively charged residues in this part of the molecule. All of them participate in the Ca-2 binding (Fig. 13; Table 7). The pentagonal bipyramidal co-ordination of the Ca-2 ion includes Asp60 ODl , Asp62 ODl and OD2, Thr64 0 and Gln66 OEl as equatorial ligands and Asp57 OD2 and a water molecule as axial ligands. The co-ordination of both Ca ions in TRMase can also be described as octahedral with the carboxyl group of one of the Asp residues constituting one ligand. There is a complex network of hydrogen bonds supporting the fold of the chain around the Ca-2 binding site. In general the architecture of the site can be described in terms of two layers of interactions (Fig. 13). 0 ne of them includes main chain hydrogen bonds between Asp57, Asp62 and ArglO2, which form part of the central p-sheet. Another layer includes the interactions between their sidechains and the co-ordination ligands of Ca-2. An interesting feature of this Ca-2 binding site is the presence of the positively charged guanidinium group of ArglO2, which forms salt bridges with the Ca-2 ligands Asp57 OD2 (2.97 A) and Asp60 ODl (315 A). This observation is discussed below. The Ca-2 binding site is absent in SBT. Although one can find some sequence homology between TRMase and SBT for this section of the chain the conformation of the Ca-2 binding loop is completely different from that of corresponding loop in SBT. This loop seems to be one of the most variable parts of the chain in the subtilisins. In SBC it is one

13. The secondcalcium binding site in TRMase. Hydrogen bonds 62, 57 and

102 form

part

of the

central

/?-sheet.

273

and

salt

bridges

are

shown

by

thin

lines.

274

A. V. Teplyakov

et al.

RC249

i

Figure 14. The potential “3rd” calcium binding site in TRMase occupiedby a water molecule(shownas a big circle). The co-ordination of this water moleculeis the sameasfor Ca’+ in SBT.

residue shorter than in SBT (McPhalen & James, 1988) and in proteinase K it is absent due to the deletion of three residues (Betzel et al., 19883). The region in TRMase equivalent to the second Ca2+ binding site in SBT is shown in Figure 14. There is no evidence for a Ca-3 at this position in the TRMase structure. It is occupied by a water molecule, or possibly a Na ion, co-ordinated to the carbonyl oxygens of Ala173, Tyr175 and Ala178, the carboxyl oxygen of Asp201 and two water molecules. All the distances lie in the range 2.80 to 2.97 A. The observed distorted octahedral geometry of the ligands is very similar to that of SBT (McPhalen & James, 1988). The deviations for the equivalent Ca2+ binding atoms are about 02 to @3A. Hence this is a potential third Ca2+ binding site in TRMase. The crystals used in the structure determination of TRMase were grown from calciumfree solutions and this may be the reason for the lack of a third Ca2+ in the structure. This hypothesis could explain the similar effect of the removal of the weakly bound Ca ions from TRMase and SBT. Similar to the Ca-2 binding site there is a positively charged group, in this case Arg249, which forms hydrogen bonds to the carbonyl oxygens of Ser176 (NH1 . .O 996A) and Ala178 (NH1 .O 3.23 A) and a salt bridge to the carboxyl group of Asp201 (NH2.. . OD2 3.418, NH2.. ODl 3.46 8). The equivalent residue Arg247 occupies the same position in SBT. A possible explanation of the appearance of the positively charged residues near to the Ca2+ binding sites is the necessity of stabilizing the local tertiary structure in the absence of sufficient Ca ions in the environment.

Clearly both the TRMase and SBT Ca ions contribute to the overall stability of the surface regions and improve the thermal stability of the enzymes. They may, in addition, contribute to the stability by reducing the flexibility of the protein and hence its susceptibility to partial unfolding followed by autolysis. The existence of the three potential Ca2+ sites in TRMase and the observed mode of their binding suggest that the Ca ions are one of the major causesof the relatively high thermal stability of TRMase. Recently, it, was found that, the Ca-3 binding site in the crystal structure of the TRMase-eglin complex is occupied by Ca2+ in the presence of 100 mM-CaC12 in the crystallization solution (Gros. Kalk & Hol, unpublished results). (d)

Ionic

interactions

The importance of ionic interactions for protein stability has been pointed out in several studies based on X-ray structures (Perutz, 1978; Walker et al., 1980). The comparison of TRMase with SBT confirms this suggestion. There are 11 pairs of charged groups forming ionic interactions in TRMase (Table 8). These include 16 residues and both the N and C termini out of 32 charged groups present in the protein. Ionic interactions in SBT involve only three pairs between six residues out, of 29 charged groups. Thus, although the total number of charged groups is similar in these proteins many more of them form salt bridges in TRMase. The distances between interacting atoms do not exceed 3.2 d, except those for the ion pair

Crystal

Structure

275

of Therm&use

Table 9

Table 8 Aromatic

Ionic interactions

A. TRMase

A. Residue

Atom

Residue

Atom

Tyrl LYS95 Arg102 ArglO2 Lys153 Arg243 Arg243 Arg249 Arg249 Arg249 Arg270 Arg270 Arg270 Lys27.5

N NZ NH1 NH1 NZ NH1 NH2 NE NH2 NH2 NE KH2 NH1 NZ

Asp25 Glu28 Asp57 Asp60 Asp124 Tyr279 Tyr279 Glu253 Glu253 Asp201 Asp188 Asp188 Asp257 Asp257

OD2 OE2 OD2 ODl OD2 0 0 OE2 OE2 OD2 ODl OD2 ODl OD2

Distance

interactions: the distances centres are given TRMase

between the ring

(A) Cluster

2.88 2.73 2.97 3.15 3.19 2.87 2.77 2.83 284 3.41 2.81 2.78 276 2.52

1

2 3

4

5

B.SBT

Residues Tyr175 Tyrl71 Tyr174 Tyrl Trp24 Tyr7 Tyr7 Tyr210 Trp56 Trp56 Phe58 Tyr196 Tyr265

Distance Tyrl71 Tyr174 Trp199 Trp24 Phe8 Tyr210 Tyr218 Tyr218 Phe58 Tyr121 Tyrl21 Tyr265 Trp266

(A)

5.27 5.71 541 5.38 7.52 5.43 5.31 6.22 5.80 6.5 1 5.84 5.46 6.33

U.SBT

Residue

Atom

Residue

Atom

Lys136 Lys141 Arg247

NZ NZ NH1

Asp140 Glu112 Asp197

OD2 OEl ODl

Distance

(A)

2.99 302 324

Asp201 . Arg249 in TRMase and the corresponding one in SBT discussedabove. The surprising fact is that this is the only conserved ion pair in the compared proteins although there are 12 positions with a residue possessingthe same charge (Lys/Arg and Asp/Glu). One of these conserved residues is the active site Asp38 (Asp32 in SBT). Asp47 in TRMase and the equivalent Asp41 in SBT participate in the Ca-1 binding. Asp62 and Arg102 are located in the Ca-2 site in TRMase. All other conservative charged groups in both proteins are not involved in ionic interactions. This suggeststhat they are not critical elements in defining the three-dimensional fold of the chain in TRMase and SBT. At the same time the observed difference in the number of ion pairs in TRMase and SBT indicates that their contribution to the overall stability of the proteins is significant.

Residues Tyr91 Tyr167 Phe261

Distance Trpl13 Tyrl71 Tyr262

(A)

7.33 5.28 5.18

there are only six residues forming three aromatic pairs in SBT. They represent only one-third of the aromatic residues in the protein. The significantly increased number of aromatic interactions in TRMase (13) compared to SBT (3) suggests that they are another source of the enhanced stability of TRMase. Comparison of the three-dimensional structures of TRMase and SBT reveals only one conserved aromatic pair although there are ten common positions in these proteins occupied by aromatic residues. The invariant pair Tyr171-Tyr175 in TRMase forms part of the cluster that includes also Tyr174 and Trp199. The latter two residues are replaced by Lys169 and Glu195 in SBT with a salt bridge between them (Fig. 15). Thus, both types of interaction contribute to the stabilization of the “Ca-3” binding site.

(e) Aromatic interactions Burley & Petsko (1985) suggested the stabilizing role of closely packed perpendicular ring-ring interactions in proteins and calculated the potential energy of one such interaction to be about 1 kcal/mol (1 cal = 4.184 J). Table 9 shows the ring-ring interactions of Phe, Tyr and Trp in TRMase and SBT. Only those pairs were included that did not contain other atoms between the aromatic rings and had distances between ring centroids of less than 8 b. There are five clusters of interacting aromatic residues in TRMase distributed over the surface of the molecule. One cluster contains four residues and the others contain three residues each. In total there are 16 residues involved in the aromatic interactions, i.e. two-thirds of the whole number of aromatic residues in TRMase. In contrast to TRMase

5. Solvent Structure

in TRMase

Crystals

There are 182 water molecules identified in the electron density map on the surface of the molecule or within its cavities. They cover the protein surface quite evenly except the regions that are involved in the crystal packing contacts and thus are not accessible to solvent. The first hydration shell includes 117 water molecules forming direct hydrogen bonds with protein atoms. Hydrogen bonds were assignedif the 0 . . . N or 0 . . 0 distance was shorter than 3.4 b. Table 10 shows the distribution of the water molecules relative to the number of hydrogen bonds which they form. Not surprisingly their temperature factors correlate with the number of hydrogen bonds decreasing with every additional bond. Water molecules are known to play an important

276

A. V. Teplyakov

et al.

TRPl99

Figure 15. Stereo view of the aromatic Tyr174

and Trp199

cluster in TRMase superimposed on the equivalent are replaced in SBT by Lys169 and Glu195, which form a salt bridge.

role in the solvation of buried charged and polar residues providing the required hydrogen-bonding environment. In TRMase there are 49 charged and polar residues with an accessible surface less than 40 A2, calculated with the program DSSP (Kabsch & Sander, 1983). Ten of them are involved in ionic interactions including Ca2+ binding and 16 form hydrogen bonds with side-chains of the other polar residues. The remaining 23 residues are co-ordinated to water molecules (17 of them) and/or to mainchain atoms (19 of them). One of the most interesting features of the solvent structure in TRMase is the water channel of approximately 11 A in length protruding from the surface near the N terminus to the centre of the protein molecule (Fig. 16). It contains five water molecules that form a number of hydrogen bonds with the surrounding protein atoms. All five have relatively low temperature factors (11.6 to 13.3 A2). This water channel penetrates the structure and ends at the turn of helix C where it is interrupted to make the Ca-1 binding loop. The most deeply bound water molecules, Wat314 and Wat344, are hydrogen-bonded to the buried polar side-chains of Thr92 and His230, respectively. Similar water channels were found in SBT and SBC (McPhalen, 1986)

residues in SBT. Residues

where water molecules provided a connection between the buried residues His67, Thr71, His226 (in SBT numbering) and bulk solvent. Of particular functional importance is the role water molecules play in stabilizing the active site residues in the absence of a substrate. There are 11 water molecules in the active site of TRMase. One of them occupies the position of the carbonyl oxygen of the scissilebond of a substrate, the site known as the oxyanion hole (Fig. 11). Another two water molecules are bound deep in the active site cleft and form a hydrogen bond network between Asp38, Ser131 and Ser133: Wat368 is hydrogen-bonded to Va137 0, Asp38 ODl, Serl31 0 and Ser133 N; Wat364 to Ser131 OG and Ser133OG. The former water molecule seems to maintain its position on binding of substrate, at least in the complexes formed between SBT and SBC with protein inhibitors (McPhalen, 1986). There are two more internal water molecules in the active site region, which are present both in the TRMese structure and in the SBC and SBT complexes. One of them lies at the bottom of the Pl pocket and forms hydrogen bonds to Leu134 0, Ala160 0 and Ala173 N. Another is hydrogen-bonded to Ile36 0, Asp38 N and Gly40 N and seemsto stabilize this loop.

Table 10 The temperature

factors

of the water molecules for thermitase number of hydrogen bonds formed

Number of hydrogen bonds Number of water molecules

0 49

Average

422

B-factor

(A*)

1 66 390

2 39 31.3

as a function

3 21 27.6

4 6 17.0

of the

5 1 17.8

Crystal

Figure 11A:

16. Water

channel

in TRMase

with

Structure

5 internal

water

Water molecules are often necessary for the binding of Ca ions. They can complete the coordination sphere of Ca*+ and also support the structure of the Ca‘+ binding sites. There is one water molecule involved in the co-ordination of Ca-2 in TRMase (Fig. 13). The potential Ca-3 site in TRMase is highly hydrated (Fig. 14) and the corresponding Ca* + in SBT binds one water molecule as ligand and in SBC two water molecules (McPhalen & James, 1988).

Table 11 Intemolecular hydrogen bonds X,Y,Z

to

Gln145 NE2 Serl41 OG Asnl40 ND2 to

GY,Z

Lys258 Ser260 Gly261 Thr262 Thr264

0 OG N 0 OGl

x,y,z

to

Ser198 0 x,y,z

Asn152 0 Lys153 NZ

Thr2 N Thr2 OGl Ser9 OG

to

to

Distance (A) 291 308 338

1/2+x,1/2-y,-% Asn85 ND2 Thr83 0 Asn84 0 Asn4 ND2 Asn4 ND2

2.91 295 283 323 322

1/2+x,1/2-y,z Asn61 ND2

Asp105 ODl Asp105 ODl SerlO9 OG X,Y,Z

TY,l +z

314

l/2-2,1-y,1/2+2 Ser208 OG Ser220 OG Ser220 N

2.96 264 306

l/2-x,-y,1/2+z Gln241 NE2 Gln278 0

2.91 2.94

277

of Therm&se

molecules.

The length

of the channel

is approximately

6. Crystal Packing As noted above TRMase crystals have a very high packing density. Each molecule makes contact with ten symmetry-related molecules. Table 11 contains a list of all direct protein-protein intermolecular hydrogen bonds defined as the contacts between potential donors and acceptors with distances less than 3.4 8. The symmetry operations between interacting molecules involve the screw rotations around crystallographic axes a and c and pure translations along axis c. Close crystal contacts could, in principle, influence the conformation of the surface regions of the protein molecule, especially the loops protruding into the solvent. Such regions could be detected upon comparison of the structures determined for different crystal forms.

7. Conclusion The X-ray crystal structure analysis of TRMase shows the similarity of the tertiary structures of TRMase and SBT. The arrangement of residues in the active site and the substrate binding region in TRMase is also very close to SBT, reflecting their similar specificity. The observed difference in stability of these proteins can be explained in terms of the three-dimensional structure as resulting from three possible sources. The Ca ions, two of which are found in the structure of TRMase described here, seem to play an important role in stabilization of the tertiary structure of TRMase. The other two sources of the enhanced stability of TRMase compared to SBT are the ionic and aromatic interactions. The refined structure of TRMase allows us to propose ways to design more stable variants of subtilisin by site-directed mutagenesis. We thank Dr A. Popov for help in providing the X-ray experiment on the diffractometer KARD in Moscow and

278

A. V. Teplyakov et, al.

Dr Z. Dauter for help in using the synchrotron beam-line in Hamburg. We thank Dr B. Strokopytov and Dr Ch. Betzel for help at various stages of the refinement. A.V.T. is grateful for a FEBS fellowship for the completion of the structure determination at the EMBL (Hamburg).

References Alden, R. A., Birktoft, J. J., Kraut, J., Robertus, Wright, C. S. (1971). Biochem. Biophys. Commun.

45,

J. D. & Bes.

337-344.

Andrianova, M. E., Kheiker, Simonov, V. I., Anisimov, Tvanov, A. V., Movchan, S. Zanevsky, Yu. V. (1982).

D. M., Popov, A. N.. Yu. S., Chernenko, S. P.. A., Peshekhonov, V. D. & J. Appl. Crystallogr. 15,

626-631.

Baker,

E. N. & Hubbard, R. E. (1984). Prog. Biophys. Mol. Biol. 44, 97-179. Bernstein, F. C., Koetzle, T. F., Williams, G. J. B., Meyer. E. F., Jr. Brice, M. D.. Rogers. J. R., Kennard, O., Shimanouchi, T. & Tasumi, M. (1977). J. Mol. Biol. 122,535-542.

Bet,zel, Ch., Pal, G. P. & Saenger, W. (1988a). Acta Crystallogr. sect. B, 44, 163-172. Betzel, Ch.. Pal. G. P. & Saenger, W. (19886). h’ur. J. Biochem. 178, 155171. Betzel. Ch., Bellemann, M., Pal, G. P., Saenger, W. & Wilson, K. S. (1988c). Proteins: Struct. Func. Genet. 4, 157-164. Bode, W., Papamokos, E. & Musil, D. (1987). Eur. J. Biochem. 166, 673-692. Bott, R., Ultsch, M., Kossiakoff, A., Graycar, T., Katz, B. & Power, S. (1988). J. Biol. Chem. 263, 7895-7906. Briedigkeit,, L. & Frommel, C. (1989) FEBS Letters. 253, 83-87.

Bryan, P., Pantoliano, M. W., Quill, S. A.: Hsiao, H.-Y. & Poulos, T. (1986). Proc. Nat. Acad. Sci., U.S. A. 83, 3743-3745.

Burley, S. K. &, Petsko, G. A. (1985). Science 229, 23-28. Chestukina, G. G., Epremyan, A. S., Gaida, A. V., Osterman, A. L., Khodova, 0. M. & Stepanov, V. M. (1982). Bioorg. Chim. 8, 1649-1658. Crowther, R. A. (1972). In The Molecular Replacement Method (Rossmann, M. G., ed.), vol. 13, pp. 1733178, Gordon & Breach, New York. Crowther, R. A. & Blow, D. M. (1967). Acta Crystallogr. sect. B, 31, 238-250.

Dauter, Z., Betzel, Ch., HGhne, W.-E., Ingelman, M. & Wilson, K. S. (1988). FEBS Letters, 236, 171-178. Drenth, J.; Hol, W. G. J., Jansonius. J. & Koekoek, R. (1972). Eur. J. Biochem. 26, 177-181. Einspahr, H. & Bugg, C. E. (1984). In Metal Ions in Biological Systems (Siegel, H., ed.), vol. 17, pp. 51-97, Marcel Dekker, New York. Finzel, B. C. (1987). J. Appl. Crystallogr. 20, 53-55. Frommel, C. C Hiihne, W. E. (1981). Biochim. Biophys. Acta, 670, 25-31. Frommel, C. & Sander. C. (1989). Proteins: Struct. Funct. Genet.

5, 22-37.

Frommel, C., Hausdorf, G., Hiihne, W. E., Behnke, U. 8: Ruttloff, H. (1978). Acta Riol. Med. Ger. 37, 1193-1204. Fujinaga, M., Delbaere, L. T. J., Brayer, 6. I). & James, M. N. G. (1985). J. Mol. Biol. 183, 479-502. Gaucher, G. M. & Stevenson, K. ,J. (1976). Methods Enzymol.

45, 414-433.

Gros, P., Fujinaga, M., Dijkstra, Hol, W. G. ,J. (1989a). Acta 488-499.

B. W., Kalk, Crystallogr.

sect.

K. H. bz B, 45,

Gras, I’.. Betzel, Ch.. Dauter, Z., Wilson, K. S. & HOI, W. G. ?J. (1989r’,). J. Mol. BioZ. 210, 347-367. Hendrickson. W. A. & Konnert, ,J. H. (1981). In Biomalecular

Structure,

C’onformation,

Function

and Evolution

(Srinivasan. R.. ed:). vol. 1, pp. 4357, Pergamon Press, Oxford. Hirono, S., Akagawa, H., Mitsui, 1. B Iitaka. Y. (1984). J. Mol. Biol. 178, 389-413. HOI, W. G. J. (1985) Prog. Biophys. Mol. Hiol. 45. 149-195. Jany, K. D. 8: Mayer. B. (1985). Hio/. (‘/~inl. HoPpeA\‘eyler,

366.

4855492.

Jones. T. A. (1978). J. Appl. Crystallogr. Kabsch, VI’. & Sander, C. (1983). 2577-2637. Kraut, ,J. (1977).

Kretsinger.

Annu.

R. H.

Rev?. Biochem. Anna.

(1976).

11. 268 272. Biopolywws,

46, Bee.

331 -358. Riochem.

22.

45.

239-266.

Levitt,

M. & Chothia.

(‘.

(1976). Nature

(London),

261.

552-557.

Luzzati, Machin,

V. (1952). dcta (‘rystallogr. 5, 802 810. P. A.. Wonacott, A. J. & Moss. D. (1983). I)arrsbury Lab. News. 10. 3&9. Markland. F. S. dz Smith. E. I,. (1967). -1. Biol. (‘hem. 242, 51988521 1. Markland. F. S. di Smith, E. L. (1971) In The Enzymes (Bayer. P. I)., ed.). vol. 3, pp. 516-608, Academic Press. New York. Matthews. B. W. (1968). J. Mol. Biol. 33. 491 -497. McPhalen. C. A. (1986). Ph.D. thesis, l’niversity of Alberta, Edmonton. McPhalen, C. A. & tJames. 31. Ir;. (:. (1988). Biochemistry, 27, 6582-6598. McPhalen. C. A.. Svendsen, I., ,Jonassen. I. 6t ,Jarnes. M. I\j. G. (1985a). Proc. Nat. Acad. Sci., t~“‘.A.A. 82, 7242-7246.

McPhalen, (‘. A.. Schnebli. H.-P. & James. M. N. (:. (19856). FEBS Letters, 188, 55-58. Meloun, K., Baudys. M.. Kostka. V.. Hausdorf. (i.. Frommel. (‘. 8: Hohne, W. E. (1985). FEHS Letters. 183.

North,

195-200.

A. C. T., Phillips,

Acta

1). C. & Mathews, F. S. (1968). 351-359. M. W., Ladner. R. C., Bryan. I’. N., Rollence, Wood, *J. F. & Poulos, T. 1,. (1987). Kio-

Crystallcgr.

Pantoliano, M. L., chemistry.

26,

sect. A, 24,

2077 -2082.

Pantoliano, M. W.. Whitlow. M., Wood, ,J. F.. Rollence. M. I,., Finzel, B. C.. Gilliland, G. L., Poulos, T. 1,. & Bryan, P. K. (1988). Biochemistry, 27. 831 I --8317. Perutz, M. F. (1978). Science. 201. 1187 1191. Ramakrishnan. (‘. bi Ramachandran, (:. S. (1965). Biophys.

.J. 5. 9099933.

Read, R. ,J. (1986). Acta Crystallogr. sect. A, 42, 140~- 149. Richardson. .J. S., Getzoff, E. D. & Richardson. D. C. (1978). Proc. Nat. Acad. Sci., I ‘.&‘.A. 75, 2574-2578. Robertus. .J. D., Alden, R. A., Birktoft, J. ,J.. Kraut, .I.. Powers. ,J. (‘. h Wilcox, P. E. (1972). Biochemistry, 11. 2439-2449. Schechter. I. 8r Berger. A. (1967). Hiochrm. Niophys. Hes. Pommun.

27.

157 -162.

Smith, E. L., Delange. R. J., Evans, W. H., Landon. M. & Markland, F. S. (1968). J. Biol. Chem 243. 218442191, Stepanov. V. M., Chestukhina. G. G.. Rudenskaya. G. N:.. Epremyan, A. S., Osterman. A. L.. Khodova, 0. M. & Belyanova, L. I’. (1981). Biochem. Biophy”. Hes. Commm. 100, 1680-1687. Sussman, ,J. L., Holbrook, S. It., Church, (>. M. & Kim, S. (1977). Acta Crystallogr. sect. A, 33, 800~804.

Crystal

Structure

Teplyakov, A. V., Strokopytov, B. V., Kuranova, I. P., Popov, A. N., Harutyunyan, E. H., Vainshtein, B. K., Friimmel, C. & HGhne, W. E. (1986). Sov. Phys. Crystallogr. 31, 553-556. Teplyakov, A. V., Kuranova, I. P., Harutyunyan, E. H., FrFmmel, C. & Hiihne, W. E. (1989). FEBS Letters, 244, 208-212.

Edited

of Thermitase

279

Venkatachalam, C. M. (1968). Biopolymers, 6, 1425-1436. Voordouw, G., Milo, C. & Roche, R. S. (1976). Biochemistry, 15, 3716-3724. Walker, J. E., Wonacott, A. J. & Haaris, J. J. (1980). Eur. J. B&hem. 108, 581-586. Wright, C. S., Alden, R. A. & Kraut, J. (1969). Nature (London), 221, 235-242.

by R. Huber

Crystal structure of thermitase at 1.4 A resolution.

The crystal structure of thermitase, a subtilisin-type serine proteinase from Thermoactinomyces vulgaris, was determined by X-ray diffraction at 1.4 A...
2MB Sizes 0 Downloads 0 Views