ANNUAL REVIEWS

Annu.

Copyright

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

1991.20:267-98 Annual Reviews Inc. All rights reserved

Rev. Biophys. Biophys. Chern.

© 1991

by

Further

Quick links to online content

ELECTROSTATIC ENERGY AND MACROMOLECULAR FUNCTION Arieh Warshel Department of Chemistry, University of Southern California, Los Angeles, California 90089- 1 062

lohan Aqvist Department of Molecular Biology, Uppsala Biomedical Centre, Box 590, S-75 124 Uppsala, Sweden KEY WORDS:

electrostatic efef cts dielectric constants of proteins

CONTENTS PERSPECTIVES AND OVERVIEW ................

....................... ................ ........ . . . . . . . . . . . . . . . . . . . . . . . . .

HOW TO CALCVLATE ELECTROSTATIC ENERGIES IN PROTEINS AND SOLUTIONS...................

Some Basic Relationships . . . . . .... . . . . . .. .......................................... .. .. . . ... .. ....... ............. Macroscopic Approaches .......................... . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Simplified Microscopic PDLD Model........................................... . . . . . . . .. . . . . . . . . . . . . Scaled Microscopic Models Can Extend the Precision of Electrostatic Calculations . Microscopic Approaches with All-Atom Solvent Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . .. . . .. .. .... The Dielectric Constant of Proteins Depends on Its Definition ..................................

WHAT IS THE DIELECTRIC CONSTANT IN PROTEINS?

STUDIES OF ELECTROSTATIC EFFECTS IN MACROMOLECULES..............................................

Charge-Charge Interactions Between Surface Group s . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Self-Energies or Solvation of Charges in Macromolecules. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . Enzymatic Reactions . . . . . . . . . . . . . . . . . . . . . . . . .. . .. .......... .. ...................................................... Helices Stabilize Charges with LocalizedAmide Dipo les and not with Macrodipo les O ther Electrostatic Factors..................................... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . .

.

CONCLUDING REMARKS AND FUTURE DIRECTIONS...........................................................

268 269 269 270 273 276 278 280 281 285 285 287 290 292 293 294

267 0883-9 182/9 1/06 10-0267$02.00

268

WARSHEL & AQVIST

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

PERSPECTIVES AND OVERVIEW Investigators have frequently pointed out the importance of electrostatic interactions for the structure and function of proteins (e.g. 38, 4 1 , 44 , 68, 81, 125). However, the notion that quantitative evaluation of electrostatic energies might provide the best way of actually correlating structure and function (115) is not altogether accepted. Nevertheless, recent progress in calculations of electrostatic energies has opened up new possibilities for exploring the functions of macromolecules. In considering these advances, it is important to have clear perspectives about the relevance and relative importance of different electrostatic factors. For example, many studies have identified electrostatic energies in macromolecules with the inter­ actions between charged groups (typically considering the effects of surface groups and ionic strength). However, these interactions are quite small and can be produced by many models, including incorrect ones. On the other hand, the interactions between charges and their surrounding dipoles (e.g. hydrogen bonds and other polar groups), which are called self­ energies, have frequently been omitted from the considerations of elec­ trostatic energies. These contributions are perhaps the most important factors for structure-function correlation in proteins. Calculations of self energies that are of the order of 100 kcal/mol present a real challenge and a distinct test case for different models, partially because these quantities are quite sensitive to the exact orientation of the protein dipoles. Some confusion still exists with regard to the relationship between the evaluation of electrostatic energies on the microscopic molecular level and the tra­ ditional macroscopic level (which considers the system as a continuum). In particular it is not widely recognized that the solvation energies evalu­ ated by recent computer-simulation methods are basically the electrostatic self energies. Significant problems are also associatcd with the proper definition of the dielectric constant (or constants) in macromolecules, where it is sometimes not realized that thc assumption of a given value of the dielectric constant may actually amount to assuming rather than calculating certain properties of the system. The present review considers the above issues in the broad context of the connection between microscopic and macroscopic pictures. The microscopic approach is advocated not only because it is based on very clear concepts of energetics, but because it allows one to avoid many of the traps that until very recently hindered meaningful progress of micro­ scopically based studies. The review is somewhat biased by the view that only quantitative evalu­ ations of electrostatic energies are meaningful, but we nevertheless try to

ELECTROSTATIC ENERGY AND FUNCTION

269

give some general back-of-the-envelope rules that might help those who do not feel comfortable with computer-simulation approaches. HOW TO CALCULATE ELECTROSTATIC ENERGIES IN PROTEINS AND SOLUTIONS

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

Some Basic Relationships To clarify the basic electrostatic concepts on an atomic level, let us consider a collection of atoms in a specified system (say a protein), in which each atom is characterized by a given residual charge (QJ The electrostatic energy of our system is given (in kcal/mol) by u = 1 66 I Qi

i

(HiI Qrijj)

=

�2 I QiVi,

1.

i

where rij is distance between atoms i and} in A and thus Vi is the micro­ scopic electrostatic potential on the ith atom. We can also define the micro­ scopic electric field at any point of the system as 2. where Vi is the gradient operator with respect to the x, y, and z coordinates of the point f;, fij r;- rj, and V includes the potential from all atoms (unlike Vi of Equation I). The microscopic field provides probably the most fundamental connection between the microscopic atomic picture and the traditional macroscopic electrostatic theory, which views matter as a dielectric continuum. That is, one can derive (see 1 25) the relationship =

3. where E is the macroscopic electric field i n a volume element, r, o f the system. Although we can use the microscopic � to evaluate the macroscopic E, we cannot go in the opposite direction and deduce a unique � from E. If our system is overall electroneutral, we can describe it as a collection of microscopic dipoles and obtain the corresponding macroscopic polarization P by 4. where /l is the sum of the microscopic dipole moments divided by the volume element r. The quantities E and P are related to each other by

E(s-l)

=

4nP,

5.

W ARSHEL &

270

AQVIST

where e is the macroscopic dielectric constant and is discussed in detail in a subsequent section. The above equation is, in fact, the most basic defi­ nition of the dielectric constant because it directly relates e to the well­ defined microscopic averages of Equations 3 and 4. The free energy difference associated with moving the charge from vacuum (6 1) to a given homogeneous medium is usually called the self­ energy (see the section on self-energies below) of the charge (in the given medium) and is given by the Born formula ( 1 6)

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

=

AG

=

Q2( )

1 -332� 1- . 2a 6

6.

where a is the effective radius of the charge. Thus, this self-energy is what chemists refer to as the solvation energy in the particular medium characterized by 6. Unfortunately, one cannot determine the macroscopic 6 in an inhomo­ geneous medium (e.g. a protein) using any macroscopic concepts. Thus, to proceed we must either use a microscopic approach, in which no e enters into the formulation, or assume the value of e and then repeatedly examine the validity of our assumptions. In the next section, we take a closer look at these different strategies. Macroscopic Approaches

Anyone facing for the first time the task of calculating electrostatic energies in proteins might find it natural to follow the route laid out in the macro­ scopic formulation (e.g. 48). This decision would probably be influenced by the fact that all electrostatic textbooks present the detailed and rigorous macroscopic derivations needed, albeit sometimes without a clear warning that one must actually know 6 in order to make any meaningful progress. Unfortunately, electrostatic problems in proteins present problems that involve several chemical phases and several dielectric constants. Further­ more, in many cases one cannot deduce the magnitude of relevant elec­ trostatic energies from available experiments (e.g. the electrostatic con­ tribution to the catalytic effect of an enzyme), and the value of 6 cannot be calibrated against experimental data. This problem is probably why the seemingly simple macroscopic treatment did not give a quantitatively correct description of electrostatic energies in proteins until it was mixed with microscopic elements (see below). To elucidate some of the pitfalls in the macroscopic approach, let us consider a few of its versions. The simplest and the most widely used macroscopic formulation is the Coulomb law,

ELECTROSTATIC ENERGY AND FUNCTION

271

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

7. The 8 in this equation represents the effect of all the surrounding materials (protein, water, lipid membranes, etc) on the charges Q I and Q2' In fact, Cet! simply represents every factor not explicitly included in our model. Thus, Ceff can be viewed as an effective dielectric constant associated with the particular interaction (thus, the r dependence) rather than with a particular medium, as in Equation 6. Ifwe are dealing with the interaction between two charges in water (which can be considered a homogeneous medium), then C 80 is a reasonable approximation, but what about a more complex medium such as that inside a protein? We can, of course, make some assumption concerning the value of 8 but must then remember that we are really making assumptions about the properties of the environ­ ment surrounding the charges, which may or may not turn out to be correct. Thus, if we use a value of Goff 2 or Geff = r for the overall effect of the environment, as was often done in early calculations on proteins (e.g. 32, 6 1, 1 07), the resulting energetics are not likely to be entirely relevant for a protein in solution. For instance, an ion pair with 3-A.. separation inside a protein would be stabilized by I'1G � 37 kcaljmol, e = r, while the observed interaction for such a case is usually around - 2 kcaljmol or less ( 1 25). In this case, one cannot argue that the effect of the surrounding water is missing because the 8 in Equation 7 represents the effect of the entire environment. Thus, a choice of 8eff = 2 amounts to assuming that the combined effect of protein and water is well represented by this value. Therefore, using a low value of 8 in conjunction with the simple Coulomb law (Equation 7) for solvated proteins may lead to enor­ mous errors, regardless of the rationale given for such a choice. The Tanford-Kirkwood (TK) treatment provides a seemingly major improvement over the single dielectric model of Equation 7 (100). This model, which was introduced before the availability of protein X-ray structures, views the protein as a medium oflow dielectric constant (8 = Cin) surrounded by water in which e = eout (a large dielectric constant). This model appears reasonable as long as the relevant charged groups reside near the surface of the protein ( 1 25). However, the model becomes inade­ quate for charges buried in the interior of a protein. Warshe1 et al ( 1 25, 126) demonstrated this deficiency (Figure 1 of 126). To see this, let us consider the energy of bringing a charge of radius ii from vacuum to the center of a dielectric sphere (c = 8in with radius b, surrounded by a dielectric continuum with 8 80ut) . The corresponding energy is given by =

=

-

=

I'1G = _166Q2

[�a (

) �b (�

1_ � + 8m

8m

1

)J,

__

80ut

8.

272

WARSHEL & AQVIST

which when b ---+ CIJ reduces to the Born formula (Equation 6). Now, let's say that the inner sphere (protein) is nonpolar with Gin 2 and a radius of 20 A and contains a singly charged ionized group with radius ii = 2.0 A positioned at its center. With a value of Gout = 80 for the surrounding water, Equation 8 thus gives a self-energy of - 46 kcal(mol for the charge. The corresponding self-energy (or solvation energy) in water is given by the Born formula (Equation 6) and amounts to -82 kcal(mol. Interest­ ingly, one also gets the same general result for ion pairs ( 1 26), i.e. that the self-energy is considerably higher inside the inner sphere than outside. Hence, buried ionized groups become unstable according to such a model and, to remained ionized, thcy would be drawn out into the solvent. Likewise, if we generate an ion or an ion pair inside a droplet of benzene (which resembles a nonpolar protein) the droplets explode when sur­ rounded by water to let the ions escape. Despite this fact, one nevertheless frequently finds ionized groups inside proteins. The reason for this is not that ions are stable inside nonpolar proteins [as suggested elsewhere (62)] but that protein sites accommodating ions with small radii are always polar ( 1 1 2, 1 25, 1 26). In other words, ionized groups in the interior of proteins are always surrounded by local protein dipoles and bound water molecules that provide the favorable interactions required for the groups to be stable (e.g. hydrogen bonds). This explanation, however, does not justify models with a low value of Gin because the value of this dielectric constant represents the entire environment, inside the radius b, around the charges and by definition includes the effect of polar protein groups. Thus, the above discussion concludes that, at least as far as internally buried charged groups are concerned, the TK model is fundamentally incorrect. Now, after having recognized that a low value of Gin yields incorrect values of the energy of charges in the interior of proteins, one may include

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

=

the permanent local dipoles of the protein as additional explicit charges

(rather than as part of Gin) . This inclusion can lead to a morc realistic description of the system in which Gin reduces to represent the effect of the electronic polarizability of the protein atoms and the conformational reorganization in response to the given charges. However, this type of revision means the model cannot really be considered a TK model any­ more; e.g. in the same way, one could treat water (for which G � 80) as a "low-dielectric medium" in which Swater 4, include the molecular dipoles explicitly, and still call the model "macroscopic." Furthermore, the local polarity effect does not appear to have been widely recognized by prac­ titioners of the TK model or related approaches (68, 69), and it actually took microscopic calculations (e.g. 1 1 2) to indentify this key effect and to estimate the corresponding energetics. One of the drawbacks of the original TK model is also the representation =

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

ELECTROSTATIC ENERGY AND FUNCTION

273

of the protein as an idealized sphere. In particular, this model prevents one from obtaining any quantitative assessment of the energetics of a charge in a realistic protein environment. That is (see Figure 1 of 1 26), the energy is a very steep function of the distance between the charge and the protein surface, and, in order to account for the actual shape of the protein, one must employ some type of numerical treatment. This point was realized by Warwicker & Watson ( 1 30), who used a "discretized continuum" (DC) approach, dividing the surrounding solvent and the protein itself into a three-dimensional grid with a local value of S (i.e. either Sin or Sout) associ­ ated with each grid point [a related method was proposed by Orthung (77)]. With such a grid one can use standard finite difference procedures to numerically solve the corresponding Poisson equation (48) _

V2¢(r) = pJr) , s(r)

9.

where ¢ (r) is the electrostatic potential and plr) the free charge density. It is also possible to include the average effect of an electrolyte solution, rather than pure water, by the modified version of Equation 9 known as the Poisson-Boltzmann equation. Interestingly, the DC approaches appeared at a rather late stage of protein electrostatic calculations, which might reflect the conceptual constraints imposed by the macroscopic litera­ ture, which is based on ideas formulated long before the emergence of digital computers. The above DC method and its different variants is now widely used, partially because of the efforts of Honig and coworkers (e.g. 35, 36). Howcver, many recently reported studies using this approach still neglect the role of local protein polarity, assigning low values of the interior dielectric constant without representing any dipoles (e.g. hydrogen bonds) explicitly. In such cases, the interaction between surface groups is reason­ ably evaluated but the solvation energies of internal groups (i.e. self­ energies) is incorrectly modeled. The problems associated with self-energies or solvation energies become much simpler and clearer when dealt with on a microscopic level (see below). However, when the local polarity of the protein is incorporated explicitly into the DC treatment, this model becomes rather similar to microscopic or scaled microscopic approaches, such as the PDLD model described below. Thus, different methods will inevitably converge as the treatment becomes more complete and approaches a realistic physical picture. The Simplified Microscopic PDLD Model

The first microscopic approach to evaluating electrostatic energies in pro­ teins emerged in the mid 1 970s (Ill, 1 23) and was prompted by the inability

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

274

WARSHEL & AQVIST

of macroscopic models, with arbitrary dielectric constants, to provide answers to some of the key questions in biochemistry (such as the mag­ nitude of the electrostatic contributions to enzyme catalysis). One of the reasons for choosing a microscopic approach was that the issue of dielectric constants could be avoided altogether. Also, many traps associated with the implementation of a macroscopic model could be eliminated with a microscopic approach that only requires one to properly count all the relevant energy contributions. Of course, such counting necessitates explicit representation of all the atoms of the (protein + solvent) system and the determination of the average orientation of dipoles, which was impractical with the computer power of that time. The simplified micro­ scopic alternative developed by Warshel and coworkers (92, 1 23, 1 25), which we briefly describe below, is referred to as the protein dipoles Langevin dipoles (PDLD) model. This model replaces the average energy of the system with the electrostatic energy evaluated from its average structure. The average protein structure is taken as the X-ray structure (which corresponds to a time-average, although in the crystalline state), sometimes reminimized for different charge configurations. The average solvent polarization, on the other hand, is evaluated using a three-dimen­ sional grid model. In this simplified treatment of the solvent, the water molecules are represented by point dipoles on a grid constructed around the protein (and also allowed to penetrate cavities if such exist). Each dipole is polarized towards the local field resulting from the protein atoms as well as all other solvent dipoles. The polarization of a given solvent dipole in its local field is approximated by a Langevin-type function:

to.

where ei is a unit vector in the direction of �i' the local field on the ith dipole. The term flo is taken as l .8 Debye, C' is a parameter, i;? is the field [rom the permanent charge distribution on the ith dipole, and �c is the field from its nearest neighbors. The equation for the effective Langevin dipoles, J.l�, is solved iteratively as described elsewhere (92). [The unit vector ei, used by Warshel & Levitt ( 1 23), was replaced in some PDLD versions (e.g. 92) with e?, giving faster convergence and results very similar to those obtained with ei. This useful treatment, which was recently cri­ ticized (38), is a reasonable approximation for models that represent per­ manent dipoles by their projections (see Figure 5 of 1 25).] The key idea

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

ELECTROSTATIC ENERGY AND FUNCTION

275

here is that the distribution function for the average polarization of the solvent follows some given polarization law and that a model that repro­ duces this law also would reproduce the electrostatic interaction between the solute and the solvent. This idea has been confirmed recently by detailed all-atom simulations (57, 125). The relevant polarization law can be obtained by calibrating the parameter C and the solute-solvent van der Waals distances for different atom types so that the solvation energies of various ionic species with different charges and radii are reproduced. This rather simple solvent model appears to give quite reliable solvation energies, which probably reflects the fact that the physics of solvation effects can be described by many alternative dipolar models (114). The key for obtaining reliable energetics lies in consistent calibration using observed solvation energies and radial distribution functiom and not in the source of the solvent charges or nonbonded potential (9). Thus, deriv­ ing solute-solvent interaction potentials from gas-phase quantum mech­ anical calculations leads at present to unreliable solvation energies unless the parameters are recalibrated using experimental information (53) (in which case the quantum mechanical step becomes unnecessary). The points raised above might sound strange because we are usually taught that hydrogen bonding properties of different solvents are very special, but the fact that different polar solvents give very similar solvation energies validates these ideas. For instance, the free energy of solvation for a Na+ ion in water and in ammonia is -98 and - 96 kcaljmol, respectively. Apparently, the physics of electrostatic interactions involves compensating effects so that the overall solvation energy can turn out the same even if its individual components are quite different. This compensation might be one reason why solvation effects can be reproduced even with rather simple continuum models using an adjustable cavity radius (88). The PDLD model also explicitly includes the effect of the protein elec­ tronic polarization by assigning induced dipole moments (e.g. atomic polarizabilitics) to all the protein atoms. These dipoles interact both with the permanent charge distribution of the system and with each other, and are therefore also treated by an iterative procedure. Researchers have implemented this type of approach in several subsequent polarizable models (14, 57, 64, 66, 90, 106, 114, 116). In the section on electrostatic effects, we review some applications of the PDLD method and examine its performance in more detail. The PDLD method has been implemented in the POLARIS program, which is part of the MOLARIS molecular modeling package (122). The PDLD model has been criticized, sometimes because of misunder­ standings [e.g. confusing the noniterative induced dipole scaling (123) with the protein dielectric constant (62)]. Criticism was also raised in a recent

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

276

WARSHEL

&

AQVIST

incorrect assertion that the solvent structure represented by the model is static. These critics failed to realize that the polarization law prescribed by Equation 1 0 corresponds to a thermal average of the orientations and positions of the solvent permanent and induced dipoles. Evidently, some have difficulty accepting the fact that the average effect of the microscopic polarization can be simulated by dipolar models (e.g. 62). Interestingly, many researchers who accept the continuum model as a satisfactory description overlook the fact that such models are drastic over­ simplifications of the microscopic dipolar picture ( 1 25). Scaled Microscopic Models can Extend the Precision of Electrostatic Calculations

In the macroscopic picture, one considers the force (i.e. the electric field) associated with moving charges from one site to another in the same medium. The integral of this force is the difference between the energy of the system in the initial and final state. In a uniform polar medium, the electrostatic force is rather small because the vacuum field is divided by a large dielectric constant. Hence, the energy of moving charges is very small, usually on the order of 1 kcal/mol , and calculated energies may be very similar to observed ones (at least in terms of the absolute magnitude of the errors). The inherent precision is not so useful when several phases are involved (e.g. protein and solvent). In such cases, in addition to inte­ grating the force, one must also calculate the large change in self-energy that accompanies the movement of the charges from one phase to another. Unfortunately, the error in this contribution may make our precise energy very inaccurate. On the other hand, the microscopic philosophy is based on evaluating absolute solvation energies associated with moving charges from vacuum to their site in the given medium. This approach involves large compensating contributions with opposite signs. Thus, obtaining precise results is quite hard, but here the precision corresponds to the accuracy of the method. A method that increases the precision of the microscopic approach by scaling it to reflect the macroscopic philosophy has been introduced recently ( 1 24). In this scaled microscopic or semimicroscopic model, the thermodynamic cycle for calculating the energy of a number of charges in a solvated protein involves the following steps (see Figure 1 ) . (a) We start with the relevant charges (as well as the protein itself) at infinite separation in water and obtain the corresponding hydration energies. These energies are simply calculated with each charge in a bath of Langevin dipoles. (b) The dielectric constant of the outer medium is changed from Bout to f.fm which gives a free energy (t\G 1 ) that reflects the corresponding change in the solvation energy of the charges and the protein (we use the notation

ELECTROSTATIC ENERGY AND FUNCTION

277

� l'.G2=(VQQ+VQI1)/ein Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

,------,

Figure

J The thennodynamic cycle associated with the scaled microscopic or semi­ microscopic electrostatic model described in the section on scaled microscopic models.

Bin to distinguish it from the Bin of the TK model; Bin denotes the remaining dielectric, i.e. that resulting from induced dipoles and dipolar reorgan­ ization when the protein polarity is explicitly treated). This free energy contribution is obtained by scaling the solvation energies of the previous step with the factor (lMn-I/Bout). (c) Thereafter, the charges are brought into their given protein sites, and the corresponding free energy change (�G2) is given by the total charge-charge and charge-dipole interactions divided by Bin. (d) The medium surrounding the protein is changed from B = Bin back to water (B Bout). This energy term (�G3) is obtained from the interaction between the protein with its charged groups and the sur­ rounding Langevin dipoles by scaling it with a factor (lMn -1/BouJ Hence, the overall electrostatic free energy of the process is given by =

1 1. where �G �ol are the solvation energies of each charge in water, �GLgvn the difference in the protein-Langevin dipole interaction energy between the states with the relevant groups charged and uncharged. VQQ+ VQI' denotes the total interaction energy, calculated with B 1 between the charges themselves and with the protein dipoles. The model yields very precise results for surface groups using Bin 2: 6. In this case, the observed /j,G is small, as is the calculated one because we scale the results by 1 /6. When one deals with charges in the interior of proteins, the accuracy of the model is less impressive. However, the procedure may be quite useful for obtaining upper and lower limits to the actual electrostatic energy by using, e.g. Bfn 2 and sfn 20. =

=

=

278

WARSHEL & AQVIST

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

Microscopic Approaches with All-Atom Solvent Models

Electrostatic energies in solutions and proteins can be evaluated, at least in principle, by treating all the atoms of the protein and the surrounding solvent explicitly by direct simulation. Such a brute-force approach does, of course, require a considerable amount of computer time for conver­ gence. The configurational space of the protein-solvent system is rather large and exploring it was impractical before modern computers were available. Instead, many simplified studies of solvation (e.g. 52, 85) included only a small number of solvent molecules (e.g. four or less). However, this strategy did not turn out very well because it did not provide any reliable means for calculations of dGso1s. In fact, as argued in the previous section, a better strategy appears to be to include all the different contributing solvation factors (even if some are represented in a simplified way) rather than to treat only a few elements rigorously. Studies of solvation energies using all-atom solvent models (which repre­ sent the totality of the solvent and give reasonable solvation free energies) started quite recently (83, 116) with the adaptation of the free energy perturbation (FEP) method (l05) to solvation problems. In this approach, which is sometimes referred to as adiabatic charging, one gradually changes (or mutates) the charges of solute from zero to their actual value, Qo, by a mapping potential ( 1 1 6): 12. where A is a mapping parameter that controls the charge corresponding to the effective potential 8m, From a series of moh'!cular dynamics (MD) or Monte Carlo (MC) simulations, corresponding to n discrete charging steps, the overall change in free energy is obtained as dG(Q

=

04 Q

=

Qo)

=

m=n-J L -RTl n { (exp[ -(8m+ 1-8m)/RT]>m}' m=O

13.

The solvation energy associated with moving from one environment to another is evaluated according to the thermodynamic cycle of Figure 2. This type of cycle was also used in earlier PDLD studies ( 1 1 7), and its importance is reemphasized in many recent FEP studies. Recently, interest has rapidly increased in the FEP method, and numer­ ous studies of various aspects of solvation energies in solutions have been published (e.g. 9, 15,40,45,54,57,67,83,95, 108, 116). The convergence of the calculations is satisfactory in many cases. The use of all-atom models for calculation of solvation energies (i.e. self-energies) in proteins is far more difficult than in solution, chiefly because of the heterogeneous nature of the microenvironment surrounding

ELECTROSTATIC ENERGY AND FUNCTION Q=O

Q=Qo

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

�t-s



1:

[QL . .

Figure 2

279

�L

1::.

AG

�L..

The thermodynamic cycle that describes the energetics of charged groups in

solution.

charges in proteins. The first calculation of this type (which was severely limited by the available computer power) involved the nonphysical mutation Glu3s-H+Sub � Glu3s+Sub-H+ in the active site of lyso­ zyme (1 1 8), where Sub denotes the sugar substrate of the system. The calculation examined the energetics of the generated ion pair and its dependence on the pKa difference between the substrate and the acid. A subsequent study evaluated the solvation free energy of ionized acids in bovine pancreatic trypsin inhibitor (BPTI) ( 1 29) and gave rather encour­ aging results. More recent calculations exploiting modern computer power show that one can indeed obtain reasonable and stable results. However, to date only a few attempts to calculate self-energies of charges in proteins have been reported ( 1 7, 25, 3 1 , 47, 94, 1 29). At present, despite initial optimism ( 1 29), it appears quite difficult to reduce the FEP errors when evaluating absolute solvation energies in proteins to less than 4% or so (which could correspond to as much as 5 kcal/mol). Overall, the FEP results (e.g. Figure 3) do not appear substantially better than the PDLD results of Russell & Warshel (Figure 9 of 92). In principle, using infinite computer time, many solvent molecules, large cut-off radii, perfected pa­ rameters, etc, we will be able to obtain accurate FEP results, but this stage has not yet been reached. In particular, the limitation in computer power still necessitates the use of cut-off radii for many systems, which leads to major problems in treating long-ranged electrostatic interactions. The need to reduce the system to a finite size is frequently met by using periodic boundary conditions, but this approach is, in fact, not a very good approxi­ mation when dealing with the spherical symmetry of solvation problems (see 57). The problem can also be treated using spherical boundary models

280

WARSHEL & AQVIST

PIOtaIn Induced dipoles

ProI8ln pennanenl dJpoles

Water + Bulk

Total solvation troeonergy

--

-60

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

-40

-20

����� �� 0.0 .. 0.0

�,

II J

74950

.11 3



74950

3

74950

3

74950

Figure 3 FEP evaluation of the different contributions to the energetics of the ionizable acids in BPTI. (Left) The dependence of the free energy on the charging parameter for Asp3 and Glu7. (Right) The different energy contribution to the total electrostatic energy for the four indicated acids (3,7,49, and 50). See References 92 and 128 for related PDLD and FEP studies.

with surface constraints that force the system to behave as if it were part of an infinite bulk system ( 1 7, 1 1 3 , 1 1 4, 1 1 6, 1 25). The cut-off dilemma can also be resolved by the use of nonperiodic Ewald (63). Nevertheless, the goal of reaching an accuracy of less than 1 kcal/mol in calculations of solvation free energies in proteins has not yet been reached. Thus, when studying new systems, one should compare estimates from the FEP, PDLD, scaled microscopic, and DC methods and take the deviations between the different models as indicators of the difficulties with modeling the system. WHAT IS THE DIELECTRIC CONSTANT IN PROTEINS?

With the background provided by the previous section, we could turn now to review actual calculations of electrostatic energies in proteins. However, because some readers might be more interested in simple back-of-the­ envelope considerations than in detailed calculations, we start by searching for some general rules. Knowing such universal rules is, in fact, equivalent to having a universal dielectric constant (or constants) in proteins. As might be suspected from the previous section, we do not believe that such a universal constant can be defined in a general way and recommend microscopic calculations. Nevertheless, once we start to understand the

ELECTROSTATIC ENERGY AND FUNCTION

28 1

origin and magnitude of a given electrostatic effect, we may benefit from classifying and interpolating our conclusions using some sort of dielectric constant. We therefore discuss below the concept of a dielectric constant of proteins, which has frequently caused confusion. The Dielectric Constant of Proteins Depends on Its

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

Definition

When searching for a universal dielectric constant for proteins, one is confronted with a major problem not mentioned in electrostatic textbooks. The value of the dielectric constant depends on the property used to define it ( 125). This issue is not merely semantic but is actually a fundamental aspect of inhomogeneous systems. We consider below several possible ways to define the dielectric constant in proteins, which are summarized in Table 1 . When an organic chemist studies a chemical reaction, he or she usually divides solvents into the two classes polar and nonpolar (e.g. water and hydrocarbons, respectively). Polar solvents have large values of e and stabilize ionic configurations much more than nonpolar solvents, in which B � 2. Using this type of relationship for defining e, one finds that protein sites accom­ modating small charged groups (e.g. ionized acids) are always polar rather than hydrophobic. As discussed in the section on macroscopic approaches,

OPERATIONAL DEFINITION IN TERMS OF LOCAL POLARITY

Table 1

Some rules for dielectric constants in proteinsa

Defini ti on Polar = e large Nonpolar = e small e(r)

=

332

Q Q2 , rt!G

I iit!G 1--= - -eB 1 66Q2

e(r)

=

s=



_

332

Q ,il2 cos II r2t!G

4n 1 0

e�4

Protein sites are always polar near ions with small radii. s

is large for charge charge interactions -

.

Proteins can provide as much solvation as water does for ionized groups with small radii. For functionally important charge dipole i n te ractions , F. can be as small as 4. Such a low value, however, requires relatively fixed dipoles with small reorganization energy. -

s = 2-30 This microscopic definition of s is not so useful for depending on evaluating functional properties. site

e(r) and eB correspond to effective values that should not

be confused with OJ"'

282

WARSHEL & AQVIST

the fact that ionized groups are sometimes found in the interior of proteins cannot be accounted for by TK-type models that treat the proteins as a low dielectric medium surrounded by water. Thus, by the polarity defi­ nition, the value of c (in the neighborhood of charges, at least) is large.

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

THE MACROSCOPIC BULK DEFINITION The customary electrostatic approach considers the entire protein as a bulk mass of uniform dielectric constant using

4nP+E c

=--E--

14 .

The bulk value of e is estimated by applying relatively weak electric fields to, say, a protein powder. In many physical measurements based on such procedures, c for the protein as a whole is quite low (39, 80), i.e. about 2 or so. This definition is, however, practically useless when dealing with specific charges in the protein interior and causes considerable confusion. Apparently, the average e of a protein does not tell us much about the microscopic stabilization of charges in proteins and predicts that (in con­ trast to experimental facts) no charges could be stable in the interior. By considering microscopic probes, one may evaluate c in proteins in a unique way. In this case, however, the value of e actually depends on what region of the protein is used to define it, thus indicating that c would be better represented as a spatially varying function. Below, we look at some of the most useful definitions. Let us start with the definition provided by Coulomb's law (Equation 7), which gives for the interaction between two ionized groups separated by a distance, r:

MICROSCOPIC DEFINITIONS

15. Upon examining the observed interactions between charges in a protein, one can obtain the corresponding eelf{r). Apparently, one finds as a rather general rule (89, 125, 126) that the value of Celf � 40 [or the related e(r) used elsewhere (70, 126)] gives a good approximation for most ionizable surface groups in proteins and that Self is always larger than 1 0, even for internally buried groups (see 125). The large c results not only from the fact that the protein is surrounded by water but also from the compensation of the polar protein components for the instability associated with charge separation, either by dipole reorientation or local unfolding with pene­ tration of water molecules (see Figure 27 of 125). By examining the effective interaction between a single charge and

ELECTROSTATIC ENERGY AND FUNCTION

283

surrounding dipoles, e.g. N-H groups in ion binding sites, one may evaluate the effect of a single dipole (by mutagenesis experiments, for instance) from ( )

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

Gelf r

=

- 332 r2AG(r) , QJlcos8

16.

where Jl is the given dipole moment in units of electron A ngstroms (e.g. � 0 . 3 for the N-H in an amide bond). Here, one finds that GelT � 4 (see Table 1), which reflects the fact that the forces on the protein are smaller than those in charge-charge interactions. On the other hand, for dipole­ dipole interactions, which can be estimated from, e.g. the contribution of a single hydrogen bond to the stability of the protein as a whole (3), one will find even lower values of sellr). Thus, the weaker and more short­ ranged the interaction, the lower is the effective value of G. Another microscopic definition of G can be obtained using the observed self-energy of the given charge and Born's formula (Equation 6). This gIves

( )

1 106 1-_

eB

=

-

aAG Q2



17.

,

where the f.B again represents an effective value associated with given interactions rather than with a medium. If we calculate self-energies using microscopic models or evaluate them using reliable thermodynamic cycles (125), we find that eB is large for all ionized groups with small effective radii in proteins, thus reflecting a polar local surrounding. The only excep­ tions are large groups where e can be rather small (21). This exception is possible because the charges are delocalized over a large spatial region and the force on the protein is sufficiently small to avoid local unfolding. If we use the semimicroscopic definition of Equation 8 and explicitly consider the protein dipoles, then we can reproduce observed self-energies with 4 ::; ern::; 20. However, with this type of definition, water would also have ern � 4 rather than 80. Some of the confusion concerning the value of f. inside proteins seems to arise from the fact that practitioners of macroscopic approaches initially used ep ein � 2 (for the protein, and frequently even for the combined effect of protein and water) and later included protein dipoles explicitly while implying that the value of ep had not changed. In fact, the definition of ep had changed from ein to ern, which, as should be evident from the discussion above, is not a matter of semantics but a matter of including polarity in the model, or not. Finally, we can use microscopic simulations to determine the macro­ scopic f. in a given region of the protein using =

284

WARSHEL & AQVIST

18. This interesting quantity, which is the most unique (but not necessarily the most useful) definition of s, gives different values at different protein sites and electric fields. Microscopic calculations of E are discussed below. Although the computed value of e might not be so useful for energetic calculations, it is interesting to try to evaluate it for different sites in a protein. This problem should not be 'confused with the much simpler one of evaluating Geff from e.g. DC calculations in which Sin and Sout are already assumed (e.g. 97). In trying to compute E from first principles, one actually seeks to determine Sin from the (microscopic) response of the system to the microscopic fields. To date, all the reported attempts to evaluate e (34, 72) suffer from oversimplifications that seem to underestimate [ [Gilson & Honig (34) obtained if 2-4, while Nakamura et al (72) obtained if 1-20]. These works were based on normal mode analysis of a protein, or just part of it, in vacuum and did not include the effect of the solvent reaction field (RF) on the average dipole moment of the specified region in the protein. A seemingly small effect from the RF does, however, lead to enormous changes in f.. For instance, if one tries to evaluate f. of a small volume element without including the RF, one obtains a value of f. in the range of 4 rather than 80 (57) (see Table 2). The effect of the RF, which might not be so widely appreciated, is not associated with the fact that the water around a protein with small Gin gives a large Geff for interacting charges but is connected with the fundamental value of the protein Gin" One can also raise some objections against using normal mode analysis to determine

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

ATTEMPTS TO GET [ F ROM FIRST PRINCIPLES

=

Table 2

=

Microscopic evaluation of the dielectric constant in BPTI"

System

Size

RF

Ii (Equation 1 8)

Ii (Equation 19)

Water Water

7A 7A 3A 7A 3A

+

22

7A

3A

+ + +

3A

+ +

80 4 42 24 22 14 20 4

Water

Asp3 Asp3 Glll7 Glll7 Interior

+

4 10 16 12 15 6 2

" The calculations based on Eq uation 18 use a field of 0.004 electron A '. RF indicates whether or not we include the reaction field in the calculations, while size designates the radius of the spherical region !. -

Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of California - San Francisco UCSF on 09/12/14. For personal use only.

ELECTROSTATIC ENERGY AND FUNCTION

28 5

the average polarization. If the normal mode treatment only deals with dihedral angles (72), then a large part of the polarization is missing (e.g. bending of R-O-H angles). Nevertheless, the above studies are quite inter­ esting and represent an important step towards determining s from a first principle. As indicated by the above discussion, any attempt to calculate s, in proteins must be carefully examined by checking whether the method can reproduce a reasonable value of s, for pure water. Moreover, proper calculations of S, in polar solvents are far from trivial and have only emerged fairly recently (6, 7,57,75). In fact, the inclusion of the solvent reaction field in the framework of periodic boundary conditions is some­ what inconsistent (see 57). Evaluation of if is fundamentally harder in proteins than in a homogeneous solvent like water. That is, in principle one should start from the basic microscopic definition of Equation 18 . However, one may try to use the well known formula (59) (s-n2)(2s+n2) s(n2+2)2

4n

Electrostatic energy and macromolecular function.

ANNUAL REVIEWS Annu. Copyright Annu. Rev. Biophys. Biophys. Chem. 1991.20:267-298. Downloaded from www.annualreviews.org by University of Californi...
853KB Sizes 0 Downloads 0 Views