Multiscale implementation of infinite-swap replica exchange molecular dynamics.

Multiscale implementation of infinite-swap replica exchange molecular dynamics Tang-Qing Yua, Jianfeng Lub,c,d, Cameron F. Abramse, and Eric Vanden-Eijndena,1 a

Courant Institute of Mathematical Sciences, New York University, New York, NY 10012; bDepartment of Mathematics, Duke University, Durham, NC 27708; Department of Physics, Duke University, Durham, NC 27708; dDepartment of Chemistry, Duke University, Durham, NC 27708; and eDepartment of Chemical and Biological Engineering, Drexel University, Philadelphia, PA 19104 c

Edited by Michael L. Klein, Temple University, Philadelphia, PA, and approved August 9, 2016 (received for review March 28, 2016)

Replica exchange molecular dynamics (REMD) is a popular method to accelerate conformational sampling of complex molecular systems. The idea is to run several replicas of the system in parallel at different temperatures that are swapped periodically. These swaps are typically attempted every few MD steps and accepted or rejected according to a Metropolis–Hastings criterion. This guarantees that the joint distribution of the composite system of replicas is the normalized sum of the symmetrized product of the canonical distributions of these replicas at the different temperatures. Here we propose a different implementation of REMD in which (i) the swaps obey a continuous-time Markov jump process implemented via Gillespie’s stochastic simulation algorithm (SSA), which also samples exactly the aforementioned joint distribution and has the advantage of being rejection free, and (ii) this REMD-SSA is combined with the heterogeneous multiscale method to accelerate the rate of the swaps and reach the so-called infinite-swap limit that is known to optimize sampling efficiency. The method is easy to implement and can be trivially parallelized. Here we illustrate its accuracy and efficiency on the examples of alanine dipeptide in vacuum and C-terminal β-hairpin of protein G in explicit solvent. In this latter example, our results indicate that the landscape of the protein is a triple funnel with two folded structures and one misfolded structure that are stabilized by H-bonds. importance sampling

| REMD | SSA | HMM | protein-folding

A

ll-atom molecular dynamics (MD) simulations are increasingly used as a tool to study the structure and function of large biomolecules and other complex molecular systems from first principles. By allowing us to scrutinize these systems at spatiotemporal resolutions that are typically beyond experimental reach, MD simulations are changing the way we understand chemical and biological processes, and their impact in areas such as chemical engineering, synthetic biology, drug design, and so on is bound to increase in the coming years (1–3). However, exploiting the full potential of MD simulations remains a major challenge, primarily due to approximations in all-atom force fields and the limitation of accessible timescales available to standard MD. Force fields continue to be improved (4–7), and hardware developments involving special-purpose high-performance computers (8), high-performance graphics processing units (9), or massively parallel simulations (10) also permit us to enlarge the scope of MD simulations. However, it has been argued (11) that algorithm developments will be the key to access biologically relevant timescales with MD. In particular, there is a great need for methods capable of accelerating the exploration and sampling of the conformational space of complex molecular systems. Among the various techniques that have been introduced to this end in recent years, replica exchange MD (REMD) (12) remains one of the most popular. The reasons for this success are simple: REMD does not require much prior information on the system (in particular, it does not require one to prescribe collective variables beforehand), it is fairly versatile, and it is easy to parallelize. The basic idea behind REMD is to introduce several additional copies, or replicas, of the system that have temperatures higher than the one(s) of interest and are therefore expected to explore its conformational space faster—control parameters other than the temperature can also be used, as long as adjusting their

11744–11749 | PNAS | October 18, 2016 | vol. 113 | no. 42

values permits accelerated sampling. These replicas are then evolved in parallel using standard MD, and their temperatures are periodically swapped. If the swaps are performed in a thermodynamically consistent way, the correct canonical averages at any of these temperatures can be sampled from the pieces of the replica trajectories during which they feel this temperature. In the standard implementation of REMD, this is done by attempting swaps at a predefined frequency and accepting or rejecting them according to a Metropolis–Hastings criterion. This guarantees that the joint distribution of the replicas is the symmetrized product of their canonical distributions at the different temperatures. The frequency of swap attempts is an input parameter in the method, and recent simulations argue that higher frequencies generally provide better sampling (13–17). This was justified rigorously through large deviation theory (18). How to implement high-frequency swapattempt REMD is not obvious, however, because increasing the attempt frequency of the swap also increases the computational cost, sometimes with little gain because this also leads to more rejections. One way to get around this difficulty is to take the socalled infinite-swap limit analytically and simulate the resulting limiting equations. This solution was first advocated in ref. 19 and later revisited in ref. 20. Unfortunately, this solution cannot be implemented directly if the number of temperatures is large, because the limiting equation involves coefficients whose calculation requires computing sum over all possible permutations of the temperatures, and if there are N temperatures, there are of course N! such permutations. Further approximations are therefore required in practice (19, 20). In this paper, we propose a different solution to this problem that involves no approximation. It is based on the observation that REMD can be reformulated in such a way that the temperature swaps satisfy a continuous-time Markov jump process (MJP) rather than a discrete-time Markov chain. This MJP can Significance Efficiently sampling the conformational space of complex molecular systems is a difficult problem that frequently arises in the context of molecular dynamics (MD) simulations. Here we present a modification of replica exchange MD (REMD) that is as simple to implement and parallelize as standard REMD but is shown to perform better in terms of sampling efficiency. The method is used here to investigate the multifunnel structure of the C-terminal β-hairpin of protein G in explicit solvent, which to date has required much longer REMD simulations to produce converged predictions of conformational statistics. In particular, we identify a potentially important folded β structure previously undetected by other importance sampling methods. Author contributions: T.-Q.Y., J.L., and E.V.-E. designed research; T.-Q.Y., J.L., C.F.A., and E.V.-E. performed research; J.L. and E.V.-E. contributed new reagents/analytic tools; T.-Q.Y., C.F.A., and E.V.-E. analyzed data; and T.-Q.Y., J.L., C.F.A., and E.V.-E. wrote the paper. The authors declare no conflict of interest. This article is a PNAS Direct Submission. 1

To whom correspondence should be addressed. Email: [email protected].

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1605089113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1605089113

ϱðXÞ =

1 X ρ ðx 1 Þ⋯ρσðNÞ ðx N Þ, N! σ σð1Þ

[1]

where the sum is taken over all of the permutation σ of the indices f1; ⋯; Ng [with σðiÞ denoting the index onto which i is mapped by the permutation σ]. The symmetrized density in Eq. 1 can be thought of as the marginal density on the positions X alone of the following joint distribution for X and the permutation σ: ϱðσ; XÞ =

1 ρ ðx 1 Þ⋯ρσðNÞ ðx N Þ. N! σð1Þ

N X

βσðiÞ Vðx i Þ + cst,

where XðsÞ is the solution to Eq. 4 for s ∈ ½t; t′ with σðtÞ = σ fixed. This reformulation of the process can be used to design straightforward variants of Gillespie’s SSA to simulate it—for this reason we will refer to it as REMD with SSA, or REMD-SSA for short. How to implement REMD-SSA with a finite ν is explained in Supporting Information, and how to reach the limit ν → ∞ is dealt with below. In both cases, these implementations are rejection-free and akin to kinetic Monte-Carlo schemes. For any swapping rate ν, the dynamics described in Eqs. 4 and 5 satisfies detailed balance with respect to Eq. 2 and therefore has Eq. 2 as its equilibrium distribution (see Supporting Information for details). As a result, the expectation of an observable A at any temperature 1=βj can be calculated as Z ÆAæj = AðxÞρj ðxÞdx = ≈

x_ j = m−1 pj ; pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p_ j = −β−1 βσðjÞ ∇V x j − γpj + 2γmβ−1 ηj ;

1

[5]

T

N X

0

Aðx i ðtÞÞ1j=σði;tÞ dt,

i=1

x_ j = m−1 pj , pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi p_ j = −Rj ðXÞ∇V x j − γpj + 2γmβ−1 ηj . Here Rj ðXÞ = β−1

X σ

βσ −1 ðjÞ ωX ðσÞ,

[8]

[9]

where σ −1 ðjÞ denotes the index mapped onto j by the permutation σ, is the averaged rescaling parameter of the force, with the average taken with respect to the equilibrium distribution of σ given X: e−βVðX;σÞ ϱðσ; XÞ ωX ðσÞdP −βVðX;σ′Þ = P . σ′ ϱðσ′; XÞ σ′ e

[10]

We note that Eq. 8 is exactly the infinite-swap REMD (ISREMD) formulated in ref. 20. The equilibrium distribution sampled by the limiting equations (Eq. 8) is the mixed distribution ϱðXÞ in Eq. 1. Therefore, the canonical average of A at βj can be estimated by Z AðxÞρj ðxÞdx ÆAæj =

[4]

where ν > 0 is a constant that controls the swapping rate (the higher ν, the more frequently the jumps occur), and the symmetric matrix aσ;σ′ = aσ′;σ indicates whether the permutation σ′ is accessible from σ, that is, aσ;σ′ = 1 if we allow the jump from σ to σ′ and aσ;σ′ = 0 otherwise—below we will Yu et al.

Z

[7]

i=1

Infinite-Swap REMD. It can be shown that the sampling efficiency of REMDSSA increases monotonically with ν (Supporting Information). This suggests one should operate the scheme in the infinite-swap limit ν → ∞, and so let us consider briefly this limit next—how to operate within it in practice will be described in the next section. As ν → ∞, the permutation σ evolves infinitely faster than X, meaning that σ is always at equilibrium conditional on the current value of XðtÞ, and XðtÞ only feels the average effect of σ. In other words, the dynamics of X is captured by the limiting equation

=

where ηj is white noise with mean zero and covariance Æηj ðtÞηTk ðt′Þæ = δj;k δðt − t′ÞId, and ii) the permutation σ is updated via the continuous-time MJP with rate qσ;σ′ ðXÞ with σ′ ≠ σ given by qσ;σ′ ðXÞ = νaσ;σ′ e−2 βðVðX;σ′Þ−VðX;σÞÞ ,

1 T

! Aðx i Þ1j=σðiÞ ϱðσ; XÞdX

where 1j=σðiÞ = 1 if j = σðiÞ and 1j=σðiÞ = 0 otherwise, and similarly for 1j=σði, tÞ: Here σði, tÞ denotes the index onto which i is mapped at time t by the timedependent permutation σðtÞ.

i=1

i) the replica positions evolve via standard MD (using, e.g., Langevin’s thermostat with friction coefficient γ) over the potential (Eq. 3),

Z X X N σ

[3]

where we used β ≡ β1 as reference temperature, and noting that ∇x j VðX; σÞ = β−1 βσðjÞ ∇Vðx j Þ, this amounts to imposing that:

t

σ′′

[2]

Performing temperature swaps is equivalent to evolving the permutation σ concurrently with the replica configurations X in a way that is consistent with Eq. 2. In standard REMD this is done by proposing a new permutation σ′ ≠ σ after a fixed time lag and accepting or rejecting it according to a Metropolis–Hastings criterion. However, it is easy to modify the method and make both X and σ continuous-time Markov processes in which the updates of σ occur at random times. Introducing the symmetrized (mixture) potential VðX; σÞ = −β−1 log ϱðσ; XÞ = β−1

These dynamics can also be reformulated as follows. Given that σðtÞ = σ at time t, the probability that it jumps out of σ for the first time in the infinitesimal time interval ½t′; t′ + dt with t′ > t and reaches σ′ ≠ σ is given by ! X Z t′ exp − qσ;σ′′ ðXðsÞÞds qσ;σ′ ðXðt′ÞÞdt, [6]

Z X N

Aðx i Þηi;j ðXÞϱðXÞdX

[11]

i=1

≈

1 T

Z

T 0

N X

Aðx i ðtÞÞηi;j ðXðtÞÞdt,

i=1

where ηi;j ðXÞ =

X σ

1j=σðiÞ ωX ðσÞ

CHEMISTRY

Methodology REMD-SSA. Let us propose a reformulation of REMD such that (i) the temperature swaps are replaced by swaps in factors multiplying the forces acting on the replicas and (ii) these swaps occur via a continuous-time MJP—the generalization to other control parameters is straightforward and is considered briefly in Using Parameters Other than Temperature. To this end, it is useful to take as a starting point the probability distribution that a REMD simulation is designed to sample. Suppose we use N replicas with positions denoted collectively as X = fx 1 ; ⋯; x N g and let β1 > β2 ⋯ > βN be the N inverse temperatures that are being swapped over these replicas. Denote also ρi ðxÞ = Zβ−1 e−βi V ðxÞ the canonical distribution at inverse temperature βi over the i R atomic potential VðxÞ (with Zβi = e−βi V ðxÞ dx). Then REMD samples the symmetrized equilibrium probability density (18, 19):

restrict ourselves to transposition moves in which two random indices are swapped and for which the total number of accessible states σ′ from σ is NðN − 1Þ=2.

[12]

is the probability that the ith replica is at the jth temperature conditional on the replica positions being at X.

PNAS | October 18, 2016 | vol. 113 | no. 42 | 11745

BIOPHYSICS AND COMPUTATIONAL BIOLOGY

be simulated via Gillespie’s stochastic simulation algorithm (SSA) (21), which is rejection-free: Given the current assignments of the temperatures across the replicas, the method computes directly the time at which the next swap occurs and proceeds with the MD until that time—this is in contrast with standard REMD, in which the swaps are proposed (and rejected or accepted) at fixed time lags. This REMD-SSA can then be combined with multiscale simulations schemes such as the heterogeneous multiscale method (HMM) (22–24) to effectively compute at the infinite-swap limit. Here, we discuss the theoretical and computational aspects of this reformulation of REMD involving a multiscale SSA, termed REMDMSSA for short, and apply it to compute free-energy surfaces (FESs) of alanine dipeptide (AD) and a 16-residue β-hairpin from protein G. In both applications, we observe that REMD-MSSA is more efficient than standard REMD: In particular, the rapid convergence of the free energy calculation in the context of β-hairpin allows one to identify a β structure.

In ISREMD, the permutation σ is no longer a dynamical variable: We only need to evolve XðtÞ according to Eq. 8. Unfortunately, this involves calculating the factor Rj ðXÞ defined in Eq. 9, and the cost of these calculations increases very rapidly with the number of replicas N, because the number of permutations grows as a factorial of N. On top of this, to estimate averages we also need to compute ηi,j ðXÞ, which is costly too. One way to address this problem is to introduce additional approximations, for example, partial swapping (18, 19). Next, we propose an alternative strategy that permits us to reach the infinite-swap limit while working directly with the original dynamics defined in Eqs. 4 and 5, without introducing additional approximations. Implementation: REMD with Multiscale SSA. To simulate the process defined by Eqs. 4 and 5 at large ν, that is, in a regime where the evolution of slow variables X is effectively captured by the limiting equation (Eq. 8), we will use a variant of the HMM (22–24). HMM’s basic idea is to compute on-the-fly the coefficients in the limiting equation (Eq. 8) for the slow variables X, as they are needed to evolve these variables, via short bursts of simulation of the fast process σ—these simulations are performed in what is called the microsolver in HMM, whereas the routine used to evolve the slow variables is referred to as the macrosolver; the scheme is made complete by an estimator specifying how the data generated in the microsolver are used within the macrosolver (i.e., how the unknown coefficients in the limiting equation for X are being computed). In the present context, the HMM scheme we will use works as follows. Choose a time step Δt appropriate for the simulation via MD of the limiting equation (Eq. 8) and pick a frequency ν such that ν 1=Δt (for example ν = 102 =Δt − 103 =Δt). Start the simulation with X 0 = Xð0Þ and σ 0 = σð0Þ, and set t0 = 0. Then for k ≥ 0, use the following three-step iteration algorithm: i) Microsolver: Evolve σ k via SSA from tk to tk+1 dtk + Δt using the rate in Eq. 5, that is: Set σ k;0 = σ k , tk;0 = tk , and for l ≥ 1, do: (a) Compute the lag to the next reaction via τl = −

ln r , qσ k;l−1

[13]

where r is a random number uniformly picked in the interval ð0,1Þ and P qσ≠σ′ ðX k Þ; qσ = σ′≠σ

(b) pick σ k,l with probability

pσ k;l =

qσ k;l−1 ;σ k;l ðX k Þ . qσ k;l−1

[14]

(c) Set tk;l = tk;l−1 + τl and repeat till the first L such that tk;L > tk + Δt; then set σ k+1 = σ k;L and reset τL = tk + Δt − tk;L−1. ii) Estimator: Given the trajectory of σ, estimate ηi,j ðX k Þ and Rj ðX k Þ via

^ηi,j ðX k Þ = =

1 Δt

Z

tk +Δt

1j=σði;sÞ ds tk

L 1 X 1j=σ k;l ðiÞτl ; Δt l=1

[15]

and X βi Z tk +Δt 1j=σði, sÞ ds Δt tk i X = β−1 βi ^ηi;j ðX k Þ.

^ j ðX k Þ = β−1 R

[16]

i

iii) Macrosolver: Evolve X k to X k+1 using one time-step of size Δt in the ^ j ðX k Þ calcuMD integrator for Eq. 4 with Rj ðX k Þ replaced by the factor R lated in the estimator. Then repeat the three steps above for each time step k. We will refer to this scheme as REMD with multiscale SSA, or REMD-MSSA for short. It gives an accurate approximation of the solution to the limiting ^ j ðXÞ and equation (Eq. 8) if ν 1=Δt. Indeed, in this regime the sampled R ^ηi,j ðXÞ will be close to their averaged values in Eqs. 9 and 12. We also note that a different implementation of the microsolver, based on a Metropolis– Hastings algorithm can also be used; this implementation is discussed in

11746 | www.pnas.org/cgi/doi/10.1073/pnas.1605089113

Supporting Information. It is closer to what is done in standard REMD, except that many swaps are attempted at each MD step instead of one swap every few MD steps (i.e., this implementation still permits one to effectively compute in the infinite-swap limit). Like standard REMD, REMD-MSSA is easily parallelizable. For example, the MD simulations of each replica can be performed on different nodes with an additional node used to evolve σ. At each MD step, the energies Vðx j Þ of the N replicas need to be communicated to the node evolving σ, and this node ^ j ðX k Þ. The calculation of sends back to those evolving X k the N values of R averages can be handled separately and requires the N values of Aðx j Þ and the NðN − 1Þ values of ^ηi,j ðX k Þ. A version REMD-MSSA has been implemented into Desmond (v3.4.0.2) (25) and used in the numerical examples presented next. To benchmark the method on a simple example, we also tested REMD-MSSA on a 1D asymmetric double-well potential. These results are reported in Figs. S1–S3 and show that REMD-MSAA combined with HMM permits one to reach the infinite-swap limit and is more efficient than standard REMD.

Numerical Results AD in Vacuum. As a first test of REMD-MSSA, we studied blocked AD (ACE-ALA-CBX) in vacuum under the CHARMM force field with CMAP corrections (26, 27). The free energy profile along the two dihedral angles ϕ and ψ for this system has been mapped (e.g., in ref. 28). We performed tests of REMD-MSSA using either 4 or 12 replicas, using the values of ν specified in simulation details, and running the simulations for 50 ns. We compared these results with those from an ISREMD simulation of 50 ns with four replicas (in which there are only 24 permutations) using the analytic weights in ref. 10—such a calculation with ISREMD is not feasible with 12 replicas because the number of permutations ( ≈ 4.8 × 108) becomes too large to manage. Finally, we compared the results with those from a standard REMD with 12 replicas and swapping rate of 0.5 ps−1. The reconstructed free energy profiles at 300 K are shown in Fig. 1A. The profiles calculated using REMD-MSSA with 4 and 12 replicas (Fig. 1A, Left, Upper and Lower) and ISREMD with four replicas (Fig. 1A, Upper Right) are in close agreement. These profiles also match those obtained with standard REMD (Fig. 1A, Lower Right) and they are in close agreement with those estimated by other enhanced sampling methods under the same force field, including those calculated with 100 ns of metadynamics and 250-ns accelerated MD (28). In particular, we correctly identified the four known local minima [using the same terminology as in Mackerell et al. (27)]: C5 located at ð−154,160Þ, C7eq at ð−81,66Þ, C7ax at ð80, − 63Þ, and αR at ð−99,12Þ. To assess the sampling efficiency of our scheme, we first looked at the times the temperatures of each replicas take to make a round trip between their highest and lowest values; equivalently, this amounts to monitoring the values of σði, tÞ versus time for each replica index i. It has been argued (16, 29–31) that the shorter these round-trip times, the faster the scheme will converge. A similar conclusion was reached in ref. 17, using a quantity closely related to round-trip times: the lifetime, which gives the average time a temperature stays at a given value before being updated to another. In the context of ISREMD, the round-trip times (and the lifetime) should be zero because this method operates in the infinite-swap limit: This means that we can use this test to quantify how close REMD-MSSA is from this limit. In Fig. 1B we show a typical trajectory σði, tÞ from REMD-MSSA and one from standard REMD (with swap rate at 0.5 ps−1) for the case of 12 replicas (the full set of trajectories can be found in Fig. S4). It can be seen that the round-trip times are indeed much shorter in REMD-MSSA than in standard REMD. Similarly, the lifetime in REMD-SSA is roughly 10 as =10−17 s compared with 102 ps =10−10 s for standard REMD. Another more stringent measure of efficiency is the flatness of the temperature distributions at each replica, or equivalently that of σði, tÞ for each replica index i. Indeed, it is known (30, 31) that these distributions should be uniform at equilibrium [i.e., for each i σði, tÞ should spend the same proportion of time at every index value j]. Unlike the round-trip times, which can be short even if the replicas have not equilibrated yet, these distributions can only be flat if the replicas are themselves Yu et al.

C

Fig. 1. (A) Free energy surfaces along the two dihedral angles ϕ and ψ for AD in vacuum at 300 K. (Upper Left) Fifty-nanosecond REMD-MSSA simulation with four replicas. (Upper Right) Fifty-nanosecond ISREMD simulation with four replicas using the analytical weights calculated from Eq. 10. (Lower Left) Fifty-nanosecond REMD-MSSA simulation with 12 replicas. (Lower Right) Fifty-nanosecond standard REMD simulation with 12 replicas and a swapping rate of 0.5 ps−1. All plots have 60 × 60 bins and filtered by standard Gaussian kernel. Same level sets are used in the contour plots. (B) Trajectories σði, tÞ of one replica. (C) Distributions of σði, tÞ for the same replica after 0.8 ns of simulation. In B and C, the orange curve is from REMDMSSA and the blue one is from standard REMD, both with 12 replicas.

at equilibrium. A representative distribution from REMD-MSSA and one from standard REMD, both calculated via time averaging of σði, tÞ over 0.8 ns of simulations, are shown in Fig. 1C (the full set of distributions can also be found in Fig. S5). As can be seen, the distribution from REMD-MSSA is almost uniform and significantly flatter than that from standard REMD, indicating that the former has converged after 0.8 ns, whereas the latter has not. Folding of Protein G β-Hairpin in Explicit Solvent. As a second test we considered a β-hairpin peptide in explicit water, whose folding behavior resembles that of larger protein, and which has been investigated by many techniques, including standard REMD (32– 37). Specifically, we studied the C-terminal fragment of the Ig binding domain B1 of protein G [Protein Data Bank (PDB) ID code 2gb1]. This capped peptide sequence contains 16 residues, 1-Ace-GEWTYDDATKTFTVTE-NMe-16, with 256 atoms. This β-hairpin is known as a hard-to-fold protein. In our REMDMSSA simulations, the initial structures for the replicas are chosen from an unfolded ensemble that is generated from a high-temperature MD run. We then run REMD-MSSA and check how fast folded state emerges and reasonable free energy profiles in various order parameters are obtained. In Fig. S6, we show traces of two order parameters, namely the number of β-strand H-bonds NH (defined in Eq. S18) and the rmsd of the Cα s from the native structure (PDB ID code 2gb1), obtained by monitoring the evolution for a few representative replicas in the REMD-MSSA Yu et al.


CHEMISTRY

B

simulation. These traces indicate that our REMD-MSSA simulation experienced several folding/unfolding events within 50 ns. This is a significant speed-up compared with a bare MD simulation, in which folding event are expected to take place every 500 ns to 1 μs (38). We used the REMD-MSSA simulation data to generate FES along the two order parameters, β-strand H-bonds NH and backbone radius of gyration RG, from a 100-ns REMD-MSSA run (Fig. 2A and Fig. S7). This FES captures the main features present in FES obtained from previous longer REMD simulations (32, 34, 39, 40). Specifically, we observe basins corresponding to conformations that form no β-strand H-bonds, form one or two H-bonds as partially folded β-sheet, and fully β-sheet with more than three H-bonds. We also compare this FES with the one from 120-ns standard REMD starting from the same initial structures (Fig. 2B). It can be seen that the folded basins are populated in REMDMSSA whereas only a few samples are seen in standard REMD. The FES from REMD-MSSA is in better agreement with previous longer simulation studies, indicating that REMD-MSSA can give a more converged FES than standard REMD within a 100-ns run due to higher sampling efficiency. To test convergence, in Fig. 2 C and D we show the trajectories and the distributions of σði, tÞ for one representative replica, for both REMD-MSSA and standard REMD (more representative trajectories and distributions are shown in Figs. S8 and S9). The round-trip times and the lifetime observed in REMD-MSSA are again shorter than those in REMD with swap rate 1 ps−1, with values similar to those reported for AD, and the temperature distributions are also a significantly flatter in REMD-MSSA than in REMD. To measure how flat these distributions are across all of the replicas, P we calculated the relative entropy of each, that is, Sðpi Þ = j pi ðjÞlogpi ðjÞ=pref ðjÞ, where pi ðjÞ is the distribution σði, tÞ and pref ðjÞ = 1=60 is the target uniform distribution: If pi ðjÞ = pref ðjÞ, Sðpi Þ = 0, otherwise it is positive, and the higher the value of Sðpi Þ = 0, the less flat pi ðjÞ is. These relative entropies are shown after ordering from smallest to largest in Fig. 2E: They are significantly smaller for REMD-MSSA than for standard REMD, indicative of faster mixing by the former than by the latter. Remarkably, careful investigation of the FES in Fig. 2 leads us to find a folded state that is sampled in REMD-MSSA but not in standard REMD. For reference, the native state’s structure and H-bond registry are shown in Fig. 3A; the native state is characterized by the β-strand H-bond pairs (E14,T1), (T12,T3), and (D10,T5). We detect three metastable states whose representative structures are depicted in Fig. 3 as insets and labeled as βN , β2, and M. The respective backbone H-bond contact maps of these states are shown in Fig. 3B. From these H-bond registries, we recognize βN as the native state, whereas β2 represents a distinct, nonnative β-hairpin state that is not sampled in our 100 ns of standard REMD, and M a misfolded state. Fig. 3C shows the FES at 270 K in the space of RG and NH, with states βN , β2, and M indicated. In this space, the three states are not well-distinguished, indicating that the (RG, NH) space is likely not optimal for understanding the conformational distribution of the folded states. We show in Fig. 3D the FES at 270 K in the space of two new collective and dCMAP , respectively, built from all samples in variables, dCMAP 2 1 the REMD-MSSA at 270 K for which NH > 2. Here, dCMAP 1 measures the L1 distance from state β2 and dCMAP from state βN 2 (see Supporting Information for their definition). In this space, the three states are easily distinguishable. To the best of our knowledge, the β-sheet state we observe here (β2) has never been detected in any other simulations of this peptide. To verify that β2 is metastable, we launched a 220-ns standard MD simulation at 270 K from a β2 sample. The state remained stable for the duration of that simulation, as can be seen from the driftless traces of several collective variables that feature the conformational structure (Fig. 3E and Fig. S10). We also calculated the average potential energy difference between β2 and βN to be 1.4 ± 7.7 kcal/mol, which indicates they are energetically indistinguishable. Even though state β2 looks similar to βN , its H-bond registry is different, as evidenced by the O–H contact maps in Fig. 3B. Their ,dCMAP ) space. To access β2 from distinction is revealed in the (dCMAP 2 1


A

A

C

B

D

landscape for the folding thermodynamics of this β-hairpin that has at least two distinct and deep minima and a misfolded kinetic trap. Concluding Remarks We have proposed a multiscale variant of REMD in which the swaps are performed via SSA, which effectively permits one to reach the infinite-swap limit known to optimize sampling efficiency. This REMD-MSSA is as simple to implement and parallelize as standard REMD and can be used with any set of temperatures, including sophisticated choices like those advocated, for example, in ref. 29 that would most likely improve the efficiency even more. It can also be used with parameters other than temperature (Figs. S11 and S12). Here its usefulness and efficiency were benchmarked on test cases of various complexity. In particular, in the context of folding the G β-hairpin, our REMD-MSSA method was shown to outperform standard REMD and other enhanced sampling techniques, as evidenced by the fact that it allowed us to identify a folded β structure previously undetected by these other methods. These results indicate that REMD-MSSA can be an efficient and accurate alternative to existing enhanced sampling methods and should be useful in a broad variety of MD applications. Simulation Details

E

Fig. 2. Free energy surfaces of β-hairpin at 270 K along the two order parameters β-strand HB number and backbone radius of gyration RG (A) from 120 ns of standard REMD at swapping rate 1 ps−1 and (B) from 100 ns of REMDMSSA. (C) Trajectories of σði, tÞ for a given replica i. (D) Distributions of σði, tÞ for this replica, from a 36-ns-long run. (E) Relative entropy of the distributions of σði, tÞ (in an ascending order). In C–E the orange curve is for REMD-MSSA and the blue one for standard REMD.

βN , the amino-terminal strand must flip 180° so that its outside O and H can point inside to form a hydrogen bond with H and O on the carboxyl-terminal strand (C-strand). This implies that either state must unfold first to break all H-bonds to get a chance to fold into the other. We indeed observe that β2 is populated predominantly by transitions from the unfolded pool rather than from the βN pool. Another difference between the two folded states is the turn structure. βN has a large π-turn (41) composed of D10, D9, A8, T7, and K6, whereas β2 has a small γ-turn (42) only involving D9, A8, and T7. Previous studies proposed folding of this β-hairpin uses either a “zipper” mechanism (36, 43–46) where hydrogen bonds form sequentially from the turn, or a “hydrophobic collapse” in which the folded state arises from a collapsed globule (32, 33, 35, 39, 47, 48). It is not clear from our simulation results which of these is preferred, but the γ-turn of β2, being so tight, may be unlikely to form before a few H-bonds first lock in a registry between the two strands. We therefore suggest a more detailed mechanistic study based on string method (49, 50) or transition path sampling (51) in the future. In addition to two folded β-sheet states, we observed one extremely stable misfolded state, labeled M. Its contact map and conformation are shown in Fig. 3. This misfolded state is stabilized by three β-strand hydrogen bonds and two salt bridges between the K6 ammonium group and the carboxylates on D10 and D9. Neither folded state displays any such side-chain salt bridging, and we therefore suggest that state M serves as a kinetic trap that slows the acquisition of authentic β-sheet structure, either in the native or secondary conformation. The picture that arises in light of these results is of a funnel-like 11748 | www.pnas.org/cgi/doi/10.1073/pnas.1605089113

AD. We used REMD-MSSA runs with temperatures in geo-

metric progression between 300 K and 1,200 K; other, perhaps more efficient, choices of temperatures could be used as well (29), but this one proved sufficient to our purpose. We used a time step of 2 fs for the Langevin dynamics and chose γ = 5ps−1. The bond lengths between hydrogen and heavy atoms were kept constant via M-SHAKE (52). We choose ν to obtain about 104 jumps of σ on average between two consecutive MD steps. This led to using ν = 10000=Δt with 4 replicas and ν = 500=Δt with 12 replicas—ν must be larger with 4 rather than 12 replicas because the energy gaps between the replicas are larger in the former case, implying that the jump rate in Eq. 5 is smaller. In the

Fig. 3. (A) Native β-hairpin structure with H-bond registry indicated. (B) H–O contact map of for βN , β2, and M; in each map, the horizontal axis indicates the residue number of the amide H and the vertical axis the residue number of the carbonyl O. (C) FES at 270 K in the space of number of β-strand H-bonds NH and the backbone radius of gyration RG. (D) FES at 270 K in the space of two contact-map distances for conformations with more than two β-strand H-bonds. The insets in C and D are representative conformations of the native folded state, βN ; the folded state β2; and a misfolded state, M. The FESs in C and D are built from a 100-ns-long run of REMD-MSSA. (E) Traces of NH and RG from a 220-ns-long bare MD simulation, where the blue is for β2 and the green is for βN . Yu et al.

1. Giberti F, Salvalaglio M, Mazzotti M, Parrinello M (2015) Insight into the nucleation of urea crystals from the melt. Chem Eng Sci 121:51–59. 2. Dror RO, et al. (2013) Structural basis for modulation of a G-protein-coupled receptor by allosteric drugs. Nature 503(7475):295–299. 3. Yu T-Q, Lapelosa M, Vanden-Eijnden E, Abrams CF (2015) Full kinetics of CO entry, internal diffusion, and exit in myoglobin from transition-path theory simulations. J Am Chem Soc 137(8):3041–3050. 4. Freddolino PL, Harrison CB, Liu Y, Schulten K (2010) Challenges in protein folding simulations: Timescale, representation, and analysis. Nat Phys 6(10):751–758. 5. Lindorff-Larsen K, et al. (2010) Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78(8):1950–1958. 6. Robertson MJ, Tirado-Rives J, Jorgensen WL (2015) Improved peptide and protein torsional energetics with the OPLSAA force field. J Chem Theory Comput 11(7):3499–3509. 7. Lopes P, Guvench O, MacKerell J, Alexander D (2015) Current status of protein force fields for molecular dynamics simulations. Molecular Modeling of Proteins, Methods in Molecular Biology, ed Kukol A (Springer, New York), Vol 1215, pp 47–71. 8. Shaw DE, et al. (2010) Atomic-level characterization of the structural dynamics of proteins. Science 330(6002):341–346. 9. Stone JE, et al. (2007) Accelerating molecular modeling applications with graphics processors. J Comput Chem 28(16):2618–2640. 10. Voelz VA, Bowman GR, Beauchamp K, Pande VS (2010) Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1-39). J Am Chem Soc 132(5):1526–1528. 11. Holdren JP, et al. (2010) Designing a digital future: Federally funded research and development in networking and information technology (The President’s Council of Advisors on Science and Technology, Washington, DC). 12. Sugita Y, Okamoto Y (1999) Replica-exchange molecular dynamics method for protein folding. Chem Phys Lett 314(1–2):141–151. 13. Sindhikara D, Meng Y, Roitberg AE (2008) Exchange frequency in replica exchange molecular dynamics. J Chem Phys 128(2):024103. 14. Abraham MJ, Gready JE (2008) Ensuring mixing efficiency of replica-exchange molecular dynamics simulations. J Chem Theory Comput 4(7):1119–1128. 15. Rosta E, Hummer G (2009) Error and efficiency of replica exchange molecular dynamics simulations. J Chem Phys 131(16):165102. 16. Sindhikara DJ, Emerson DJ, Roitberg AE (2010) Exchange often and properly in replica exchange molecular dynamics. J Chem Theory Comput 6(9):2804–2808. 17. Zhang BW, et al. (2016) Simulating replica exchange: Markov state models, proposal schemes, and the infinite swapping limit. J Phys Chem B. 18. Dupuis P, Liu Y, Plattner N, Doll JD (2012) On the infinite swapping limit for parallel tempering. SIAM J Multiscale Model Simul 10:986–1022. 19. Plattner N, et al. (2011) An infinite swapping approach to the rare-event sampling problem. J Chem Phys 135(13):134111. 20. Lu J, Vanden-Eijnden E (2013) Infinite swapping replica exchange molecular dynamics leads to a simple simulation patch using mixture potentials. J Chem Phys 138(8):084105. 21. Gillespie DT (1977) Exact stochastic simulation of coupled chemical reactions. J Phys Chem 81(25):2340–2361. 22. Weinan E, Engquist WEB (2003) The heterogeneous multiscale methods. Commun Math Sci 1(1):87–132. 23. Vanden-Eijnden E (2003) Numerical techniques for multi-scale dynamical systems with stochastic effects. Commun Math Sci 1(2):385–391. 24. Engquist WEB, Li X, Ren W, Vanden-Eijnden E (2007) Heterogeneous multiscale methods: A review. Commun Comput Phys 2(3):367–450. 25. Bowers KJ, et al. (2006) Scalable algorithms for molecular dynamics simulations on commodity clusters. Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (Assoc for Computing Machinery, New York). 26. MacKerell AD, et al. (1998) All-atom empirical potential for molecular modeling and dynamics studies of proteins. J Phys Chem B 102(18):3586–3616. 27. Mackerell AD, Jr, Feig M, Brooks CL, 3rd (2004) Extending the treatment of backbone energetics in protein force fields: limitations of gas-phase quantum mechanics in reproducing protein conformational distributions in molecular dynamics simulations. J Comput Chem 25(11):1400–1415. 28. Wang Y, Harrison CB, Schulten K, McCammon JA (2011) Implementation of accelerated molecular dynamics in NAMD. Comput Sci Discov 4(1):015002.

Yu et al.

ACKNOWLEDGMENTS. This work was supported by National Science Foundation Grant DMR-1207389 (to C.F.A., E.V.-E., and T.-Q.Y.), National Institutes of Health Grant GM100472, and National Science Foundation Grant DMS1454939 (to J.L.). T.-Q.Y. acknowledges the technical support from Shenglong Wang (New York University).

29. Katzgraber HG, Trebst S, Huse Da, Troyer M (2006) Feedback-optimized parallel tempering Monte Carlo. J Stat Mech Theory Exp 2006, 10.1088/1742-5468/2006/03/P03018. 30. Doll JD, Plattner N, Freeman DL, Liu Y, Dupuis P (2012) Rare-event sampling: Occupation-based performance measures for parallel tempering and infinite swapping Monte Carlo methods. J Chem Phys 137(20):204112. 31. Doll JD, Dupuis P (2015) On performance measures for infinite swapping Monte Carlo methods. J Chem Phys 142(2):024111. 32. Zhou R, Berne BJ, Germain R (2001) The free energy landscape for beta hairpin folding in explicit water. Proc Natl Acad Sci USA 98(26):14931–14936. 33. Nguyen PH, Stock G, Mittag E, Hu CK, Li MS (2005) Free energy landscape and folding mechanism of a beta-hairpin in explicit water: A replica exchange molecular dynamics study. Proteins 61(4):795–808. 34. Bussi G, Gervasio FL, Laio A, Parrinello M (2006) Free-energy landscape for beta hairpin folding from combined parallel tempering and metadynamics. J Am Chem Soc 128(41):13435–13441. 35. Best RB, Mittal J (2010) Balance between alpha and beta structures in ab initio protein folding. J Phys Chem B 114(26):8790–8798. 36. Best RB, Mittal J (2011) Free-energy landscape of the GB1 hairpin in all-atom explicit solvent simulations with different force fields: Similarities and differences. Proteins 79(4):1318–1328. 37. Hatch HW, Stillinger FH, Debenedetti PG (2014) Computational study of the stability of the miniprotein trp-cage, the GB1 β-hairpin, and the AK16 peptide, under negative pressure. J Phys Chem B 118(28):7761–7769. 38. Thukral L, Smith JC, Daidone I (2009) Common folding mechanism of a beta-hairpin peptide via non-native turn formation revealed by unbiased molecular dynamics simulations. J Am Chem Soc 131(50):18147–18152. 39. Bonomi M, Branduardi D, Gervasio FL, Parrinello M (2008) The unfolded ensemble and folding mechanism of the C-terminal GB1 beta-hairpin. J Am Chem Soc 130(42): 13938–13944. 40. Ardevol A, Tribello GA, Ceriotti M, Parrinello M (2015) Probing the unfolded configurations of a β-hairpin using sketch-map. J Chem Theory Comput 11(3):1086–1093. 41. Rajashankar KR, Ramakumar S (1996) Pi-turns in proteins and peptides: Classification, conformation, occurrence, hydration and sequence. Protein Sci 5(5):932–946. 42. Némethy G, Printz MP (1972) The γ turn, a possible folded conformation of the polypeptide chain. Comparison with the β turn. Macromolecules 5:755–758. 43. Muñoz V, Thompson PA, Hofrichter J, Eaton WA (1997) Folding dynamics and mechanism of beta-hairpin formation. Nature 390(6656):196–199. 44. Muñoz V, Henry ER, Hofrichter J, Eaton WA (1998) A statistical mechanical model for beta-hairpin kinetics. Proc Natl Acad Sci USA 95(11):5872–5879. 45. Du D, Zhu Y, Huang CY, Gai F (2004) Understanding the key factors that control the rate of beta-hairpin folding. Proc Natl Acad Sci USA 101(45):15915–15920. 46. Du D, Tucker MJ, Gai F (2006) Understanding the mechanism of beta-hairpin folding via phi-value analysis. Biochemistry 45(8):2668–2678. 47. Klimov DK, Thirumalai D (2000) Mechanisms and kinetics of beta-hairpin formation. Proc Natl Acad Sci USA 97(6):2544–2549. 48. Juraszek J, Bolhuis PG (2009) Effects of a mutation on the folding mechanism of a beta-hairpin. J Phys Chem B 113(50):16184–16196. 49. Weinan E, Ren W, Vanden-Eijnden E (2002) String method for the study of rare events. Phys Rev B 66:52301. 50. Maragliano L, Fischer A, Vanden-Eijnden E, Ciccotti G (2006) String method in collective variables: Minimum free energy paths and isocommittor surfaces. J Chem Phys 125(2):24106. 51. Bolhuis PG, Chandler D, Dellago C, Geissler PL (2002) Transition path sampling: Throwing ropes over rough mountain passes, in the dark. Annu Rev Phys Chem 53:291–318. 52. Lambrakos S, Boris J, Oran E, Chandrasekhar I, Nagumo M (1989) A modified shake algorithm for maintaining rigid bonds in molecular dynamics simulations of large molecules. J Comput Phys 85(2):473–486. 53. Jorgensen WL, Tirado-Rives J (1988) The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambin. J Am Chem Soc 110(6):1657–1666. 54. Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML (1983) Comparison of simple potential functions for simulating liquid water. J Chem Phys 79:926–935. 55. Kofke DA (2002) On the acceptance probability of replica-exchange Monte Carlo trials. J Chem Phys 117(15):6911–6914.


CHEMISTRY

β-Hairpin. We solvated the protein with 1,549 water molecules, neutralized with three Na+ ions, resulting in a total system size of 4,906 atoms and box dimensions of 36.3 × 36.3 × 36.3 Å3. OPLSAA is the force field for protein (53) and TIP3P for water (54). Long-range electrostatics are handled via particel-mesh Ewald summation with a 64 × 64 × 64 mesh. The cutoff for short-range nonbonded interactions was set to 12 Å. The integration time step is 2 fs with the bonds between hydrogen and heavy atoms constrained via M-SHAKE (52). The system is equilibrated at 270 K and 1 atm via 1-ns NPT Langevin dynamics then a 50-ns NVT simulation. REMD-MSSA simulations have 60 replicas

with temperatures ranging from 270 K to 690 K in a geometric progression (55)—here, too, other choices of temperatures could be used (29). The initial structures for these replicas were taken from the 4-ns NVT simulation at 700 K. We used a friction coefficient of 5 ps−1 in the Langevin thermostat and choose ν = 3=Δt to get about 1,000 temperature swaps per MD step. In the standard REMD simulation with 60 replicas under the same temperature setting we use random-neighbor swapping at the rate of 1 ps−1 and the same MD parameters as above. All MD simulations were carried out with Desmond (v3.4.0.2) (25).


standard REMD simulation with 12 replicas under same temperature setting, we use the same MD parameters as above with a swapping rate 0.5 ps−1. All MD simulations were carried out with Desmond (v3.4.0.2) (25).

Ensuring Mixing Efficiency of Replica-Exchange Molecular Dynamics Simulations.

Constant pH Replica Exchange Molecular Dynamics in Explicit Solvent Using Discrete Protonation States: Implementation, Testing, and Validation.

Evaluation of enhanced sampling provided by accelerated molecular dynamics with Hamiltonian replica exchange methods.

Effect of Taiwan mutation (D7H) on structures of amyloid-β peptides: replica exchange molecular dynamics study.

Conformational Transition Map of an RNA GCAA Tetraloop Explored by Replica-Exchange Molecular Dynamics Simulation.

Replica exchange molecular dynamics optimization of tensor network states for quantum many-body systems.

Replica-exchange molecular dynamics simulation of basic fibroblast growth factor adsorption on hydroxyapatite.

Multidimensional umbrella sampling and replica-exchange molecular dynamics simulations for structure prediction of transmembrane helix dimers.

Folding thermodynamics of β-hairpins studied by replica-exchange molecular dynamics simulations.

Sampling of Protein Folding Transitions: Multicanonical Versus Replica Exchange Molecular Dynamics.

Efficient Determination of Relative Entropy Using Combined Temperature and Hamiltonian Replica-Exchange Molecular Dynamics.

Thermodynamic and kinetic stability of the Josephin Domain closed arrangement: evidences from replica exchange molecular dynamics.

Identifying ligand binding sites and poses using GPU-accelerated Hamiltonian replica exchange molecular dynamics.

Prion versus doppel protein misfolding: new insights from replica-exchange molecular dynamics simulations.

Continuous Constant pH Molecular Dynamics in Explicit Solvent with pH-Based Replica Exchange.

Replica exchange molecular dynamics simulations reveal the structural and molecular properties of levan-type fructo-oligosaccharides of various chain lengths.

Internal coordinate molecular dynamics: a foundation for multiscale dynamics.

Effect of pH on the Aggregation of α-syn12 Dimer in Explicit Water by Replica-Exchange Molecular Dynamics Simulation.

Self-assembly and bilayer-micelle transition of fatty acids studied by replica-exchange constant pH molecular dynamics.

Structural diversity and initial oligomerization of PrP106-126 studied by replica-exchange and conventional molecular dynamics simulations.

Replica-exchange molecular dynamics simulations of cellulose solvated in water and in the ionic liquid 1-butyl-3-methylimidazolium chloride.

Hamiltonian replica exchange combined with elastic network analysis to enhance global domain motions in atomistic molecular dynamics simulations.

Alzheimer's Aβ10-40 peptide binds and penetrates DMPC bilayer: an isobaric-isothermal replica exchange molecular dynamics study.

Synaptobrevin Transmembrane Domain Dimerization Studied by Multiscale Molecular Dynamics Simulations.