On the analysis and comparison of conformer-specific essential dynamics upon ligand binding to a protein Marcos Grosso, Adrian Kalstein, Gustavo Parisi, Adrian E. Roitberg, and Sebastian Fernandez-Alberti Citation: The Journal of Chemical Physics 142, 245101 (2015); doi: 10.1063/1.4922925 View online: http://dx.doi.org/10.1063/1.4922925 View Table of Contents: http://scitation.aip.org/content/aip/journal/jcp/142/24?ver=pdfcov Published by the AIP Publishing Articles you may be interested in Dynamics of water around the complex structures formed between the KH domains of far upstream element binding protein and single-stranded DNA molecules J. Chem. Phys. 143, 045106 (2015); 10.1063/1.4927568 BDflex: A method for efficient treatment of molecular flexibility in calculating protein-ligand binding rate constants from Brownian dynamics simulations J. Chem. Phys. 137, 135105 (2012); 10.1063/1.4756913 Conformational dynamics of ATP/Mg:ATP in motor proteins via data mining and molecular simulation J. Chem. Phys. 137, 075101 (2012); 10.1063/1.4739308 Theory and simulation on the kinetics of protein–ligand binding coupled to conformational change J. Chem. Phys. 134, 105101 (2011); 10.1063/1.3561694 Functionally relevant protein motions: Extracting basin-specific collective coordinates from molecular dynamics trajectories J. Chem. Phys. 122, 034904 (2005); 10.1063/1.1830434

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

THE JOURNAL OF CHEMICAL PHYSICS 142, 245101 (2015)

On the analysis and comparison of conformer-specific essential dynamics upon ligand binding to a protein Marcos Grosso,1 Adrian Kalstein,1 Gustavo Parisi,1 Adrian E. Roitberg,2 and Sebastian Fernandez-Alberti1,a) 1 2

Universidad Nacional de Quilmes, Roque Saenz Peña 352, B1876BXD Bernal, Argentina Departments of Physics and Chemistry, University of Florida, Gainesville, Florida 32611, USA

(Received 26 February 2015; accepted 12 June 2015; published online 26 June 2015) The native state of a protein consists of an equilibrium of conformational states on an energy landscape rather than existing as a single static state. The co-existence of conformers with different ligand-affinities in a dynamical equilibrium is the basis for the conformational selection model for ligand binding. In this context, the development of theoretical methods that allow us to analyze not only the structural changes but also changes in the fluctuation patterns between conformers will contribute to elucidate the differential properties acquired upon ligand binding. Molecular dynamics simulations can provide the required information to explore these features. Its use in combination with subsequent essential dynamics analysis allows separating large concerted conformational rearrangements from irrelevant fluctuations. We present a novel procedure to define the size and composition of essential dynamics subspaces associated with ligand-bound and ligand-free conformations. These definitions allow us to compare essential dynamics subspaces between different conformers. Our procedure attempts to emphasize the main similarities and differences between the different essential dynamics in an unbiased way. Essential dynamics subspaces associated to conformational transitions can also be analyzed. As a test case, we study the glutaminase interacting protein (GIP), composed of a single PDZ domain. Both GIP ligand-free state and glutaminase L peptide-bound states are analyzed. Our findings concerning the relative changes in the flexibility pattern upon binding are in good agreement with experimental Nuclear Magnetic Resonance data. C 2015 AIP Publishing LLC. [http://dx.doi.org/10.1063/1.4922925]

I. INTRODUCTION

Proteins are found in a nonuniform distribution of conformer populations in dynamic equilibrium.1 The energy landscape of proteins can be described as an equilibrium of pre-existing populations of states separated by energy barriers that can be overcome by thermal fluctuations. These states correspond to different conformations whose populations follow statistical thermodynamic distributions. The heights of the energy barriers separating the conformations define the timescale of conformational exchange. This emerging view of protein structure and dynamics2–5 provides a theoretical framework to study protein folding funnels,6–8 conformational selection, ligand-binding9–11 allostery, protein evolution,12 and, therefore, the ultimate aim of understanding the mechanism of protein function.13–16 The energy landscape theory validates and reformulates the conformational selection model for ligand-binding17–20 originated from the Monod-Wyman-Changeux model of allostery.21,22 This model postulates a pre-existing equilibrium of conformers with lower (ligand-free (lf)) and higher (ligandbound (lb)) affinities for the ligand. Upon binding, the equilibrium populations of the conformational states shift towards the ligand-bound state. The existence of different ligand-bound a)Author to whom correspondence should be addressed. Electronic mail:

[email protected]

0021-9606/2015/142(24)/245101/12/$30.00

and ligand-free conformations is extensively supported by a large variety of experimental evidence obtained from Xray and cryo-electron microscope images, kinetic studies, single molecule fluorescence, and NMR (Nuclear Magnetic Resonance).23–26 Differential structural and dynamical properties between conformers can have important functional consequences. Equilibrium fluctuations associated to the ligand-free conformation predispose the protein to experience the unboundto-bound conformational change required to bind its ligand. Internal protein motions responsible for the conformational changes upon ligand binding were commonly found to correspond to low-frequency vibrations.27–30 Therefore, elasticnetwork models have proven to be quite successful in characterizing and predicting these large-scale conformational changes.31–35 These models describe protein dynamics taking into account the protein architecture described by the inter-residue contact topology, but neglecting information related to the amino acid sequence and full atomic coordinates.36,37 However, many signaling molecules, like PDZ domains, do not experience such large-scale conformational changes but rather slight structural and dynamic variations in their local secondary structure elements (SSEs).38 It is not always clear if the observed slight variations in structure/dynamics have a significant role in ligand-binding or they are simply the result of noisy fluctuations associated to relative large flexible regions of the protein. Therefore,

142, 245101-1

© 2015 AIP Publishing LLC

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-2

Grosso et al.

a more detailed picture of the underlying dynamics is required. Molecular dynamics (MD) simulations provide a framework for the structural and dynamic interpretation of ligand-binding and the exploration of the conformational energy landscape of proteins.39–42 Because of the complexity of protein motions, results obtained from MD require further analysis to grasp the motions of interest, or to uncover ligandbinding mechanisms. Principal Component Analysis (PCA)43–47 is a useful multivariate statistical method that has been applied to reduce the number of dimensions needed to describe protein dynamics. The method consists of diagonalizing the covariance matrix of atomic fluctuations. The resulting set of PCA modes is sorted according to their contribution to the total fluctuation along the MD simulation. Essential dynamics analysis is based on the hope that a very small set of collective modes dominate the functional dynamics. Therefore, the dynamics in the low-dimensional subspace spanned by these PCA modes is defined as the essential dynamics.48 Most of the published PCA analysis is commonly restricted to the use of the first two or three of low frequency PCA modes, arbitrary neglecting the role of other modes in the dynamics. Herein, we focus on selecting the set of PCA modes that systematically and objectively define the essential dynamics subspace. The size of the subspace is related to the diversity of patterns generated by thermal fluctuations of the protein. Therefore, it is expected that the sizes and compositions of essential subspaces vary according to the relative changes in the flexibility patterns of the different conformers. Besides its usefulness in the analysis and interpretation of MD, PCA can be used to compare the motions of two MD trajectories.49 Mode mixing and/or changes in the PCA ordering can take place while comparing conformer-specific essential dynamics. These features discourage any attempt to compare essential subspaces based on the overlap of individual modes. In this context, the comparison of subspaces, rather than individual modes, seems to be adequate to overcome this task. Previous works have explored different measures to compare mode subspaces.50–54 In the present work, we present a method to consistently define the size and composition of ligand-bound and ligandfree essential subspaces. Our definition of conformer-specific subspaces allows us to retain every PCA mode that participates significantly in any observed structural and dynamics change and to perform a comparative analysis without underestimating minor events. As a result, a comparative analysis of each subspace can be envisaged. Furthermore, in order to identify differential dynamical properties acquired upon ligand binding, we present a procedure to analyze separately the similarities and differences between these subspaces. The procedure allows comparing essential dynamics associated not only with each conformer but also with the conformational transition between them. In this way, the extension through which conformational changes upon ligand binding are included in each conformer-specific essential dynamics can be evaluated. The glutaminase interacting protein (GIP) is studied here as a test case. GIP presents the highly unusual feature, among PDZ domain55 containing proteins, of being composed almost

J. Chem. Phys. 142, 245101 (2015)

exclusively of a single PDZ domain rather than one of many domains as part of a larger protein. GIP is known to interact with a large variety of proteins, including glutaminase L.56 Among the biological function of GIP, we can mention not only the regulation of glutaminase L but also its participation in a variety of different cancer and cell signaling pathways.57–61 Upon binding its partners, GIP presents significant changes in its pattern of relative flexibility. The general trend, with exceptions, is that regions that directly interact with glutaminase L become more rigid, while an increase in flexibility distributed throughout the rest of the protein62 is observed. PDZ domain proteins are small protein-protein interaction modules that contain 80-100 amino acid residues. They form a compact domain primarily composed of one or two α-helices (α1 and α2), and five or six β-strands (β1-β6),63,64 which fold in an overall six-stranded β-sandwich. Proteins containing PDZ domains are often involved in signaling pathways and act as scaffolds in the organization of multimeric complexes. PDZ domains typically recognize specific amino acids in the C-terminal end of peptide motifs or proteins. The C-terminus of the interaction partner binds as an antiparallel β-strand in a groove between β2 and the α2 helix and is capped by the loop β1-β2. Despite their highly conserved overall fold and binding sites,65 they are able to bind to multiple ligands that belong to different classes of peptide motifs.66–68 While most of the binding specificity is determined by a subset of residues that are likely to be located in the binding pocket of the domain, other functional residues are not in direct contact with the ligand.69 The dynamics of PDZ domains upon ligand binding implies correlations over the entire protein structure.70,71 Cooperative and long-range effects also participate in determining the specificity of the ligand-binding process.72,73 These findings suggest that fluctuation and flexibility properties are important determinants of function.74,75 Dynamic properties of PDZ domains have been the focus of several experimental70,76–81 and theoretical previous works.38,69,71,82–91 Normal mode analysis71,90 shows that the conformational change upon ligand-binding can be mainly captured by low-frequency collective domain movements. That is, thermal fluctuations of PDZ domains seem to introduce the structural distortions involved in ligand-binding. Networks that physically link functionally distant sites70,83,77 and present dynamical responses upon ligand-binding, as well as pathways of energetic connectivity,69,76,82,85,92 have been previously observed in a large variety of PDZ domains. These features have been proposed as the basis for allosteric behavior and efficient intramolecular energy conduction.69 In this paper, we present a general and novel procedure to define the size and composition of conformer-specific essential dynamics. We also describe a procedure to compare essential dynamics subspaces. We provide a procedure to explore the extent through which the directions of conformational changes upon ligand binding are included in each conformer-specific essential dynamics. As a test case, we consider MD simulations and PCA of GIP in its ligand-bound and ligand-free conformations. The paper is organized as follows. In Sec. II, we present the methodology to define and compare conformer-specific essential subspaces. In Sec. III, we present and discuss our

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-3

Grosso et al.

results concerning the ligand-binding of a single PDZ domain of GIP. Conclusions are given in Sec. IV. II. THEORETICAL METHODOLOGY A. Molecular dynamics simulations

MD simulations were done for GIP (ligand-free conformation) and the GIP-glutaminase L peptide complex (ligandbound conformation). These were carried out with the AMBER 12 software package.93–96 Initial coordinates for each protein were taken from NMR based structures. The protein databank identifiers are 2L4S (GIP) and 2L4 T (GIPglutaminase L). In the GIP system, a Cl− ion has been added for charge neutralization. Each system was solvated with explicit TIP3P water molecules in a rectangular periodic box large enough to contain the protein and 10 Å of solvent on all sides. The Amber ff99SB force field parameters97 were used for all residues. The standard protonation state at physiological pH was assigned to the ionizable residues. A 1000-step steepestdescent minimization was followed by a 3000-step conjugate gradient minimization applying constraints on atoms of the protein and thereafter, additional 5000-step conjugate gradient minimization without any constraints. The system was then heated for 100 ps to a final temperature of 300 K. During heating, a harmonic restraint of 50.0 (kcal/mol)/Å2 was applied to the protein atoms. The time step was 2 fs and the SHAKE algorithm was employed to constrain bonds involving hydrogen atoms. A cutoff of 10 Å was applied to non-bonded interactions. The system was equilibrated at constant pressure using 10 steps of 20 ps and reducing the restraint by 5.0 (kcal/mol)/Å2 on each step. After the last step with restrains, all restrains were lifted and the system was equilibrated for 100 ns at the constant temperature of 300 K using Andersen barostat and thermostat98 with periodic boundary conditions and particle-mesh Ewald (PME) sums. Once the system has relaxed, a production MD run is performed. Configurations were collected at 1-ps intervals during one equilibrated MD simulation of 400 ns, which were subsequently used to analyze the MD simulations. In order to verify the convergence of our results, production MD simulations have been extended for additional 800 ns and configurations were bunched in 400 ns intervals. B. PCA background

PCA is a well-known and widely used method to analyze the essential dynamics from MD simulations.43–48 For the sake of consistency, we briefly review PCA below. PCA is performed by calculating the covariance matrix C whose elements are defined as

1 K k k q q , (1) Cij = qi q j = k=1 i j K where the sum goes over the K configurations stored during √ an equilibrated MD simulation, qik = mi x ki − ⟨x i ⟩ is the mass-weighted internal displacement of Cartesian coordinate x ki of the ith atom (i = 1, . . . , N; N = number of atoms in the molecule) with mass mi , and the angular brackets represent the average obtained from the K configurations of the cor-

J. Chem. Phys. 142, 245101 (2015)

responding equilibrated MD simulation. The center-of-mass translational and rotational motions are customarily removed by least-squares fitting to a reference structure.48 By diagonalizing the C matrix LT CL = Λ,

(2)

we obtain the elements of Λ representing the relative contribution of each PCA or essential mode (EM) to the overall fluctuation of the molecule, and the eigenvector matrix L containing the 3N orthonormal eigenvectors Qi referred to as PCA modes. The eigenvectors are typically ordered according to descending eigenvalues, with the first PCA mode being the one with major contribution. The PCA modes can be calculated either considering all the atoms N in the molecule or, for instance, only the Cα atoms. In the latter case, the dimensionality of C matrix is reduced to (3Nr × 3Nr ), where Nr is the number of residues of the molecule. This is the option we have used in all the calculations shown in the present work since we have previously confirmed that the fluctuation patterns of Cα atoms are in good agreement with those obtained using all atoms. C. Essential subspaces

Any mass-weighted internal displacement vector qk with elements qik (i = 1, 3Nr ) corresponding to the kth configuration stored during the MD simulation can be projected on the basis set of PCA modes as 3Nr 3Nr (3Nr ( ))  qk = qk · Qi Qi = q kj l j i Qi i=1 i=1 j=1 3Nr = cik Qi , (3) i=1

with cik =

3Nr ( j=1

) q kj l j i .

(4)

In order to evaluate the delocalization of the displacements associated to qk among the different PCA modes, we can define the mode participation number99,100 as (3N 4) −1 r cik . (5) PQk = i=1

The participation number has been originally introduced as a convenient means of describing a measure of the delocalization for a given normal mode vector. In that case, the participation number has the value of 3Nr for a pure translation, and the value of unity for a highly localized mode. Beyond these two extremes, the participation number can be used to define the delocalization at intermediate situations. That is, the participation number represents a measure of the delocalization of the normal mode vector on the basis of the atomic Cartesian coordinates. In the present work, we extend this concept in order to apply it to the delocalization of a displacement vector qk , obtained during the MD simulations, on the basis of PCA modes. The value of PQk , rounded to the nearest higher integer, contains information about the number of PCA modes needed to describe the displacements of qk with respect to the average from the dynamics. Values of PQk ≈ 3Nr mean that qk is completely delocalized over the complete basis set {Qi } i=1,3Nr , that is, all PCA modes are required in order to achieve a good representation of the atomic displacements

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-4

Grosso et al.

J. Chem. Phys. 142, 245101 (2015)

associated to the kth configuration. Values of PQk ≈ 1 correspond to a perfect match between qk and one single Qi , that is, the atomic displacement profile of the kth configuration is coincident with the corresponding profile observed in one single PCA mode. The first PQk modes ordered by index f ik  k 2 in decreasing   values of ci define the minimum subspace of modes Q f k i

k i=1, PQ

required to achieve a good description of

the vector qk . By performing this calculation for K different configurations, we can define the essential subspace S as the subspace composed by the n PCA  modes that belong to at defined for any of the least one of the subspaces Q f k k i

i=1, PQ

K configurations stored during the MD simulation. That is, S recovers every PCA mode that participates significantly in any observed structural and dynamic fluctuations. The ligandbound and ligand-free PCA subspaces are computed separately from each other with their PCA referenced against their own average.

diagonal matrix W with positive or zero elements (the singular values), and the transpose of a K × K orthogonal matrix V, *. +/ *. +/ *. w1 + * + .. D // = .. U // · ..· · · wi · · ·/// · ... VT /// . (7) wK - , , - , - , ′ Thus, the d ijk k elements of matrix D can be expressed as the sum of products of columns of U and rows of VT, with the “weighting factors” being the singular values w j , K ′ d ijk k = wl · uil · v jl . (8) l=1

The set of SVD vectors {ui }i=1,3N r , defined as the columns ′ of U, represents a basis on which the set of dk k vectors is projected, 3Nr ( ′ 3Nr (3Nr ( ′ ) ) ) ′ dk k = dk k · ui ui = d kj k u j i ui i=1 i=1 j=1 3Nr ′ = bki k ui , (9) i=1

with D. Essential subspace for conformational transition

The essential subspace associated with a conformational transition is commonly obtained by diagonalizing the correlation matrix obtained by concatenating the ligand-free and ligand-bond MD simulations. This methodology has been extensively used in previous publications after its introduction by Berendsen et al.101 The essential subspace associated with a conformational transition should contain information on the large variety of possible structural differences between the collected conformations during the ligand-free and ligandbound MD simulations. In this article, we have explored an alternative procedure based on a direct processing of these differences. For this purpose, we have set up vectors difference defined as ′



dk k = qk − qk , k′

(6)

with qk and q being mass-weighted displacement vectors corresponding to the kth and k ′th configurations stored during the ligand-free and ligand-bound MD simulations, respectively. In practice, we peak a random subset of k and k ′ objects from the total k · k ′ set. The correct application of ′ this method requires that both qk and qk must be translated and rotated to a Cartesian frame that coincides with that of an arbitrary reference structure. This structure has been chosen as the average structure obtained in the ligand-free PCA modes ′ calculations. Therefore, the ensemble of dk k vectors holds all possible unbound-to-bound conformational changes. We construct a matrix D of dimension 3Nr × K with columns ′ representing each of the dk k vectors. Matrix D is not a square matrix. So, in order to capture the ′ dominant modes of motion associated to the set of dk k vectors, a Singular Value Decomposition (SVD) of D can be performed. The SVD method has found wide-ranging applications102 as a dimensionality reduction technique to transform the original high-dimensional representation of protein motion into a lower-dimensional representation that captures the dominant modes of motion of the protein. That is, D is written as the product of a 3Nr × K column-orthogonal matrix U, a K × K



bki k =

3Nr ( j=1

) ′ d kj k u j i .

(10)

In analogy with the procedure described in Sec. II D, we eval′ uate the corresponding participation number Puk k and essential subspace SV that harvests every SVD mode that participates significantly in any of the K unbound-to-bound displacements. E. Secondary structure elements

In order to achieve a simplified picture of the pattern of internal fluctuations of a protein, it can be suitable to analyze it in terms of the contributions of the different SSEs. Considering that the PCA modes are expressed in terms of the average massweighted internal displacement vectors as 3Nr Qi = l j i q kj , (11) j=1

we can define the corresponding participation number, expressed in terms of SSE, as ( (3N ) 2) −1 M m 2 SSE PQ i = l ji , (12) m=1

j ∈m

with M being the number of SSE in the protein, and Nm the SSE number of Cα belonging to the mth SSE. Values of PQ ≈M i correspond to PCA modes delocalized among all the SSE of SSE the molecule, and PQ ≈ 1 corresponds to modes localized on i only one SSE. F. Similarities and differences between essential subspaces

The comparison of essential dynamics subspaces for ligand-free, ligand-bound, and conformational transition (see Secs. II C and II D) can be performed according to the geometric approach of Krzanowski.103,104 The method can be used to calculate angles between pairs of PCA modes that belong to the set of two different essential subspaces. These values can be used to quantify the similarity between subspaces.105,106 We define two matrices, A(3Nr × M) and B (3Nr × R), representing two essential subspaces with vector columns

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-5

Grosso et al.

J. Chem. Phys. 142, 245101 (2015)

{ai } i=1, M and {bi } i=1, R containing the set of essential modes selected according to the procedures described previously in Secs. II C and II D. They can be compared by defining the vector projection of each a j onto the set of modes {bi } i=1, R as R  pAB a j · bi bi . (13) j = i=1

 ABThe Gramian matrix G (M × M) of the set of vectors pi i=1, M is calculated as the matrix of inner products, whose entries are given by ( ) AB Gij = pAB . (14) i · pj

m B nor = j

B j − ⟨B⟩ σB

(22)

being B and σ B the average and standard  deviation,  and respectively. The normalized values of Bsim j j=1,3N r  diff  Bj represent the effective similarities and differj=1,3N r ences between both subspaces expressed in terms of atomic displacements. These values can also be expressed in terms of SSEs grouping the atomic values according to their belonging to each SSE. In the same way, Eqs. (19) and (20) can be obtained using vectors {di } i=1, M (Eq. (18)).

The diagonalization of G LTGGLG = ΛG

(15)

allows us to use the eigenvalues of G, {λi } i=1, M , as a measure of the similarity between the two subspaces. The smallest angle between any √ pair of orthogonal modes of A and B is defined as cos−1 λ 1 where λ 1 is the largest eigenvalue of G. Hence, the sum of the eigenvalues of G equals the sum of squares of the cosines of the angles between the two essential subspaces.103 Since all the eigenvalues of G vary between 0 and 1, which correspond to critical angles between 0◦ and 90◦, we can define a measure of the similarity of the two subspaces as M λi AB . (16) ζ = i M If ζ AB ≈ 0, the two subspaces are dissimilar, while ζ AB ≈ 1 indicates that the two essential subspaces share the same orientation. In order to determine the contribution of the original traits to the similarity between subspaces, the eigenvectors of G, {LGi } i=1, M , can be projected onto the subspace of A obtaining the set of vectors {ci } i=1, M defined as ci = ALGi

(17)

III. RESULTS AND DISCUSSION

The GIP57 is composed of a single PDZ55 domain of 124 residues. Figure 1 depicts both lf and lb conformations. Both conformations share the typical six-stranded β-sandwich folding of PDZ domains, that is, two α-helices (α1 and α2) and six β-strands (β1-β6).63,64 GIP also contains an additional short β-hairpin located between canonical strands β1 and β2. It is formed by residues L21-I28 and composed of strands βaβb.108–110 The C-terminal end of glutaminase L binds as an antiparallel β-strand in a groove between β2 and the α2 helix and is capped by the βa- βb hairpin.111 While MD simulations have been performed considering the whole molecule, the C-terminal and N-terminal tails have been excluded from the PCA analysis in order to reduce the noise, we know that does not affect ligand binding, introduced by these highly flexible parts of the protein. According to NMR experiments,62 both regions of GIP are completely unstructured either in the ligand-free or ligand-bound conformations with very few observed NOEs (Nuclear Overhauser Effect) and correspondingly high Root Mean Square Deviations (RMSDs) in the NMR structural ensembles without significant changes during ligand binding. In this way, while we do not consider these

and also onto the subspace of B obtaining the set of vectors {di } i=1, M defined as  di = BBT ci . (18) In this way, the contribution of the different structural components of the molecule to the similarity between subspaces can be identified. Finally, in order to stress similarities and differences between essential subspaces and considering that eigenvalues {λGi } i=1, M are given in decreasing order, that is, representing decreasing similarities between subspaces, we can define n D   Bsim λGi ci cTi jj (19) j = i=1

and Bdiff = j

M i=n D +1

  (1 − λGi ) ci cTi jj,

(20)

where n D is the index defined by Kirkpatrick107 measuring the effective number of dimensions rounded to the nearest higher integer  M λGi nD = . (21) i=1 λG1 Bsim j

and

Bdiff j

can be normalized as

FIG. 1. Cartoon representation of GIP in its (a) ligand-free and (b) complex with the C-terminal end of glutaminase L. The secondary structure elements are also indicated.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-6

Grosso et al.

highly flexible regions in the PCA analysis, any effect that these regions could have on the flexibility pattern of other SSE is included. First, we analyze the size and composition of ligandfree (Slf ) and ligand-bound (Slb) essential subspaces separately defined according to the procedure described in Sec. II C. These mode subspaces are composed by every PCA mode that participates significantly in any fluctuation contained in the  k q k=1, K displacement vectors stored during each of the MD simulations. The participation number PQk (Eq. (5)) measures the number of PCA modes that span the subspace that describes the kth configuration. Figures 2(a) and 2(b) display the distribution of values of PQk (lf ) and PQk (lb) obtained over all the lf and lb configurations, with average values 6.9 ± 3.2 and 8.7 ± 4.2, respectively. The largest values of PQk (lf ) and PQk (lb) are 20 and 26, respectively, representing a reduction of 93% and 91% of their corresponding original configurational spaces. The composition of these subspaces is displayed in Figures 2(c) and 2(d) as the distribution of the relative probabilities that each PCA mode Qi belongs to the essential subspace. As it is expected, these values decrease with the corresponding PCA mode eigenvalues. The maximum dimensions of S(lf ) and S(lb), recovering every PCA mode that participates in the projection of any configuration, are 106 and 133, that correspond to 35% and 44% of the corresponding total (3 × Nr = 303) configurational spaces. These sizes are significantly larger than the small number of modes considered in PCA studies, which attempt to retain only the main features in the flexibility pattern. From this figure, it is clear that only looking at the first and second PCA modes could lead to very wrong conclusions. Our definitions of S(lf) and S(lb) take into account all the structural changes contained in the set of MD snapshots displacement vectors. In order to validate the convergence of our results, either the distribution of values of the mode participation number PkQ or the relative probabilities that each essential mode belongs to the essential subspace

J. Chem. Phys. 142, 245101 (2015)

have been calculated during extended 800 ns MD simulations bunched in 400 ns intervals. The average values of PQk (lf) and PQk (lb) obtained over all the lf and lb configurations among these intervals are 7.0 ± 3.5 and 8.9 ± 4.5, that is, very close to the original values 6.9 ± 3.2 and 8.7 ± 4.2, respectively. In this way, we confirm the convergence of our definition of the size and composition of the essential dynamics. In order to gain insight into the identity of modes included in S(lf ) and S(lb), we have evaluated their participation number SSE PQ expressed in terms of SSE (see Eq. (12)). The distribui SSE tions of the values of PQ for all Qi belonging to S(lf ) and i S(lb) are shown in Figures 3(a) and 3(b), respectively. Each SSE PQ is weighted according to the relative probability that the i corresponding Qi belongs to the essential subspace (values shown in Figures 2(c) and 2(d)). In most cases, less than half of the total SSEs are involved in the essential modes, with the modes of the ligand-free system being more localized than the ones corresponding to the ligand-bound system. This is in agreement with the general trend that, except in the regions of GIP that directly interact with the ligand, the ligand-bound conformation presents a more delocalized flexibility pattern than the ligand-free conformation.62 We now analyze the essential subspace associated to the conformational transition. Figure 4(a) displays the values of ′ the participation number Puk k (see Sec. II D) indicating that only two modes participate significantly in more than 90% of the K unbound-to-bound frames, while 3 modes participate in the remained 10%. In all the cases, the mode with the largest eigenvalue is involved in the essential subspace. Comparing these results with those shown in Figures 2(a) and 2(b), we can appreciate that the conformational change is characterized by a narrower pattern than conformer-specific S(lf ) and S(lb) essential subspaces. Even more, the analysis of its essential modes in terms of SSE (Figure 3(c)) indicates that in most cases only up to six SSEs are involved. As it has been pointed out previously, convergence of these results has been confirmed

FIG. 2. Histogram of values of the k obmode participation number PQ tained over all the (a) ligand-free and (b) ligand-bound configurations. Relative probabilities that each essential mode belongs to the (c) ligand-free and (d) ligand-bound essential subspace.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-7

Grosso et al.

J. Chem. Phys. 142, 245101 (2015)

FIG. 3. Histogram of the values of the SSE, expressed participation number PQ i in terms of SSEs, for all the essential modes belonging to the (a) ligandfree essential, (b) ligand-bound essential, and (c) conformational transition subspace, weighted according to their relative occurrence probabilities.

analyzing results obtained in the extended 800 ns MD simulations bunched in 400 ns intervals. A common technique to study conformational transitions is to concatenate two separate trajectories, corresponding to the dynamics of two different conformational states. It is then useful to compare the results obtained following our procedure to get insight into the conformational changes (described in Sec. II D) with the results obtained by concatenating the ligand-free and ligand-bond MD simulations. The first mode obtained with the two different methods perfectly matches each other (see Figure S1 in the supplementary material).112 This mode corresponds to the vector difference between the average structures of the individual ligand-free and ligandbond MD simulations. Large values of similarities (>0.7) were also obtained for the first five modes. Figure 4(b) displays the distribution of the participation number obtained by

FIG. 4. Histogram of the participation number obtained by projecting the ensemble of vector differences between ligand-free and ligand-bound conformations onto (a) the conformational change essential subspace defined in Secs. II B and II D the essential subspace obtained combining the ligand-free and ligand-bound MD simulations.

projecting the ensemble of vector differences  ′ between ligandfree and ligand-bound conformations, dk k , over SV k,k ′=1, K essential subspace obtained combining the MD simulations. While either the combination of MD simulations or the SVD procedure described in Sec. II D allows a significant reduction of the whole configurational space, the SV essential subspace seems to be more efficient to concentrate the main traits of the conformational change. In order to express our findings concerning the conformational changes in terms of the conformer-specific essential dynamics, Figure 5(a) shows the projection of the ensemble of vector difference  ′between ligand-free and ligand-bound conformations, dk k , on the ligand-free essential k,k ′=1, K modes. Comparing these results with those of Figure 2(a), we observe that more essential modes are required to describe the direction of the conformational transition than those required to describe the ligand-free fluctuations. Thereby, the conformational transitions involve more complex geometry distortions than the ones collected during the ligand-free MD simulations. Nevertheless, only a relatively small number of essential modes are needed to describe it. It is important at this point to test our definition of the conformational change essential subspace using the ensemble of vector difference between ligand-free and ligand-bound conformations. In order to analyze if this construction is affected by fluctuations in each individual conformer MD simulation, we have redefined the vector differences as the difference between randomly selected mass-weighted displacement vectors corresponding to only configurations stored during the ligand-free MD simulations. Figure 5(b) shows the projection of the ensemble of these redefined vectors on the ligand-free essential modes. As we can see, a significantly larger number of modes are required in order to represent these vectors in the basis of ligand-free PCA modes. Therefore, these vectors present higher degrees of mismatching with ligand-free PCA modes than vectors used to construct the conformational change essential subspace (Eq. (6)).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-8

Grosso et al.

J. Chem. Phys. 142, 245101 (2015)

FIG. 5. (a) Histogram of the values of the projection on the ligand-free essential modes of the ensemble of vector differences (a) between ligand-free and ligand-bound conformations and (b) between randomly selected ligandfree conformations. (c) Relative probabilities that each ligand-free essential mode is (c) involved in the conformational change and (d) involved in differences between randomly selected ligand-free conformations.

In order to gain insight into the identity of the ligandfree essential modes involved in the conformational transition, Figure 5(c) displays the distribution of the relative probabilities that each ligand-free PCA mode Qi is involved in this process. In accordance with Figure 2(c), the first ∼35 more collective modes (i.e., lowest frequencies modes) are the ones that contribute the most. The contributions of the other modes present a steep decrease with the corresponding PCA mode eigenvalues. On the contrary, a gradual decrease is observed for vectors involving differences between randomly selected ligand-free conformations (Figure 5(d)). The number of different ligand-free PCA modes that participate at least once in any of these vectors is 243, representing more than 80% of the total space. Therefore, we confirm that displacements

along the individual ligand-free trajectory do not obscure the signal of matching between the ensemble of vector difference between ligand-free and ligand-bound conformations and the most collective ligand-free PCA modes. The S(lf ) and S(lb) essential subspaces hold the main traits of the fluctuations involved in each conformer-specific dynamics, while SV contains information about unboundto-bound conformational change. We have analyzed them in terms of the internal fluctuations of the different SSEs of the protein. Similar values to the ones reported for the SV essential subspace have been obtained using essential modes combining both ligand-free and ligand-bound MD simulations. Table I and Figure 6 display the average contribution of each SSE to essential modes weighted according to the relative

TABLE I. Average contributions of each SSE to ligand-free S(lf), ligand-bound S(lb), and SV conformational transition (lf-lb) essential modes weighted according to their relative occurrence probabilities; RMSD of each SSE with respect to the average structures of the individual ligand-free and ligand-bond MD simulations.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

SSE

Residues

S(lf )

S(bf )

SV

RMSD (Å) (lf )

RMSD (Å) (lb)

β sheet 1 β sheet a β sheet b Loop βb-β2 β sheet 2 Loop β2-β3 β sheet 3 Loop β3-α1 α helix 1 Loop α1-β4 β sheet 4 β sheet 5 Loop β5-α2 α helix 2 Loop α2-β6 β sheet 6

12-19 20-34 25-29 30-32 33-35 36-54 55-58 59-64 65-70 71-74 75-81 82-85 86-89 90-98 99-104 105-112

3.22 13.46 12.65 2.71 2.82 13.78 1.82 6.72 7.01 2.78 2.79 6.62 8.43 4.74 8.00 2.45

3.53 13.52 11.49 3.86 3.21 6.13 1.62 7.48 10.53 2.41 2.16 6.40 7.80 7.65 9.05 3.15

2.44 13.56 12.46 2.51 6.27 19.99 5.11 4.43 5.18 3.54 3.16 4.75 4.70 3.88 5.57 2.46

0.78 1.67 1.62 0.71 0.73 1.71 0.61 1.03 1.07 0.71 0.71 1.00 1.03 0.85 1.10 0.68

0.72 1.54 1.46 0.75 0.73 0.99 0.55 1.01 1.12 0.63 0.62 0.95 1.07 1.12 1.18 0.70

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-9

Grosso et al.

FIG. 6. Contributions of each SSE to (a) ligand-free, (b) ligand-bound, and (c) conformational transition essential modes weighted according to their relative occurrence probabilities. The color scale is from blue (lower values) to red (higher values), and white indicates intermediate contributions.

probabilities that each PCA mode Qi belongs to the essential subspace (Figures 2(c) and 2(d)). It can be noted that the three SSEs mainly involved in the conformational change, indicated in Figure 3(c), correspond to the βa- βb hairpin and the β2- β3 loop. This is in agreement with the previous NMR data analysis performed by Zoetewey et al.62 Both SSEs are among the most flexible regions of the protein. While the contribution of βaβb hairpin during the conformational change exhibits similar values as the ones observed in the ligand-free and ligandbound essential dynamics, the contribution for the β2- β3 loop is significantly larger. This indicates that the β2- β3 loop experiences structural distortions during the conformational change that are larger than intrinsic fluctuations of each conformer. Similar behavior is observed for strands β2 and β3, perhaps as a consequence of their proximity with this loop. Strand β2 forms an antiparallel β-sheet with the glutaminase L peptide and it has been previously reported that it becomes more disordered after binding.62 On the contrary, the whole region covered by β3-α1 loop, α1 helix, β5 strand, β5-α2 loop, α2 helix, and α2- β6 loop seems not to be relevant to the conformational change presenting actually a reduction of their contributions to SV with respect to their contributions to S(lf ) and S(lb). At this point, it is interesting to analyze and compare our results with the corresponding results obtained through the calculation of the RMSD of each SSE with respect to the average structures of the individual ligand-free and ligandbond MD simulations (Table I). Relative S(lf )-S(lb) values are, in general, more evident than the corresponding RMSD(lf )RMSD(lb) values, indicating that the analysis of the main traits of the conformer-specific fluctuations through the

J. Chem. Phys. 142, 245101 (2015)

definition of S(lf ) and S(lb) essential subspaces highlights unique conformer-specific patterns. The main reason for that can be found in our definitions of essential subspaces S that recover every PCA mode that participates significantly in any observed structural distortion, neglecting minor local contributions that can blur up collective patterns. We now compare contributions of each SSE to S(lf ) and S(lb). The contribution of the β2- β3 loop decreases in S(lb) with respect to that of S(lf ). This is in agreement with the few NOEs observed in NMR spectroscopy62 between this loop and the glutaminase L peptide. Zoetewey et al.62 have analyzed changes in generalized order parameters S2, calculated using steady-state 1H-15N NOE intensities and R1 and R2 relaxation rates. They reveal that residues that belong to the β2- β3 loop exhibit a significant decrease in flexibility (increase of S2) upon ligand binding. Our analysis of MD simulations reveals that residues Q43 and P45 ( β2- β3 loop) form inter-SSEs hydrogen bonds with Y56 (strand β3) with 88% and 35% of occupancy during the MD of the ligand-bound conformer that are not observed in the MD of the ligand-free conformer. On the other hand, several SSEs increase their contributions in S(lb) with respect to S(lf ). Among them, α1 helix and α2 helix present the larger increase of contributions. Residues Q92 and A93 (α2 helix) form more probable hydrogen bonds with T89 ( β5-α2 loop) during the MD of the ligand-bound conformer with respect to that of the ligand-free conformer. The increased flexibility of residues H90, D91, and Q92 has been experimentally reported and they have been related to direct hydrogen bonds with the ligand.62 Despite that, neither α1 helix nor α2 helix seems to significantly contribute to the conformational change. The comparison of essential dynamics subspaces, described in Sec. II F, between ligand-free and ligand-bound conformers indicates that both MD simulations share almost 70% of their essential dynamics. The essential dynamics associated to the conformational transition is completely covered (∼96%) by the essential dynamics of each of ligand-free and ligand-bound states. Dynamic fluctuations associated with both conformations account for unbound-to-bound displacements. Furthermore, the values of the effective number of dimensions, n D (Eq. (21)), indicate that similarity between subspaces is spanned through n D/M = 92.8/106 ≈ 87% of the ligand-free and ligand-bound essential dynamics, and almost the totality (31.8/33 = 96%) of the conformational change essential dynamics. In order to identify novel dynamics aspects that are introduced during the conformational transition, we need to stress similarities and differences between essential subspaces. Figures 7(a)–7(d) display the distribution of the values of the m normalized factors, B nor (Eq. (22)), evaluated either for Bsim j j diff or B j (Eqs. (19) and (20)). These values are also mapped onto the structure of free GIP (Figures 8(a) and 8(b)) and onto the structure of the GIP-ligand complex (Figure 8(c)) expressed in terms of SSEs. The values of Bsim j represent the effective similarities between both essential dynamics. Figures 7(a), 7(b) and 8(a) indicate that both ligand-free and ligandbound conformers share a common flexibility pattern with βaβb hairpin as the common highest flexible part of GIP, while sheet β3 the most rigid one. Figures 7(c) and 7(d) and Figures 8(b) and 8(c) stress the main differences between the essential

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-10

Grosso et al.

J. Chem. Phys. 142, 245101 (2015)

FIG. 7. Histogram of the values of the normalized factors B sim obtained by j projecting the eigenvectors of G, onto the ligand-free (a) and ligand-bound diff (b) essential subspaces; (c) B j obtained by projecting the eigenvectors of G onto the ligand-free essential subdiff space; and (d) B j obtained by projecting the eigenvectors of G onto the ligand-bound essential subspace. The color blue corresponds to negative values and red to positive ones.

subspaces. Figure 7(c) and Figure 8(b) have been obtained using Eqs. (17) and (19), that is, by projecting the eigenvectors of G onto the ligand-free essential subspace (Eq. (18)). On the contrary, Figures 7(d) and 8(c) have been obtained

by projecting the eigenvectors of G onto the ligand-bound essential subspace. We can verify that this procedure stresses the decrease in flexibility of β2- β3 loop after ligand binding, previously observed in Table I. A decrease in relative flexibility is also observed for βa- βb hairpin and loops surrounding the α2 helix. Besides, an increase in flexibility upon ligand binding is observed for β1 strand, α1 helix, β4 strand, α2 helix, and β6 strand. At this point, it is interesting to stress that the increase in backbone flexibility in β1, β4, and β6 strands has not been observed in our previous analysis performed in Table I and Figure 6. Nevertheless, this behavior is in perfect agreement with previous NMR experiments performed by Zoetewey et al.62 who show that the decrease in the flexibility of the binding pocket is offset by an increase in flexibility distal to the binding site in the core of the protein, including β1, β4, and β6 strands. Furthermore, values of the RMSD of each SSE with respect to the average structures of the individual ligand-free and ligand-bond MD simulations can neither reproduce these features. The comparison of relative values of either S(lf ) with respect to S(lb) or RMSD(lf ) with respect to RMSD(lb), previously shown in Table I, with our results depicted in Figure 7 emphasizes that our procedure described in Sec. II F allows us to reproduce the general experimental trend, not revealed using standard analysis. The decrease in the flexibility of the binding pocket is offset by an increase in flexibility distal to the binding site in the core of the protein, involving β1, β4, and β6 strands. IV. CONCLUSIONS

FIG. 8. Representation of the relative values of the normalized factors (a) diff B sim j mapped onto the structure of ligand-free conformation, (b) B j mapped diff

onto the structure of ligand-free conformation, and (c) B j mapped onto the structure of ligand-bound conformation. The color scale is from blue (lower values) to red (higher values), and white indicates intermediate values.

We have presented a general and novel procedure to define the size and composition of conformer-specific and conformational transition essential dynamics. The method is unbiased by user choices. We have also described a procedure to compare essential dynamics subspaces. The procedure is easy to implement and allows emphasizing the main similarities and differences between the different essential dynamics, frequently hidden using standard analysis.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-11

Grosso et al.

We were able to explore the extent through which conformational changes upon ligand binding are included in each conformer-specific essential dynamics. We consider that the method is suitable to be applied in a large variety of cases such as the analysis of the effects of mutations on dynamics, design of new drugs that prevent conformational changes upon ligandbinding, and the analysis of conformational transitions induced by changes in cofactor oxidation states. MD simulations and PCA of GIP in its ligand-bound and ligand-free conformations have been considered as a test case. We have found that the sizes of the essential subspaces, required to include every PCA mode that participates significantly in any structural change observed during each of the MD simulations, are larger than the most frequently considered number of modes. The analysis of the essential dynamics subspace associated to conformational transitions indicates that in most cases, mainly the βa- βb hairpin and the β2- β3 loop are involved. Our procedure to emphasize similarities and differences between the conformer-specific fluctuation patterns allows us to identify an increase in backbone flexibility in β1, β4, and β6 strands. These findings are not clearly observed using standard analysis of relative fluctuations. Our findings are in good agreement with the previous NMR data analysis performed by Zoetewey et al.62 The relative changes in the flexibility pattern upon binding are in agreement with the general trend that, except in the regions of GIP that directly interact with the ligand, the ligand-bound conformation is more flexible than the ligand-free conformation. We observed that the conformational transitions involve more complex geometry distortions than the ones collected during the ligand-free MD simulations. The comparison of essential dynamics subspaces for ligand-free, ligand-bound, and conformational transition reveals that the ligand-free and ligand-bound MD simulations share almost 70% of their essential dynamics. Besides, the essential dynamics associated to the conformational transition is completely covered by the essential dynamics of each of ligand-free and ligand-bound states. Dynamic fluctuations associated to both conformations account for unbound-to-bound displacements. ACKNOWLEDGMENTS

This work was partially supported by CONICET and UNQ. We also want to acknowledge the University of Florida Research Computing for providing computational resources and support that have contributed to the research results reported in this publication. 1K.

Gunasekaran, B. Ma, and R. Nussinov, Proteins: Struct., Funct., Bioinf. 57, 433 (2004). 2P. G. Debrunner and H. Frauenfelder, Annu. Rev. Phys. Chem. 33, 283 (1982). 3G. U. Nienhaus, J. D. Müller, B. H. McMahon, and H. Frauenfelder, Phys. D 107, 297 (1997). 4H. Frauenfelder and B. McMahon, Proc. Natl. Acad. Sci. U. S. A. 95, 4795 (1998). 5H. Frauenfelder, S. G. Sligar, and P. G. Wolynes, Science 254, 1598 (1991). 6K. A. Dill and H. S. Chan, Nat. Struct. Biol. 4, 10 (1997). 7P. G. Wolynes, J. N. Onuchic, and D. Thirumalai, Science 267, 1619 (1995). 8K. A. Dill, S. B. Ozkan, M. S. Shell, and T. R. Weikl, Annu. Rev. Biophys. 37, 289 (2008).

J. Chem. Phys. 142, 245101 (2015) 9M.

P. Mims, A. G. Porras, J. S. Olson, R. W. Noble, and J. A. Peterson, J. Biol. Chem. 258, 14219 (1983). 10D. W. Miller and K. A. Dill, Protein Sci. 6, 2166 (1997). 11B. Ma, M. Shatsky, H. J. Wolfson, and R. Nussinov, Protein Sci. 11, 184 (2002). 12E. Juritz, N. Palopoli, S. Fornasari, S. Fernandez Alberti, and G. Parisi, Mol. Biol. Evol. 30, 79 (2013). 13H. Frauenfelder, Proc. Natl. Acad. Sci. U. S. A. 99(Suppl. 1), 2479 (2002). 14C. J. Tsai, S. Kumar, and B. Ma, Protein Sci. 8, 1181 (1999). 15H. Frauenfelder, B. H. McMahon, R. H. Austin, K. Chu, and J. T. Groves, Proc. Natl. Acad. Sci. U. S. A. 98, 2370 (2001). 16P. I. Zhuravlev and G. A. Papoian, “Protein functional landscapes, dynamics, allostery: A tortuous path towards a universal theoretical framework,” Q. Rev. Biophys. 43, 295 (2010). 17S. Kumar, B. Ma, C. J. Tsai, N. Sinha, and R. Nussinov, Protein Sci. 9, 10 (2000). 18P. Csermely, R. Palotai, and R. Nussinov, Trends Biochem. Sci. 35, 539 (2010). 19D. D. Boehr, R. Nussinov, and P. E. Wright, Nat. Chem. Biol. 5, 789 (2009). 20K. Okazaki and S. Takada, Proc. Natl. Acad. Sci. U. S. A. 105, 11182 (2008). 21J. Monod, J. Wyman, and J. P. Changeux, J. Mol. Biol. 12, 88 (1965). 22L. C. James and D. S. Tawfik, Trends Biochem. Sci. 28, 361 (2003). 23O. F. Lange, N. A. Lakomek, C. Farès, G. F. Schröder, K. F. Walter, S. Becker, J. Meiler, H. Grubmüller, C. Griesinger, and B. L. De Groot, Science 320, 1471 (2008). 24Z. Wu, V. Elgart, H. Qian, and J. Xing, J. Phys. Chem. B 113, 12375 (2009). 25T. R. Weikl and C. von Deuster, Proteins 75, 104 (2009). 26S.-R. Tzeng and C. G. Kalodimos, Curr. Opin. Struct. Biol. 21, 62 (2011). 27M. Levitt, C. Sander, and P. S. Stern, J. Mol. Biol. 181, 423 (1985). 28J. Ma, Structure 13, 373 (2005). 29B. Brooks and M. Karplus, Proc. Natl. Acad. Sci. U. S. A. 82, 4995 (1985). 30Q. Cui, G. Li, J. Ma, and M. A. Karplus, J. Mol. Biol. 340, 345 (2004). 31W. G. Krebs, V. Alexandrov, C. A. Wilson, N. Echols, H. Yu, and M. Gerstein, Proteins 48, 682 (2002). 32S. E. Dobbins, V. I. Lesk, and J. E. Sternberg, Proc. Natl. Acad. Sci. U. S. A. 105, 10390 (2008). 33F. Tama and Y. H. Sanejouand, Protein Eng. 14, 1 (2001). 34D. Tobi and I. Bahar, Proc. Natl. Acad. Sci. U. S. A. 102, 18908 (2005). 35H. Wako and S. Endo, Biophys. Chem. 159, 257 (2011). 36D. Ming, Y. Kong, S. J. Wakil, and J. M. Brink, Proc. Natl. Acad. Sci. U. S. A. 99, 7895 (2002). 37D. Ming, Y. Kong, M. A. Lambert, Z. Huang, and J. P. Ma, Proc. Natl. Acad. Sci. U. S. A. 99, 8620 (2002). 38M. Münz, R. Lyngsø, J. Hein, and P. C. Biggin, BMC Bioinf. 11, 188 (2010). 39J. A. McCammon and S. C. Harvey, Dynamics of Proteins and Nucleic Acids (Cambridge University Press, 1987). 40M. Karplus and J. Kuriyan, Proc. Natl. Acad. Sci. U. S. A. 19, 6679 (2005). 41M. Karplus and J. A. McCammon, Nat. Struct. Biol. 9, 646 (2002). 42C. Brooks III, M. Karplus, and B. M. Pettitt, “Proteins: A theoretical perspective of dynamics, structure, and thermodynamics,” in Advances in Chemical Physics, edited by I. Prigogine and S. A. Rice (John Wiley & Sons Ltd., 1987), Vol. 71. 43T. Ichiye and M. Karplus, Proteins 11, 205 (1991). 44T. Horiuchi and N. Go, Proteins 10, 106 (1991). 45D. M. F. Van Aalten, B. L. De Groot, J. B. C. Findlay, H. J. C. Berendsen, and A. Amadei, J. Comput. Chem. 18, 169 (1997). 46I. T. Jolliffe, Principal Component Analysis, 2nd ed. (Springer Series, Springer-Verlag, 2002). 47C. C. David and D. J. Jacobs, in Protein Dynamics, edited by D. R. Livesay (Humana Press, Totowa, NJ, 2014), Vol. 1084, pp. 193–226. 48A. Amadei, A. B. M. Linssen, and H. J. C. Berendsen, Proteins 17, 412 (1993). 49M. Paul, M. Hazra, A. Barman, and S. Hazra, J. Biomol. Struct. Dyn. 32, 928 (2014). 50B. Hess, Phys. Rev. E 62, 8438 (2000). 51B. Hess, Phys. Rev. E 65, 031910 (2002). 52J. D. Faraldo-Gómez, L. R. Forrest, M. Baaden, P. J. Bond, C. Domene, G. Patargias, J. Cuthbertson, and M. S. Sansom, Proteins 57, 783 (2004). 53A. Amadei, M. A. Ceruso, and A. Di Nola, Proteins: Struct., Funct., Bioinf. 36, 419 (1999). 54A. Leo-Macias, P. Lopez-Romero, D. Lupyan, D. Zerbino, and A. R. Ortiz, Biophys. J. 88, 1291 (2005).

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

245101-12 55R.

Grosso et al.

Rousset, S. Fabre, C. Desbois, F. Bantignies, and P. Jalinot, Oncogene 16, 643 (1998). 56L. Olalla, J. C. Aledo, G. Bannenberg, and J. Marquez, FEBS Lett. 488, 116 (2001). 57R. G. Mata and K. Burridge, Trends Cell Biol. 17, 36 (2007). 58C. Perez-Gomez, J. A. Campos-Sandoval, J. F. Alonso, J. A. Segura, E. Manzanares, P. Ruiz-Sanchez, M. E. Gonzalez, J. Marquez, and J. M. Mates, Biochem. J. 386, 535 (2005). 59P. Gao, I. Tchernyshyov, T. C. Chang, Y. S. Lee, K. Kita, T. Ochi, K. I. Zeller, A. M. De Marzo, J. E. Van Eyk, J. T. Mendell et al., Nature 458, 762 (2009). 60F. A. Gallagher, M. I. Kettunen, S. E. Day, M. Lerche, and K. M. Brindle, Magn. Reson. Med. 60, 253 (2008). 61L. Olalla, A. Gutierrez, A. J. Jimenez, J. F. Lopez-Tellez, Z. U. Khan, J. Perez, F. J. Alonso, V. De la Rosa, J. A. Campos-Sandoval, J. A. Segura et al., J. Neurosci. Res. 86, 281 (2008). 62D. L. Zoetewey, M. Ovee, M. Banerjee, R. Bhaskaran, and S. Mohanty, Biochemistry 50, 3528 (2011). 63A. S. Fanning and J. M. Anderson, Curr. Biol. 6, 1385 (1996). 64H.-J. Lee and J. J. Zheng, Cell Commun. Signal 8, 8 (2010). 65T. Chimura, T. Launey, and M. Ito, BMC Genomics 12, 300 (2011). 66M. Sheng and C. Sala, Annu. Rev. Neurosci. 24, 1 (2001). 67M. A. Stiffler, J. R. Chen, V. P. Grantcharova, Y. Lei, D. Fuchs, J. E. Allen, L. A. Zaslavskaia, and G. MacBeath, Science 317, 364 (2007). 68Y. Ivarsson, FEBS Lett. 586, 2638 (2012). 69Y. Kong and M. Karplus, Proteins 74, 145 (2009). 70E. J. Fuentes, C. J. Der, and A. L. Lee, Mol. Biol. 335, 1105 (2004). 71P. De Los Rios, F. Cecconi, A. Pretre, G. Dietler, O. Michielin, F. Piazza, and B. Juanico, Biophys. J. 89, 14 (2005). 72S. Gianni, T. Walma, A. Arcovito, N. Calosci, A. Bellelli, A. Engström, C. Travaglini-Allocatelli, M. Brunori, P. Jemth, and G. W. Vuister, Structure 14, 1801 (2006). 73R. Tonikian, Y. Zhang, S. L. Sazinsky, B. Currell, J.-H. Yeh, B. Reva, H. A. Held, B. A. Appleton, M. Evangelista, Y. Wu et al., PLoS Biol. 6, e239 (2008). 74Z. N. Gerek and S. B. Ozkan, Protein Sci. 19, 914 (2010). 75M. Münz, J. Hein, and P. C. Biggin, PLoS Comput. Biol. 8, e1002749 (2012). 76R. G. Smock and L. M. Gierasch, Science 324, 198 (2009). 77E. J. Fuentes, S. A. Gilmore, R. V. Mauldin, and A. L. Lee, J. Mol. Biol. 364, 337 (2006). 78S. Gianni, A. Engström, M. Larsson, N. Calosci, F. Malatesta, L. Eklund, C. C. Ngang, C. Travaglini-Allocatelli, and P. Jemth, J. Biol. Chem. 280, 34805 (2005). 79R. Bloem, K. Koziol, S. A. Waldauer, B. Buchli, R. Walser, B. Samatanga, I. Jelesarov, and P. Hamm, J. Phys. Chem. B 116, 13705 (2012). 80S. Bhattacharya, J. H. Ju, N. Orlova, J. A. Khajeh, D. Cowburn, and Z. Bu, J. Mol. Biol. 425, 2509 (2013). 81C. M. Petit, J. Zhang, P. J. Sapienza, E. J. Fuentes, and A. L. Lee, Proc. Natl. Acad. Sci. U. S. A. 106, 18249 (2009). 82N. Ota and D. A. Agard, J. Mol. Biol. 351, 345 (2005). 83A. Dhulesia, J. Gsponer, and M. Vendruscolo, J. Am. Chem. Soc. 130, 8931 (2008).

J. Chem. Phys. 142, 245101 (2015) 84S.

Mostarda, D. Gfeller, and F. Rao, PLoS Comput. Biol. 8, e1002429 (2012). 85G. Hultqvist, S. R. Haq, A. S. Punekar, C. N. Chi, Å Engström, A. Bach, K. Strømgaard, M. Selmer, S. Gianni, and P. Jemth, Structure 21, 1193 (2013). 86B. Buchli, S. A. Waldauer, R. Walser, M. L. Donten, R. Pfister, N. Blöchliger, S. Steiner, A. Caflisch, O. Zerbe, and P. Hamm, Proc. Natl. Acad. Sci. U. S. A. 110, 11725 (2013). 87G. Tiwari and D. Mohanty, Plos ONE 8, e71340 (2013). 88K. Sharp and J. J. Skinner, Proteins 361, 347 (2006). 89S. Steiner and A. Caflisch, Proteins 80, 2562 (2012). 90Z. N. Gerek, O. Keskin, and S. B. Ozkan, Proteins 77, 796 (2009). 91N. Basdevant, H. Weinstein, and M. Ceruso, J. Am. Chem. Soc. 128, 12766 (2006). 92S. W. Lockless and R. Ranganathan, Science 286, 295 (1999). 93D. A. Case, T. A. Darden, T. E. Cheatham III, C. L. Simmerling, J. Wang, R. E. Duke, R. Luo, R. C. Walker, W. Zhang, K. M. Merz et al., AMBER 12 (University of California, 2012). 94R. Salomon-Ferrer, A. W. Goetz, D. Poole, S. L. Grand, and R. C. Walker, J. Chem. Theory Comput. 9, 3878 (2013). 95A. W. Goetz, M. J. Williamson, D. Xu, D. Poole, S. L. Grand, and R. C. Walker, J. Chem. Theory Comput. 8, 1542 (2012). 96S. L. Grand, A. W. Goetz, and R. C. Walker, Comput. Phys. Commun. 184, 374 (2013). 97V. Vnak, R. Abel, A. Okur, B. Strockbine, A. Roitberg, and C. Simmerling, Proteins: Struct., Funct., Genet. 65, 712 (2006). 98T. A. Andrea, W. C. Swope, and H. C. Andersen, J. Chem. Phys. 799, 4576 (1983). 99R. J. Bell, P. Dean, and D. C. Hibbins-Butter, J. Phys. C: Solid State Phys. 3, 2111 (1970). 100S. N. Taraskin and S. R. Elliott, Phys. Rev. B 59, 8572 (1999). 101D. M. Van Aalten, A. Amadei, A. B. Linssen, V. G. Eijsink, G. Vriend, and H. J. Berendsen, Proteins 22, 45 (1995). 102M. E. Wall, A. Rechtsteinner, and L. M. Rocha, “Singular value decomposition and principal component analysis,” in A Practical Approach to Microarray Data Analysis, edited by D. P. Berrar, W. Dubitzky, and M. Granzow (Kluwer, Norwell, MA, 2003), pp. 91–109. 103W. J. Krzanowski, J. Am. Stat. Assoc. 74, 703 (1979). 104M. W. Blows, S. F. Chenoweth, and E. Hine, Am. Nat. 163, 329 (2004). 105B. Flury, Common Principal Components and Related Multivariate Models (Wiley, New York, 1988). 106R. D. Cohn, J. Agri. Biol. Environ. Stat. 4, 238 (1999). 107M. Kirkpatrick, Genetica 136, 271 (2009). 108J. Zhang, X. Yan, C. Shi, X. Yang, Y. Guo, C. Tian, J. Long, and Y. Shen, J. Mol. Biol. 384, 255 (2008). 109M. A. Durney, G. Birrane, C. Anklin, A. Soni, and J. A. A. Ladias, J. Biomol. NMR 45, 329 (2009). 110J. Schultz, U. Hoffmuller, G. Krause, J. Ashurst, M. J. Macias, P. Schmieder, J. Schneider-Mergener, and H. Oschkinat, Nat. Struct. Biol. 5, 19 (1998). 111M. Banerjee, C. Huang, J. Marquez, and S. Mohanty, Biochemistry 47, 9208 (2008). 112See supplementary material at http://dx.doi.org/10.1063/1.4922925 for the comparison of Cα isotopic relative displacements.

This article is copyrighted as indicated in the article. Reuse of AIP content is subject to the terms at: http://scitation.aip.org/termsconditions. Downloaded to IP: 128.178.131.113 On: Tue, 29 Sep 2015 07:21:44

On the analysis and comparison of conformer-specific essential dynamics upon ligand binding to a protein.

The native state of a protein consists of an equilibrium of conformational states on an energy landscape rather than existing as a single static state...
6MB Sizes 0 Downloads 9 Views