Progress in Biophysics and Molecular Biology xxx (2015) 1e6

Contents lists available at ScienceDirect

Progress in Biophysics and Molecular Biology journal homepage: www.elsevier.com/locate/pbiomolbio

Progress in studying intrinsically disordered proteins with atomistic simulations Nathaniel Stanley a, Santiago Esteban-Martín a, b, Gianni De Fabritiis a, c, * a

Computational Biophysics Laboratory (GRIB-IMIM), Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C/Doctor Aiguader 88, 08003 Barcelona, Spain b Joint BSC-IRB-CRG Research Programme in Computational Biology, Barcelona Supercomputing Center e BSC, Jordi Girona 29, 08034 Barcelona, Spain c  Catalana de Recerca i Estudis Avançats, Passeig Lluis Companys 23, 08010 Barcelona, Spain Institucio

a r t i c l e i n f o

a b s t r a c t

Article history: Available online xxx

Intrinsically disordered proteins are increasingly the focus of biological research since their significance was acknowledged over a decade ago. Due to their importance in biomolecular interactions, they are found to play key roles in many diseases such as cancers and amyloidoses. However, because they lack stable structure they pose a challenge for many experimental methods that are traditionally used to study proteins. Atomistic molecular dynamics simulations can help get around many of the problems faced by such methods provided appropriate timescales are sampled and underlying empirical force fields are applicable. This review presents recent works that highlight the power and potential of atomistic simulations to transform the investigatory pipeline by providing critical insights into the behavior and interactions of intrinsically disordered proteins. © 2015 Elsevier Ltd. All rights reserved.

Keywords: Intrinsically disordered proteins High-throughput molecular dynamics Markov state models

Contents 1. 2.

3. 4.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Extensive all-atom simulations of IDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Post-translational modifications and IDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2. Aggregation of IDPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Conflict of interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

00 00 00 00 00 00 00 00 00

1. Introduction Abbreviations: IDP, intrinsically disordered protein; MD, molecular dynamics; HTMD, high-throughput molecular dynamics; MSM, Markov state models; KID, kinase inducible domain; NMR, nuclear magnetic resonance; PRE, Paramagnetic relaxation enhancement; PTM, post-translational modification; TICA, time-sensitive independent correlation analysis; Ab, amyloid beta; hIAPP, human islet amyloid polypeptide; EGFR, epidermal growth factor receptor. * Corresponding author. Computational Biophysics Laboratory (GRIB-IMIM), Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C/Doctor Aiguader 88, 08003 Barcelona, Spain. E-mail address: [email protected] (G. De Fabritiis).

Intrinsically disordered proteins (IDPs) are proteins that lack fixed secondary and tertiary structure. This makes them particularly difficult to characterize by traditional biophysical techniques like X-ray crystallography, which in part led to their being largely ignored for many years. They finally gained widespread acceptance around the turn of the millennium as their prevalence and functional significance became clear (Dunker et al., 2001; Tompa, 2009, 2002; Uversky et al., 2000; Wright and Dyson, 1999).

http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003 0079-6107/© 2015 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

2

N. Stanley et al. / Progress in Biophysics and Molecular Biology xxx (2015) 1e6

Extensive investigation over the last decade has made it clear that IDPs have frequent and important roles in biological processes. Disorder is found in both prokaryotes and eukaryotes, but is higher in eukaryotes, where disordered regions are found in more than 50% of proteins (Pancsa and Tompa, 2012). Disordered regions are enriched in regulatory and signaling proteins, and are less commonly found in proteins responsible for metabolism, biosynthesis or transport. They have various functional advantages, among them the ability to bind to multiple different binding partners, to form weak but highly-specific interactions, as well as being frequent targets of post-translational modification (Dunker et al., 2002; Dyson, 2011; Dyson and Wright, 2005). Having such prevalence in key regulatory functions in the cell means that they are commonly found to play roles in various diseases (Uversky et al., 2008). They are found mutated in numerous cancers (Iakoucheva et al., 2002), unexpectedly common in cardiovascular disease (Cheng et al., 2006a), and they are common components in the fibrils of various amyloidoses like Alzheimer's disease (Lashuel et al., 2002), Parkinson's disease, and diabetes €ppener et al., 2000). A better understanding of how and why (Ho they cause or participate in such diseases is crucial from a therapeutic perspective. Some of the first clues to the existence of IDPs came from crystal structures with missing sections in their electron density maps (Alber et al., 1983; Lewis et al., 1996; Spolar and Record, 1994), in some cases parts critical to function (Aviles et al., 1978; Muchmore et al., 1996). This, along with data from NMR and CD experiments, led to the creation of a database for disordered regions like DisProt (Sickmeier et al., 2007) and inspired tools to try to predict disorder from sequence alone (PONDR) (Romero et al., 1997). There are now many such predictors (He et al., 2009; Linding et al., 2003), and even meta predictors (Xue et al., 2010). Some work has already been done to understand what chemical properties make IDPs disordered, and what static or dynamic structural properties they may have. Most disordered proteins are polyampholytes, enriched in charged residues and depleted in hydrophobic residues (Uversky et al., 2000). The distribution of charge along the sequence determines how collapsed or extended the IDP is (Das and Pappu, 2013). The prevalence and importance of transient metastable structures is unclear, but contact between hydrophobic residues has already been shown as one mechanism that confers such transient structure (Meng et al., 2013). A database has been created to store structural ensembles of IDPs for future study (Varadi et al., 2013). Despite all this progress, IDPs are still difficult to study from a biophysics point of view. Generally the methods used suffer either in limitations in their scale or time resolution. X-ray crystallography, for example, can give accurate information on atomic positions, but is limited by the fact that positions of atoms must be stable, or at least transition only slowly between a few positions. Nuclear magnetic resonance (NMR) methods can give general information about residual secondary structure or transiently formed long range contacts (Eliezer, 2009), as well as the timescale of conformational transitions. However, the information is ensemble averaged and, despite recent advances (Ban et al., 2013; Palmer III, 2014), limitations on the accessible timescale remain. SAXS is particularly well suited to study the degree of collapse in IDPs, and single-molecular FRET can single out conformations with varying degree of extension. In short, experimental methods have clear limitations in their ability to give detailed information about the states and transitions of IDPs. The challenges faced by the methods above stress the need for new approaches. Secondary structural motifs like a-helices and bhairpins form on the 0.1e10 ms timescale, and even the fastest folding proteins take multiple microseconds to milliseconds to fold

(Lindorff-Larsen et al., 2011; Snow et al., 2002). Meaningful transitions in IDPs will likely occur on similar timescales, so any technique that is to fill this void must be able to identify transitions and metastable states formed on these timescales or longer. Long timescale, explicit solvent molecular dynamics simulations is perhaps just the tool. Molecular dynamics simulations use a classical Newtonian representation of atoms, molecules, and the forces between them are encoded in a forcefield which contains all the chemical specificity (Karplus and McCammon, 2002; Levitt, 2001). Specialized computer hardware now exists that allow one to perform single simulations on the millisecond timescales (Ohmura et al., 2014; Shaw et al., 2007). While they are important steps forward, those tools are expensive and difficult to access. New methods pioneered by our group and others that use off-the-shelf GPU hardware and specialized analyses mean such investigations are open to a much broader research community. In the following sections, we highlight the new and important role these tools have had in investigating several disordered proteins (Table 1). While numerous kinds of simulations have been done, we focus on the state-of-the-art by looking at unbiased, all-atom, explicit solvent simulations that extend into the tens of microseconds and beyond. While many biasing or coarse-graining techniques exist to accelerate such simulations, they often require prior knowledge about a system. We direct the reader to a comprehensive review of such techniques and their limitations (Zuckerman, 2011). 2. Extensive all-atom simulations of IDPs The first notable study of the disordered state of a protein using extensive molecular dynamics was that of bovine acyl-coenzyme A binding protein (ACBP) performed on the Anton supercomputer (Lindorff-Larsen et al., 2012). It covered 200 ms of simulation time, or two orders of magnitude longer than any previous study of an IDP. As the first study of its kind, the goal was to determine how well the simulations reproduced copious but sparse NMR data on the protein. Key to this end was the use of a state-of-the-art force field, CHARMM22*, which balances secondary structural propensity to match experimental data (Piana et al., 2011). LindorffLarsen et al. found that the simulations reliably reproduced several different NMR observables, including helical fraction of each residue, spectral densities and order parameters. The only major discrepancy they found was the radius of gyration, which was significantly more collapsed in the simulations. One of the most interesting findings of the work was that they observed a conformation that formed but did not break during simulations. This highlights the difficulty with using simulations to investigate IDPs or proteins that fold on such timescales. While this will likely be an issue with simulations in the foreseeable future, it is encouraging that a simulation of that length could describe so much of the protein's behavior. Working with single long trajectories is convenient, as analysis gets much more complex when working with many short parallel simulations. One of the biggest issues in properly characterizing the states and motions of an IDP is that conformers may be geometrically close, but distant kinetically, and vice versa. Our group struggled with this issue in one of our first works with IDPs where we studied the HIV-1 fusion peptide (Venken et al., 2013). We were unable to adequately cluster the data into meaningful states, and therefore could only make general statements about its bulk properties. This issue was resolved with the development of the rez-Herna ndez et al., 2013; Schwantes and Pande, TICA method (Pe 2013), which allows the data to first be projected along its slowest coordinates, before being clustered into meaningful states. This makes simulations of IDPs with highly parallel methods like GPU clusters possible.

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

N. Stanley et al. / Progress in Biophysics and Molecular Biology xxx (2015) 1e6

3

Table 1 Summary of disordered regions discussed in this review. Name

Abbv.

Aggregate simulation

Key result

Publication

Acyl-coenzyme A binding protein Kinase inducible domain

ACBP KID

200 ms 1.7 ms

(Lindorff-Larsen et al., 2012) (Stanley et al., 2014)

Epidermal growth factor receptor

EGFR aC loop

>100 ms

Amyloid beta peptide

Ab

700 ms

Human islet amyloid polypeptide HIV-1 fusion peptide

hIAPP FP

70 ms 30 ms

Simulations can reproduce many experimental observables. Phosphorylation induces a kinetic slowdown that could be important to binding. Mutations and PTMs can regulate intrinsic disorder of loops, leading to disease. Greater secondary structural propensity of different forms explains greater potential to form fibrils. b-hairpin conformer may lead to nucleation of fibril. Structure is highly heterogonous and quickly interconverting; purely geometric clustering inadequate.

While these first studies were important, it is often the goal of biological research to understand either broadly or specifically applicable lessons. In the next parts, we summarize how simulations of IDPs have been used to advance our understanding. We group them into subsections that highlight where and how they have been helpful in biological research. 2.1. Post-translational modifications and IDPs Post-translational modifications are common in intrinsically disordered regions, and several investigations into how they may modulate IDPs have yielded important lessons. In a recent work by our group, we investigated an IDP known as the kinase inducible domain (KID) (Stanley et al., 2014). KID is part of CREB and, when phosphorylated at residue 133, binds to a domain in CBP to initiate transcription. KID was one of the first and most studied IDPs, and so an extensive array of experimental data existed to compare our simulations with. We assessed the kinetics and energetics of the KID peptide in its phosphorylated, non-phosphorylated, and S133E mutant form. We found that phosphorylation caused a 60-fold slowdown in conformational kinetics between the disordered state and a metastable state believed to be important for binding (Fig. 1). This change was

(Shan et al., 2013, 2012) (Lin et al., 2012) (Qiao et al., 2013) (Venken et al., 2013)

not accompanied by a large change in absolute populations, and the mutation to glutamate could not recapitulate this effect. While the exchange slowdown observed is of interest by itself, we devised a kinetic model to show that this behavior can have real consequences for binding. A 100-fold slowdown in exchange between a binding competent state and non-competent state, for example, can cause a 10-fold change in binding affinity. While the binding affinity of KID to partners like KIX has been attributed largely to specific residue contacts, binding of IDPs is a complex process. The fact that such a mechanism might exist is an important consideration for any IDP that undergoes post-translational modifications. Molecular simulations have also been instrumental in showing how mutation and post-translational modification changes the behavior of a disordered loop in EGFR kinase and can lead to cancer. It had previously been reported that EGFR kinase needed to form an asymmetric dimer in order to activate itself. This process is mediated by a loop that is disordered. In a recent work, Shan et al. used molecular simulations to show that mutations and phosphorylation of this loop reduce its intrinsic disorder, making this dimerization more probable (Shan et al., 2013, 2012). This not only explains how mutations cause constitutive activation of the signaling pathway, but also how, under normal cellular conditions, signaling is regulated. 2.2. Aggregation of IDPs

Fig. 1. Schematic illustration of the kinetic modulation of KID peptide by phosphorylation. Phosphorylation of S133 in the KID domain results in a 60-fold change in the kinetic exchange between the disordered ensemble and a metastable state that may be important to binding. Interestingly, this does not result in nearly as stark change in populations of the states, only increasing the population of the metastable ordered state from 1% to 5%. Simulations will be instrumental in uncovering such phenomena in the future.

Another area that simulations can play an important role is in studying proteins that aggregate, which are often disordered until they incorporate into fibrils. Having a better understanding of this process might help in the design of strategies to treat numerous diseases. Two such studies that have already been attempted, one with Ab domain (Lin et al., 2012), important in Alzheimer's disease, and the other with hIAPP (Qiao et al., 2013), found in type II diabetes. Simulations were needed for the Ab study because it is highly hydrophobic and begins to form fibrils quickly, making experiments challenging. The conformational ensembles of the two most dominant in vivo lengths were investigated, Ab40 and Ab42, along with a mutant known as Ab42-E22K. Using 700 ms of simulation, they were able to explain that the Ab42 and Ab42-E22K mutants were more likely to aggregate due increased beta-sheet or alphahelical secondary structure, respectively. In the other study, the hIAPP domain was simulated for 70 ms to better understand its conformations and how those might lead to aggregation. They found several states that had large amounts of bhairpin secondary structure, which could lead to nucleation of the fibrils. The work also suggests that a conformational selection mechanism is needed for fibril formation, at least in that case but perhaps in many others. The peptide must reach a certain

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

4

N. Stanley et al. / Progress in Biophysics and Molecular Biology xxx (2015) 1e6

conformer before nucleation can happen, it is proposed. Of course, this still leaves unclear how the fibril extends, which can be an area of future study. 3. Discussion We are only just beginning to understand IDPs and their importance, so it is good to discuss where molecular simulations of IDPs may be useful in the future. One way simulations will undoubtedly be essential is to better understand how IDPs bind to other proteins. Successful simulations have already been performed by our group, in which we bound the short disordered peptide pYEEIþ to the SH2 domain (Giorgino et al., 2012). Interestingly, we were able to fully reproduce the experimental kinetics and energetics. Still, many fundamental questions remain about IDP binding. It has been hotly debated what advantages disorder can confer in proteineprotein binding, and while examples exist of either conformational selection or induced fit mechanism in IDP binding (Onitsuka et al., 2008), whether one is more common or advantageous than the other is unknown. Further investigations into longer peptides with more complex binding profiles are needed. Finally, while the investigations into IDPs that aggregate mentioned above are interesting, they raise the question as to whether we can see nucleation and growth of fibrils. The ability to do so would not only give us critical insights into their growth, but could lead to ways to prevent or retard their growth. Perhaps one of the most interesting potential uses of studying IDPs with molecular simulations is the hope that we may identify and drug their metastable conformations, thereby activating or deactivating them. The possibility of drugging disordered regions has been discussed at length (Anurag and Dash, 2009; Cheng et al., 2006b; Metallo, 2010), and there are already cases showing it is possible (Krishnan et al., 2014). While simulations have not played an integral part in such a development yet, some groups have already made some successful forays in that direction (Michel and Cuchillo, 2012). Simulations should be used alongside experiments to help improve and support one another whenever possible. Some recent works, for example, make nice strides towards resolving questions about how collapsed IDPs are in solution and simulation. A comprehensive assessment was undertaken recently by Palazzesi et al. (2014) that show substantial heterogeneity in how different force fields approximate disordered ensembles. Skinner et al. (2014) have shown that force fields likely do not accurately describe interactions between water and backbone amide bonds, resulting in a more collapsed form. In a separate work, to remedy this problem, Best et al. (2014) show that slight modifications to the protein-water pair interaction can resolve this issue. Finally, Zerze et al. (2014) show that chromophores attached to a peptide in FRET experiments have only modest effect on the conformation of an IDP, suggesting they provide accurate and useful information. Taken together, these results clearly show how experimental observations and simulations can help improve and support one another. It is a near certainty that other deficiencies remain to be uncovered in the force fields. Systematic approaches to parameterization have recently been advanced (Wang et al., 2014), and a potential solution to propagate advances such as those mentioned above would be the creation of a consortium to ensure force field accuracy. Long timescale simulations will help discover such problems and ensure the accuracy of improvements. With all the recent advances that have made the above works possible, it is reasonable to ask whether there is further room for improvement or if we are at a temporary plateau. Fortunately, improvements in both hardware and methods are already here. The Anton supercomputer now has a successor, the Anton 2 (Shaw

et al., 2014), which will be capable of performing 85 ms/day of simulations for ~25,000 atom system. That is ten times faster than the previous model. While the Anton 2 will not be available for general use, its development is important nonetheless as it makes clear what is possible, and will hopefully spur public and private institutions to take action to make such capability more broadly accessible. Advancements made in the methods for performing and analyzing simulations will also make long timescales more accessible by increasing our ability to sample rare states. Markov state models were originally only used after simulations were performed to understand large amounts of parallel simulations. However, it has been suggested for some time that they could be used between successive rounds of simulation to explore unsampled conformations and ensure adequate sampling of the processes seen (Pande et al., 2010). Our group has successfully applied these methods for the simple case of benzamidine binding to trypsin (Doerr and De Fabritiis, 2014), reducing the required simulations time by an order of magnitude. Other similar methods have also successfully been used to achieve this kind of rate enhancement (Du and Bolhuis, , 2015). Using such methods for more complex pro2015; Noe cesses will be more challenging, but we believe that they will become an indispensable tool as we explore into the millisecond timescales and beyond. 4. Conclusion The examples covered in this review make it clear that MD is becoming an essential tool for studying IDPs. It is now possible to simulate IDPs at atomic scale and on long timescales. As a result, IDP behavior can be studied meaningfully and, properly backup against experimental data. This opens the possibility of uncovering previously hidden features of IDPs. Simulations can be used to fully characterize the states and transitions of disordered proteins, and how those may affect things like binding or aggregation. We expect that simulations will be instrumental in developing good strategies to drug IDPs. And as hardware and methods continue to improve, the possibilities will only expand. Intrinsically disordered proteins are only beginning to be understood, and current successes suggest simulations will play a crucial role in their study, from basic biophysics all the way to the development of therapeutics. Conflict of interest The authors declare no competing financial interest. Acknowledgments We thank the volunteers at GPUGRID.net for contributing computing resources to our works presented here. G.D.F. acknowledges support by the Spanish Ministry of Science and Innovation (ref. BIO2011/27450). S.E.M. is a recipient of Juan de la Cierva Fellowship. References Alber, T., Gilbert, W.A., Ponzi, D.R., Petsko, G.A., 1983. The role of mobility in the substrate binding and catalytic machinery of enzymes. Ciba Found. Symp. 93, 4e24. Anurag, M., Dash, D., 2009. Unraveling the potential of intrinsically disordered proteins as drug targets: application to Mycobacterium tuberculosis. Mol. Biosyst. 5, 1752e1757. http://dx.doi.org/10.1039/B905518P. Aviles, F.J., Chapman, G.E., Kneale, G.G., Crane-Robinson, C., Bradbury, E.M., 1978. The conformation of histone H5. Eur. J. Biochem. 88, 363e371. http://dx.doi.org/ 10.1111/j.1432-1033.1978.tb12457.x. Ban, D., Sabo, T.M., Griesinger, C., Lee, D., 2013. Measuring dynamic and kinetic information in the previously inaccessible supra-tc window of nanoseconds to

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

N. Stanley et al. / Progress in Biophysics and Molecular Biology xxx (2015) 1e6 microseconds by solution NMR spectroscopy. Molecules 18, 11904e11937. http://dx.doi.org/10.3390/molecules181011904. Best, R.B., Zheng, W., Mittal, J., 2014. Balanced proteinewater interactions improve properties of disordered proteins and non-specific protein association. J. Chem. Theory Comput. http://dx.doi.org/10.1021/ct500569b. Cheng, Y., LeGall, T., Oldfield, C.J., Dunker, A.K., Uversky, V.N., 2006a. Abundance of intrinsic disorder in protein associated with cardiovascular disease. Biochemistry (Mosc.) 45, 10448e10460. http://dx.doi.org/10.1021/bi060981d. Cheng, Y., LeGall, T., Oldfield, C.J., Mueller, J.P., Van, Y.-Y.J., Romero, P., Cortese, M.S., Uversky, V.N., Dunker, A.K., 2006b. Rational drug design via intrinsically disordered protein. Trends Biotechnol. 24, 435e442. http://dx.doi.org/10.1016/ j.tibtech.2006.07.005. Das, R.K., Pappu, R.V., 2013. Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues. Proc. Natl. Acad. Sci. U. S. A. 110, 13392e13397. http://dx.doi.org/10.1073/ pnas.1304749110. Doerr, S., De Fabritiis, G., 2014. On-the-fly learning and sampling of ligand binding by high-throughput molecular simulations. J. Chem. Theory Comput. http:// dx.doi.org/10.1021/ct400919u. Dunker, A.K., Brown, C.J., Lawson, J.D., Iakoucheva, L.M., Obradovi c, Z., 2002. Intrinsic disorder and protein function. Biochemistry (Mosc.) 41, 6573e6582. http://dx.doi.org/10.1021/bi012159. Dunker, A.K., Lawson, J.D., Brown, C.J., Williams, R.M., Romero, P., Oh, J.S., Oldfield, C.J., Campen, A.M., Ratliff, C.M., Hipps, K.W., Ausio, J., Nissen, M.S., Reeves, R., Kang, C., Kissinger, C.R., Bailey, R.W., Griswold, M.D., Chiu, W., Garner, E.C., Obradovic, Z., 2001. Intrinsically disordered protein. J. Mol. Graph. Model. 19, 26e59. http://dx.doi.org/10.1016/S1093-3263(00)00138-8. Du, W., Bolhuis, P.G., 2015. Equilibrium kinetic network of the villin headpiece in implicit solvent. Biophys. J. 108, 368e378. http://dx.doi.org/10.1016/ j.bpj.2014.11.3476. Dyson, H.J., 2011. Expanding the proteome: disordered and alternatively folded proteins. Q. Rev. Biophys. 44, 467e518. http://dx.doi.org/10.1017/ S0033583511000060. Dyson, H.J., Wright, P.E., 2005. Intrinsically unstructured proteins and their functions. Nat. Rev. Mol. Cell Biol. 6, 197e208. http://dx.doi.org/10.1038/nrm1589. Eliezer, D., 2009. Biophysical characterization of intrinsically disordered proteins. Curr. Opin. Struct. Biol. 19, 23e30. http://dx.doi.org/10.1016/j.sbi.2008.12.004 (Folding and binding/Proteinenuclei acid interactions). Giorgino, T., Buch, I., De Fabritiis, G., 2012. Visualizing the induced binding of SH2ephosphopeptide. J. Chem. Theory Comput. 8, 1171e1175. http://dx.doi.org/ 10.1021/ct300003f. He, B., Wang, K., Liu, Y., Xue, B., Uversky, V.N., Dunker, A.K., 2009. Predicting intrinsic disorder in proteins: an overview. Cell Res. 19, 929e949. http:// dx.doi.org/10.1038/cr.2009.87. €ppener, J.W.M., Ahre n, B., Lips, C.J.M., 2000. Islet amyloid and type 2 diabetes Ho mellitus. N. Engl. J. Med. 343, 411e419. http://dx.doi.org/10.1056/ NEJM200008103430607. Iakoucheva, L.M., Brown, C.J., Lawson, J.D., Obradovi c, Z., Dunker, A.K., 2002. Intrinsic disorder in cell-signaling and cancer-associated proteins. J. Mol. Biol. 323, 573e584. Karplus, M., McCammon, J.A., 2002. Molecular dynamics simulations of biomolecules. Nat. Struct. Mol. Biol. 9, 646e652. http://dx.doi.org/10.1038/ nsb0902-646. Krishnan, N., Koveal, D., Miller, D.H., Xue, B., Akshinthala, S.D., Kragelj, J., Jensen, M.R., Gauss, C.-M., Page, R., Blackledge, M., Muthuswamy, S.K., Peti, W., Tonks, N.K., 2014. Targeting the disordered C terminus of PTP1B with an allosteric inhibitor. Nat. Chem. Biol. http://dx.doi.org/10.1038/nchembio.1528 (advance online publication). Lashuel, H.A., Hartley, D., Petre, B.M., Walz, T., Lansbury, P.T., 2002. Neurodegenerative disease: amyloid pores from pathogenic mutations. Nature 418, 291. http://dx.doi.org/10.1038/418291a. Levitt, M., 2001. The birth of computational structural biology. Nat. Struct. Mol. Biol. 8, 392e393. http://dx.doi.org/10.1038/87545. Lewis, M., Chang, G., Horton, N.C., Kercher, M.A., Pace, H.C., Schumacher, M.A., Brennan, R.G., Lu, P., 1996. Crystal structure of the lactose operon repressor and its complexes with DNA and inducer. Science 271, 1247e1254. Linding, R., Jensen, L.J., Diella, F., Bork, P., Gibson, T.J., Russell, R.B., 2003. Protein disorder prediction: implications for structural proteomics. Structure 11, 1453e1459. http://dx.doi.org/10.1016/j.str.2003.10.002. Lindorff-Larsen, K., Piana, S., Dror, R.O., Shaw, D.E., 2011. How fast-folding proteins fold. Science 334, 517e520. http://dx.doi.org/10.1126/science.1208351. Lindorff-Larsen, K., Trbovic, N., Maragakis, P., Piana, S., Shaw, D.E., 2012. Structure and dynamics of an unfolded protein examined by molecular dynamics simulation. J. Am. Chem. Soc. 134, 3787e3791. http://dx.doi.org/10.1021/ja209931w. Lin, Y.-S., Bowman, G.R., Beauchamp, K.A., Pande, V.S., 2012. Investigating how peptide length and a pathogenic mutation modify the structural ensemble of amyloid beta monomer. Biophys. J. 102, 315e324. http://dx.doi.org/10.1016/ j.bpj.2011.12.002. Meng, W., Lyle, N., Luan, B., Raleigh, D.P., Pappu, R.V., 2013. Experiments and simulations show how long-range contacts can form in expanded unfolded proteins with negligible secondary structure. Proc. Natl. Acad. Sci. U. S. A. http:// dx.doi.org/10.1073/pnas.1216979110. Metallo, S.J., 2010. Intrinsically disordered proteins are potential drug targets. Curr. Opin. Chem. Biol. 14, 481e488. http://dx.doi.org/10.1016/j.cbpa.2010.06.169.

5

Michel, J., Cuchillo, R., 2012. The impact of small molecule binding on the energy landscape of the intrinsically disordered protein C-myc. PLoS ONE 7, e41070. http://dx.doi.org/10.1371/journal.pone.0041070. Muchmore, S.W., Sattler, M., Liang, H., Meadows, R.P., Harlan, J.E., Yoon, H.S., Nettesheim, D., Chang, B.S., Thompson, C.B., Wong, S.L., Ng, S.L., Fesik, S.W., 1996. X-ray and NMR structure of human Bcl-xL, an inhibitor of programmed cell death. Nature 381, 335e341. http://dx.doi.org/10.1038/381335a0. , F., 2015. Beating the millisecond barrier in molecular dynamics simulations. Noe Biophys. J. 108, 228e229. http://dx.doi.org/10.1016/j.bpj.2014.11.3477. Ohmura, I., Morimoto, G., Ohno, Y., Hasegawa, A., Taiji, M., 2014. MDGRAPE-4: a special-purpose computer system for molecular dynamics simulations. Philos. Trans. R. Soc. Math. Phys. Eng. Sci. 372, 20130387. http://dx.doi.org/10.1098/ rsta.2013.0387. Onitsuka, M., Kamikubo, H., Yamazaki, Y., Kataoka, M., 2008. Mechanism of induced folding: both folding before binding and binding before folding can be realized in staphylococcal nuclease mutants. Proteins 72, 837e847. http://dx.doi.org/ 10.1002/prot.21978 (Struct. Funct. Bioinforma.). Palazzesi, F., Prakash, M.K., Bonomi, M., Barducci, A., 2014. Accuracy of current allatom force-fields in modeling protein disordered states. J. Chem. Theory Comput. 11, 2e7. http://dx.doi.org/10.1021/ct500718s. Palmer III, A.G., 2014. Chemical exchange in biomacromolecules: past, present, and future. J. Magn. Reson. 241, 3e17. http://dx.doi.org/10.1016/j.jmr.2014.01.008. Pancsa, R., Tompa, P., 2012. Structural disorder in eukaryotes. PLoS ONE 7, e34687. http://dx.doi.org/10.1371/journal.pone.0034687. Pande, V.S., Beauchamp, K., Bowman, G.R., 2010. Everything you wanted to know about Markov state models but were afraid to ask. Methods. http://dx.doi.org/ 10.1016/j.ymeth.2010.06.002 (San Diego Calif.). rez-Hern , F., 2013. IdentificaPe andez, G., Paul, F., Giorgino, T., De Fabritiis, G., Noe tion of slow molecular order parameters for Markov model construction. J. Chem. Phys. 139, 015102-1e015102-13. http://dx.doi.org/10.1063/1.4811489. Piana, S., Lindorff-Larsen, K., Shaw, D.E., 2011. How robust are protein folding simulations with respect to force field parameterization? Biophys. J. 100, L47eL49. http://dx.doi.org/10.1016/j.bpj.2011.03.051. Qiao, Q., Bowman, G.R., Huang, X., 2013. Dynamics of an intrinsically disordered protein reveal metastable conformations that potentially seed aggregation. J. Am. Chem. Soc. http://dx.doi.org/10.1021/ja403147m. Romero, P., Obradovic, Z., Kissinger, C., Villafranca, J.E., Dunker, A.K., 1997. Identifying disordered regions in proteins from amino acid sequence. In: International Conference on Neural Networks, 1997. Presented at the, International Conference on Neural Networks, 1997, vol. 1, pp. 90e95. http://dx.doi.org/ 10.1109/ICNN.1997.611643. Schwantes, C.R., Pande, V.S., 2013. Improvements in Markov state model construction reveal many non-native interactions in the folding of NTL9. J. Chem. Theory Comput. 9, 2000e2009. http://dx.doi.org/10.1021/ct300878a. Shan, Y., Arkhipov, A., Kim, E.T., Pan, A.C., Shaw, D.E., 2013. Transitions to catalytically inactive conformations in EGFR kinase. Proc. Natl. Acad. Sci. U. S. A. 110, 7270e7275. http://dx.doi.org/10.1073/pnas.1220843110. Shan, Y., Eastwood, M.P., Zhang, X., Kim, E.T., Arkhipov, A., Dror, R.O., Jumper, J., Kuriyan, J., Shaw, D.E., 2012. Oncogenic mutations counteract intrinsic disorder in the EGFR kinase and promote receptor dimerization. Cell 149, 860e870. http://dx.doi.org/10.1016/j.cell.2012.02.063. Shaw, D.E., Deneroff, M.M., Dror, R.O., Kuskin, J.S., Larson, R.H., Salmon, J.K., Young, C., Batson, B., Bowers, K.J., Chao, J.C., Eastwood, M.P., Gagliardo, J., ry, I., Klepeis, J.L., Layman, T., Grossman, J.P., Ho, C.R., Ierardi, D.J., Kolossva McLeavey, C., Moraes, M.A., Mueller, R., Priest, E.C., Shan, Y., Spengler, J., Theobald, M., Towles, B., Wang, S.C., 2007. Anton, a special-purpose machine for molecular dynamics simulation. In: Proceedings of the 34th Annual International Symposium on Computer Architecture, ISCA'07. ACM, New York, NY, USA, pp. 1e12. http://dx.doi.org/10.1145/1250662.1250664. Shaw, D.E., Grossman, J.P., Bank, J.A., Batson, B., Butts, J.A., Chao, J.C., Deneroff, M.M., Dror, R.O., Even, A., Fenton, C.H., Forte, A., Gagliardo, J., Gill, G., Greskamp, B., Ho, C.R., Ierardi, D.J., Iserovich, L., Kuskin, J.S., Larson, R.H., Layman, T., Lee, L.-S., Lerer, A.K., Li, C., Killebrew, D., Mackenzie, K.M., Mok, S.Y.-H., Moraes, M.A., Mueller, R., Nociolo, L.J., Peticolas, J.L., Quan, T., Ramot, D., Salmon, J.K., Scarpazza, D.P., Ben Schafer, U., Siddique, N., Snyder, C.W., Spengler, J., Tang, P.T.P., Theobald, M., Toma, H., Towles, B., Vitale, B., Wang, S.C., Young, C., 2014. Anton 2: raising the bar for performance and programmability in a special-purpose molecular dynamics supercomputer. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC'14. IEEE Press, Piscataway, NJ, USA, pp. 41e53. http:// dx.doi.org/10.1109/SC.2014.9. Sickmeier, M., Hamilton, J.A., LeGall, T., Vacic, V., Cortese, M.S., Tantos, A., Szabo, B., Tompa, P., Chen, J., Uversky, V.N., Obradovic, Z., Dunker, A.K., 2007. DisProt: the database of disordered proteins. Nucl. Acids Res. 35, D786eD793. http:// dx.doi.org/10.1093/nar/gkl893. Skinner, J.J., Yu, W., Gichana, E.K., Baxa, M.C., Hinshaw, J.R., Freed, K.F., Sosnick, T.R., 2014. Benchmarking all-atom simulations using hydrogen exchange. Proc. Natl. Acad. Sci. U. S. A. http://dx.doi.org/10.1073/pnas.1404213111 (201404213). Snow, C.D., Nguyen, H., Pande, V.S., Gruebele, M., 2002. Absolute comparison of simulated and experimental protein-folding dynamics. Nature 420, 102e106. http://dx.doi.org/10.1038/nature01160. Spolar, R.S., Record, M.T., 1994. Coupling of local folding to site-specific binding of proteins to DNA. Science 263, 777e784. http://dx.doi.org/10.1126/ science.8303294.

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

6

N. Stanley et al. / Progress in Biophysics and Molecular Biology xxx (2015) 1e6

Stanley, N., Esteban-Martín, S., De Fabritiis, G., 2014. Kinetic modulation of a disordered protein domain by phosphorylation. Nat. Commun. 5 http:// dx.doi.org/10.1038/ncomms6272. Tompa, P., 2002. Intrinsically unstructured proteins. Trends Biochem. Sci. 27, 527e533. http://dx.doi.org/10.1016/S0968-0004(02)02169-2. Tompa, P., 2009. Structure and Function of Intrinsically Disordered Proteins. Chapman and Hall/CRC. Uversky, V.N., Gillespie, J.R., Fink, A.L., 2000. Why are “natively unfolded” proteins unstructured under physiologic conditions? Proteins 41, 415e427. http:// dx.doi.org/10.1002/1097-0134(20001115)41:33.0.CO;2e7 (Struct. Funct. Bioinforma.). Uversky, V.N., Oldfield, C.J., Dunker, A.K., 2008. Intrinsically disordered proteins in human diseases: introducing the D2 concept. Annu. Rev. Biophys. 37, 215e246. http://dx.doi.org/10.1146/annurev.biophys.37.032807.125924. Varadi, M., Kosol, S., Lebrun, P., Valentini, E., Blackledge, M., Dunker, A.K., Felli, I.C., Forman-Kay, J.D., Kriwacki, R.W., Pierattelli, R., Sussman, J., Svergun, D.I., Uversky, V.N., Vendruscolo, M., Wishart, D., Wright, P.E., Tompa, P., 2013. pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins. Nucl. Acids Res. http://dx.doi.org/10.1093/nar/gkt960 gkt960.

Venken, T., Voet, A., De Maeyer, M., De Fabritiis, G., Sadiq, S.K., 2013. Rapid conformational fluctuations of disordered HIV-1 fusion peptide in solution. J. Chem. Theory Comput. http://dx.doi.org/10.1021/ct300856r. Wang, L.-P., Martinez, T.J., Pande, V.S., 2014. Building force fields: an automatic, systematic, and reproducible approach. J. Phys. Chem. Lett. 1885e1891. http:// dx.doi.org/10.1021/jz500737m. Wright, P.E., Dyson, H.J., 1999. Intrinsically unstructured proteins: re-assessing the protein structureefunction paradigm. J. Mol. Biol. 293, 321e331. http:// dx.doi.org/10.1006/jmbi.1999.3110. Xue, B., Dunbrack, R.L., Williams, R.W., Dunker, A.K., Uversky, V.N., 2010. PONDRFIT: a meta-predictor of intrinsically disordered amino acids. Biochim. Biophys. Acta BBA e Proteins Proteom. 1804, 996e1010. http://dx.doi.org/10.1016/ j.bbapap.2010.01.011. Zerze, G.H., Best, R.B., Mittal, J., 2014. Modest influence of FRET chromophores on the properties of unfolded proteins. Biophys. J. 107, 1654e1660. http:// dx.doi.org/10.1016/j.bpj.2014.07.071. Zuckerman, D.M., 2011. Equilibrium sampling in biomolecular simulations. Annu. Rev. Biophys. 40, 41e62. http://dx.doi.org/10.1146/annurev-biophys-042910155255.

Please cite this article in press as: Stanley, N., et al., Progress in studying intrinsically disordered proteins with atomistic simulations, Progress in Biophysics and Molecular Biology (2015), http://dx.doi.org/10.1016/j.pbiomolbio.2015.03.003

Progress in studying intrinsically disordered proteins with atomistic simulations.

Intrinsically disordered proteins are increasingly the focus of biological research since their significance was acknowledged over a decade ago. Due t...
464KB Sizes 3 Downloads 6 Views