1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

JPBM929_proof ■ 30 July 2014 ■ 1/10

Progress in Biophysics and Molecular Biology xxx (2014) 1e10

Contents lists available at ScienceDirect

Progress in Biophysics and Molecular Biology journal homepage: www.elsevier.com/locate/pbiomolbio

Original research

An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles Q6

Govindarajan Sudha a, Ruth Nussinov b, c, **, Narayanaswamy Srinivasan a, * a

Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, MD 21702, USA c Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel b

Q2

a r t i c l e i n f o

a b s t r a c t

Article history: Available online xxx

Rich data bearing on the structural and evolutionary principles of proteineprotein interactions are paving the way to a better understanding of the regulation of function in the cell. This is particularly the case when these interactions are considered in the framework of key pathways. Knowledge of the interactions may provide insights into the mechanisms of crucial ‘driver’ mutations in oncogenesis. They also provide the foundation toward the design of proteineprotein interfaces and inhibitors that can abrogate their formation or enhance them. The main features to learn from known 3-D structures of proteineprotein complexes and the extensive literature which analyzes them computationally and experimentally include the interaction details which permit undertaking structure-based drug discovery, the evolution of complexes and their interactions, the consequences of alterations such as posttranslational modifications, ligand binding, disease causing mutations, host pathogen interactions, oligomerization, aggregation and the roles of disorder, dynamics, allostery and more to the protein and the cell. This review highlights some of the recent advances in these areas, including design, inhibition and prediction of proteineprotein complexes. The field is broad, and much work has been carried out in these areas, making it challenging to cover it in its entirety. Much of this is due to the fast increase in the number of molecules whose structures have been determined experimentally and the vast increase in computational power. Here we provide a concise overview. © 2014 Elsevier Ltd. All rights reserved.

Keywords: Proteineprotein interactions Proteineprotein complexes Homomeric proteins Oligomeric proteins Oligomers Structure Evolution Interaction Function Conformation

1. The classical view of proteineprotein interactions Understanding biological systems requires detailed knowledge of cellular events at the detailed molecular level. This level includes the physical interactions between macromolecules such as DNA, RNA and proteins and between these and their environment, including lipids, ions and second messengers, such as cAMP. Here we focus on proteineprotein interactions which are responsible for carrying out diverse processes in living systems. Structural and mechanistic features of proteineprotein interactions may be best understood using the three-dimensional structures of the proteins

Q1

* Corresponding author. Cancer and Inflammation Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., National Cancer Institute, Frederick, MD 21702, USA. ** Corresponding author. E-mail addresses: [email protected] (G. Sudha), [email protected] (R. Nussinov), [email protected], [email protected] (N. Srinivasan).

and their complexes. The structural database provides rich data both of static crystal structures and their ensembles in solutions by NMR. Protein ensembles can also be glimpsed from collections of crystal structures of the same protein, however in different bound and unbound states and crystal forms. Even though the crystal environment captures only the state favored under specific crystallization conditions, these static structures still provide crucial information on the nature of the proteineprotein interactions. A vast majority of heterocomplexes with known 3D structures are heterodimers (Fig. 1). Therefore, there is a need to study the 3D structures of higher order heteromers, which often form the functional multiprotein assemblies in the cell. Structural bioinformatics of proteineprotein interactions, which deals with the analysis of known 3D structures, has provided detailed information on the underlying principles of structure, function and dysfunction, and evolution of proteineprotein complexes. Proteins that are stable only in a proteineprotein complex form and remain together throughout their functional life time are

http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004 0079-6107/© 2014 Elsevier Ltd. All rights reserved.

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 2/10

2

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

Fig. 1. Distribution of number of chaints in the heterocomplexes of known 3-D structure: The histogram shows the distribution of number of chaints in the heterocomplexes of known structure available in the Protein Data Bank (PDB (Berman et al., 2000)). The data to generate this figure corresponds to the number of chains in the biological units as presented in the PDB.

termed as ‘permanent’ proteineprotein complexes. On their own, these proteins are typically disordered; that is, they exist in a range of conformational states, with none of these having a sufficiently stable conformation to be captured in crystalline form. Proteineprotein complexes that interact with their partner for a brief period of time to carry out a specific function and are stable in their free form are termed ‘transient’ (Nooren and Thornton, 2003a). On average, there are differences in the structures and chemical characteristics of interfaces between permanent and transient proteineprotein complexes (De et al., 2005). The evolution of the interfaces was suggested to be slower for permanent proteineprotein complexes than for transient complexes (Mintseris and Weng, 2005). Transient proteineprotein interfaces show higher residue conservation than rest of the tertiary structural surface (Choi et al., 2009; Mintseris and Weng, 2005; Valdar and Thornton, 2001). Physicochemical and geometrical characterization of protein interfaces have been extensively studied that are different from the rest of the surface (Jones and Thornton, 1996) (De et al., 2005; Jones et al., 2000; Lo Conte et al., 1999; Sonavane and Chakrabarti, 2008). Differences in interfacial features have also been observed between permanent and transient proteineprotein complexes. Interface size (small interfaces in transient proteineprotein complexes versus large interfaces in permanent complexes), area, polarity (polar interfaces in transient proteineprotein complexes versus non-polar interfaces in permanent complexes), shape complementarity, conformational changes upon binding, residue interface propensities and residue contacts have served as distinguishing features to predict and classify permanent and transient proteineprotein complexes (Ansari and Helms, 2005; Bahadur et al., 2003; Block et al., 2006; De et al., 2005; Jones and Thornton, 1996; Keskin et al., 2008; Levy and Pereira-Leal, 2008; Mintseris and Weng, 2003; Nooren and Thornton, 2003b; Zhu et al., 2006). A proteineprotein interface can be divided into core and rim which are buried in the interface and remain accessible to the solvent, respectively (Bahadur et al., 2003). Interestingly, the core and the rim differ in their amino acid composition and conservation (Janin et al., 2008). Another important approach to interface

residue classification is based on the contributions to interaction energy. The subset of interface residues that serve as major contributors to binding energy in proteineprotein interfaces (>2 kcal/ mol) have been termed hot-spot residues (Bogan and Thorn, 1998). Analysis of a large number of 3-D structures of proteineprotein complexes revealed that, in general homologous proteineprotein complex structures are conserved (Aloy et al., 2003). However, interfaces of distantly-related homologous proteins are usually not topologically equivalent (Rekha et al., 2005). Further detailed analysis showed that spatial orientations of interacting proteins with respect to each other in some of the homologous proteineprotein complexes differ (Kim et al., 2006). Studies also showed that interactions between proteins could often be predicted successfully if the proteins have high sequence similarities with proteins, which are known to interact with each other (Levy and Pereira-Leal, 2008; Mika and Rost, 2006). Studies also showed that structurally similar interfaces can bind proteins with different binding site structures and different functions (Tsai et al., 1996). This is accommodated through conserved interactions at similar interface locations, despite having different partners (Keskin and Nussinov, 2007). Even if the overall structures of the interacting chains are different, interface similarity may exist (Keskin and Nussinov, 2005). While proteineprotein interfaces are typically highly specific, there appear to be proteins with ‘promiscuous’ binding characteristics (Schreiber and Keating, 2011). One way to achieve specificity is by utilizing different hotspot residues in the protein interfaces (Gretes et al., 2009). Alternatively, different conformations in the ensemble may be selected (Ma et al., 1999; Tsai et al., 1999a, b). Clusters of interacting residues have been observed in proteineprotein interfaces and cooperative interactions between residues in a cluster generate binding affinity and specificity (Reichmann et al., 2005). Below, we briefly discuss recent and emerging views in structural bioinformatics of proteineprotein interactions. 2. Recent and emerging views on proteineprotein interactions 2.1. Proteineprotein complexes are multifaceted A grasp of the structural and evolutionary principles of proteineprotein interactions is essential to understand the roles of proteins in the cell. Degeneracy is observed not only at the level of protein folds but also at the level of proteineprotein interface structures. This is due to the structural constraints of packing of secondary structural elements at the interface and functional constraints (Gao and Skolnick, 2010). Using available 3-D structures of proteineprotein complexes, interfaces have been clustered and it was proposed that the repertoire of structures of interfaces is limited (Cukuroglu et al., 2014). However, surprisingly the conservation of interfaces in evolutionarily-related proteineprotein complexes does not always take place (Zhang et al., 2010), which suggests that interfaces are tuned for specific interactions, which then lead to specific cellular pathways. Alternate binding modes in homologous proteineprotein complexes have been observed, with the interfaces not entirely topologically equivalent (Fig. 2). (Hamp and Rost, 2012; Kundrotas and Vakser, 2013), and there are examples of proteins which can bind to different proteins with nonequivalent locations (Martin, 2010). There seems to be evolutionary ‘plasticity’ in homologous proteineprotein interfaces which are manifested as different types of interface contacts especially those involving polar residues (Andreani et al., 2012). ‘Plasticity’ reflects the presence of proteins as conformational ensembles, with different conformations being selected followed by minor induced fit optimization (Csermely et al., 2010). At the same time, we also

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 3/10

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

3

Fig. 2. An example of homologous proteineprotein complexes with different binding modes: (a) Erythropoietin-receptor complex (PDB code: 1eer) and (b) Granulocyte colonystimulating factor e G-CSF receptor complex (PDB code: 1cd9) are homologous pairs of cytokine-receptor complexes with different binding modes. The cytokines are colored in yellow while the receptors colored in red. (c) Structural superposition of homologous cytokines are shown with the interface residues colored as yellow sticks. The topologically equivalent non interface residues of the other homologous cytokine are shown as orange sticks. The types of the interface residues are drastically different from the topologically equivalent non interface residues as shown in the figure. The figures are generated using PyMOL (DeLano, 2002).

observe conserved interactions and interface residues in homologous proteineprotein complexes, which is the generally expected situation. Firstly, interface residues can be mutated during evolution without affecting their interaction if the interaction is through main-chain atoms or b-carbons (Talavera et al., 2012). Secondly, it was suggested that a subset of interface residues termed rigid interface residues” are evolutionarily well conserved and conformationally invariant in associated and unassociated forms of proteineprotein complexes. This could be due to the dual function of forming intra-and inter-protein contacts, as well as contacts with the solvent, which provides stabilization to each side of the interface even in their unbound forms (Swapna et al., 2012a). Proteineprotein interactions evolve. They are highly specific, with functionally required binding affinity and specificity. Paralogous proteins that bind to different proteins do so by employing subfamily-specific residues at the interface (Aiello and Caffrey, 2012). Rewiring of interaction specificity has been achieved by mutating some of the interfacial residues by residues from an interface of another protein (Podgornaia et al., 2013). Information on co-evolution between interacting proteins has been used to carry out mutations which helped in altering the specificity (Cheng et al., 2014). Specificity for the order of assembly of heteromeric multi-protein complexes may be dictated by protein interface size (Marsh et al., 2013), which acts to stabilize the ‘right’ conformer in the interactions. However, other factors besides size may also be at play, including specific stabilizing interactions, achieving the same outcome. Clusters of conserved exposed hydrophobic and charged residues in the uncomplexed form can account for most of the proteineprotein binding energy/hotspot residues in the complexed form (Agrawal et al., 2014). Small molecule binding sites in proteineprotein complexes share the proteineprotein interface hotspots. Therefore, binding of small molecules to these hotspots could inhibit the proteineprotein interactions (London et al., 2013; Thangudu et al., 2012) and constitute drug target. Hot spots are pre-organized in the unbound protein form, presenting reduced mobility (Kozakov et al., 2011; Ozbek et al., 2013). Their conformation in the unbound form resembles the one that they assume in the bound state. Additionally, allosteric druggable hotspot regions can also be used to modulate proteineprotein interactions (Ma and Nussinov, 2013). 2.2. Proteineprotein interfaces can be altered e PTMs, small molecule binding and mutations Chemical alterations in proteineprotein interfaces can be beneficial or detrimental to proteineprotein complex formation.

Such changes at proteineprotein interfaces can occur by means of post-translational modifications such as phosphorylation, acetylation and ubiquitination to either mediate or abrogate proteineprotein interactions (Beltrao et al., 2012; Nishi et al., 2011a; Nussinov et al., 2012). Cross-talk between phosphorylation and acetylation can also co-occur in the same protein interface (Beltrao et al., 2013; van Noort et al., 2012). These modified residues could serve as interaction hotspots (Nishi et al., 2011a). They can also be intrinsically disordered in the phosphorylated or unphosphorylated state for functional regulation (Nishi et al., 2013a). Allosteric modifications away from the proteineprotein interface can block or tune functional sites via conformational changes (Nussinov et al., 2012). Even though they are generally found to be evolutionarily more conserved than other interface residues, they need not be always conserved for signaling to take place (Nishi et al., 2011a; Tan et al., 2010). Small molecules can bind to pockets within or close to the proteineprotein interface. This may or may not be due to proteineprotein packing. Packing may not be perfect within or at the periphery of the interface resulting in the formation of pockets for ligand binding. Control of proteineprotein interactions, enhanced stability and regulation of function may be achieved through biologically relevant ligand binding pockets (Gao and Skolnick, 2012). Interface residues which interact with the complementary protein as well as small molecules are generally more conserved than other interface residues and tend to locate at pockets, which is also where interaction hotspots often are (Davis and Sali, 2010; Thangudu et al., 2012; Walter et al., 2013). In the complex, these pockets are often, though not always, filled by the complementary protein chain (Li et al., 2004). Filled pockets result in high packing density, thus making a residue a hot spot. A hot spot is typically a highly packed residue in the interface. Since pockets are often identified at the proteineprotein interface in the unbound or complexed forms, druggable sites could be identified. Moreover, engaging multiple pockets can be a productive strategy in inhibiting proteineprotein interactions (Fuller et al., 2009). Mutations in proteineprotein interfaces can be manifested as disease phenotypes. The structural changes following a mutation could lead to loss of electrostatic interaction, reduction of the hydrophobic effect, formation of steric clashes, changes in conformation, dynamics, and destabilization or over stabilization etc (David et al., 2012; Nishi et al., 2013b; Stefl et al., 2013; Teng et al., 2009). Disease-causing mutations of highly conserved residues manifest these detrimental effects (Talavera et al., 2012; Teng et al., 2009). Differences in the mutational microenvironments between cancerous and neutral mutations have also been observed (Engin et al., 2013; Espinosa et al., 2014; Nishi et al., 2013b; Yates and

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 4/10

4

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

Sternberg, 2013). Disease mutations can also result in gain of interactions leading to changes in partners, aggregation, and changes at the post-translational modification sites causing an orderdisorder transition. Proteins whose mutations can provoke cancer may interact with partners through distinct interfaces in multiinterface hubs (Kar et al., 2009). It was also suggested that mutations on different interfaces are more likely to cause different disorders than those on same interface (Wang et al., 2012). Alterations in proteineprotein interactions are common in infection due to the interactions between proteins of pathogen and the host. Large scale comparisons of viral protein structures have shown the mimicry of host protein structures at various structural levels which (Franzosa and Xia, 2011; Garamszegi et al., 2013; Itzhaki, 2011; Segura-Cabrera et al., 2013) appears to serve as a general strategy for successful hostepathogen interaction. Firstly, domain - domain interactions used in the host are also observed in viruses showing mimicry of domains for competitive interaction with the host proteins (Itzhaki, 2011). Secondly, host protein interfaces that are mimicked are generally interaction hub proteins which are transient in nature (Franzosa and Xia, 2011). Interestingly, Hepatitis C viral protein mimics the interface of human protein kinase without any gross structural similarity between the host and the viral protein (Sudha et al., 2012). Thirdly, viruses can mimic linear motifs involved in domain-motif interactions. This is feasible for viruses owing to their small genome size and convergent evolution (Gadkari and Srinivasan, 2010; Garamszegi et al., 2013). For example, human kinase substrate recognition motif is mimicked by a HCV protein (Sudha et al., 2012). Another strategy by viral proteins is to selectively target the host hub proteins for disrupting several host processes. Since viral proteins evolve faster than host proteins, there may exist an “evolutionary arms race” at the structural level for survival (Franzosa and Xia, 2011). 2.3. Proteineprotein complexes can be ordered and disordered Function often requires protein self-assembly to form homooligomers. Homo-oligomers usually involve permanent interactions; however there may also exist weak dimers, which are in equilibrium with monomers. Interfaces of weak dimers are loosely packed resulting in low stability (Dey et al., 2010). The prevalence of symmetric homomers is due to the favorable interaction energy leading to structural and functional advantage despite the entropic cost of the association from disordered interacting pairs of monomers (Andre et al., 2008). Asymmetric organization is rare yet significant in homomers (Swapna et al., 2012c). The evolutionary route of homomer formation can be identified from interface size (Levy et al., 2008). Insertions or deletions of residues at the interaction interface can be responsible for enabling or disabling homomer formation. These regions have low aggregation propensity, contain high proportion of polar residues, and large interface area which stabilizes the homomer (Hashimoto and Panchenko, 2010). Short insertions can act as spacers to fill cavities present at the interface which helps in oligomerization (Nishi et al., 2011b). Similar to transient proteineprotein complexes, homologous homomers also display conservation of binding mode and oligomeric state (Dayhoff et al., 2010). Nevertheless, closely related homomers with different oligomeric states are also observed. Inter-subunit geometry, interface or allosteric mutations, functional and stability constraints can all contribute to changes in the oligomeric state (Perica et al., 2012). Homomeric proteins have also been designed artificially to a desired oligomeric state which has been used as protein nanomaterial (King et al., 2012). The dynamics of the quaternary structure can be modulated by ligands, which can be exploited in drug discovery (Jaffe, 2013).

Apart from proteins being well ordered, proteins or part of a protein can be intrinsically disordered. Disordered states have been observed to fulfill key functions in the cell. The prevalence of polar interactions and complementary electrostatic potentials help to achieve high specificity. Their multiple states coupled with these residues also permits interactions with multiple specific partners (Wong et al., 2013). Sequence correlations between amino acids of the same type were suggested to be enhanced in structurally disordered proteins (Fong et al., 2009). Interactions in disordered regions are less conserved than the ordered regions possibly owing to their higher capacity to interact with multiple partners and rewire interactions (Mosca et al., 2012). Disordered regions are more common in symmetric homodimers than heterodimers which typically present small interfaces. The symmetric arrangement in homodimers was suggested to help bring the disordered regions close in space which permits access to the interacting partner thereby maintaining the function (Fong et al., 2009). Allosteric mechanisms are also manifested in IDPs (intrinsically disordered proteins) (Ferreon et al., 2013). The binding partners of IDPs may have different folds or with the same fold characterized by low sequence identity (Hsu et al., 2013). Alternatively, spliced proteins also tend to be intrinsically disordered where different disordered segments can mediate different interactions resulting in new functions especially those involved in signaling pathways (Buljan et al., 2013; Hsu et al., 2013). Tissue specific exons in humans are generally disordered with distinct interacting partners in different tissues (Buljan et al., 2012). Proteins belonging to various functional classes such as proteineprotein binding, phosphosphorylation, acetylation, metal binding, substrate/ligand binding, polymerization and transcriptional activation can be intrinsically disordered proteins. Functions in these proteins could arise from a transition of disordered to ordered state, or function arising from disordered state (Sickmeier et al., 2007; Hsu et al., 2013; Nishi et al., 2013a). Disease associated mis-sense mutations can also be located in IDPs which are deleterious (Vacic and Iakoucheva, 2012). Transition of ordered to unordered assembly in proteineprotein complexes causes a transition of function to dysfunction. Unordered assembly or non-specific interaction of proteins can result in aggregation, which is typically detrimental to the cell. An interesting observation showed that the interface regions are more prone to aggregation than the exposed surface region, which may be expected due to the higher fraction of hydrophobic and aromatic residues in interfaces. However, uncontrolled assembly can be prevented by disulphide bonds and salt bridges to form specific interactions (Pechmann et al., 2009). Highly abundant proteins generally have less sticky surfaces preventing formation of nonfunctional self-interactions and are poorly conserved (Levy et al., 2012). Aggregation prone proteins are subjected to differential transcription, translation and degradation control. The concentrations of these regions are always kept under critical level required for aggregation (Gsponer and Babu, 2012). Short living proteins show a higher aggregation propensity which is often associated with deposition diseases (De Baets et al., 2011). Generally, aggregation prone regions are protected by burial in the hydrophobic core. However, mutations and stress can make these regions solvent exposed and leads to the formation of b-structure agglomerates resulting in a disease phenotype. Enrichment of charged residues at the flanking region of aggregation prone segments can help against aggregate formation serving as gatekeepers. These factors increase the intrinsic solubility of otherwise aggregating sequences (Beerten et al., 2012a,b). Better protection of aggregation prone regions and aggregation gate keeping ability are observed more often in thermophilic proteins than in mesophilic proteins. In spite of the fact that proteins of the same family from thermophilic and mesophilic organisms are homologous, interestingly

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 5/10

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

aggregation prone regions appear not to be conserved (Thangakani et al., 2012). Aggregation prone regions have been analyzed in randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins and catalytic residues. The aggregation propensities of monomeric proteins are lower than random sequences (Buck et al., 2013). The likelihood of aggregation is generally not even in artificial sequences (Angyan et al., 2012). Aggregation in antibodies has been prevented by engineering mutations in aggregation prone motifs which has led to enhanced stability in the IgG antibody (Chennamsetty et al., 2009). 2.4. Proteineprotein complexes are dynamic Proteineprotein complexes are dynamic in nature. It is interesting to note that the evolutionary changes in the protein sequence and structures are related to protein flexibility (Marsh and Teichmann, 2014). The mobility of amino acids is inversely proportional to amino acid conservation (Liu and Bahar, 2012). Evolution of functions has been shown to be related to changes in protein dynamics (Lai et al., 2012). Catalytic loop motions help in substrate recognition and binding (Kurkcuoglu et al., 2012). Regions in the enzyme which show co-evolution as well as high mobility are predisposed to be substrate recognition sites. Similarity in the binding pocket dynamics within a protein family is also observed (Lai et al., 2012). Enzymes bound to inhibitors and in unbound form show that the ligands select the conformer that best matches the structural and dynamic properties among the conformers (Bakan and Bahar, 2009). The mobility of amino acids at the dimeric interface is generally lower than other amino acids at the tertiary structural surface (Zen et al., 2010). Conformational loop dynamics has been used to understand binding promiscuity and specificity in Eph-ephrin systems (Nussinov and Ma, 2012). Large conformational changes are often observed upon binding. Relative solvent accessible surface area has been used to predict the magnitude of binding-induced conformational changes from single chains or proteineprotein complexes. Large conformational changes may be especially significant in sequences enriched with intrinsically disordered regions, although lack of stable structures in such cases prevents direct comparisons and definitive assessments. Oligomers take advantage of the intrinsic dynamics in the individual subunits. At the same time, oligomerization stiffens the interfacial regions of the subunits and was suggested to provide new cooperative modes (Marcos et al., 2011). Interactions may also entail minor, or even subtle conformational changes, as can be seen from comparisons of corresponding bound and unbound protein structures. In allostery the ligand or protein effector binds to a protein and changes the structure and/or dynamics at a distant site which is often a functional site (Tsai et al., 2008). Even though a metabolite binds the enzyme at a site distant from the catalytic site, its binding is coupled to the active site. Numerous atoms are involved in the interaction between the two sites and the conformational changes in the protein structure (Manley et al., 2013). When surface sites are linked to active sites, they can be preferred locations for emergence of allosteric control and serve as hotspots for interaction (Reynolds et al., 2011). Sometimes substrates are transferred to the enzyme with help from scaffolding proteins through allosteric regulation (Nussinov et al., 2013). Allosteric signals are transmitted through multiple pre-existing pathways. Perturbations at any protein site will shift the pre-existing ensemble of pathways (del Sol et al., 2009). Proteineprotein complexation causes structural changes not only at the proteineprotein interface but also in regions away from the interface eliciting allostery. Allostery can be induced by any protein effector, and plays key roles in signaling systems (Kar et al., 2010; Swapna et al., 2012b). Allosteric events take place via the

5

dynamic conformational ensembles enabling information transfer in signaling proteineprotein complexes (Kar et al., 2010; Motlagh et al., 2014). Since both tertiary and quaternary scales of motions act interdependently, a global communication network had been developed which integrates both tertiary (residue level) and quaternary (subunit level) structural changes which are both required for allosteric communication. This approach can be used to design allostery in non-allosteric proteins (Daily and Gray, 2009). Allostery can be induced by modulating the amplitude of thermal fluctuations around a mean structure rather than conformational changes in the structure (McLeish et al., 2013). The CRP/FNR family of transcription factors shows only low frequency dynamics without the change in structure. Residues involved in allosteric control are conserved. Changes in the low frequency dynamics correlate with the allosteric effects of ligand binding (Rodgers et al., 2013). All protein conformations pre-exist and the ligand chooses the most favored conformation. Upon ligand binding, population shift in the conformational ensemble is observed thereby redistributing the conformational states. Upon binding a selected conformation, optimization of side chain and back-bone interactions proceeds following the induced fit mechanism (Boehr et al., 2009). Allosteric transition has been studied by varying the size and interactions in the allosteric sites to construct a series of energy landscapes corresponding to effector bound and unbound structures. Ligand induced cooperatively can measure how a given site responds to the effector binding. These models have been used to reproduce allosteric motion (Weinkam et al., 2012). Allosteric drug discovery, where the drug binds away from the native binding site and modulates the native interactions holds promise. However, a main challenge is to identify the allosteric hotspots (Ma and Nussinov, 2013). Aberrant allosteric actions can also result in diseased conditions (Nussinov and Tsai, 2013). 2.5. Sculpting proteineprotein interfaces can be advantageous Making alterations in proteineprotein interfaces or protein interface design helps in the generation of proteineprotein complexes with interaction specificity and improved affinity (Chen and Keating, 2012). Proteins with desired functions might be designed by means of computational grafting of functional motifs onto a protein scaffold. This involves transplantation of both backbone and side chain of linear functional motifs onto scaffold proteins (Azoitei et al., 2012). Thermally stable protein scaffolds that mimic viral epitope structure were able to induce potent neutralizing antibodies (Correia et al., 2014). Also, epitope transplanted into scaffolds can provide good affinity for antibodies (Ofek et al., 2010). Interface design has also generated a pH dependent IgG binding protein. Hotspot interactions are used to design the IgG binding proteins which are extremely stable and heat resistant and thus can be used for IgG affinity purification and diagnostic purposes (Strauch et al., 2014). Remodeling of loops near active sites has introduced specific enzymeesubstrate interactions making the redesigned enzyme more active than the native protein (Murphy et al., 2009). An enzyme which is naturally a hexamer has been converted into a homodimer using directed evolution. Mutations in the designed homodimer open up the active site for the new substrate. However, the thermal stability can be compromised (Yip et al., 2011). Designed zinc-mediated protein interface forms a cleft which creates enzyme active sites capable of hydrolysis (Der et al., 2012). An enzyme inhibitor has been designed which binds to the active site of the enzyme. Protein interface design can be based on interaction hotspots, shape complementary, and residue types resulting in high affinity binding with the enzyme (Procko et al., 2013). Another approach in protein interface design is based on a-helix mediated proteineprotein interactions (Azzarito

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 6/10

6

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

et al., 2013). Design of interaction specificity in proteineprotein interactions has been carried out by rationally rewiring the interaction interface of Thermotoga maritima two-component system to Escherichia coli two-component system. This shows how individual mutations can contribute to the rewiring of interaction specificity (Podgornaia et al., 2013). Designing many interaction partners simultaneously for a protein can provide insights into multispecificity showing that there are distinct positions at the interface for affinity and interaction specificity (Fromer and Shifman, 2009). Also, a de novo design of protein interactions based on interaction hotspots within core secondary structural elements and variable loops has been made (Fleishman et al., 2011). Design of ancestral homodimer that existed millions of years ago has provided insights into the evolution of protein structure and function. Structures of dimeric galectins from ancestral fish were compared with the currently existing galectins. Interestingly, the hydrogen bonding pattern at the dimeric interface, carbohydrate binding site of the ancestor is different from that of currently existing proteins (Konno et al., 2011). At the same time it behooves us to note that interface grafting is challenging. Transplanting a motif from one protein to another does not guarantee that the motif will not undergo changes in its new environment or that the protein environment will change following the grafting. The conformational ensemble will shift. The question is to what extent the shift will alter the prevailing conformation. Currently such predictions are challenging to make. 2.6. Advances in methodology Some of the recent improvements in the methodology of proteineprotein complex inhibition, prediction of interacting proteins, prediction of interface residues, modeling of proteineprotein complexes, and proteineprotein docking are discussed briefly below (Kastritis et al., 2014; Malhotra et al., 2014; Rodrigues and Bonvin, 2014; Shoemaker et al., 2013). It is a general notion that proteineprotein interfaces are “non-druggable” as they tend to be flat, lacking druggable pockets. However, recent studies have challenged this view and have changed the perspective about proteineprotein modulation by small molecules (Wells and McClendon, 2007). It has been shown that the contact surfaces are flexible due to amino acid side chain motion and small loop movements. These features have been captured by molecular dynamics, even in the absence of ligands, showing transient opening of binding pockets. There have been several examples showing that optimised small molecules can bind with an affinity comparable to that of the native protein. One such example is the ligand SP4206 which binds close to the receptor region of IL2 - IL2 Receptor alpha. This ligand is highly specific and a tight binder (Wells and McClendon, 2007). Scoring functions to assess the druggability of proteineprotein interfaces using atomic structures have been developed (Basse et al., 2013). Data collection of small molecules inhibiting proteineprotein complexes and their molecular properties can be a good resource for drug design studies (Buck et al., 2013; Higueruelo et al., 2013, 2009). The traditional view of structure-based drug design used only single protein-ligand structures. However, recent view has utilized information from ensemble of protein conformations using molecular dynamic simulations which reflect the flexibility of proteins during ligand binding, as in the case of HIV protease. This area is important for successful drug design (Meagher and Carlson, 2004). Yet another interesting discovery is the use of stable and cell permeable stapled alpha helical peptides, which have been shown to successfully inhibit P53-HDMX and NOTCH transactivation complex (Chang et al., 2013). Previous interface residue prediction methods were dependent on sequence information and structural

features in order to differentiate between interface residues from surface residues (Zhou and Qin, 2007). Recently a co-variance method has been adopted to identify the correlation between amino acid positions in interacting proteins using sequence information (Weigt et al., 2009). Evolutionary information derived from homologous domains in proteins with diverse architectures has been used to predict domainedomain interfacial residues (Bhaskara et al., 2013). Prediction of interface residues in low resolution cryoEM assemblies of Dengue viral coat proteins and clathrin vesicular assembly has been carried out based only on Ca atom positions (Gadkari and Srinivasan, 2010, 2012; Gadkari et al., 2009). Prediction of interacting proteins has been achieved in a study where the three dimensional structural information was enhanced by information on functional sites with the quality of the prediction comparable to that of high throughput experiments. Application of this approach resulted in high confidence interactions in yeast and human (Zhang et al., 2012). A docking approach has been used to distinguish binding partners, since they have achieved good performance making their predictions of favorable models highly probable compared to a pool of nonbinders (Wass et al., 2011). Proteineprotein docking to identify native structures is challenging. One way to address this is by ranking the docked poses based on structural interface parameters (Malhotra et al., 2014), energy and surface accessibility (Feliu et al., 2011). Attempts to improve proteineprotein docking have been made. Modeling a proteineprotein structure based on information from 3-D structures of similar complexes is more reliable than blind proteineprotein docking. Structural comparison of the target surface to template proteineprotein interfaces followed by application of docking energy function is based on similar principle except that this strategy requires only the interface region rather than the entire structure (Tuncbag et al., 2011). Threading is another approach to model proteineprotein complexes based on template identification (Mukherjee and Zhang, 2011). Data driven approach to docking evolutionarily conserved residues has also been useful (de Vries et al., 2010). Binding-induced conformational changes present a great challenge in modeling proteineprotein complexes by docking. A flexible multi-domain docking has been devised where a flexible partner is treated as an assembly of domains that are docked simultaneously. The elastic network model predicts the extent of conformational change (Karaca and Bonvin, 2011). Surprisingly, in spite of the limited number of currently available proteineprotein complexes, a study showed that templates can be found for complexes representing almost all known proteineprotein interactions (Kundrotas et al., 2012). Computational modeling along with experimental information such as chemical shift perturbation data with residual dipolar couplings have been used to drive proteineprotein docking (van Zundert and Bonvin, 2014). Recently, methods presented in the critical assessment of predicted interactions (CAPRI), predicted the position of water molecules in the proteineprotein interface, estimated the relative binding affinity, effect of point mutations on the stability of designed and native proteineprotein interactions (Lensink and Wodak, 2013). 3. Outlook A wealth of data on various features of proteineprotein complexes has been assembled in this review. We believe that there are still many outstanding questions which are yet to be unraveled in this vast area. High throughout proteineprotein interaction datasets are growing rapidly. However, their completeness and the occurrence of false positives are still major concerns. An integrated approach exploiting structural and evolutionary insights of proteineprotein interactions may improve the confidence in modeled

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 7/10

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

proteineprotein interactions. Further, it is important to understand why certain proteins do not interact with each other in spite of their being co-localized. There are still template-based proteineprotein complexes modelling approaches based on the premise that interface residues can be directly extrapolated from the template proteineprotein complex to modeled proteineprotein complex. This notion may not always hold due to the high evolutionary plasticity observed in protein interfaces, arguing for further improvements in such template-based modeling methods. Another important aspect begging improvement is incorporation of protein flexibility. To date, despite many efforts over years, proteineprotein docking methods are still largely rigid with only limited extent of flexibility. Since molecular surfaces are flexible, in the absence of additional biochemical data this may hamper accurate and reliable docking. This area is challenging largely due to the high computational costs of sampling proteineprotein modeling poses on the atomic scale. Most of the time, the proteineprotein complexes being studied extensively constitute part of a large macromolecular assembly in the living system. The Cryo-EM (Cryo-Electron microscopy) technique has provided low resolution maps of large assemblies. As shown in Fig. 3, EM maps are commonly available in the resolution range of only 8Å e 12Å. Therefore, fitting the atomic level structures into EM maps is an important step in order to understand the molecular details of interactions in huge molecular assemblies. Further, knowledge of the structural and evolutionary features of proteineprotein complexes such as conservation of interface residues and dynamics can be applied to improve the fitting of atomic level structures into the cryoEM density maps. Also, fitting of atomic level structures in cryoEM maps should take into account the conformational changes that occur between uncomplexed state (available at atomic resolution) and the complexed state (available at low resolution) in the multi-protein assembly. Detailed understanding of structural and functional constraints of quaternary structures of proteins as reflected in their evolution needs to be explored in greater detail. These could ultimately provide a holistic view of the functioning of the protein in the cell. Since evolutionary and structural dynamics are related, their combination can improve the confidence of protein function

7

prediction. Protein interaction networks from various organisms provide information on hub proteins coupling these together to better understand hub proteins regulation. A key question is how does a hub protein, with a shared binding site ‘know’ which partner to bind at any given time (Tsai et al., 2009). Similar approaches can be employed to understand if annotated steps in the metabolic pathway are sequential or simultaneous, although since these often involve enzymatic reactions, different considerations may apply. Even though several principles guide the design of inhibitors for proteineprotein complexes, a combined approach which considers the structural dynamics of proteineprotein interface, interaction hotspots, ‘rigid’ interface residues, post-translational modification sites and allosteric sites could eventually be adopted. This is however challenging because of their variability, temporal occurrences and combinations which can be expected to lead to different interface conformations. Notwithstanding, these may enhance or attenuate the affinity and specificity of the designed molecule. Ultimately, understanding biological processes requires the molecular details of proteineprotein interactions. It is of paramount importance due to the diverse applications in translational research in the area of drug discovery, protein interface design, and most of all the fundamental understanding which bears on all of these. Acknowledgments We thank lab members for discussions and suggestions. We also thank Prof. Tom Blundell and Dr. Harry Jubb for their critical comments and suggestions. G.S is supported by a fellowship from the Department of Biotechnology, India. This research is supported by the Department of Biotechnology, Government of India. This project has been funded in whole or in part with Federal funds from the Frederick National Laboratory for Cancer Research, National Institutes of Health, under contract HHSN261200800001E. This research was supported [in part] by the Intramural Research Program of NIH, Frederick National Lab, Center for Cancer Research. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organizations imply endorsement by the US Government. References

Fig. 3. Distribution of resolution of single particle electron microscopy (EM) maps: The histogram shows the distribution of resolution of the single electron microscopy (EM) maps available in the Electron microscopy data bank (EMDB (Lawson et al., 2011)).

Agrawal, N.J., Helk, B., Trout, B.L., 2014. A computational tool to predict the evolutionarily conserved protein-protein interaction hot-spot residues from the structure of the unbound protein. FEBS Lett. 588, 326e333. Aiello, D., Caffrey, D.R., 2012. Evolution of specific protein-protein interaction sites following gene duplication. J. Mol. Biol. 423, 257e272. Aloy, P., Ceulemans, H., Stark, A., Russell, R.B., 2003. The relationship between sequence and interaction divergence in proteins. J. Mol. Biol. 332, 989e998. Andre, I., Strauss, C.E., Kaplan, D.B., Bradley, P., Baker, D., 2008. Emergence of symmetry in homooligomeric biological assemblies. Proc. Natl. Acad. Sci. U. S. A. 105, 16148e16152. Andreani, J., Faure, G., Guerois, R., 2012. Versatility and invariance in the evolution of homologous heteromeric interfaces. PLoS Comput. Biol. 8, e1002677. Angyan, A.F., Perczel, A., Gaspari, Z., 2012. Estimating intrinsic structural preferences of de novo emerging random-sequence proteins: is aggregation the main bottleneck? FEBS Lett. 586, 2468e2472. Ansari, S., Helms, V., 2005. Statistical analysis of predominantly transient proteinprotein interfaces. Proteins 61, 344e355. Azoitei, M.L., Ban, Y.E., Julien, J.P., Bryson, S., Schroeter, A., Kalyuzhniy, O., Porter, J.R., Adachi, Y., Baker, D., Pai, E.F., Schief, W.R., 2012. Computational design of highaffinity epitope scaffolds by backbone grafting of a linear epitope. J. Mol. Biol. 415, 175e192. Azzarito, V., Long, K., Murphy, N.S., Wilson, A.J., 2013. Inhibition of alpha-helixmediated protein-protein interactions using designed molecules. Nat. Chem. 5, 161e173. Bahadur, R.P., Chakrabarti, P., Rodier, F., Janin, J., 2003. Dissecting subunit interfaces in homodimeric proteins. Proteins 53, 708e719.

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 Q3,4 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Q5 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 8/10

8

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

Bakan, A., Bahar, I., 2009. The intrinsic dynamics of enzymes plays a dominant role in determining the structural changes induced upon inhibitor binding. Proc. Natl. Acad. Sci. U. S. A. 106, 14349e14354. Basse, M.J., Betzi, S., Bourgeas, R., Bouzidi, S., Chetrit, B., Hamon, V., Morelli, X., Roche, P., 2013. 2P2Idb: a structural database dedicated to orthosteric modulation of protein-protein interactions. Nucleic Acids Res. 41, D824eD827. Beerten, J., Jonckheere, W., Rudyak, S., Xu, J., Wilkinson, H., De Smet, F., Schymkowitz, J., Rousseau, F., 2012a. Aggregation gatekeepers modulate protein homeostasis of aggregating sequences and affect bacterial fitness. Protein Eng. Des. Sel. 25, 357e366. Beerten, J., Schymkowitz, J., Rousseau, F., 2012b. Aggregation prone regions and gatekeeping residues in protein sequences. Curr. Top. Med. Chem. 12, 2470e2478. Beltrao, P., Albanese, V., Kenner, L.R., Swaney, D.L., Burlingame, A., Villen, J., Lim, W.A., Fraser, J.S., Frydman, J., Krogan, N.J., 2012. Systematic functional prioritization of protein posttranslational modifications. Cell 150, 413e425. Beltrao, P., Bork, P., Krogan, N.J., van Noort, V., 2013. Evolution and functional crosstalk of protein post-translational modifications. Mol. Syst. Biol. 9, 714. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E., 2000. The protein data bank. Nucleic Acids Res. 28, 235e242. Bhaskara, R.M., Padhi, A., Srinivasan, N., 2013. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling. Proteins. Block, P., Paern, J., Hullermeier, E., Sanschagrin, P., Sotriffer, C.A., Klebe, G., 2006. Physicochemical descriptors to discriminate protein-protein interactions in permanent and transient complexes selected by means of machine learning algorithms. Proteins 65, 607e622. Boehr, D.D., Nussinov, R., Wright, P.E., 2009. The role of dynamic conformational ensembles in biomolecular recognition. Nat. Chem. Biol. 5, 789e796. Bogan, A.A., Thorn, K.S., 1998. Anatomy of hot spots in protein interfaces. J. Mol. Biol. 280, 1e9. Buck, P.M., Kumar, S., Singh, S.K., 2013. On the role of aggregation prone regions in protein evolution, stability, and enzymatic catalysis: insights from diverse analyses. PLoS Comput. Biol. 9, e1003291. Buljan, M., Chalancon, G., Dunker, A.K., Bateman, A., Balaji, S., Fuxreiter, M., Babu, M.M., 2013. Alternative splicing of intrinsically disordered regions and rewiring of protein interactions. Curr. Opin. Struct. Biol. 23, 443e450. Buljan, M., Chalancon, G., Eustermann, S., Wagner, G.P., Fuxreiter, M., Bateman, A., Babu, M.M., 2012. Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol. Cell. 46, 871e883. Chang, Y.S., Graves, B., Guerlavais, V., Tovar, C., Packman, K., To, K.H., Olson, K.A., Kesavan, K., Gangurde, P., Mukherjee, A., Baker, T., Darlak, K., Elkin, C., Filipovic, Z., Qureshi, F.Z., Cai, H., Berry, P., Feyfant, E., Shi, X.E., Horstick, J., Annis, D.A., Manning, A.M., Fotouhi, N., Nash, H., Vassilev, L.T., Sawyer, T.K., 2013. Stapled alpha-helical peptide drug development: a potent dual inhibitor of MDM2 and MDMX for p53-dependent cancer therapy. Proc. Natl. Acad. Sci. U. S. A. 110, E3445eE3454. Chen, T.S., Keating, A.E., 2012. Designing specific protein-protein interactions using computation, experimental library screening, or integrated methods. Protein Sci. 21, 949e963. Cheng, R.R., Morcos, F., Levine, H., Onuchic, J.N., 2014. Toward rationally redesigning bacterial two-component signaling systems using coevolutionary information. Proc. Natl. Acad. Sci. U. S. A. 111, E563eE571. Chennamsetty, N., Helk, B., Voynov, V., Kayser, V., Trout, B.L., 2009. Aggregationprone motifs in human immunoglobulin G. J. Mol. Biol. 391, 404e413. Choi, Y.S., Yang, J.S., Choi, Y., Ryu, S.H., Kim, S., 2009. Evolutionary conservation in multiple faces of protein interaction. Proteins 77, 14e25. Correia, B.E., Bates, J.T., Loomis, R.J., Baneyx, G., Carrico, C., Jardine, J.G., Rupert, P., Correnti, C., Kalyuzhniy, O., Vittal, V., Connell, M.J., Stevens, E., Schroeter, A., Chen, M., Macpherson, S., Serra, A.M., Adachi, Y., Holmes, M.A., Li, Y., Klevit, R.E., Graham, B.S., Wyatt, R.T., Baker, D., Strong, R.K., Crowe Jr., J.E., Johnson, P.R., Schief, W.R., 2014. Proof of principle for epitope-focused vaccine design. Nature 507, 201e206. Csermely, P., Palotai, R., Nussinov, R., 2010. Induced fit, conformational selection and independent dynamic segments: an extended view of binding events. Trends Biochem. Sci. 35, 539e546. Cukuroglu, E., Gursoy, A., Nussinov, R., Keskin, O., 2014. Non-redundant unique interface structures as templates for modeling protein interactions. PLoS One 9, e86738. Daily, M.D., Gray, J.J., 2009. Allosteric communication occurs via networks of tertiary and quaternary motions in proteins. PLoS Comput. Biol. 5, e1000293. David, A., Razali, R., Wass, M.N., Sternberg, M.J., 2012. Protein-protein interaction sites are hot spots for disease-associated nonsynonymous SNPs. Hum. Mutat. 33, 359e363. Davis, F.P., Sali, A., 2010. The overlap of small molecule and protein binding sites within families of protein structures. PLoS Comput. Biol. 6, e1000668. Dayhoff, J.E., Shoemaker, B.A., Bryant, S.H., Panchenko, A.R., 2010. Evolution of protein binding modes in homooligomers. J. Mol. Biol. 395, 860e870. De Baets, G., Reumers, J., Delgado Blanco, J., Dopazo, J., Schymkowitz, J., Rousseau, F., 2011. An evolutionary trade-off between protein turnover rate and protein aggregation favors a higher aggregation propensity in fast degrading proteins. PLoS Comput. Biol. 7, e1002090. De, S., Krishnadev, O., Srinivasan, N., Rekha, N., 2005. Interaction preferences across protein-protein interfaces of obligatory and non-obligatory components are different. BMC Struct. Biol. 5, 15.

de Vries, S.J., van Dijk, M., Bonvin, A.M., 2010. The HADDOCK web server for datadriven biomolecular docking. Nat. Protoc. 5, 883e897. del Sol, A., Tsai, C.J., Ma, B., Nussinov, R., 2009. The origin of allosteric functional modulation: multiple pre-existing pathways. Structure 17, 1042e1050. DeLano, W.L., 2002. The pyMOL Molecular Graphics System. DeLano scientific, Palo Alto, CA. Der, B.S., Edwards, D.R., Kuhlman, B., 2012. Catalysis by a de novo zinc-mediated protein interface: implications for natural enzyme evolution and rational enzyme engineering. Biochemistry 51, 3933e3940. Dey, S., Pal, A., Chakrabarti, P., Janin, J., 2010. The subunit interfaces of weakly associated homodimeric proteins. J. Mol. Biol. 398, 146e160. Engin, H.B., Guney, E., Keskin, O., Oliva, B., Gursoy, A., 2013. Integrating structure to protein-protein interaction networks that drive metastasis to brain and lung in breast cancer. PLoS One 8, e81035. Espinosa, O., Mitsopoulos, K., Hakas, J., Pearl, F., Zvelebil, M., 2014. Deriving a mutation index of carcinogenicity using protein structure and protein interfaces. PLoS One 9, e84598. Feliu, E., Aloy, P., Oliva, B., 2011. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking. Protein Sci. 20, 529e541. Ferreon, A.C., Ferreon, J.C., Wright, P.E., Deniz, A.A., 2013. Modulation of allostery by protein intrinsic disorder. Nature 498, 390e394. Fleishman, S.J., Corn, J.E., Strauch, E.M., Whitehead, T.A., Karanicolas, J., Baker, D., 2011. Hotspot-centric de novo design of protein binders. J. Mol. Biol. 413, 1047e1062. Fong, J.H., Shoemaker, B.A., Garbuzynskiy, S.O., Lobanov, M.Y., Galzitskaya, O.V., Panchenko, A.R., 2009. Intrinsic disorder in protein interactions: insights from a comprehensive structural analysis. PLoS Comput. Biol. 5, e1000316. Franzosa, E.A., Xia, Y., 2011. Structural principles within the human-virus proteinprotein interaction network. Proc. Natl. Acad. Sci. U. S. A. 108, 10538e10543. Fromer, M., Shifman, J.M., 2009. Tradeoff between stability and multispecificity in the design of promiscuous proteins. PLoS Comput Biol. 5, e1000627. Fuller, J.C., Burgoyne, N.J., Jackson, R.M., 2009. Predicting druggable binding sites at the protein-protein interface. Drug. Discov. Today 14, 155e161. Gadkari, R.A., Srinivasan, N., 2010. Prediction of protein-protein interactions in dengue virus coat proteins guided by low resolution cryoEM structures. BMC Struct. Biol. 10, 17. Gadkari, R.A., Srinivasan, N., 2012. Protein-protein interactions in clathrin vesicular assembly: radial distribution of evolutionary constraints in interfaces. PLoS One 7, e31445. Gadkari, R.A., Varughese, D., Srinivasan, N., 2009. Recognition of interaction interface residues in low-resolution structures of protein assemblies solely from the positions of C(alpha) atoms. PLoS One 4, e4476. Gao, M., Skolnick, J., 2010. Structural space of protein-protein interfaces is degenerate, close to complete, and highly connected. Proc. Natl. Acad. Sci. U. S. A. 107, 22517e22522. Gao, M., Skolnick, J., 2012. The distribution of ligand-binding pockets around protein-protein interfaces suggests a general mechanism for pocket formation. Proc. Natl. Acad. Sci. U. S. A. 109, 3784e3789. Garamszegi, S., Franzosa, E.A., Xia, Y., 2013. Signatures of pleiotropy, economy and convergent evolution in a domain-resolved map of human-virus protein-protein interaction networks. PLoS Pathog. 9, e1003778. Gretes, M., Lim, D.C., de Castro, L., Jensen, S.E., Kang, S.G., Lee, K.J., Strynadka, N.C., 2009. Insights into positive and negative requirements for protein-protein interactions by crystallographic analysis of the beta-lactamase inhibitory proteins BLIP, BLIP-I, and BLP. J. Mol. Biol. 389, 289e305. Gsponer, J., Babu, M.M., 2012. Cellular strategies for regulating functional and nonfunctional protein aggregation. Cell. Rep. 2, 1425e1437. Hamp, T., Rost, B., 2012. Alternative protein-protein interfaces are frequent exceptions. PLoS Comput. Biol. 8, e1002623. Hashimoto, K., Panchenko, A.R., 2010. Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proc. Natl. Acad. Sci. U. S. A. 107, 20352e20357. Higueruelo, A.P., Jubb, H., Blundell, T.L., 2013. TIMBAL V2: Update of a Database Holding Small Molecules Modulating Protein-protein Interactions. Database (Oxford) 2013, bat039. Higueruelo, A.P., Schreyer, A., Bickerton, G.R., Pitt, W.R., Groom, C.R., Blundell, T.L., 2009. Atomic interactions and profile of small molecules disrupting protein-protein interfaces: the TIMBAL database. Chem. Biol. Drug. Des. 74, 457e467. Hsu, W.L., Oldfield, C.J., Xue, B., Meng, J., Huang, F., Romero, P., Uversky, V.N., Dunker, A.K., 2013. Exploring the binding diversity of intrinsically disordered proteins involved in one-to-many binding. Protein Sci. 22, 258e273. Itzhaki, Z., 2011. Domain-domain interactions underlying herpesvirus-human protein-protein interaction networks. PLoS One 6, e21724. Jaffe, E.K., 2013. Impact of quaternary structure dynamics on allosteric drug discovery. Curr. Top. Med. Chem. 13, 55e63. Janin, J., Bahadur, R.P., Chakrabarti, P., 2008. Protein-protein interaction and quaternary structure. Q. Rev. Biophys. 41, 133e180. Jones, S., Marin, A., Thornton, J.M., 2000. Protein domain interfaces: characterization and comparison with oligomeric protein interfaces. Protein Eng. 13, 77e82. Jones, S., Thornton, J.M., 1996. Principles of protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 93, 13e20. Kar, G., Gursoy, A., Keskin, O., 2009. Human cancer protein-protein interaction network: a structural perspective. PLoS Comput. Biol. 5, e1000601.

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

JPBM929_proof ■ 30 July 2014 ■ 9/10

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10 Kar, G., Keskin, O., Gursoy, A., Nussinov, R., 2010. Allostery and population shift in drug discovery. Curr. Opin. Pharmacol. 10, 715e722. Karaca, E., Bonvin, A.M., 2011. A multidomain flexible docking approach to deal with large conformational changes in the modeling of biomolecular complexes. Structure 19, 555e565. Kastritis, P.L., Rodrigues, J.P., Bonvin, A.M., 2014. HADDOCK2P2I: a biophysical model for predicting the binding affinity of protein-protein interaction inhibitors. J. Chem. Inf. Model 54, 826e836. Keskin, O., Gursoy, A., Ma, B., Nussinov, R., 2008. Principles of protein-protein interactions: what are the preferred ways for proteins to interact? Chem. Rev. 108, 1225e1244. Keskin, O., Nussinov, R., 2005. Favorable scaffolds: proteins with different sequence, structure and function may associate in similar ways. Protein Eng. Des. Sel. 18, 11e24. Keskin, O., Nussinov, R., 2007. Similar binding sites and different partners: implications to shared proteins in cellular pathways. Structure 15, 341e354. Kim, W.K., Henschel, A., Winter, C., Schroeder, M., 2006. The many faces of proteinprotein interactions: a compendium of interface geometry. PLoS Comput. Biol. 2, e124. King, N.P., Sheffler, W., Sawaya, M.R., Vollmar, B.S., Sumida, J.P., Andre, I., Gonen, T., Yeates, T.O., Baker, D., 2012. Computational design of self-assembling protein nanomaterials with atomic level accuracy. Science 336, 1171e1174. Konno, A., Kitagawa, A., Watanabe, M., Ogawa, T., Shirai, T., 2011. Tracing protein evolution through ancestral structures of fish galectin. Structure 19, 711e721. Kozakov, D., Hall, D.R., Chuang, G.Y., Cencic, R., Brenke, R., Grove, L.E., Beglov, D., Pelletier, J., Whitty, A., Vajda, S., 2011. Structural conservation of druggable hot spots in protein-protein interfaces. Proc. Natl. Acad. Sci. U. S. A. 108, 13528e13533. Kundrotas, P.J., Vakser, I.A., 2013. Protein-protein alternative binding modes do not overlap. Protein Sci. 22, 1141e1145. Kundrotas, P.J., Zhu, Z., Janin, J., Vakser, I.A., 2012. Templates are available to model nearly all complexes of structurally characterized proteins. Proc. Natl. Acad. Sci. U. S. A. 109, 9438e9441. Kurkcuoglu, Z., Bakan, A., Kocaman, D., Bahar, I., Doruker, P., 2012. Coupling between catalytic loop motions and enzyme global dynamics. PLoS Comput. Biol. 8, e1002705. Lai, J., Jin, J., Kubelka, J., Liberles, D.A., 2012. A phylogenetic analysis of normal modes evolution in enzymes and its relationship to enzyme function. J. Mol. Biol. 422, 442e459. Lawson, C.L., Baker, M.L., Best, C., Bi, C., Dougherty, M., Feng, P., van Ginkel, G., Devkota, B., Lagerstedt, I., Ludtke, S.J., Newman, R.H., Oldfield, T.J., Rees, I., Sahni, G., Sala, R., Velankar, S., Warren, J., Westbrook, J.D., Henrick, K., Kleywegt, G.J., Berman, H.M., Chiu, W., 2011. EMDataBank.org: unified data resource for CryoEM. Nucleic Acids Res. 39, D456eD464. Lensink, M.F., Wodak, S.J., 2013. Docking, scoring, and affinity prediction in CAPRI. Proteins 81, 2082e2095. Levy, E.D., Boeri Erba, E., Robinson, C.V., Teichmann, S.A., 2008. Assembly reflects evolution of protein complexes. Nature 453, 1262e1265. Levy, E.D., De, S., Teichmann, S.A., 2012. Cellular crowding imposes global constraints on the chemistry and evolution of proteomes. Proc. Natl. Acad. Sci. U. S. A. 109, 20461e20466. Levy, E.D., Pereira-Leal, J.B., 2008. Evolution and dynamics of protein interactions and networks. Curr. Opin. Struct. Biol. 18, 349e357. Li, X., Keskin, O., Ma, B., Nussinov, R., Liang, J., 2004. Protein-protein interactions: hot spots and structurally conserved residues often locate in complemented pockets that pre-organized in the unbound states: implications for docking. J. Mol. Biol. 344, 781e795. Liu, Y., Bahar, I., 2012. Sequence evolution correlates with structural dynamics. Mol. Biol. Evol. 29, 2253e2263. Lo Conte, L., Chothia, C., Janin, J., 1999. The atomic structure of protein-protein recognition sites. J. Mol. Biol. 285, 2177e2198. London, N., Raveh, B., Schueler-Furman, O., 2013. Druggable protein-protein interactionsefrom hot spots to hot segments. Curr. Opin. Chem. Biol. 17, 952e959. Ma, B., Kumar, S., Tsai, C.J., Nussinov, R., 1999. Folding funnels and binding mechanisms. Protein Eng. 12, 713e720. Ma, B., Nussinov, R., 2013. Druggable orthosteric and allosteric hot spots to target protein-protein interactions. Curr. Pharm. Des.. Malhotra, S., Sankar, K., Sowdhamini, R., 2014. Structural interface parameters are discriminatory in recognising near-native poses of protein-protein interactions. PLoS One 9, e80255. Manley, G., Rivalta, I., Loria, J.P., 2013. Solution NMR and computational methods for understanding protein allostery. J. Phys. Chem. B 117, 3063e3073. Marcos, E., Crehuet, R., Bahar, I., 2011. Changes in dynamics upon oligomerization regulate substrate binding and allostery in amino acid kinase family members. PLoS Comput. Biol. 7, e1002201. Marsh, J.A., Hernandez, H., Hall, Z., Ahnert, S.E., Perica, T., Robinson, C.V., Teichmann, S.A., 2013. Protein complexes are under evolutionary selection to assemble via ordered pathways. Cell 153, 461e470. Marsh, J.A., Teichmann, S.A., 2014. Parallel dynamics and evolution: protein conformational fluctuations and assembly reflect evolutionary changes in sequence and structure. Bioessays 36, 209e218. Martin, J., 2010. Beauty is in the eye of the beholder: proteins can recognize binding sites of homologous proteins in more than one way. PLoS Comput. Biol. 6, e1000821.

9

McLeish, T.C., Rodgers, T.L., Wilson, M.R., 2013. Allostery without conformation change: modelling protein dynamics at multiple scales. Phys. Biol. 10, 056004. Meagher, K.L., Carlson, H.A., 2004. Incorporating protein flexibility in structurebased drug discovery: using HIV-1 protease as a test case. J. Am. Chem. Soc. 126, 13276e13281. Mika, S., Rost, B., 2006. Protein-protein interactions more conserved within species than across species. PLoS Comput. Biol. 2, e79. Mintseris, J., Weng, Z., 2003. Atomic contact vectors in protein-protein recognition. Proteins 53, 629e639. Mintseris, J., Weng, Z., 2005. Structure, function, and evolution of transient and obligate protein-protein interactions. Proc. Natl. Acad. Sci. U. S. A. 102, 10930e10935. Mosca, R., Pache, R.A., Aloy, P., 2012. The role of structural disorder in the rewiring of protein interactions through evolution. Mol. Cell. Proteomics 11. M111 014969. Motlagh, H.N., Wrabl, J.O., Li, J., Hilser, V.J., 2014. The ensemble nature of allostery. Nature 508, 331e339. Mukherjee, S., Zhang, Y., 2011. Protein-protein complex structure predictions by multimeric threading and template recombination. Structure 19, 955e966. Murphy, P.M., Bolduc, J.M., Gallaher, J.L., Stoddard, B.L., Baker, D., 2009. Alteration of enzyme specificity by computational loop remodeling and design. Proc. Natl. Acad. Sci. U. S. A. 106, 9215e9220. Nishi, H., Fong, J.H., Chang, C., Teichmann, S.A., Panchenko, A.R., 2013a. Regulation of protein-protein binding by coupling between phosphorylation and intrinsic disorder: analysis of human protein complexes. Mol. Biosyst. 9, 1620e1626. Nishi, H., Hashimoto, K., Panchenko, A.R., 2011a. Phosphorylation in protein-protein binding: effect on stability and function. Structure 19, 1807e1815. Nishi, H., Koike, R., Ota, M., 2011b. Cover and spacer insertions: small nonhydrophobic accessories that assist protein oligomerization. Proteins 79, 2372e2379. Nishi, H., Tyagi, M., Teng, S., Shoemaker, B.A., Hashimoto, K., Alexov, E., Wuchty, S., Panchenko, A.R., 2013b. Cancer missense mutations alter binding properties of proteins and their interaction networks. PLoS One 8, e66273. Nooren, I.M., Thornton, J.M., 2003a. Diversity of protein-protein interactions. EMBO J. 22, 3486e3492. Nooren, I.M., Thornton, J.M., 2003b. Structural characterisation and functional significance of transient protein-protein interactions. J. Mol. Biol. 325, 991e1018. Nussinov, R., Ma, B., 2012. Protein dynamics and conformational selection in bidirectional signal transduction. BMC Biol. 10, 2. Nussinov, R., Ma, B., Tsai, C.J., 2013. A broad view of scaffolding suggests that scaffolding proteins can actively control regulation and signaling of multienzyme complexes through allostery. Biochim. Biophys. Acta 1834, 820e829. Nussinov, R., Tsai, C.J., 2013. Allostery in disease and in drug discovery. Cell 153, 293e305. Nussinov, R., Tsai, C.J., Xin, F., Radivojac, P., 2012. Allosteric post-translational modification codes. Trends Biochem Sci. 37, 447e455. Ofek, G., Guenaga, F.J., Schief, W.R., Skinner, J., Baker, D., Wyatt, R., Kwong, P.D., 2010. Elicitation of structure-specific antibodies by epitope scaffolds. Proc. Natl. Acad. Sci. U. S. A. 107, 17880e17887. Ozbek, P., Soner, S., Haliloglu, T., 2013. Hot spots in a network of functional sites. PLoS One 8, e74320. Pechmann, S., Levy, E.D., Tartaglia, G.G., Vendruscolo, M., 2009. Physicochemical principles that regulate the competition between functional and dysfunctional association of proteins. Proc. Natl. Acad. Sci. U. S. A. 106, 10159e10164. Perica, T., Chothia, C., Teichmann, S.A., 2012. Evolution of oligomeric state through geometric coupling of protein interfaces. Proc. Natl. Acad. Sci. U. S. A. 109, 8127e8132. Podgornaia, A.I., Casino, P., Marina, A., Laub, M.T., 2013. Structural basis of a rationally rewired protein-protein interface critical to bacterial signaling. Structure 21, 1636e1647. Procko, E., Hedman, R., Hamilton, K., Seetharaman, J., Fleishman, S.J., Su, M., Aramini, J., Kornhaber, G., Hunt, J.F., Tong, L., Montelione, G.T., Baker, D., 2013. Computational design of a protein-based enzyme inhibitor. J. Mol. Biol. 425, 3563e3575. Reichmann, D., Rahat, O., Albeck, S., Meged, R., Dym, O., Schreiber, G., 2005. The modular architecture of protein-protein binding interfaces. Proc. Natl. Acad. Sci. U. S. A. 102, 57e62. Rekha, N., Machado, S.M., Narayanan, C., Krupa, A., Srinivasan, N., 2005. Interaction interfaces of protein domains are not topologically equivalent across families within superfamilies: implications for metabolic and signaling pathways. Proteins 58, 339e353. Reynolds, K.A., McLaughlin, R.N., Ranganathan, R., 2011. Hot spots for allosteric regulation on protein surfaces. Cell 147, 1564e1575. Rodgers, T.L., Townsend, P.D., Burnell, D., Jones, M.L., Richards, S.A., McLeish, T.C., Pohl, E., Wilson, M.R., Cann, M.J., 2013. Modulation of global low-frequency motions underlies allosteric regulation: demonstration in CRP/FNR family transcription factors. PLoS Biol. 11, e1001651. Rodrigues, J.P., Bonvin, A.M., 2014. Integrative computational modeling of protein interactions. FEBS J.. Schreiber, G., Keating, A.E., 2011. Protein binding specificity versus promiscuity. Curr. Opin. Struct. Biol. 21, 50e61. Segura-Cabrera, A., Garcia-Perez, C.A., Guo, X., Rodriguez-Perez, M.A., 2013. A viralhuman interactome based on structural motif-domain interactions captures the human infectome. PLoS One 8, e71526.

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

JPBM929_proof ■ 30 July 2014 ■ 10/10

10

G. Sudha et al. / Progress in Biophysics and Molecular Biology xxx (2014) 1e10

Shoemaker, B., Wuchty, S., Panchenko, A.R., 2013. Computational large-scale mapping of protein-protein interactions using structural complexes. Curr. Protoc. Protein Sci. 73. Unit 3 9. Sickmeier, M., Hamilton, J.A., LeGall, T., Vacic, V., Cortese, M.S., Tantos, A., Szabo, B., Tompa, P., Chen, J., Uversky, V.N., Obradovic, Z., Dunker, A.K., 2007. DisProt: the database of disordered proteins. Nucleic Acids Res. 35, D786eD793. Sonavane, S., Chakrabarti, P., 2008. Cavities and atomic packing in protein structures and interfaces. PLoS Comput Biol. 4, e1000188. Stefl, S., Nishi, H., Petukh, M., Panchenko, A.R., Alexov, E., 2013. Molecular mechanisms of disease-causing missense mutations. J. Mol. Biol. 425, 3919e3936. Strauch, E.M., Fleishman, S.J., Baker, D., 2014. Computational design of a pHsensitive IgG binding protein. Proc. Natl. Acad. Sci. U. S. A. 111, 675e680. Sudha, G., Yamunadevi, S., Tyagi, N., Das, S., Srinivasan, N., 2012. Structural and molecular basis of interaction of HCV non-structural protein 5A with human casein kinase 1alpha and PKR. BMC Struct. Biol. 12, 28. Swapna, L.S., Bhaskara, R.M., Sharma, J., Srinivasan, N., 2012a. Roles of residues in the interface of transient protein-protein complexes before complexation. Sci. Rep. 2, 334. Swapna, L.S., Mahajan, S., de Brevern, A.G., Srinivasan, N., 2012b. Comparison of tertiary structures of proteins in protein-protein complexes with unbound forms suggests prevalence of allostery in signalling proteins. BMC Struct. Biol. 12, 6. Swapna, L.S., Srikeerthana, K., Srinivasan, N., 2012c. Extent of structural asymmetry in homodimeric proteins: prevalence and relevance. PLoS One 7, e36688. Talavera, D., Williams, S.G., Norris, M.G., Robertson, D.L., Lovell, S.C., 2012. Evolvability of yeast protein-protein interaction interfaces. J. Mol. Biol. 419, 387e396. Tan, C.S., Jorgensen, C., Linding, R., 2010. Roles of “junk phosphorylation” in modulating biomolecular association of phosphorylated proteins? Cell Cycle 9, 1276e1280. Teng, S., Madej, T., Panchenko, A., Alexov, E., 2009. Modeling effects of human single nucleotide polymorphisms on protein-protein interactions. Biophys. J. 96, 2178e2188. Thangakani, A.M., Kumar, S., Velmurugan, D., Gromiha, M.S., 2012. How do thermophilic proteins resist aggregation? Proteins 80, 1003e1015. Thangudu, R.R., Bryant, S.H., Panchenko, A.R., Madej, T., 2012. Modulating proteinprotein interactions with small molecules: the importance of binding hotspots. J. Mol. Biol. 415, 443e453. Tsai, C.J., del Sol, A., Nussinov, R., 2008. Allostery: absence of a change in shape does not imply that allostery is not at play. J. Mol. Biol. 378, 1e11. Tsai, C.J., Kumar, S., Ma, B., Nussinov, R., 1999a. Folding funnels, binding funnels, and protein function. Protein Sci. 8, 1181e1190. Tsai, C.J., Lin, S.L., Wolfson, H.J., Nussinov, R., 1996. A dataset of protein-protein interfaces generated with a sequence-order-independent comparison technique. J. Mol. Biol. 260, 604e620. Tsai, C.J., Ma, B., Nussinov, R., 1999b. Folding and binding cascades: shifts in energy landscapes. Proc. Natl. Acad. Sci. U. S. A. 96, 9970e9972. Tsai, C.J., Ma, B., Nussinov, R., 2009. Protein-protein interaction networks: how can a hub protein bind so many different partners? Trends Biochem Sci. 34, 594e600. Tuncbag, N., Gursoy, A., Nussinov, R., Keskin, O., 2011. Predicting protein-protein interactions on a proteome scale by matching evolutionary and structural similarities at interfaces using PRISM. Nat. Protoc. 6, 1341e1354.

Vacic, V., Iakoucheva, L.M., 2012. Disease mutations in disordered regionseexception to the rule? Mol. Biosyst. 8, 27e32. Valdar, W.S., Thornton, J.M., 2001. Protein-protein interfaces: analysis of amino acid conservation in homodimers. Proteins 42, 108e124. van Noort, V., Seebacher, J., Bader, S., Mohammed, S., Vonkova, I., Betts, M.J., Kuhner, S., Kumar, R., Maier, T., O'Flaherty, M., Rybin, V., Schmeisky, A., Yus, E., Stulke, J., Serrano, L., Russell, R.B., Heck, A.J., Bork, P., Gavin, A.C., 2012. Crosstalk between phosphorylation and lysine acetylation in a genome-reduced bacterium. Mol. Syst. Biol. 8, 571. van Zundert, G.C., Bonvin, A.M., 2014. Modeling protein-protein complexes using the HADDOCK Webserver “Modeling protein complexes with HADDOCK”. Methods Mol. Biol. 1137, 163e179. Walter, P., Metzger, J., Thiel, C., Helms, V., 2013. Predicting where small molecules bind at protein-protein interfaces. PLoS One 8, e58583. Wang, X., Wei, X., Thijssen, B., Das, J., Lipkin, S.M., Yu, H., 2012. Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat. Biotechnol. 30, 159e164. Wass, M.N., Fuentes, G., Pons, C., Pazos, F., Valencia, A., 2011. Towards the prediction of protein interaction partners using physical docking. Mol. Syst. Biol. 7, 469. Weigt, M., White, R.A., Szurmant, H., Hoch, J.A., Hwa, T., 2009. Identification of direct residue contacts in protein-protein interaction by message passing. Proc. Natl. Acad. Sci. U. S. A. 106, 67e72. Weinkam, P., Pons, J., Sali, A., 2012. Structure-based model of allostery predicts coupling between distant sites. Proc. Natl. Acad. Sci. U. S. A. 109, 4875e4880. Wells, J.A., McClendon, C.L., 2007. Reaching for high-hanging fruit in drug discovery at protein-protein interfaces. Nature 450, 1001e1009. Wong, E.T., Na, D., Gsponer, J., 2013. On the importance of polar interactions for complexes containing intrinsically disordered proteins. PLoS Comput. Biol. 9, e1003192. Yates, C.M., Sternberg, M.J., 2013. The effects of non-synonymous single nucleotide polymorphisms (nsSNPs) on protein-protein interactions. J. Mol. Biol. 425, 3949e3963. Yip, S.H., Foo, J.L., Schenk, G., Gahan, L.R., Carr, P.D., Ollis, D.L., 2011. Directed evolution combined with rational design increases activity of GpdQ toward a nonphysiological substrate and alters the oligomeric structure of the enzyme. Protein Eng. Des. Sel. 24, 861e872. Zen, A., Micheletti, C., Keskin, O., Nussinov, R., 2010. Comparing interfacial dynamics in protein-protein complexes: an elastic network approach. BMC Struct. Biol. 10, 26. Zhang, Q.C., Petrey, D., Deng, L., Qiang, L., Shi, Y., Thu, C.A., Bisikirska, B., Lefebvre, C., Accili, D., Hunter, T., Maniatis, T., Califano, A., Honig, B., 2012. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490, 556e560. Zhang, Q.C., Petrey, D., Norel, R., Honig, B.H., 2010. Protein interface conservation across structure space. Proc. Natl. Acad. Sci. U. S. A. 107, 10896e10901. Zhou, H.X., Qin, S., 2007. Interaction-site prediction for protein complexes: a critical assessment. Bioinformatics 23, 2203e2209. Zhu, H., Domingues, F.S., Sommer, I., Lengauer, T., 2006. NOXclass: prediction of protein-protein interaction types. BMC Bioinform. 7, 27.

Please cite this article in press as: Sudha, G., et al., An overview of recent advances in structural bioinformatics of proteineprotein interactions and a guide to their principles, Progress in Biophysics and Molecular Biology (2014), http://dx.doi.org/10.1016/j.pbiomolbio.2014.07.004

41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

An overview of recent advances in structural bioinformatics of protein-protein interactions and a guide to their principles.

Rich data bearing on the structural and evolutionary principles of protein-protein interactions are paving the way to a better understanding of the re...
870KB Sizes 0 Downloads 8 Views