CHAPTER EIGHT

Intrinsically Disordered Proteins— Relation to General Model Expressing the Active Role of the Water Environment Barbara Kalinowska*,†, Mateusz Banach*,†, Leszek Konieczny{, Damian Marchewka*,†, Irena Roterman*,1

*Department of Bioinformatics and Telemedicine, Medical College, Jagiellonian University, Krakow, Poland † Faculty of Physics, Astronomy and Applied Computer Science - Jagiellonian University, Krakow, Poland { Chair of Medical Biochemistry, Medical College, Jagiellonian University, Krakow, Poland 1 Corresponding author: e-mail address: [email protected]

Contents 1. Introduction 2. Definition of the Fuzzy Oil Drop Model 2.1 Observed hydrophobic density distribution 2.2 What is the expected hydrophobic core structure? 2.3 Do real-world proteins actually follow the presented distribution? 2.4 Two ways to determine O/T and O/R values 2.5 Internal force field 3. Results 3.1 Summary of results 3.2 Selected examples 4. Discussion 5. Conclusions Acknowledgments References

316 317 318 319 319 322 323 324 324 325 339 343 344 344

Abstract This work discusses the role of unstructured polypeptide chain fragments in shaping the protein's hydrophobic core. Based on the “fuzzy oil drop” model, which assumes an idealized distribution of hydrophobicity density described by the 3D Gaussian, we can determine which fragments make up the core and pinpoint residues whose location conflicts with theoretical predictions. We show that the structural influence of the water environment determines the positions of disordered fragments, leading to the formation of a hydrophobic core overlaid by a hydrophilic mantle. This phenomenon is further described by studying selected proteins which are known to be unstable and contain intrinsically disordered fragments. Their properties are established quantitatively, Advances in Protein Chemistry and Structural Biology, Volume 94 ISSN 1876-1623 http://dx.doi.org/10.1016/B978-0-12-800168-4.00008-1

#

2014 Elsevier Inc. All rights reserved.

315

316

Barbara Kalinowska et al.

explaining the causative relation between the protein's structure and function and facilitating further comparative analyses of various structural models.

1. INTRODUCTION The existence of loosely packed polypeptide chain fragments suggests mechanisms which counter the natural tendency of proteins to fold. Generally, protein folding usually results in a tightly packed, stable structure which can be described in terms of secondary and supersecondary motifs. While the basic question remains the same (i.e., “how are such structures generated?”), an equally interesting problem is to explain the formation of structures dominated by disordered regions. The above-mentioned phenomenon is related to 2 of 14 problems (specifically, nos. 8 and 9) which—despite 50 years of research, aided by the dynamic growth of bioinformatics—have so far eluded solution. These challenges are summarized in Dill and MacCallum (2012): We know little about the ensembles and functions of intrinsically disordered proteins, even though nearly half of all eukaryotic proteins contain large disordered regions. This is sometimes called the “protein unfolding problem” or “unstructural biology.”

In an overview of to-date research into the properties of disordered fragments (Uversky & Dunker, 2010), the authors present a list of phenomena closely related to the presence of such fragments (citation—selected points): 5. Increased interaction (surface) area per residue; 6. A single disordered region may bind to several structurally diverse partners; 7. Many distinct (structured) proteins may bind a single disordered region; 8. Intrinsic disorder provides ability to overcome steric restrictions, enabling larger interaction surfaces in protein-protein and protein-ligand complexes than those obtained with rigid partners; 9. Unstructured regions fold to specific bound conformations according to the template provided by structured partners; 14. The possibility of overlapping binding sites due to extended linear conformation.

This is why the problem of disordered proteins becomes the focus of attention of many researchers. Information on known disordered proteins can be found in a specialized database called DisProt (http://www.disprot.org; Vucetic et al., 2005). The database lists proteins (or fragments thereof ) whose native form lacks a stable 3D representation. Its content is derived

Intrinsically Disordered Proteins

317

from published experimental data confirming the unstructured nature of such proteins (or fragments) (Abramavicius & Mukamel, 2004; Asplund, Zanni, & Hochstrasser, 2000; Baiz, Peng, Reppert, Jones, & Tokmakoff, 2012). In addition, DisProt contains information on the biological profile of disordered fragments, methods of detecting such fragments, and links to other databases (Sickmeier et al., 2007). Our analysis in this chapter focuses on a set of proteins (retrieved from the DisProt database on March 15, 2013) whose crystal structures can be found in PDB. Proteins for which PDB data do not provide 3D structure of the unstructured fragments (where DisProt fragments are unambiguously references) have been excluded from analysis. Along with a presentation of the causes and effects of the presence of disordered fragments in proteins, an important issue raised in this work concerns the lack of a generalized model describing the relation between the protein and its water environment. It seems that interaction with water plays a pivotal role in ensuring that the polypeptide chain folds in a correct fashion. The presence of water also determines which fragments should remain stable and which ones may retain dynamic properties. This is why our discussion of disordered fragments is presented in the context of a formal model expressing the influence of the water environment upon the structure and properties of proteins in living organisms. The abbreviation DisProt is used in this chapter as identification of intrinsically disordered fragments (identified in DisProt database).

2. DEFINITION OF THE FUZZY OIL DROP MODEL The presence and role of the hydrophobic core in tertiary structural stabilization is well described in biochemistry handbooks, although no accurate in silico model has so far been proposed. Hydrophobic interactions are usually accounted for by structural prediction software, for example, in Levitt’s model (Levitt, 1976), where such interactions are modeled in a pairwise fashion, referring to individual protein atoms and individual water molecules (two- or three-atom models) (Urbic & Dill, 2010). In such models, interaction between the protein and its water environment is described and measured in terms of electrostatic components and Leonard-Jones potentials in pairwise system. As a result, hydrophilic residues preferentially aggregate on the protein’s surface, while hydrophobic residues are expected to remain buried, forming a hydrophobic core (Biancardi,

318

Barbara Kalinowska et al.

Cammi, Cappelli, Mennucci, & Tomasi, 2012; Kauzmann, 1959; Murphy et al., 2012; Priyakumar, 2012). In this work, the influence of the water environment, leading to the expected distribution of hydrophobicity (i.e., hydrophobicity density gradient, peaking near the center of the protein body and reaching near-zero values on its surface), is modeled on a global scale, that is, by considering the protein as a whole. The formation of a hydrophobic core appears to be directly related to interactions between polar residues and the water environment, isolating the remaining hydrophobic residues. In the “fuzzy oil drop” model, the notion of a “hydrophobic core” refers to a set of properties which describe the entire molecule, including the aggregation of hydrophobic residues near its center as well as the existence of a hydrophilic “mantle” which remains in contact with water (Yang, Jiao, & Li, 2012). The role of this mantle is to stabilize the core and thus both elements need to be considered as part of a hydrophobic core as a whole. The “fuzzy oil drop” model is based on an idealized distribution of hydrophobicity density represented by a 3D Gaussian. Actual (observed) hydrophobicity density—dependent on the placement of each residue in a properly folded chain—can be obtained by tracing interactions between pairs of residues and ascribing to each amino a value which reflects its affinity for water. Quantitative comparison of both distributions (theoretical and observed) enables us to determine whether a given protein conforms to the model and contains a well-defined hydrophobic core.

2.1. Observed hydrophobic density distribution If we assume that hydrophobicity density distribution within the protein molecule results from interactions between side chains, each of which is represented by its so-called effective atom (located at the geometric center of the side chain), then the force of such interactions is given by Levitt’s formula (Levitt, 1976): e j¼ Ho

N   1 X Hir þ Hjr e sum i¼18 Ho      4   6   8  2 > < 1  1 7 rij  9 rij þ 5 rij  rij forrij  c c c c c 2 > : 0 forr > c ij

e i expresses the hydrophoN is the number of amino acids in the protein, H bicity of the ith residue, rij expresses the distance between two interacting r

Intrinsically Disordered Proteins

319

residues ( jth effective atom and ith effective atom), while c expresses the cutoff distance for hydrophobic interactions, which is taken as 9.0 A˚ (following e sum coefficient, representing the aggregate sum of all Levitt, 1976). The Ho components, is needed to normalize the distribution.

2.2. What is the expected hydrophobic core structure? The resulting empirical distribution can be compared to a corresponding idealized distribution (Konieczny, Brylinski, & Roterman, 2006). We expect to find the greatest hydrophobicity at the center of the molecule, with hydrophobicity values decreasing along with distance from the center, approaching values close to 0 on the surface. This kind of distribution can be approximated by the 3D Gaussian:  2 !  2 !  2 !  xj  x  yj  y  zj  z 1 e j¼ Ht exp exp exp e sum 2s2x 2s2y 2s2z Ht The above formula expresses the distribution of probability in an ellipsoid capsule whose dimensions are determined by the values of s parameters calculated for each of the three cardinal directions. If we place the molecule in such a way that the 3D Gaussian completely encapsulates it (fine tuning the s values as needed), the value of the function will express the expected (theoretical) distribution of hydrophobicity throughout the protein body ˚ cutoff distance to include also the possible space increased by additional 9 A available for hydrophobic interaction. The superscribed x, y, and z parameters reflect the placement of the center of the ellipsoid—if this center coincides with the origin of the coordinate system, then all three values are equal to 0. Values of s are calculated as 1/3 of the greatest distance between an effective atom belonging to the molecule and the origin of the system, once the molecule has been oriented in such a way that its greatest breadth coincides with a system axis (for each axis separately). The 1/Htsum and 1/Hosum coefficients ensure normalization of both distributions (empirical and theoretical), enabling meaningful comparisons.

2.3. Do real-world proteins actually follow the presented distribution? The answer to this question requires quantitative analysis of the similarities/ differences between both distributions based on Kullback–Leibler’s entropy criterion (Kullback & Leibler, 1951):

320

Barbara Kalinowska et al.

N   X DKL p p0 ¼ pi log2 pi =p0i i¼1

The value of DKL expresses the distance between the empirical (p) and target (p0) distributions. In our case, the target distribution is supplied by the 3D Gaussian. According to the definition, DKL is a measure of entropy and thus cannot be interpreted on its own. This is why an independent separate target distribution is required—one in which the aggregation of hydrophobicity near the center of the molecule is absent. In this so-called unified distribution, each residue is assigned a hydrophobicity density value of 1/N, where N is the number of residues in the polypeptide chain. To simplify matters, we can introduce the following notation: O=T ¼

N X

Oi log2 Oi =Ti

i¼1

O=R ¼

N X

Oi log2 Oi =Ri

i¼1

where O/T is the difference (distance) between the observed and theoretical distributions, while O/R is the corresponding difference (distance) between the observed distribution and a distribution in which each residue carries the same hydrophobicity density value (the hydrophobic core absent). Comparing O/T and O/R profiles reveals the “closeness” between the observed and theoretical distributions for a given protein. A binary predicate can be adopted at this stage: O/T < O/R indicates the presence of a hydrophobic core. Quantitative analysis is also possible, leading to a ranked list of proteins which expresses their adherence to the idealized core model. This measure of “closeness” can be expressed by the following formula:

RD ¼

O=T O=T þ O=R

ð8:1Þ

RD stands for relative distance between the observed and theoretical distributions. Clearly, the lower the value of RD, the more closely a given protein approximates the theoretical optimum. Figure 8.1 provides a graphical depiction of this relationship.

321

Intrinsically Disordered Proteins

0.4 0.3 0.2 0.1 0 -4

0.4 0.3 0.2 0.1 -4

-2

0

0

T

2

4

0.2

-2

0.4

0

O

2

4

0.6

Distance versus idealized distribution

0.4 0.3 0.2 0.1 0 -4

0.8

-2

0

2

4

1

R

Figure 8.1 Graphical representation of the theoretical versus observed distribution, shown using a linear scale. The horizontal axis expresses the relative distance (RD; cf. Eq. (8.1)) between the idealized 3D Gaussian (leftmost image) and the actual hydrophobicity profile. The right-hand image presents a distribution which does not include a hydrophobic core. The pink dot tagged “O” represents the placement of an arbitrary empirical (observed) distribution which is shown in the central image.

We can also suspect that the status of some residues may have been affected by external factors, such as interaction with ligands or other proteins (Marchewka, Jurkowski, Banach, & Roterman, 2013). As already mentioned, the O/T versus O/R relation is the primary method of determining whether the structure in question contains a well-ordered hydrophobic core. Table 8.3 lists O/T and O/R values for each structural element (complex, chain, or domain) separately. It also provides information regarding the influence of external chains, ligands, or ions. Assessing this influence requires us to calculate O/T and O/R values with and without residues involved in external interactions (as it is assumed that the presence of a ligand may distort hydrophobicity density distribution in the target protein) (Brylinski, Konieczny, & Roterman, 2007). This, in turn, means that T, O, and R values need to be normalized following elimination of ligand-binding residues, while other coefficients of the 3D Gaussian remain the same. For each structural unit, the Gaussian is computed separately. Determining O/T and O/R for fragments (parts) of a structural unit calls for normalization of T and O values and subsequent calculation of O/T and O/R which express the participation of the eliminated fragment in shaping the unit’s hydrophobic core. Values obtained for units from which active residues have been eliminated reflect the degree to which such “external” activity (ligand binding, complexation, SS bonds, etc.) distorts the core. However, the groups of proteins belonging to downhill proteins appear to represent the structure of hydrophobic core highly accordant with the idealized one (Banach, Prymula, Jurkowski, Konieczny, & Roterman, 2012).

322

Barbara Kalinowska et al.

2.4. Two ways to determine O/T and O/R values The values of DKL, both for O/T and O/R, can be calculated for each structural unit separately. Possible units include individual domains, entire chains, and protein complexes. Depending on the size of the unit, the encapsulating “drop” must be appropriately selected by adjusting its s coefficients (sx, sy, sz). The O/T versus O/R relation (or, correspondingly, the value of DR) identifies the status of each unit with respect to the theoretical hydrophobicity density distribution. Another way to calculate O/T and O/R (as well as RD), used when trying to identify the structural role of each fragment, is as follows. A specific fragment (for instance, a fragment listed in DisProt) is selected from the hydrophobicity density distribution profile. T and R values representing this fragment are removed from the profile, and the remaining values are again normalized (ensuring that the aggregate total T and R remains equal to 1). For this new chain, O/TF and O/RF values are calculated, along with RDF. The F subscript indicates that we are dealing with a truncated molecule (fragmented). If, following removal of this fragment, the inequality between O/T and O/R flips (e.g., O/T > O/R while O/TF < O/RF), we can conclude that the excised fragment is indeed responsible for the discordance between the given structural unit and its corresponding theoretical representation. Similar analysis can be performed for the selected fragment itself. Following normalization, T and O values are fed into O/TFR and O/RFR (as well as RDFR) formulae to determine the hydrophobicity density distribution status of the fragment (FR—fragment). This status can be compared to the corresponding status of the entire structural unit—such as a complex (if one exists), a chain, or a domain (if the given complex and/or chain can be subdivided into domains). Applying this algorithm to each fragment listed in the DisProt database yields information regarding its status with respect to the “fuzzy oil drop” model. Identification of residues engaged in ligand binding or protein complexation was performed using the PDBSum criteria depending on the distance between interacting molecules (Laskowski, 2009). The analysis of protein structures in respect to “fuzzy oil drop” revealed that the presence of cavity (ligand binding, enzymatic cavity, protein– protein interaction area) significantly influences the fashion of the protein body (Banach, Konieczny, & Roterman, 2012a, 2012b, 2013;

Intrinsically Disordered Proteins

323

Prymula, Jadczyk, & Roterman, 2011). This is why the calculation of O/T and O/R for proteins under consideration was performed also for molecules with residues engaged in intermolecular interaction eliminated from calculation.

2.5. Internal force field The calculation of internal interaction in protein molecule was performed to make possible comparison between hydrophobicity density distribution and nonbonding interaction density distribution (Marchewka, Banach, & Roterman, 2011). The following procedure was carried out: 1. the structure of the protein (as listed in PDB) was subject to energy minimization (EM) in order to eliminate steric clashes (e.g., resulting from inclusion of hydrogen atoms, which are not present in the protein’s crystal form); 2. for each amino acid residue (and its constituent atoms), interactions with remaining part of the molecule were calculated; 3. separate computations were carried out for electrostatic and van der Waals interactions; 4. the Gromacs program was used to perform the energy optimization and crystal relaxation procedure. All EM calculations were conducted with the use of Gromacs software package v4.0.3 and Gromos96 43a1 force field (Berendsen, Postma, van Gunsteren, & Hermans, 1981; Berendsen, van der Spoel, & van Drunen, 1995; Lin & van Gunsteren, 2013; Lindahl, Hess, & van der Spoel, 2001; van der Spoel et al., 2005, 1995). The grouping option was used to tag each residue as a separate “group” interacting with the rest of the protein body. The interaction between each residue and the rest of molecule was performed to attribute the local interaction of each residue with its local surrounding and concentrated in the position of effective atoms to make possible the comparison with the hydrophobicity density distribution. The electrostatic and vdW interaction was taken under consideration. The electrostatic and vdW interaction density was normalized to make applicability of Kullback–Leibler distance entropy calculation possible and comparable to hydrophobicity density distribution in the protein molecule under consideration (Banach, Marchewka, Piwowar, & Roterman, 2012; Banach, Prymula, et al., 2012).

324

Barbara Kalinowska et al.

3. RESULTS The calculation methods as introduced above when applied to the set of DisProt proteins deliver the general overview of different characteristics of structural units and intrinsically disordered fragments as well.

3.1. Summary of results The summary results classifying particular structural units (chains and domains) as accordant or discordant in respect to “fuzzy oil drop” model is given in Table 8.1. This classification suggests the presence of hydrophobic core as defined in the model. The results of this analysis are given in Table 8.1. Table 8.1 Summary of DisProt fragments as they appear in appropriate structural unit, the status Chain discordant Chain accordant Domain Chain discordant Domain accordant accordant Domain discordant

2CV4(2–8), 1ECF(472–492), 3GZP(67–78), (125–139), 1ECF (471–492), 2B76(91–158), 1GUA (104–107), 1OLG_A(319–323), 1OS2(110–116), 1RJ7_A (312–317), 1RJ8_A(312–315), 1RRP_A(60–62), 2BZS (228–231), 1L0I(17–36), 1DDS (9–24), (63–72), (116–132), (142–150), 1CWX_A(1–45), 2KOG_A(55–76), 2BRZ(38–45), 2JV4(24–45), 1SS3_A(34–50), 1JK3(110–116), (146–157), 1RXR(169–189), (172–176), (178–187), (181–187), (202–206), 2NLN(42–71), (82–109), 1YYZ (204–222), 1RG7(16–22), 1L0H (258–267), 1KAO(60–63), 1LXL (28–80), 1KAO(62–63), 1AA9 (57–64), (58–66), 1CRD(58–66), 121P(59–72), 1GHZ(183–188), (224–227), 1SVA_1(15–89), 1SRY(258–267), 1TBA(11–17)

2IS2 (495–564) 2WB0 (401–406) 1RRP_A (108–109) 1FAQ_A (136–138) 1FAQ_A (185–187)

3C66(80–105), 1GME(1–42), 1ECF(73–84), 1SVA_1(90–107), (297–301), (302–330), (331–341), (342–362), 1CRD(30–38), 1HPW_A(35–36), 1EOT_A (1–8), 1HPW_A(35–36), 2JU4 (74–87), 1G2S_A(1–9), 1HRT_I (50–65), 1JSU_C(22–34), 1OLG_A(357–360), 1AA9_A (31–39), 1AA9_A(31–39), 1OS2_A(146–157), 2BZS_A (1–4), 2PTL_A(1–17), 1KAO (45–55), (32–36), 1YDV(66–80), (165–178), 2KOG_A(1–35), (36–54), (77–88)

Intrinsically Disordered Proteins

325

Table 8.1 lists the results obtained for each fragment from the DisProt database. Several cases merit further interpretation. For instance, 2KOG is a membrane protein whose shape is far from globular. It seems that the “fuzzy oil drop” model does not adequately reflect its structural properties. Proteins labeled 2PTL and 1GME are globular but have an outstretched “arm” which causes them to diverge from the model. Many other proteins, such as 1TEW (as well as the small 1EOT protein with two SS bonds), are rich in disulfide bonds which impose additional structural constraints and distort the shape of the molecule in relation to the model. As an example showing the applied methodology, the protein 1QO9 was arbitrarily selected (Harel et al., 2000). The analysis of 1QO9 protein—a hydrolase (E.C.3.1.1.7)— acetylcholinesterase from Drosophila melanogaster in complex with two inhibitor proteins is presented in details. The status of 1QO9 chain can be represented by its RD value, which is equal to 0.63. The DisProt fragment (142–173), described in the database as “flexible linkers/spacers,” has an RD value of 0.74. Figure 8.2 (bottom) presents the O/T and O/R profiles calculated using a 20 aa open reading frame. As can be seen, only the beginning and the end of the chain (as well as a single fragment near position 350) satisfy O/T < O/R. The DisProt fragment, tagged light blue, diverges from the theoretical hydrophobicity density distribution model, with the exception of its C-terminal fragment and some short intermediate sequences. The localization of DisProt fragment, which occupies rather central position in the protein body of 1QO9 molecule, is shown in Fig. 8.3 in 3D presentation. Do any other proteins follow the idealized hydrophobicity density distribution? As a matter of fact—yes—such proteins are relatively easy to identify in PDB. The answer to this question depends, however, on the structural unit for which accordance is measured. The presented work is an attempt at addressing this question in the scope of DisProt fragments. Our study set contains accordant proteins, as well as proteins in which disorganized fragments diverge from the model. As a result, we have decided to perform further analysis for each group separately.

3.2. Selected examples Individual proteins were selected to present the different status of particular proteins. The proteins with many structural and functional profiles are aimed to verify the applicability of “fuzzy oil drop” model for the DisProt analysis.

326

Barbara Kalinowska et al.

Figure 8.2 Visual representation of the 1QO9-A chain profile. Top diagram: values of T (dark blue—rhombs), O (pink squares), and R (light blue—continuous line) showing to what extend the observed O distribution resembles R or T distribution. The black bar on the horizontal axis identifies the disordered fragment (according to DisProt database). Bottom diagram: O/T and O/R profiles calculated for fragments using a 20 aa open reading frame. Light blue marks (and black fragment of X-axis) correspond to the DisProt fragment. Both diagrams indicate high similarity between the observed distribution (O) and the random distribution (R), which also applies to the DisProt fragment.

3.2.1 Varied hydrophobicity distribution status of DisProt fragments—2JU4 2JU4 is a gamma subunit domain cgmp phosphodiesterase (retinal rod rhodopsin-sensitive cgmp 3´,5´-cyclic phosphodiesterase subunit gamma of Bos taurus cattle—retina; Song et al., 2008). The domain under consideration does not contain enzymatically active residues. In addition, the crystal structure of this protein contains a number of ligands (six to be exact—three monomers, two dimers, and a trimer designated RCY).

Intrinsically Disordered Proteins

327

Figure 8.3 The 3D presentation of 1QO9 with DisProt fragment distinguished in red (balls).

The retinal phosphodiesterase (PDE6) inhibitory gamma subunit (PDEgamma) plays a central role in vertebrate phototransduction by alternate interactions with the catalytic alphabeta subunits of PDE6 and the alpha subunit of transducin (alpha(t)). In-depth, analysis of its structure (using NMR) suggests a high degree of intrinsic disorder. NMR scans also point to high structural variability of the 24–45 and 74–87 fragments (listed in the DisProt database). This is further confirmed by analysis of 100 candidate structures for 2JU4 in PDB. The structure of 2JU4, from the point of view of F and C angles (Ramachandran plot), does not indicate the presence of ordered secondary structural fragments (see Ramachandran plot in PDBSum database). Structural analysis of this domain based on the “fuzzy oil drop” model reveals the presence of a hydrophobic core—the domain satisfies O/T < O/R (0.216 and 0.244, respectively). The intrinsically disordered fragments listed in DisProt (24–45 and 74–87) possess RD values of 0.49 and 0.54, respectively. Analysis of the 1–23 fragment, which is only loosely integrated with the rest of the molecule, produces an RD value of 0.53, which means that this fragment does

328

Barbara Kalinowska et al.

not contribute to hydrophobic core formation. Following elimination of ligand-binding residues, the remainder of the chain has an RD value of 0.50 (Fig. 8.4). Based on O/T and O/R analysis, the entirety of the polypeptide chain (84 amino acids) seems to exhibit an ordered distribution of hydrophobicity density, approximating the 3D Gaussian. Differences in the status of disordered fragments (24–45—accordant; 74–87—discordant) may indicate variable stability. As one of these fragments—along with the molecule as a whole—exhibits accordance with the model, it can be assumed that 2JU4 does indeed contain a hydrophobic core, as predicted by the “fuzzy oil drop” model. Regarding structural variability, DisProt mentions that the “function (of this domain) arises via a disorder to order transition,” which seems to be driven by hydrophobic interactions. Plotting the distribution of hydrophobicity density in 2JU4 reveals greater accordance between the theoretical and observed values for the 24–45 fragment than for the 74–87 fragment. Figure 8.5 shows the 3D presentation of 2JU4, highlighting the status of intrinsically disordered fragments. Shades of gray indicate the concentration of hydrophobicity, which is greatest near the center of the ellipsoid and close to zero on its surface. Highlighted fragments also illustrate the “fuzzy oil drop” model, which not only assumes the existence of a hydrophobic core

Figure 8.4 Hydrophobicity density distribution in 2JU4: T, theoretical (pink squares); O, observed (gray triangles); R, unified (no hydrophobic core—horizontal line). Dark blue marks on the horizontal axis denote intrinsically disordered fragments.

Intrinsically Disordered Proteins

329

Figure 8.5 3D representation of 2JU4 in relation to the “fuzzy oil drop” model (different perspectives). The red fragment (balls) is structurally consistent with the “fuzzy oil drop” model, while the black (sticks) fragment diverges from it. Grayscale saturation increases along with distance from the surface, corresponding to the hydrophobicity density gradient which peaks near the center of the ellipsoid. The ellipses visualize encapsulation of the molecule according to the 3D Gaussian.

but also predicts a hydrophobicity gradient through which peripheral fragments are capable of shielding the core from entropically disadvantageous contact with water. In this sense, the presented fragments are also consistent with the model. Highly dynamic fragments which undergo significant structural modifications as a result of their function are also capable of reverting to their original ordered form (this is especially true of the 24–45 fragment—red balls in Fig. 8.5).

3.2.2 Sample discordant protein: 1YDV The 1YDV homodimer represents an interesting study case. It is a triosephosphate isomerase E.C. 5.3.1.1. (source: Plasmodium falciparum—malaria parasite; Velanker et al., 1997). Each monomer contains two fragments which have been identified as intrinsically disordered. Of particular note is the presence of three enzymatically active residues in one of these fragments. Neither the dimer itself nor any of its disordered fragments (as given by DisProt) exhibit accordance with “fuzzy oil drop” model. However, such

330

Barbara Kalinowska et al.

Table 8.2 RD values for structural units in the 1YDV homodimer DisProt1 Structural unit Complex Chain 66–80

DisProt2 165–178

COMPLEX NO P–P NO ENZYM. CHAIN NO P–P NO ENZYM.

0.69

0.69

0.60

0.75

0.66 0.69

0.66 0.69

0.48 –

– 0.69

0.45

0.63

0.67

0.40 0.45

0.53 –

– 0.50

DisProt lists two intrinsically disordered fragments identified in this chain (identification according to DisProt database). Values printed in boldface indicate accordance with the hydrophobicity density distribution model.

accordance is observed for a single chain (treated as a structural unit) (Table 8.2). While interpreting the results shown in Table 8.2, it should be noted that only the 1YDV chain itself—as a structural unit—remains accordant with the assumed model. Neither DisProt fragment exhibits such accordance, regardless of the structural unit in which it is analyzed. The structure of a single isolated chain represents the structure with hydrophobic core (according to “fuzzy oil drop” model). Figure 8.6 visualizes the localization of DisProt fragments in 1YDV which appear to be localized in the protein–protein interface (homodimer). Comparison of RD values reveals the likely sequence of events involved in forming the presented homodimer. It seems that each chain folds on its own, reaching a hydrophobicity density distribution which is consistent with the model. Elimination of residues involved in enzymatic activity or protein–protein interaction does not alter this behavior. DisProt fragments, treated as elements of the complex as well as of individual chains, do not match the expected hydrophobicity distribution. Regarding DisProt1, elimination of residues involved in protein–protein interactions renders this fragment accordant with the model. However, DisProt2 remains discordant even when all residues involved in p–p interactions and enzymatic activity are eliminated. 3.2.3 2TPI as an example of a protein of highly differentiated structure To further illustrate practical use of the “fuzzy oil drop” model, we will refer to the protein complex composed of E.C.3.4.21.4 hydrolase—Trypsin

Intrinsically Disordered Proteins

331

Figure 8.6 The 3D presentation of 1YDV dimer with DisProt fragments distinguished by red balls. The position in the protein–protein interface is visualized.

(Z chain) and its inhibitor (I chain) (PDB ID: 2TPI—Walter et al., 1982). This protein is a very interesting study subject in the context of the relation between its secondary structural domains (two of which can be distinguished within the Z chain which facilitates enzymatic activity) and the hydrophobic core structure. It should be noted that 2TPI is a complex (meaning that the influence of the complexed protein can be studied) contains disulfide bonds (which can affect the core in measurable ways), includes a ligand (i.e., the ILE–VAL dipeptide), contains a mercury ion (likely not associated with biological activity), and, finally, contains disordered fragments in its Z and I chains. The “fuzzy oil drop” model can be applied to various structural units: complex, chains, or domains in 2TPI. For each of these entities, a separate “drop” is defined by establishing its volume and location. Additionally, the model permits quantitative analysis of the involvement of each fragment in shaping the common hydrophobic core. When determining the input of each fragment (or even of individual residues), a separate “drop” is not necessary—rather, the residues corresponding to the fragment in question are eliminated from O, T, and R calculations, and the resulting profile is renormalized. Computing O/T and O/R yields the status of the remainder of the initial molecule. If the inequality flips (from O/T > O/R to O/ T < O/R), we can surmise that the eliminated residues cause the chain to diverge from the model. The presented case study involves a trypsin

332

Barbara Kalinowska et al.

inhibitor complex designated 2TPI. Its Z chain is known to contain an enzymatic active site (E.C.3.4.21.4—Trypsin in preferential cleavage reaction: Arg-j-Xaa, Lys-j-Xaa), while the I chain is a basic protease inhibitor (aprotinin). The structure of the resulting complex enables us to study various factors which affect hydrophobicity density distribution. Our analysis focuses on two chains, one of which (labeled Z) comprises two domains: domain 1 (19–27 and 121–233) and domain 2 (28–120 and 233–245). The remaining chain (I) consists of a single domain. Each of external factors (ligand binding, protein–protein interaction, enzymatic activity, and SS bonds) can be studied separately to determine its influence upon the final structure of 2TPI. 2TPI is listed in the database of protein disorder (http://www.disprot. org—accessed April 20, 2013). Its Z chain includes four disordered fragments: 18–27, 137–147, 19–190, and 209–216. The shared DisProt identifier of these fragments is DP00728. Regardless of the DisProt classification, our analysis of disordered structures also covers the following three fragments in the I chain: 8–18, 24–28, and 35–47. These fragments have been selected on the basis of subjective visual assessment. The status of each disordered fragment in relation to the “fuzzy oil drop” model has been computed for each structural unit (complex, chain, or domain) separately. Results—specifically, the relation between O/T and O/R for individual units—are listed in Table 8.1. 3.2.3.1 Complex

The complex as a whole does not appear to contain a shared hydrophobic core (as evidenced by the O/T > O/R relation) (Table 8.3). The Z chain in the complex has been identified as having significant influence upon the formation of a common hydrophobic core. Its domain 2 is consistent with the theoretical hydrophobicity distribution model. Three out of four disordered fragments also appear consistent with the model (applied to the complex as a whole). Regarding chain I, its disordered fragments seem to match the expected distribution of hydrophobicity throughout the complex. One disordered fragment in chain Z diverges from the model. Eliminating residues involved in external interactions (ligand binding, inhibitor complexation, enzymatic activity, disulfide bonds) does not change the status of the complex, implying that such residues do not affect the common hydrophobic core.

Intrinsically Disordered Proteins

333

Table 8.3 O/T and O/R values calculated for each fragment under the assumption that the structural unit (in the sense of the “fuzzy oil drop” model) is the entire complex of 2TPI Residues excluded Struc/Func Residues present O/T O/R RD

COMPLEX

0.199 0.154 0.56

NO LIGAND Z: 19, 142–144, 156–159, 187–189, 194,221 0.206 0.159 0.56 NO ENZYM. Z: 57, 102, 193, 195, 196

0.199 0.154 0.56

NO P–P

Z: 39–42, 57, 151, 189–193, 195, 214–216, 0.198 0.156 0.56 226 I: 11–19, 36–39

NO ION

Z: 145, 146, 191, 220

0.202 0.153 0.57

NO SS bonds

Z: 22;157,42;58,128;232,136;201,168;182 I: 5;55,14;38,30;51

0.180 0.142 0.56

NO L-P-P-I

Z: 19, 142–144, 156–159, 187–189, 194,221 0.209 0.158 0.57 39–42, 57, 151, 189–193, 195, 214–216, 226 145, 146, 191, 220 I: 5;55,14;38,30;51

NO L-P-PI-SS

Z: 19, 142–144, 156–159, 187–189, 194,221 0.183 0.149 0.55 39–42, 57, 151, 189–193, 195, 214–216, 226 145, 146, 191, 220 22;157,42;58,128;232,136;201,168;182 I: 5;55,14;38,30;51

NO DisProt1

19–27

0.213 0.158 0.57

CHAIN Z

19–245

0.135 0.141 0.49

CHAIN I

2–58

0.460 0.209 0.69

DOMAIN 1—Z

19–27, 121–233

0.138 0.136 0.50

DOMAIN 2—Z

28–120, 233–245

0.130 0.141 0.48

Z—DisProt 1

19–27

0.093 0.078 0.54

Z—DisProt 2

137–147

0.021 0.083 0.20

Z—DisProt 3

179–190

0.149 0.177 0.46

Z—DisProt 4

209–216

0.021 0.039 0.35

I—Disorder

8–18

0.089 0.193 0.31 Continued

334

Barbara Kalinowska et al.

Table 8.3 O/T and O/R values calculated for each fragment under the assumption that the structural unit (in the sense of the “fuzzy oil drop” model) is the entire complex of 2TPI—cont'd Residues excluded Struc/Func Residues present O/T O/R RD

I—Disorder

24–28

0.125 0.153 0.45

I—Disorder

35–47

0.157 0.345 0.31

Rows labeled “NO XX” list O/T and O/R values for a common “oil drop” encapsulating the entire complex, without residues involved in activity XX. Chains, domains, and fragments labeled “DisProt” or “Disorder” are characterized by O/T and O/R values which reflect their contribution to the common hydrophobic core for the entire complex. Values listed in boldface represent hydrophobicity density distribution accordant with the theoretical model. It should be noted that the discussed fragments in chain I (disordered) are selected on the basis of subjective visual analysis (not present in the DisProt database). The numbers of residues in italics represent residues present in calculation.

3.2.3.2 Chains

Chain Z treated as a separate structural unit satisfies O/T > O/R, which— according to the model—indicates the lack of a hydrophobic core. In contrast, when analyzed as part of the complex, both Z chain domains remain accordant with the model (Table 8.4). Elimination of residues involved in enzymatic activity and inhibitor complexation renders the remainder of the chain accordant with the model. Other types of interactions (with ligands and ions, as well as participation in disulfide bonds) do not result in significant deformations within chain Z. Regarding chain Z, three out of four disordered fragments assume conformations consistent with the expected hydrophobicity density gradient. Similar to the entire complex, the first disordered fragment remains discordant. Chain I: The I chain, treated as a separate structural unit, also does not appear to contain a well-ordered hydrophobic core since it satisfies O/T > O/R (note, however, that eliminating residues involved in enzyme complexation renders the chain accordant with the model). From the three fragments arbitrarily deemed disordered, only one exhibits the expected hydrophobicity density gradient (as long as the I chain is treated as a separate structural unit). 3.2.3.3 Domains

The summary characteristics of domains present in 2TPI are shown in Table 8.5 together with the calculations of domains deprived of residues

335

Intrinsically Disordered Proteins

Table 8.4 O/T and O/R values calculated for each chain of 2TPI individually Residues excluded Struc/Func Residues present O/T O/R

RD

CHAIN Z

19–245

0.141

0.139

0.50

NO LIGAND

19, 138, 140, 142, 143, 144, 156, 157, 158, 187, 188, 189, 194, 221

0.144

0.144

0.50

NO ENZYMATIC

57, 102, 193, 195,196

0.138

0.140

0.49

NO P–P

39–42, 57, 97, 99, 151, 189–195 214–216, 226

0.137

0.142

0.49

NO ION

145, 146, 191, 220

0.142

0.139

0.50

NO SS bonds

22;157,42;58,128;232,136;201, 168;182

0.137

0.131

0.51

DOMAIN 1

19–27, 121–232

0.121

0.146

0.45

DOMAIN 2

28–120, 233–245

0.111

0.146

0.43

NO–DisProt

All residues of domain 2 with residues recognized as DisProt eliminated

0.142

0.141

0.50

DisProt 1

19–27

0.087

0.078

0.53

DisProt 2

137–147

0.060

0.083

0.42

DisProt 3

179–190

0.152

0.183

0.45

DisProt 4

209–216

0.021

0.056

0.27

CHAIN I

2–58

0.254

0.216

0.54

NO P–P

9, 11–19, 36–39

0.207

0.214

0.49

NO SS

5;55,14;38,30;51

0.231

0.187

0.55

Disorder 1

8–18

0.409

0.238

0.63

Disorder 2

24–28

0.099

0.154

0.39

Disorder 3

35–47

0.487

0.339

0.59

Rows labeled “NO XX” list O/T and O/R values for an “oil drop” encapsulating the chain, without residues involved in activity XX. Chains, domains, and fragments labeled “DisProt” or “Disorder” are characterized by O/T and O/R values which reflect their contribution to the common hydrophobic core for the given chain. Values listed in boldface represent hydrophobicity density distribution accordant with the theoretical model.

336

Barbara Kalinowska et al.

engaged in chain–chain interaction, ligand binding, enzymatic activity, and ion binding. The influence of Cys engaged in SS bonds formation is also discussed (Table 8.5). Analysis of individual domains (where each domain is treated as a separate structural unit for the purposes of drop encapsulation) in the Z chain reveals high concentration of hydrophobicity density near the center of each domain (treated as the unit for drop definition). Hydrophobicity density decreases along with distance from the center, reaching near-zero values on the surface, as implied by the O/T < O/R relationship. Elimination of residues responsible for external interactions does not alter the overall status of the domain (in fact, it produces a distribution which matches theoretical values even more closely). Figure 8.7 provides a graphical representation of the 2TPI complex.

3.2.3.4 Suggested order of events in the formation of the complex

By analyzing the status of individual structural units (complex, chain, or domain), we can speculate several stages in the formation of the 2TPI complex (enzyme þ inhibitor). Assuming that the polypeptide chain—when placed in an aqueous environment—exhibits a natural tendency to form a hydrophobic core shielded by a hydrophilic mantle, we propose that: 1. The structures which emerge first are those characterized by structural stability owing to the presence of a hydrophobic core. The structural status of three (out of four) disordered fragments seems to be accordant with the status of the entire domain. Elimination of residues involved in external interactions does not affect this status. 2. The second stage involves formation (folding) of the protein chain, with participation of individual domains. This produces a specific catalytic pocket, as the participating domains do not significantly alter their shape and therefore, do not form a shared hydrophobic core. The role of disordered fragments does not change at this stage. 3. Targeted external interactions lead to creation of an enzyme–inhibitor complex. This joint structure also does not contain a shared hydrophobic core. The role of disulfide bonds is noteworthy, as their elimination does not affect hydrophobicity density distribution in 2TPI. The DisProt1 fragment does not follow the theoretical hydrophobicity density distribution model in any structural unit. It appears to be strongly

Table 8.5 O/T and O/R values calculated for each domain individually Residues excluded Unit Struc/Func Residues present

DOMAIN 1

O/R

RD

CHAIN (Z)

19–27, 121–233

0.121

0.146

0.45

NO LIGAND

19, 138, 142–144, 156–158, 187–189, 194, 221

0.115

0.150

0.43

NO ENZYMATIC

193, 195, 196

0.124

0.149

0.45

NO P–P

151, 189–195, 214–216, 226

0.107

0.154

0.41

NO ION

145, 146, 191, 220

0.125

0.144

0.46

NO SS bonds

22:157,128;232,136;201,168:182

0.116

0.134

0.46

0.124

0.147

0.46

NODisProt

DOMAIN 2

O/T

DisProt 1

19–27

0.224

0.159

0.58

DisProt 2

137–147

0.037

0.095

0.28

DisProt 3

179–190

0.131

0.211

0.38

DisProt 4

209–216

0.044

0.055

0.44

CHAIN (Z)

28–120, 233–245

0.111

0.146

0.43

NO ENZYMATIC

57, 102

0.112

0.148

0.43

NO P–P

36–42, 57, 97, 99

0.110

0.145

0.43

NO SS bonds

42:58

0.114

0.142

0.44

Rows labeled “NO XX” list O/T and O/R values for an “oil drop” encapsulating the domain, without residues involved in activity XX. Chains, domains, and fragments labeled “DisProt” (identified according to DisProt database) or “Disorder” (subjectively recognized) are characterized by O/T and O/R values which reflect their contribution to the common hydrophobic core for the given domain. Values listed in boldface represent hydrophobicity density distribution accordant with the theoretical model. The residues present in calculation are given in italics.

338

Barbara Kalinowska et al.

A Z 2 0.42

1 0.44

SS

P EL 0.46

0.48

0.5

0.52

0.54

Distance versus idealized distribution

B Z 2 0.1

T

0.2

1

3 4 0.3

0.4

0.5

0.6

R

Distance versus idealized distribution

T

0.2

0.3

0.4

0.5

Distance versus idealized distribution

I-CHAIN

Z-DP 1 COMPLEX

I-DP 2 Z-DP 3 2D-Z Z-CHAIN 1D-Z

Z-DP 4

I-DP 3 I-DP 1

Z-DP 2

C

0.6

0.7

R

Figure 8.7 Status of individual structural units in the 2TPI complex. (A) The Z chain (pink dot on the axis) and its structure following elimination of residues responsible for ligand binding (L), enzymatic activity (E), complexation of the I chain (P), and forming disulfide bonds (SS), as well as domains 1 and 2, respectively. (B) Participation of disordered fragments in generating a distribution accordant with the model (numbering matches Table 8.5). The Z chain has been highlighted by a pink dot. (C) Participation of structural fragments in generating the “oil drop” for the entire complex. Labels—chain name followed by DP indicates a disordered fragment, as listed in Table 8.5. Numbers followed by D indicate Z chain domains. Equation ((8.1)) has been applied to construct the summary chart presented in this figure.

implicated in structural changes associated with the protein’s biological role, as stated in the DisProt database (based on experimental data). 3.2.3.5 Internal force field

The hydrophobic core structure (which is understood as an area characterized by higher than average hydrophobicity density) could potentially result from dense packing of residues near the center of the molecule. In order to determine the relationship between residue packing and hydrophobicity density distribution, we will refer to other types of internal interactions, specifically, electrostatic and van der Waals forces.

Intrinsically Disordered Proteins

339

Calculating O/T and O/R values provides a basis upon which to determine the distribution of internal force fields within the molecule. All electrostatic and van der Waals interactions were normalized (ensuring that their aggregate sum was equal to 1.0 in each case). O/T and O/R values were established in a similar manner to hydrophobicity density computations. The 2TPI molecule was used as a representative case study. Neither electrostatic nor van der Waals interactions are accordant with the theoretical model for any structural unit. This, however, should come as no surprise as the hydrophobicity density distribution in 2TPI also diverges from theoretical expectations. Regarding hydrophobicity density, the Z chain exhibits a well-defined hydrophobic core, whereas no such core can be distinguished in the scope of nonbinding interactions. High RD values (0.72 and 0.85 for electrostatic and van der Waals forces, respectively) suggest that neither interaction is significantly concentrated in the center of the domain. Similar results of electrostatic and van der Waals density calculations were obtained for a larger group of proteins, including proteins which contain a “fuzzy oil drop”-compliant hydrophobic core. This suggests that the “fuzzy oil drop” model applies exclusively to hydrophobicity density distribution. The observation may be of practical importance for in silico structure prediction algorithms. While pairwise optimization does not treat any part of the molecule preferentially (and thus tends to produce uniform distributions of the optimized quantity), hydrophobic interactions instead require a holistic model which differentiates parts of the protein body with respect to their placement.

4. DISCUSSION Can a protein contain a hydrophobic core which is in perfect accordance with the theoretical profile, with all hydrophobic residues concentrated near the center and all hydrophilic residues exposed on the surface? Such a protein would possess two key characteristics: it would be highly water soluble (which is somewhat expected) but also incapable of interacting with other types of molecules, given its high preference for contact with water. Analysis of numerous proteins reveals that most of them depart from the idealized hydrophobicity density distribution. Areas where such departures are concentrated often mediate the protein’s biological activity. For example, ligand-binding pockets are typically associated with

340

Barbara Kalinowska et al.

hydrophobicity deficiencies (Brylinski, Kochanczyk, Broniatowska, & Roterman, 2007; Brylinski, Prymula, et al., 2007) while protein complexation sites frequently exhibit excess hydrophobicity. As a result, distortions in the hydrophobicity density distribution throughout the protein body appear to be a critical factor in ensuring that the protein remains biologically active—whether dissolved in water or confined to a hydrophobic environment (e.g., cellular membranes). Do any proteins follow the theoretical model with high accuracy? Yes, several such proteins can be found in databases. This group includes antifreeze (class II) proteins, which counteract growth of ice crystals and prevent them from damaging cellular structures. The biological role of antifreeze proteins requires them to be water soluble. In addition, they should not form clusters and instead need to be uniformly distributed in the aqueous environment, which explains their adherence to the “fuzzy oil drop” model (Banach, Prymula, et al., 2012). Another group which also exhibits high accordance with the theoretical model comprises the so-called fast-folding (downhill) proteins (Roterman, Konieczny, Jurkowski, Prymula, & Banach, 2011). In their case, high reversibility of folding (as determined experimentally) indicates that the structural influence of water is sufficient to guide the process in such a way as to ensure aggregation of hydrophobic residues near the center of the protein with simultaneous exposure of hydrophilic residues on the surface. The role of structural fragments identified as disordered seems to be related to the goal of stabilizing the molecule (or complex) as a whole. They should therefore be studied in the context of the structure in which they participate. Algorithms which attempt to assemble the protein’s native form from individual structural “building blocks” are accurate only in selected cases, as determined by the CASP initiative (Dill & MacCallum, 2012). “Bottom-up” models which seek minimal free energy conformations for short fragments have since been abandoned in favor of more holistic energy optimization approaches. The “fuzzy oil drop” model interprets the molecule as a whole, rather than a sum of its (locally optimized) parts. According to the model, optimization of hydrophobic interactions should not be performed in a pairwise fashion (in contrast to electrostatic and van der Waals interactions). Instead, hydrophobic forces should act upon the molecule as a whole, leading to formation of a hydrophobic core. In fact, pairwise optimization usually results

Intrinsically Disordered Proteins

341

in a static (unified) distribution of the optimized quantity as it implicitly assumes that each part of the molecule tends to a local optimum regardless of its placement. In the “fuzzy oil drop” model, the concept of a “hydrophobic core” goes beyond aggregation of hydrophobic molecules at the center of the molecule. In order to remain stable, the core must be shielded by a layer which prevents contact with water. This layer is characterized by a hydrophobicity gradient, resulting in near-zero hydrophobicity on the surface. The interplay between the core and its hydrophilic mantle ensures stability in a water environment. However, few molecules are expected to follow the idealized hydrophobicity density profile with perfect accuracy—such molecules would exhibit perfect solubility (owing to the unbroken mantle) but would also be unable to interact with any external molecules such as ligands, ions, substrates, or other proteins. It seems that by reaching an equilibrium between perfect adherence to the model and localized discrepancies, the protein can remain both soluble and capable of interactions. Such localized discrepancies express the link between hydrophobicity density distribution and the protein’s biological function. Variations in the status of polypeptide chain fragments point to their participation in stabilization (and perhaps also function) of the target protein, requiring dynamic adaptation to changing environmental conditions (such as the presence of a ligand). This model is further detailed in Hartman et al. (2013), where the authors propose certain assumptions with regard to disordered fragments. Analysis of fragments characterized by loose packing can be likened to tracing the “history” of the folding process. If the proposed model is correct and accurately reflects the tendency of the protein to develop a hydrophobic core as a result of interactions with water, we can determine the sequence of events involved in the folding process by studying the adherence of specific structural units to theoretical predictions. The role of disordered fragments in this process, as well as in structural stabilization of individual structural units (complexes, chains, or domains), seems clear. The presence of a water environment—which is ubiquitous in biological systems—seems to play an active part in shaping protein structures. This work proposes a method of analyzing the effects of this environment with regard to individual fragments of the polypeptide chain which undergoes folding. The “fuzzy oil drop” model reflects the global influence of water upon the function of proteins, forcing us to rethink the way in which protein

342

Barbara Kalinowska et al.

structure is defined. Traditional approaches view the protein as a geometric shape whose form can be described in terms of secondary and supersecondary structural motifs. By adopting additional criteria (not related to geometry), it becomes evident that the distribution of hydrophobicity density may substantially influence the protein in ways unrelated to its secondary, tertiary, or quaternary structure (Shanmugham et al., 2012; Vugmeyster et al., 2011; Wood et al., 2013; Zhang et al., 2012). As already mentioned, a protein chain exhibiting perfect adherence to the “fuzzy oil drop” model (as described by the 3D Gaussian) would be incapable of interacting with any molecules other than water. Local deformations in the hydrophobicity density distribution facilitate contact with ligands, substrates, and other complexation targets. The protein’s native structure, associated with its intended biological activity, is determined by juxtaposing two processes: optimization of internal interactions (producing secondary and supersecondary structural motifs) and environmental effects which lead to internalization of hydrophobic residues, along with exposure of hydrophilic residues on the surface. Both processes are somewhat contradictory: internal free energy optimization may counteract the entropic effects of water. Achieving a proper balance of these forces (expressed by ionic potentials or pH values) is necessary if the protein is to assume a stable and active form. For this reason, models based solely on internal free energy optimization do not produce adequate results, that is, their predictions are a poor match for experimentally determined structures. Unfolding (denaturation)—especially if reversible—is not a direct consequence of the denaturing agent (such as urea) acting upon the protein body. Instead, the agent modifies the structural properties of water and alters the way in which the polypeptide chain interacts with its environment. Once the agent is removed, the environment reverts to a state which promotes protein folding as predicted by the “fuzzy oil drop” model. The presence of other environmental factors may also affect the final distribution of hydrophobicity density throughout the protein body, ensuring high biological specificity. The same mechanism acts upon parts of the polypeptide chain, which—in addition to highly ordered fragments (described by secondary structural motifs)—also contain disordered fragments. The presented example suggests that lack of secondary structural ordering may be a direct consequence of another form of ordering, related to global hydrophobicity density distribution. Such “disordered” fragments, therefore, reflect the balance between local (internal free energy) and global (fuzzy oil drop) optimization.

Intrinsically Disordered Proteins

343

The two-step model (first one—solely backbone dependent, second— environmental dependent)—proposed to simulate the protein folding in silico (Roterman et al., 2011) seems to be supported by experimental observation reported in Chung and Tokmakoff (2008).

5. CONCLUSIONS Analysis of fragments listed in the DisProt database (http://www. disprot.org) suggests that their involvement in shaping the hydrophobic core in the protein (complex, chain, or domain) depends on their status vis-a-vis the “fuzzy oil drop” model. If the given structural unit is not accordant with theoretical predictions, the corresponding DisProt fragment usually also diverges from the model. It should also be noted that accordance depends critically upon selection of the structural unit. Tables 8.3–8.5 with results concerning the 2TPI list individual fragments of a sample protein along with their status within various structural units. For example—if the fragment is accordant within an individual domain but not within the entire chain, the given domain probably underwent folding on its own. Chains composed of distinct domains usually do not possess unified hydrophobic cores. Plotting RD values for various structural units appears to accurately reflect the relations between these units and may enable a holistic interpretation of the folding process. Accordance of DisProt fragments with the “fuzzy oil drop” model may be related to the stability of crystal structures; it therefore appears that structures derived from NMR measurements are better suited to such studies. DisProt fragments listed as discordant in Table 8.1 usually belong to proteins which, as a whole, do not possess regular hydrophobic cores. Such proteins are generally less stable than those in which a unified hydrophobic core (as predicted by the “fuzzy oil drop” model) can be found. Synergy between disordered regions and structured domains increases the functional versatility of proteins and strengthens their interaction networks (Babu, Kriwacki, & Pappu, 2012; Me´sza´ros, Doszta´nyi, Magyar, & Simon, 2014; Me´sza´ros, Doszta´nyi, & Simon, 2012). Structural accordance of DisProt fragments may be correlated with the protein’s ability to revert to a stable conformation following function-related deformations. Proteins in which neither DisProt fragments nor any structural units (domains, chains, or complexes) conform to the model remain something of a mystery (at least in terms of their structural preferences).

344

Barbara Kalinowska et al.

ACKNOWLEDGMENTS This work was made possible by the Jagiellonian University Medical College Grant No. K/ZDS/001531. We would also like to thank Piotr Nowakowski and Anna ZarembaS´mieta nska for their technical and editorial assistance.

REFERENCES Abramavicius, D., & Mukamel, S. (2004). Many-body approaches for simulating coherent nonlinear spectroscopies of electronic and vibrational excitons. Chemical Reviews, 104, 2073–2098. Asplund, M. C., Zanni, M. T., & Hochstrasser, R. M. (2000). Two-dimensional infrared spectroscopy of peptides by phase-controlled femtosecond vibrational photon echoes. Proceedings of the National Academy of Sciences of the United States of America, 97, 8219–8224. Babu, M. M., Kriwacki, R. W., & Pappu, R. V. (2012). Versatility from protein disorder. Science, 337, 1460–1461. Baiz, C. R., Peng, C. S., Reppert, M. E., Jones, K. C., & Tokmakoff, A. (2012). Coherent two-dimensional infrared spectroscopy: Quantitative analysis of protein secondary structure in solution. Analyst, 137, 1793–1799. Banach, M., Konieczny, L., & Roterman, I. (2012a). Ligand-binding-site recognition. In I. Roterman-Konieczna (Ed.), Protein folding in silico (pp. 78–94). Oxford, Cambridge, Philadelphia, New Dehli: Woodhead Publishing. Banach, M., Konieczny, L., & Roterman, I. (2012b). Use of the “fuzzy oil drop” model to identify the complexation area in protein homodimers. In I. Roterman-Konieczna (Ed.), Protein folding in silico (pp. 95–122). Oxford, Cambridge, Philadelphia, New Dehli: Woodhead Publishing. Banach, M., Konieczny, L., & Roterman, I. (2013). Can the structure of hydrophobic core determine the complexation area? In Irena Roterman-Konieczna (Ed.), Identification of ligand binding site and protein–protein interaction area (pp. 41–54). Dordrecht, Heidelberg, New York, London: Springer. Banach, M., Marchewka, D., Piwowar, M., & Roterman, I. (2012). The divergence entropy characterizing the internal force field in proteins. In I. Roterman-Konieczna (Ed.), Protein folding in silico (pp. 55–78). Oxford, Cambridge, Philadelphia, New Dehli: Woodhead Publishing. Banach, M., Prymula, K., Jurkowski, W., Konieczny, L., & Roterman, I. (2012). Fuzzy oil drop model to interpret the structure of antifreeze proteins and their mutants. Journal of Molecular Modeling, 18(1), 229–237. Berendsen, H. J. C., Postma, J. P. M., van Gunsteren, W. F., & Hermans, J. (1981). Interaction models for water in relation to protein hydration. In B. Pullman (Ed.), Intermolecular forces (pp. 331–342). Dordrecht: Reidel Publishing Company. Berendsen, H. J., van der Spoel, D., & van Drunen, R. (1995). GROMACS: A messagepassing parallel molecular dynamics implementation. Computational Physics Communication, 91, 43–56. Biancardi, A., Cammi, R., Cappelli, C., Mennucci, B., & Tomasi, J. (2012). Modelling vibrational coupling in DNA oligomers: A computational strategy combining QM and continuum solvation models. Theoretical Chemistry Accounts, 131, 1157. Brylinski, M., Kochanczyk, M., Broniatowska, E., & Roterman, I. (2007). Localization of ligand binding site in proteins identified in silico. Journal of Molecular Modelling, 13, 665–675. Brylinski, M., Konieczny, L., & Roterman, I. (2007). Is the protein folding an aim-oriented process? Human haemoglobin as example. International Journal of Bioinformatics Research and Application, 3(2), 234–260.

Intrinsically Disordered Proteins

345

Brylinski, M., Prymula, K., Jurkowski, W., Kocha nczyk, M., Stawowczyk, E., Konieczny, L., et al. (2007). Prediction of functional sites based on the fuzzy oil drop model. PLoS Computational Biology, 3, e94. Chung, H. S., & Tokmakoff, A. (2008). Temperature-dependent downhill unfolding of ubiquitin. I. Nanosecond-to-milisecond resolved nonlinear infrared spectroscopy. Proteins, 72, 474–487. Dill, K. A., & MacCallum, J. L. (2012). The protein-folding problem, 50 years on. Science, 338, 1042–1046. Harel, M., Kryger, G., Rosenberry, T. L., Mallender, W. D., Lewis, T., Fletcher, R. J., et al. (2000). Three-dimensional structures of Drosophila melanogaster acetylcholinesterase and of its complexes with two potent inhibitors. Protein Science: A Publication of the Protein Society, 9, 1063–1072. Hartman, E., Wang, Z., Zhang, Q., Roy, K., Chanfreau, G., & Feigon, J. (2013). Intrinsic dynamics of an extended hydrophobic core in the S. cerevisiae RNase III dsRBD contributes to recognition of specific RNA binding sites. Journal of Molecular Biology, 425(3), 546–562. http://www.disprot.org. Kauzmann, W. (1959). Some factors in the interpretation of protein denaturation. Advances in Protein Chemistry, 14, 1–63. Konieczny, L., Brylinski, M., & Roterman, I. (2006). Gauss function based model of hydrophobicity density in proteins. In Silico Biology, 6(1–2), 15–22. Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics, 22, 79–86. Laskowski, R. A. (2009). PDBsum new things. Nucleic Acids Research, 37, D355–D359. Levitt, M. (1976). A simplifed representation of protein conformations for rapid simulation of protein folding. Journal of Molecular Biology, 104, 59–107. Lin, Z., & van Gunsteren, W. F. (2013). On the choice of a reference state for one-step perturbation calculations between polar and nonpolar molecules in a polar environment. Journal of Computational Chemistry, 34(5), 387–393. Lindahl, E., Hess, B., & van der Spoel, D. (2001). GROMACS 3.0: A package for molecular simulation and trajectory analysis. Journal of Molecular Modeling, 7, 306–317. Marchewka, D., Banach, M., & Roterman, I. (2011). Internal force field in proteins seen by divergence entropy. Bioinformation, 6(8), 300–302. Marchewka, D., Jurkowski, W., Banach, M., & Roterman, I. (2013). Prediction of proteinprotein binding interfaces. In I. Roterman-Konieczna (Ed.), Identification of ligand binding site and protein–protein interaction area (pp. 105–134). Dordrecht, Heidelberg, New York, London: Springer Verlag. Me´sza´ros, B., Doszta´nyi, Z., Magyar, C., & Simon, I. (2014). Bioinformatical approaches to unstructured/disordered proteins and their interactions. In A. Liwo (Ed.), Computational methods to study the structure and dynamics of biomolecules and biomolecular processes: from bioinformatics to molecular quantum mechanics. Springer Series in Bio-/Neuroinformatics, Vol. 1, Springer. Me´sza´ros, B., Doszta´nyi, Z., & Simon, I. (2012). Disordered binding regions and linear motifs—Bridging the gap between two models of molecular recognition. PLoS One, 7(10), e46829. Murphy, G. S., Mills, J. L., Miley, M. J., Machius, M., Szyperski, T., & Kuhlman, B. (2012). Increasing sequence diversity with flexible backbone protein design: The complete redesign of a protein hydrophobic core. Structure, 20(6), 1086–1096. Priyakumar, U. D. (2012). Role of hydrophobic core on the thermal stability of proteinsmolecular dynamics simulations on a single point mutant of sso7d. Journal of Biomolecular Structure and Dynamics, 29(5), 1–11. Prymula, K., Jadczyk, T., & Roterman, I. (2011). Catalytic residues in hydrolases: Analysis of methods designed for ligand-binding site prediction. Journal of Computer-Aided Molecular Design, 25(2), 117–133.

346

Barbara Kalinowska et al.

Roterman, I., Konieczny, L., Jurkowski, W., Prymula, K., & Banach, M. (2011). Twointermediate model to characterize the structure of fast-folding proteins. Journal of Theoretical Biology, 283(1), 60–70. Shanmugham, A., Bakayan, A., Vo¨ller, P., Grosveld, J., Lill, H., & Bollen, Y. J. (2012). The hydrophobic core of twin-arginine signal sequences orchestrates specific binding to Tatpathway related chaperones. PLoS One, 7(3), e34159. Sickmeier, M., Hamilton, J. A., LeGall, T., Vacic, V., Cortese, M. S., Tantos, A., et al. (2007). DisProt: The database of disordered proteins. Nucleic Acids Research, 35(Database issue), D786–D793, Epub 2006 Dec 1. Song, J., Guo, L. W., Muradov, H., Artemyev, N. O., Ruoho, A. E., & Markley, J. L. (2008). Intrinsically disordered gamma-subunit of cGMP phosphodiesterase encodes functionally relevant transient secondary and tertiary structure. Proceedings of the National Academy of Sciences of the United States of America, 105, 1505–1510. Urbic, T., & Dill, K. A. (2010). A statistical mechanical theory for a two-dimensional model of water. Journal of Chemical Physics, 132, 224507. Uversky, V. N., & Dunker, A. K. (2010). Understanding protein non-folding. Biochimica et Biophysica Acta, 1804, 1231–1264. van der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, A. E., & Berendsen, H. J. (2005). GROMACS: Fast, flexible, and free. Journal of Computational Chemistry, 26, 1701–1718. van der Spoel, D., Lindahl, E., Hess, B., van Buuren, A.R., Apol, E., & Berendsen, H.J. (1995) Gromacs User Manual version 3.3. Velanker, S. S., Ray, S. S., Gokhale, R. S., Suma, S., Balaram, H., Balaram, P., et al. (1997). Triosephosphate isomerase from Plasmodium falciparum: The crystal structure provides insights into antimalarial drug design. Structure, 5(6), 751–761. Vucetic, S., Obradovic, Z., Vacic, V., Radivojac, P., Peng, K., Iakoucheva, L. M., et al. (2005). DisProt: A database of protein disorder. Bioinformatics, 21(1), 137–140. Vugmeyster, L., Ostrovsky, D., Khadjinova, A., Ellden, J., Hoatson, G. L., & Vold, R. L. (2011). Slow motions in the hydrophobic core of chicken villin headpiece subdomain and their contributions to configurational entropy and heat capacity from solid-state deuteron NMR measurements. Biochemistry, 50(49), 10637–10646. Walter, J., Steigemann, W., Singh, T. P., Bartunik, H., Bode, W., & Huber, R. (1982). On the disordered activation domain in trypsinogen. Chemical labelling and lowtemperature crystallography. Acta Crystallographica Section B, 38, 1462–1472. Wood, K., Gallat, F. X., Otten, R., van Heel, A. J., Lethier, M., van Eijck, L., et al. (2013). Protein surface and core dynamics show concerted hydration-dependent activation. Angewandte Chemie International Edition in English, 52(2), 665–668. Yang, H., Jiao, X., & Li, S. (2012). Hydrophobic core-hydrophilic shell-structured catalysts: A general strategy for improving the reaction rate in water. Chemical Communications (Cambridge, England), 48(91), 11217–11219. Zhang, X., Tan, Y., Zhao, R., Chu, B., Tan, C., & Jiang, Y. (2012). Site-directed mutagenesis study of the Ile140 in conserved hydrophobic core of Bcl-x(L). Protein and Peptide Letters, 19(9), 991–996.

Intrinsically disordered proteins--relation to general model expressing the active role of the water environment.

This work discusses the role of unstructured polypeptide chain fragments in shaping the protein's hydrophobic core. Based on the "fuzzy oil drop" mode...
2MB Sizes 2 Downloads 3 Views