Research Article Received: 30 October 2014

Revised: 28 January 2015

Accepted: 28 January 2015

Published online in Wiley Online Library: 17 March 2015

(wileyonlinelibrary.com) DOI 10.1002/psc.2765

One short cysteine-rich sequence pattern – two different disulfide-bonded structures – a molecular dynamics simulation study Sonja A. Damesa,b* The nematocyst walls of Hydra are formed by proteins containing small cysteine-rich domains (CRDs) of ~25 amino acids. The first CRD of nematocyst outer all antigen (NW1) and the C-terminal CRD of minicollagen-1 (Mcol1C) contain six cysteines at identical sequence positions, however adopt different disulfide bonded structures. NW1 shows the disulfide connectivities C2-C14/C6C19/C10-C18 and Mcol1C C2-C18/C6-C14/C10-C19. To analyze if both show structural preferences in the open, non-disulfide bonded form, which explain the formation of either disulfide connectivity pattern, molecular dynamics (MD) simulations at different temperatures were performed. NW1 maintained in the 100-ns MD simulations at 283 K a rather compact fold that is stabilized by specific hydrogen bonds. The Mcol1C structure fluctuated overall more, however stayed most of the time also rather compact. The analysis of the backbone Φ/ψ angles indicated different turn propensities for NW1 and Mcol1C, which mostly can be explained based on published data about the influence of different amino acid side chains on the local backbone conformation. Whereas a folded precursor mechanism may be considered for NW1, Mcol1C may fold according to the quasi-stochastic folding model involving disulfide bond reshuffling and conformational changes, locking the native disulfide conformations. The study further demonstrates the power of MD simulations to detect local structural preferences in rather dynamic systems such as the open, non-disulfide bonded forms of NW1 and Mcol1C, which complement published information from NMR backbone residual dipolar couplings. Because the backbone structural preferences encoded by the amino acid sequence embedding the cysteines influence which disulfide connectivities are formed, the data are generally interesting for a better understanding of oxidative folding and the design of disulfide stabilized therapeutics. Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd. Additional supporting information may be found in the online version of this article at the publisher’s web site. Keywords: cysteine-rich domain; disulfide bond pattern; hydra; minicollagen-1; molecular dynamics simulations; NOWA; oxidative folding; GROMOS

Introduction

480

Nematocyst outer all antigen (NOWA) and minicollagens are proteins of the nematocyst, which are ‘explosive’ organelles found in Hydra, jellyfish, and other cnidarians. Both contain small CRDs of about 25 amino acids with characteristic intramolecular disulfide connectivities [1,2]. Formation of the extremely stable capsule wall that most withstands extreme osmotic pressures most likely involves reshuffling of intramolecular to intermolecular disulfide bridges between NOWA and minicollagens to result in a stable, disulfide linked matrix layer [1,3]. NOWA encompasses 774 residues (Figure 1A). The N-terminal region consists of a signal peptide, a sperm-coating protein domain and a C-type lectin domain. The C-terminal half contains the cysteine-rich octarepeat domain with eight subsequent CRDs [1,4]. Minicollagens such as minicollagen-1 are about 150 residues long [3]. Following the cleavage of an N-terminal signal and propeptide region, they have a rather symmetrical buildup consisting of a central collagen region that, at the N-terminal and C-terminal ends, is flanked by a prolinerich linker and a CRD (Figure 1A) [3,5]. The CRDs have six cysteines at equivalent sequence positions (C2, C6, C10, C14, C18, C19 in Figure 1B; however, depending on the exact sequence can form two different disulfide connectivity patterns (C2-C14/C6-C19/C10C18 or C2-C18/C6-C14/C10-C19, Figure 1B) [4,6]. Because it was

J. Pept. Sci. 2015; 21: 480–494

formerly assumed that amino acid sequences sharing the same cysteine sequence pattern also form the same disulfide connectivity pattern [7], this was rather a surprising finding. As illustrated by the structures of the first CRD of NOWA (residues 468–492 = NW1) and the C-terminal one of minicollagen-1 (residues 126–149 = Mcol1C), the different disulfide patterns are embedded in overall different tertiary folds (Figure 1C) [4,6,8]. The N-terminal CRD of minicollagen-1 (residues 32–52 = Mcol1N) and the other CRDs of NOWA share the same disulfide connectivity pattern and a highly similar fold as NW1 [7,8]. The N-terminal half of the NW1 CRD fold

* Correspondence to: Sonja A. Dames, Biomolecular NMR Spectroscopy, Department of Chemistry, Technische Universität München, Lichtenbergstr. 4, 85747 Garching, Germany. E-mail: [email protected] a Biomolecular NMR Spectroscopy, Department of Chemistry, Technische Universität München, 85747 Garching, Germany b Helmholtz Zentrum München, Germany and Institute of Structural Biology, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany Abbreviations: CRD, cysteine-rich domain; SI, supplementary information; HSQC, heteronuclear single quantum coherence experiment; MD, molecular dynamics; Mcol1C, residues 126–149 of Hydra minicollagen-1; NOWA, nematocyst outer wall antigen; NW1, residues 468–492 of Hydra NOWA; RDC, residual dipolar coupling.

Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd.

CYSTEINE-RICH DOMAIN STRUCTURAL PREFERENCES

Figure 1. Domain organization of NOWA and minicollagen-1 and sequence and structure information for a selected CRD of each. A: NOWA contains besides a signal peptide a sperm-coating protein domain, a C-type lectin domain, and the cysteine-rich octarepeat domain. The amino acid sequence and the determined NMR structure of the first of the eight CRDs, referred to as NW1, are shown in the top plot in B and the left plot in C, respectively. Minicollagen-1 contains at its N-terminus a signal peptide followed by a propeptide. The central collagen region is on each end flanked by a polyproline region (PPII) and a CRD. The sequence of the C-terminal CRD, which is referred to as Mcol1C, is shown at the bottom plot of B and its structure in the right plot of C. B and C: despite the same cysteine-rich sequence pattern, NW1 and Mcol1C adopt different disulfide connectivity patterns and different three-dimensional folds. The cysteine connectivities and the PDB-ids are indicated. Whereas the peptide bond to P12 adopts a cis conformation in NW1, it is trans in Mcol1C. All structure pictures were made with MolMol [15] and POV-Ray (www.povray.org).

J. Pept. Sci. 2015; 21: 480–494

and Mcol1C, partial reduction experiments were performed [7]. For Mcol1C, this resulted in a variety of products. Based on additional NMR data, only the fully oxidized form appears folded, whereas partially reduced forms resulted in spectra typical for unstructured peptides. This suggested that in partially oxidized forms, only low conformational preferences exist [7]. Similar experiments with P-Mcol1N resulted only in fully oxidized and fully reduced species. Thus, while disulfide bonds in the C-terminal CRD form independently, those in the N-terminal domain appear to form cooperatively [7]. Based on these observations, the authors considered a prefolded precursor mechanism involving a succession of conformational searches unlikely and favored a quasi-stochastic mechanism where the proximity rule may restrict the statistical population of intermediate disulfide isomers [7]. A major difference between Nw1 and Mcol1N versus Mcol1C is that the peptide bond preceding the central conserved proline (P12 in Figure 1B) between the third and fourth cysteine (C10, C14) occupies a cis conformation in the first two but a trans one in the latter (Figure 1C) [6,8]. In line with this, Mcol1C with this proline in trans folds rather fast and cooperatively to the respective disulfide isomer, whereas the folding of P-Mcol1N due to the presence of the cis proline proceeds in two phases, a fast one from the oxidative folding of the present cis population and a slower one involving a trans to cis isomerization

Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/jpepsci

481

(Figure 1C, left side) consists of a canonical βII turn between the first two cysteines (C2-C6) and a short 310 helix formed by two consecutive βIII turns (S7-Y11) containing the third cysteine (C10). In the central region, a βVIa turn (Y11-C14) with P12 in cis conformation continues into a γ-turn (E13-K15), thereby positioning the fourth cysteine (C14). The C-terminal two cysteines (C18, C19) are located in a single α-helical turn (K15-C19) [8]. In the Mcol1C fold (Figure 1C, right side), the region after the first cysteine (C2) containing the second cysteine (C6) forms an N-terminal α-helix (V5-Q9). The third cysteine (C10) is part of an inverse γ-turn (Q9-V11). P12 in the following βI-turn (V11-C14) containing the forth cysteine adopts in this case a trans conformation. The C-terminal region consists of a βIII-turn (P15-C18) and a positively charged stretch following the last cysteine (C19) [4,6]. Refolding of isolated reduced NW1 or Mcol1C in a redox buffer results to a large extent in the respective native disulfide connectivity pattern [4–6]. This suggested that the formation of either disulfide connectivity pattern is determined by the residues embedding the cysteines in the sequence [5]. Only in the case of Mcol1N, the efficiency of oxidative folding to the correctly disulfide-bonded conformer is significantly increased if a 12-residue N-terminal propeptide is present (= P-Mcol1N, residues 20–52) [7]. To identify a rank of order of the stability of the disulfide bonds in P-Mcol1N

S. A. DAMES [5]. The role of cis or trans conformation of this central proline for folding into the correctly disulfide linked isomer was further evaluated from folding studies of mutants in which the central proline was replaced by (4R)-fluoroproline or (4S)-fluoroproline, which favor the trans or cis conformation, respectively. These experiments suggested that the sequence encoded folding information provided by the residues embedding the cysteines in P-Mcol1N with (4R)fluoroproline and (4S)-fluoroproline at position 24 (=12 in Figure 1B) is sufficient to fold both into identical disulfide linked structures [5]. Thus, the presence of a cis peptide bond preceding P12 is not determining the adopted disulfide connectivity pattern. For NW1, it was predicted that mutating lysine 21 to proline (K21P = K15P according to the numbering used in this study, see Figure 1B) and glycine 11 to valine (G11V = G5V in Figure 1B) should strongly influence the disulfide connectivity pattern preference. Based on HPLC and NMR data, the double mutant NW1-G11V/K21P (= G5V/K21P in Figure 1B) adopts to 95% the same disulfide bond pattern as Mcol1C and only to 5% that wild type NW1 [9]. Because both, NW1 and Mcol1C, are devoid of a hydrophobic core or salt bridges, these data further suggested that the turn propensities of the non-cysteine residues must play a role for the formation of the two different disulfide connectivity patterns [9]. To better understand how the different residues embedding the six disulfide bond forming cysteines in NW1 and Mcol1C (Figure 1C) influence the respective backbone structural preferences in the open, non-disulfide bonded state and thereby which disulfide connectivity pattern can be preferentially adopted, MD simulations at different temperatures were performed. Most of the observed backbone structural preferences at ambient temperatures can be explained based on published data reporting about the influence of the different amino acid side chains on backbone structural preferences and thereby resulting turn propensities [10–14].

Materials and Methods Software Used for the Simulations and Their Analysis All MD simulations were performed using the MD++ 0.2.3 of the GROMOS05 package and analyzed using GROMOS++ 0.2.4 (www.gromos.net). The starting structures and the analysis of the MD runs were done on a MacBook, and the simulation runs on a LINUX cluster. Visualization of the structures was done using MOLMOL [15]. Structure pictures were made with POV-Ray (www.povray.org). Setup of the MD Simulations

482

Overall, the MD simulations of NW1 and Mcol1C with open disulfide bonds were done using the same approach as described in Ref. [16] to analyze the disulfide shuffling in bovine α-lactalbumin. All MD simulations were performed with the GROMOS biomolecular simulation software package [17] using the 53A6 GROMOS force field [18,19]. To derive if structural preferences exist that can explain the formation of either disulfide pattern, all six cysteine side chains were simulated as neutral side chains that do not carry a partial charge and that were accordingly not constrained by disulfide bond, angle, or dihedral angle potential terms [16]. Initial coordinates were taken from the published NMR structures of the first CRD of NOWA from Hydra (residues 468–492 = NW1, PDB-ID 2HM3, Uniprot-ID Q8IT70) [8] and the C-terminal CRD of the minicollagen-1 from Hydra (residues 126–149 = Mcol1C, PDB-ID 1SOP, Uniprot-ID Q00484) [6]. The pdb files were converted to

wileyonlinelibrary.com/journal/jpepsci

g96 format with the pdb2g96 GROMOS++ program. Hydrogens were added using the gch GROMOS++ program. Following a short minimization of the generated starting structure in vacuo, the proteins were solvated in cubic boxes using the GROMOS simbox program. The used parameters were pbc = r (rectangular -> cubic box), minwall = 0.8 (minimum solute-wall distance), and thresh = 0.23 (minimum solute-solvent distance). The resulting cubic boxes contained 3161 simple point charge water molecules for NW1 and 1891 for Mcol1C [17,20]. The box dimensions in nm were 4.66 × 4.66 × 4.66 for NW1 and 3.94 × 3.94 × 3.94 for Mcol1C. Finally, counterions were added for each side chain charge by replacing the equivalent of water molecules (NW1: 2 Na+, 1 Cl ; Mcol1C: 3 Cl ). SI Figure S1 shows pictures of the resulting solvated starting structures. Because the starting structure of NW1 is more extended than that of Mcol1C, the solvent box generated by the simbox program of GROMOS is larger for NW1 than for Mcol1C. Rectangular periodic boundary conditions were applied. In total, five MD runs were performed for each CRD: a 5ns run at 283 K for which the structures had been equilibrated to 300 K; two 100-ns runs at 283 K: for one, the structures had been equilibrated to 283 K and for the other to 300 K; and two 100-ns runs at 313 K: for one, the structures had been equilibrated to 313 K and for the other to 323 K. Using two different equilibration temperatures allows to obtain slightly different starting conditions for two runs at the same simulation temperature. This allows for example to identify sampling problems and to see how similar or reproducible the results from different runs under otherwise identical conditions are. The equilibration process was performed as follows. Initial velocities were randomly generated from a Maxwell–Boltzmann distribution at the respective starting temperature (283/5 = 56.6 or 300/5 = 60 or 313/5 = 62.6 or 323/5 = 64.6 K). The target equilibration temperature was reached after five 20-ps simulation periods. During the first four periods, the solute coordinates were restrained to their positions in the starting structures by a harmonic energy potential term and a force constant of 2.5 × 104 kJ mol 1 nm 2. During the last equilibration step needed to reach the final target equilibration temperature, the position restraints were removed. During the following 5- or 100-ns long production MD simulations, the temperature and atmospheric pressure were kept constant using a weak coupling approach [21] with relaxation time τ T 0.1 ps and τ p 0.5 ps and an isothermal compressibility of 4.575 × 10 4 (kJ mol 1 nm 3) 1. Bond lengths were constrained by a SHAKE algorithm [22,23]. Nonbonded interactions were calculated based on a pairlist that was updated every fifth time step using a cutoff distance of 0.8 nm. The interaction between the atoms of charged groups were also calculated within 1.4 nm. Outside the 1.4-nm sphere, the influence of the dielectric medium was accounted for by a reaction-field force corresponding to a relative dielectric permittivity ε of 61 [24]. During the 5-ns simulations, the coordinates and energy terms were written out every 5 ps and during the 100-ns simulations every 10 ps. Analysis of the Molecular Dynamics Simulations Analysis of the MD simulations included the calculation of the atom-positional root-mean-square deviation (RMSD) and rootmean-square fluctuation (RMSF) of the coordinates for a certain time step with respect to the starting structure. Using the GROMOS analysis tools, hydrogen bonds were defined to be present if the hydrogen-acceptor distance was less the 0.25 nm and the donorhydrogen-acceptor angle was at least 135°. In addition, the presence of hydrogen bonds and the secondary structure content

Copyright © 2015 European Peptide Society and John Wiley & Sons, Ltd.

J. Pept. Sci. 2015; 21: 480–494

CYSTEINE-RICH DOMAIN STRUCTURAL PREFERENCES was also determined for the structures present at the selected time points using the respective tools in the visualization software MOLMOL [15]. The cutoff distance for the hydrogen-acceptor distance in MOLMOL was only 0.24 nm, and the donor-hydrogenacceptor angle should have been at least 145°. To estimate the ability to form disulfide bonds, the sulfur–sulfur distances between cysteines were determined using the tser program of GROMOS. The percent of time that the sulfur–sulfur distance between a pair of cysteines is

One short cysteine-rich sequence pattern - two different disulfide-bonded structures - a molecular dynamics simulation study.

The nematocyst walls of Hydra are formed by proteins containing small cysteine-rich domains (CRDs) of ~25 amino acids. The first CRD of nematocyst out...
2MB Sizes 2 Downloads 11 Views