This article was downloaded by: [Pennsylvania State University] On: 04 December 2014, At: 23:31 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and Dynamics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tbsd20

Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein a

a

a

a

Chundi Vinay Kumar , Rayapadi G. Swetha , Sudha Ramaiah & Anand Anbarasu a

Bioinformatics Division, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu, India Accepted author version posted online: 23 Sep 2014.Published online: 13 Oct 2014.

To cite this article: Chundi Vinay Kumar, Rayapadi G. Swetha, Sudha Ramaiah & Anand Anbarasu (2014): Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein, Journal of Biomolecular Structure and Dynamics, DOI: 10.1080/07391102.2014.968211 To link to this article: http://dx.doi.org/10.1080/07391102.2014.968211

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Journal of Biomolecular Structure and Dynamics, 2014 http://dx.doi.org/10.1080/07391102.2014.968211

Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein Chundi Vinay Kumar, Rayapadi G. Swetha, Sudha Ramaiah and Anand Anbarasu* Bioinformatics Division, School of Biosciences and Technology, VIT University, Vellore 632014, Tamil Nadu, India Communicated by Ramaswamy H. Sarma

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

(Received 12 February 2014; accepted 18 September 2014) Mutations in the gene-encoding vesicle lipopolysaccharide-induced tumor necrosis factor (LITAF) protein cause Charcot–Marie–Tooth type 1C (CMT1C) disease, a neurological disorder. The LITAF gene is mapped to chromosome number 16 and can be found at cytogenetic location 16p13 of the chromosome. CMT1C-linked small integral membrane protein of lysosome/late endosome mutants are loss-of-function mutants that act in a dominant negative manner to impair endosomal trafficking, leading to prolonged extracellular signal-regulated kinases 1/2 signaling downstream of ErbB activation. Mutation W116G in the LITAF decreases the stability of the protein and also interrupts the functioning of gene. We have analyzed the single nucleotide polymorphism (SNP) results of 28 nsSNPs obtained from dbSNP. We also carried out multiple molecular dynamics simulations of 200 ns and obtained results of root-mean-square deviation, root-mean-square fluctuation, radius of gyration, solvent-accessible surface area, H-bond, and principal component analysis to check and prove the stability of both the wild type and the mutant. The protein was then checked for its aggregation and the results showed loss of helix. The loss of helix leads to the instability of the protein. Keywords: Charcot–Marie–Tooth disease; lipopolysaccharide-induced TNF factor protein; SNP analysis; molecular dynamics simulations; protein aggregation

1. Introduction Charcot–Marie–Tooth disease generally known as CMT is one of the most common inherited neurological disorders with a prevalence estimated at 1/2500 (Skre, 1974; Street et al., 2003) and it is clinically characterized by distal weakness and atrophy of the limb muscles, mild sensory loss, and absence of tendon reflexes. CMT has been divided into two main types, on the basis of electrophysiological properties and histopathology: one is the demyelinating type consisting of CMT1 (autosomal dominant form) and CMT4 (autosomal recessive form), and the other is the axonal type CMT2 (both autosomal dominant and autosomal recessive forms (Buchthal & Behse, 1977; Dyck & Lambert, 1968; Harding & Thomas, 1980). We can distinguish CMT1 and CMT4 from the axonal forms by reduced nerve-condition velocities (NCVs) with values which are less than 38 m/s for the median motor nerve, segmentalde- and remyelination, and onion bulb formation. CMT2 is seen to have normal or slightly reduced NCVs. The nerve pathology reveals axonal loss and regenerative sprouting. Four loci responsible for autosomal dominant CMT1 have been mapped, and all corresponding causative genes have been identified. CMT1A is seen to be mainly associated with a 1.5Mb duplication on chromosome 17p11.2-12 (Lupski *Corresponding author. Email: [email protected] © 2014 Taylor & Francis

et al., 1991; Raeymaekers et al., 1991) with a genedosage effect for the peripheral myelin protein 22 gene (Matsunami et al., 1992; Patel et al., 1992; Timmerman et al., 1992; Warner, Roa, & Lupski, 1996). CMT1B is seen to be caused by the point mutations in the myelin protein zero (MPZ) gene at 1q21.3-q23 (Lupski, 1998). Charcot–Marie–Tooth type 1C (CMT1C) is linked to chromosome 16p13.1-p12.3 and is associated with mutations in the lipopolysaccharide-induced tumor necrosis factor (LITAF) gene, also known as the “small integral membrane protein of lysosome/late endosome (SIMPLE) gene” (Street et al., 2003). CMT1D, mapped to chromosome 10, is associated with mutations in the early growth response 2 gene, also known as “Krox-20” (Warner et al., 1998). The SIMPLE is a ubiquitously expressed, 161-amino acid protein of unknown function (Moriwaki et al., 2001; Street et al., 2003). To date, eight distinct point mutations in SIMPLE have been identified as the genetic defects for causing dominantly inherited CMT1C (Campbell et al., 2004; Gerding, Koetting, Epplen, & Neusch, 2009; Latour et al., 2006; Saifi et al., 2005; Street et al., 2003). Endocytic trafficking is crucial to the function and survival of all eukaryotic cells. The cell surface receptors are endocytosed upon ligand binding and then they are

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

2

C.V. Kumar et al.

targeted to the early endosome. Once they arrive at the early endosome, the endocytosed receptors are usually either recycled to the cell surface or sorted to intralumenal vesicles of multivesicular bodies for delivery to the lysosome for degradation (Katzmann, Odorizzi, & Emr, 2002). The Ligand-induced lysosomal degradation of cell surface receptors is a major mechanism that attenuates signaling of activated receptors (Katzmann et al., 2002; Waterman & Yarden, 2001). A strong and good evidence indicates that the endosomal sorting complex which is required for transport (ESCRT) machinery, composed of ESCRT-0, -I, -II, and -III complexes, plays a central role in the endosomal sorting of internalized cell surface receptors to the lysosomal pathway (Henne, Buchkovich, & Emr, 2011; Roxrud, Stenmark, & Malerød, 2010). SIMPLE is seen to function with the ESCRT machinery in the control of endosome-to-lysosome trafficking and signaling attenuation. The CMT1C-linked SIMPLE mutants are loss-of-function mutants that act in a dominant negative manner to impair endosomal trafficking, leading to prolonged extracellular signal-regulated kinases 1/2 signaling downstream of ErbB activation (Lee, Chin, & Li, 2012). The mutation W116G causes the SIMPLE protein to be unstable (Lee, Olzmann, Chin, & Li, 2011). The effectiveness to identify the deleterious and disease-related mutations are stated by several researchers, thus predicting the pathogenic nsSNPs in correlation to its functional and structural damaging properties is important (Carvalho et al., 2007, 2009; Goldgar et al., 2004; Karchin, 2009; Vinay Kumar, Kumar, Swetha, Ramaiah, & Anbarasu, 2014). In this study, we focused on prioritization of pathogenic alleles in the LITAF gene and their structural consequence at molecular level. The deleterious disease-associated nsSNPs from the available nucleotide polymorphism (SNP) data-sets obtained from the NCBI dbSNP database are sorted using standard protocols. The alleles which have very high structural damaging probability may lead to a major loss of LITAF functionality. SNPs which showed positive implication of pathogenicity by all the tools employed are considered as strong candidates for structural analysis. SNPs with intermediate pathogenicity implications are ruled out from our study. The SNP analysis is followed by multiple molecular dynamics (MD) simulation.

2. Materials and method 2.1. Data-set Data on human LITAF gene were collected from OMIM (Amberger, Bocchini, Scott, & Hamosh, 2009) and Entrez gene on national center for Biotechnology information (NCBI) website. The SNP information of LITAF

was obtained from the BioMed Research international 3 dbSNP database (Sherry et al., 2001). The amino acid sequence of this protein was retrieved from the Uniprot database (UniProt ID: Q99732).

2.2. Disease-associated SNP prediction The presence of single nucleotide polymorphism may lead to the deleterious consequence in its 3D structures and hence may lead to disease-associated phenomena. Here we used various servers such as SIFT (Kumar, Henikoff, & Ng, 2009), PolyPhen 2.0 (Adzhubei et al., 2010), I-mutant 3.0 (Capriotti, Fariselli, Rossi, & Casadio, 2008), SNP&GO (Calabrese, Capriotti, Fariselli, Martelli, & Casadio, 2009), and MutPred (Li et al., 2009) to examine the disease-associated nsSNP in the LITAF protein-coding region. In the SIFT server, homology-based approach was used to classify amino acid substitutions. In this server, if the prediction score showed >.05 it was considered to be deleterious else it was considered to be tolerated (Kumar et al., 2009). PolyPhen 2.0 server was based on the combination of sequence and structure-based attributes and generally uses naïve Bayesian classifier for the identification of an amino acid substitution and the impact of mutation. Classifying the output levels as probably damaging and possibly damaging are done as functionally significant (≥.05) and benign level being classified as tolerated (≤.05) (Adzhubei et al., 2010). We made use of the sequencebased version of I-Mutant 3.0 and it classifies the prediction into three different classes, namely: neutral mutation (−.5 ≤ DDG ≥ .5 kcal/moL), large decrease (.5 kcal/moL). I-Mutant predicts free energy change (DDG) that was basically based on the difference between unfolding Gibbs free energy change of mutant and wild-type protein (kcal/moL) (Capriotti et al., 2008). The nsSNPs that were combinedly predicted to be deleterious and damaging from these three servers were considered. We further made use of SNP&GO and MutPred tools to examine the diseaseassociated nsSNPs. The data retrieval sources for SNP&GO include protein sequence, evolutionary information, and functions as encoded in the gene ontology terms (Calabrese et al., 2009). MutPred was generally used as a web-based tool. It was used mainly to predict the molecular changes associated with amino acid variants. This server uses SIFT, PSI-BLAST, and Pfam profiles along with some structural disorder prediction algorithms, including TMHMM, MARCOIL, I-Mutant 2.0, B-factor prediction, and DisProt. On combining the results and prediction of all the four servers, the accuracy of prediction rises to a greater extent and finally the most disease-associated mutations are filtered.

Tryptophan to Glycine mutation in the position 116

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

2.3. Molecular dynamics and simulation The simulation of the wild-type and mutant LITAF proteins was performed using Gromacs 4.5.5 software (Hess, Kutzner, van der Spoel, & Lindahl, 2008). The force field that was used for simulation is Gromos96 53a6 (Gunsteren et al., 1996; Oostenbrink, Villa, Mark, & Gunsteren, 2004). The structures were solvated using the simple-point-charge (Berendsen, Postma, Gunsteren, & Hermans, 1981) water box with dimension of 52.0 Å with water molecules. At physiological pH, protein was positively charged; thus in order to make the simulation system electrically neutral, the system was neutralized by adding counter ions (Cl− or Na+). Energy minimization was done for 1000 steps by using steepest descent method. After minimization, the MD simulation was carried out in three different steps viz., heating, equilibration and production. NVT ensemble (constant number of particles, volume and temperature) were used (300 K and 1.0 atm) (Berendsen et al., 1981) followed by The NPT ensemble (constant number of particles, pressure, and temperature) which was performed for 1000 ps at 300 K. The production simulation was carried out at 300 K for 200 ns wild type and mutant of LITAF protein. All the covalent bonds were constrained by using the Linear Constraint Solver algorithm (Hess, Bekker, Berendsen, & Fraaije, 1997). The electrostatic interactions were treated using Particle Mesh Ewald method (Essmann et al., 1995). The cutoff radii for coulomb and Van der Waals interactions were set to 10.0 and 14.0 Å, respectively. The MD trajectories, which were saved every 2.0 ps, were analyzed using GROMACS. The potential of each trajectory produced after MD simulations was thoroughly analyzed. The MD trajectories were analyzed using g_rms, g_rmsf, g_hbond, and g_gyrate of GROMACS utilities (Van Der Spoel et al., 2005) to get the root-mean-square deviation (RMSD), root-mean-square fluctuation (RMSF), radius of gyration (Rg), and the number of H-bonds. The differences in kinetic, potential and total energies, pressure, and temperature were computed as a function of simulation time to see whether the systems obey NVT or NPT ensemble throughout the simulation. The number of hydrogen bonds was calculated to understand the differences in protein stability. The trajectories were analyzed by using the tools from GROMACS distribution. All the Graphs were generated using the XMgrace tool (Turner, 2005). Essential dynamics (ED) (Amadei, Linssen, & Berendsen, 1993) was performed for all the trajectories according to principal component analysis (PCA). The first two eigenvectors (principal components PC1 and PC2) with largest eigenvalues were used to make 2D projection for each of independent trajectories. For the simulation of both wild type and mutant LITAF, Cα atoms were included in the definition of the covariance matrices for

3

the protein. Both the protein structures were subjected to online tools to predict protein structures. The secondary structure prediction server (Klose, Wallace, & Robert, 2010) gives us a clear understanding about the percentage of protein aggregation in the secondary structure. The Pdbsum server (Laskowski, 1997) gives us the pictorial representation of the secondary structures. These two prediction servers give reliable results.

3. Results 3.1. Protein modeling and validation The protein structure is modeled using modeller (Eswar et al., 2006). This structure is validated using proSA. The validation shows the structure to be of X-ray quality (Figure 1).

3.2. Prediction of deleterious nsSNP’s using SIFT, Polyphen, and I-MUTANT program Out of 28 input polymorphic data-set, 14 mutations (V104M, P17L, N30Y, P135T, V81M, C137Y, T49M, V144M, L122V, W116G, T115N, G112S, D89Y, and P91H) are predicted to be deleterious with tolerance index ≥.05 (Table 1). Out of these 14 mutations, P17L, P135T, C137Y, W116G, G112S, and P91H are reported to be highly deleterious with SIFT score of .00 (Table 1). We further subjected all the 28 mutations to PolyPhen server. Based on the results from this server, 19 nsSNPs were found to be “damaging” (.5–1.000) to the protein structure and function and the remaining 9 nsSNPs are characterized as benign. Among these 19

Figure 1. Protein structure validation of the LITAF protein using ProSA server.

4

C.V. Kumar et al.

Table 1.

nsSNPs analyzed by four computational methods PolyPhen 2.0, SIFT, and I-Mutant 3.0 in LITAF gene.

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

SIFT

Polyphen2

I-Mutant 3.0

SNP ID

Mutation

Score

Prediction

PSIC

Prediction

DDG

Stability

rs375665454 rs375202318 rs373445989 rs372309415 rs371808299 rs371334679 rs368574479 rs281865135 rs281865134 rs201653834 rs201512884 rs201352515 rs201283647 rs200789696 rs200709345 rs200702853 rs144232569 rs141862602 rs140714668 rs138041990 rs121908615 rs104894522 rs104894521 rs104894520 rs104894519 rs74716607 rs4280262 rs11544251

R90C P39S V104 M P17L T44P V76 M N30Y P135T A111G V81 M A129G V29G K101R R90H T78A T11I C137Y T49 M A66 V S15L V144 M L122 V W116G T115 N G112S D89Y I92 V P91H

.18 .52 .03 0 .09 .13 .04 0 .26 .01 .35 .18 .29 .55 .1 .15 0 .04 .67 .08 .01 .02 0 .01 0 .01 1 0

Tolerated Tolerated Deleterious Deleterious Tolerated Tolerated Deleterious Deleterious Tolerated Deleterious Tolerated Tolerated Tolerated Tolerated Tolerated Tolerated Deleterious Deleterious Tolerated Tolerated Deleterious Deleterious Deleterious Deleterious Deleterious Deleterious Tolerated Deleterious

1 1 .903 .995 .037 .816 0 1 .826 1 .01 .18 .92 1 .999 .003 1 .431 0 .043 .999 .996 1 .991 .998 1 0 1

Probably damaging Probably damaging Possibly damaging Probably damaging Bengin Possibly damaging Bengin Probably damaging Possibly damaging Probably damaging Bengin Bengin Possibly damaging Probably damaging Probably damaging Bengin Probably damaging Bengin Bengin Bengin Probably damaging Probably damaging Probably damaging Probably damaging Probably damaging Probably damaging Bengin Probably damaging

−.75 −1.56 −.43 −.11 −.15 −.77 .24 −1.13 −1.23 −1.12 −.88 −2.08 −.29 −1.13 −1.21 −.09 −.19 −.15 −.19 .33 −.64 −.75 −1.57 −.69 −1.26 .26 −.94 −1.31

Decrease Decrease Decrease Increase Increase Decrease Increase Decrease Decrease Decrease Decrease Decrease Decrease Decrease Decrease Increase Decrease Increase Decrease Increase Decrease Decrease Decrease Decrease Decrease Increase Decrease Decrease

deleterious nsSNPs, 9 SNPs (R90C, P36S, P135T, V81M, R90H, C137Y, W116G, D89Y, and P91H) were reported to be highly deleterious with PolyPhen score of 1.000 (Table 1). Totally, 12 mutations are identified as deleterious and damaging in SIFT and PolyPhen 2.0 server (Table 1) which also showed a very strong correlation between the prediction methodologies implemented by these two servers. SIFT and PolyPhen are said to have better performance in identifying functional nsSNPs among other in silico tools (Thusberg & Vihinen, 2009). The accuracy of SIFT and PolyPhen is further validated through our results, which makes these tools more reliable for the prediction (Hicks, Wheeler, Plon, & Kimmel, 2011). All the nsSNPs that are submitted to PolyPhen 2.0 and SIFT are also analyzed using IMutant 3.0 server. Twenty-one mutations are predicted to affect the stability of the protein structure by I-Mutant 3.0. Remaining 7 mutations showed increased stability of the structure. We filtered 10 (V104M, P135T, V81M, C137Y, V144M, L122V, W116G, T115N, G112S, and P91H) mutations. Out of these 28 input polymorphic data-set, 10 nsSNPs are predicted to be deleterious as well as damaging mutations using SIFT, PolyPhen 2.0, and I-mutant servers.

3.3. Prediction of disease-associated nsSNPs Total 10 nsSNPs are commonly predicted in SIFT, PolyPhen 2.0, and I-Mutant 3.0. These 10 mutations are analyzed using PhD-SNP which is based on support vector machine tool to further classify the predicted deleterious nsSNPs as disease associated. In the PhD-SNP server, out of 10 mutations, 7 of them (P135T, C137Y, V144M, L122V, W116G, G112S, and P91H) are predicted to be Table 2. The disease associated SNPs are predicted from PHDsnp and SNP&GO. SNP&GO SNP ID

Mutation

RI Score

rs373445989 rs281865135 rs201653834 rs144232569 rs121908615 rs104894522 rs104894521 rs104894520 rs104894519 rs11544251

V104M P135T V81M C137Y V144M L122V W116G T115N G112S P91H

1 7 6 9 6 0 7 5 6 6

Effect Neutral Disease Disease Disease Disease Neutral Disease Neutral Disease Disease

PhD SNP Score 3 3 0 9 1 1 7 0 9 6

Effect Neutral Disease Neutral Disease Disease Disease Disease Neutral Disease Disease

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

Tryptophan to Glycine mutation in the position 116 disease associated (Table 2). In SNP&GO, 7(P135T, V81M, C137Y, V144M, W116G, G112S, and P91H) nsSNPs are predicted to be disease associated (Table 2). Six mutations (P135T, C137Y, V144 M, W116G, G112S, and P91H) are predicted as most disease associated by PhD-SNP and SNP&GO (Table 2). These 6 mutations are further analyzed by MutPred tool to predict the SNP disease-association probability and probable change in the molecular mechanism in the mutant. The general probability (g) scores and (p) scores of all the 6 mutations can be seen in Table 3. This prediction could be endorsed with the noticed experimental data (Zhang et al., 2011). The mutation W116G was seen to cause CMT1C in a family (Street et al., 2003). Based on the results from various tools, we found W116G to be a disease-associated nsSNP for CMT1C. 3.4. Molecular dynamics We conducted molecular dynamics simulation of wildtype and mutant LITAF protein in order to clearly understand the structural consequence of the prioritized deleterious mutation. We followed the concept of multiple MD simulation (each simulation run of 200 ns) to improve the quality of the obtained MD results and improve the accuracy of the results (Grossfield, Feller, & Pitman, 2006, 2007; Grossfield & Zuckerman, 2009). The six factors, namely, tolerance index, PSIC score, DDG value, subPSEC score, disease-association study, general score (g), and property score (p) corresponds to the conformational changes in protein residues due to the mutation, which in turn affect the functional behavior of the protein molecule. The results obtained in the above analysis further motivated us to study the dynamic behavior of wild-type and mutant structures. Impact of the mutations at a specific position can be understood using MD simulation (Dixit, Torkamani, Schork, & Verkhivker, 2009; Guo, Ning, Ren, Liu, & Yao, 2012; Wu et al., 2011). We then investigated the H bonds, RMSD, RMSF, RMS fluctuation, Rg, solvent-accessible surface area (SASA) and NH bonds variations between the wild-type and mutant structures. We carried on the RMSD for all Cα atoms from the initial structure and we examined them to study the convergence of the protein system (Grossfield et al., 2006, 2007; Grossfield & Zuckerman, 2009). 3.4.1. Root-mean-square deviation After the multiple MD simulation (each run of 200 ns), the RMSD plot is generated. The RMSD backbone value for the wild-type protein structure and the mutant are calculated against the time simulation between 0 and 200,000 ps. RMSD is a crucial parameter to analyze the equilibration of MD trajectories. It is estimated for

5

backbone atoms by using the wild-type and mutant structure of the MD simulations. The RMSD of the backbone atoms relative to the corresponding starting structures is calculated. The RMSD graph for the wild-type protein structure in all the three simulations can be seen in Figure 2. The average RMSD value of the wild-type protein in all the three simulations ranges between .23 and .59 nm. In simulation one, the wild-type protein structure is seen to be in the range of .23–.54 nm. In this simulation, the wildtype protein is seen to show an average of .38 nm. We carried out two more simulations for the wild-type protein structure (each for 200 ns) and the results are similar in all the three simulations. In simulation two, the wildtype protein structure is seen to be in the range of .30–.59 nm. In this simulation, the wild-type protein shows an average of .44 nm. In simulation three, the wild-type protein structure is seen to be in the range of .24–.48 nm. In this simulation, the wild-type protein is seen to show an average of .37 nm. From the RMSD results of the wild-type protein structure in all the three simulations, we can see that all the trajectories lay in a very close range to one another. All the trajectories of the wild-type protein are seen to converge after 53, 500 ps (Figure 2). After the MD simulation, the wildtype protein structure was validated using proSA and the validation result showed the protein structure to be of X-ray quality, exactly as it was before MD simulation (Figure 1). We then checked the RMSD for the mutant trajectories from all the three MD simulations. The MD trajectory of the mutant structure was seen to range between .27 and .80 nm. In all the simulations, the mutant structure showed higher RMSD range (Figure 2). All the individual mutant trajectories are seen to be higher than all the individual wild-type trajectories. In simulation one (for 200 ns), the RMSD range of the wild-type structure is between .23 and .54 nm. The range of the mutant structure is .27–.80 nm. The average RMSD value for both the trajectories is noted between four phases of the MD simulation. The average RMSD of the wild-type protein between 0 and 50,000 ps is seen to be .35 nm where as the mutant showed an average of .52 nm which shows that the mutant is being unstable when compared to the wild-type protein form the beginning of the MD simulation. Between 50,000 and 100,000 ps, the average RMSD of the wild-type protein is .45 nm and the average of the mutant .57 nm. Even in this phase the RMSD of the mutant is more than the wild-type protein. After 10,000 till the end of the simulation, the average RMSD of the wild-type protein is seen to be .48 nm, from this we can say that this protein structure has converged after 100,000 ps but the mutant protein showed great variations after 100,000 ps. From 100,000 to 150,000 ps the average RMSD is .58 nm and the average RMSD between 150,000 and 200,000 ps is

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

6

C.V. Kumar et al.

Figure 2. Backbone RMSDs for the multiple simulations are shown as a function of time for wild-type and mutant LITAF protein motor domain structures at 300 K. (A) RMSD for the wild-type protein structure from all the three simulations with black being simulation one, green being simulation two, and orange being simulation three. (B) RMSD for the mutant protein structure from all the three simulations with black being simulation one, green being simulation two, and orange being simulation three.

.70 nm. From this we can say that the RMSD of the mutant shows greater variations throughout the simulation time period making it more instable when compared to the wild-type protein structure (Lee et al., 2011). We can see the average values in Table 4. Similar variations in the RMSD value can be seen in simulation two and three. 3.4.2. Root-mean-square fluctuation With the aim of determining whether the mutation affects the dynamic behavior of residues, the RMSF values of wild-type and mutant (W116G) structures were compiled (Figure 3). The RMSF with respect to the average MD simulation conformation is used as a mean describing flexibility differences among residues. The backbone RMSF of each residue of wild-type and mutant LITAF

are calculated in order to analyze the flexibility of backbone structure. The larger RMSF value shows more flexible, whereas low RMSF value shows limited movements during simulation in relation to its average position. The RMSF range of the wild-type protein is between .07 and .60 nm and the range of the mutant protein is .09–.66 nm. From the RMSF plot, we can see that the mutant protein shows more instability when compared to the wild-type protein. Individual residues in the wild-type structure that showed great fluctuations are 2 with .60 nm fluctuation, 17 with .26 nm fluctuation, 28 with .23 nm fluctuation, 42 with .31 nm fluctuation, 67 with .30 nm fluctuation, 89 with .20 nm fluctuation, 135 with .29 nm fluctuation, 151 with .35 nm fluctuation, and the target residue 116 with .15 nm fluctuation. The mutant showed high fluctuations

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

Tryptophan to Glycine mutation in the position 116

7

Figure 3. RMSF of the backbone CAs of Cα atoms of wild-type and mutant LITAF protein motor domain vs. time at 300 K. (A) RMSF for simulation one where the wild type is shown in black and mutant in blue. (B) RMSF for simulation two where the wild type is shown in black and mutant in red. (C) RMSF for simulation three where the wild type is shown in black and mutant in green.

at 16 with .64 nm fluctuation, 19 with .66 nm fluctuation, 24 with .43 nm fluctuation, 30 with .37 nm fluctuation, 72 with .4 nm fluctuation, 98 with .28 nm fluctuation, 125 with .28 nm fluctuation, 135 with .35 nm fluctuation, 157 with .50 nm fluctuation, and the target residue 116 with .17 nm fluctuation. Both the terminal residues showed high fluctuations in both the protein structures. Overall, the mutant protein shows more fluctuations when compared to the wild-type protein structure making it unstable (Lee et al., 2011). Similar results can be seen in the other two MD simulations. The RMSF values in all the three simulations

show that the mutant protein structure residues show relatively higher fluctuation in the RMSF value when compared to the wild-type protein structure. The mutant was unstable throughout the simulation time period, making it more unstable then the wild type. 3.4.3. NH bonds H-bonds play a vital role in molecular recognition and the overall stability of the protein structure. Intermolecular H-bond is analyzed for the wild-type and mutant structures of LITAF protein during the simulation period.

8

C.V. Kumar et al. protein has more number of H bonds when compared to the mutant protein hence making it more stable when compared to the mutant protein. Hence, the occurrence of W116G results in the loss of H bonds and hence reduces the stability if the protein (Lee et al., 2011). Similar results can be seen in simulation two and three. The mutant protein has relatively less number of hydrogen bonds when compared to the wildtype protein structure, making it unstable compared to the wild-type protein (Figure 4).

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

After the MD simulation of 200,000 ps, the average number of H bonds in both the wild-type and the mutant structures is calculated. The greater the number of H bonds, more is the stability of the protein. The number of H bonds is directly proportional to the stability of the protein. From the results obtained, we can see that the average number of H bonds in the wildtype protein structure is 69.225 whereas the average number of H bonds in the mutant protein structure is 63.256. From this result, we can say that the wild-type

Figure 4. Intermolecular hydrogen bonds in wild-type and mutant LITAF protein vs. time at 300 K. (A) H-bonds for simulation one where the wild type is shown in black and mutant in blue. (B) H-bonds for simulation two where the wild type is shown in black and mutant in red. (C) H-bond for simulation three where the wild type is shown in black and mutant in green.

Tryptophan to Glycine mutation in the position 116 3.4.4. Radius of gyration

trajectories are seen to be almost constant after which the mutant shows high Rg value throughout the simulation time period. From 0 to 50,000 ps, the average Rg of the wild-type protein is seen to be 1.62 nm whereas the mutant showed higher Rg of 1.70 nm. Between 50,000 and 100,000 ps, the Rg average of the wild type is 1.65 nm and the mutant showed an average of 1.67 nm. From 100,000 to 150,000 ps, the wild type showed an average Rg of 1.61 nm whereas mutant showed 1.67 nm. From 150,000 to 200,000, the average Rg of the wild

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

We performed Rg in order to understand the levels of compaction of the wild-type and mutant LITAF. The Rg is generally defined as the mass-weighted root mean square distance of a collection of atoms from their common center of mass. Hence, this analysis gives us the overall dimensions of the protein. The Rg of the wild-type protein ranges from 1.55 to 1.67 nm. The Rg range of the mutant is between 1.58 and 1.77 nm. Until 6000 ps the Rg for both the

9

Figure 5. Rg of Cα atoms of wild-type and mutant LITAF protein motor domain vs. time at 300 K. (A) Rg for simulation one where the wild type is shown in black and mutant in blue. (B) Rg for simulation two where the wild type is shown in black and mutant in red. (C) Rg for simulation three where the wild type is shown in black and mutant in green.

C.V. Kumar et al.

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

10

Figure 6. SASA of wild-type and mutant LITAF protein vs. time at 300 K. (A) SASA for simulation one where the wild type is shown in black and mutant in blue. (B) SASA for simulation two where the wild type is shown in black and mutant in red. (C) SASA for simulation three where the wild type is shown in black and mutant in green.

type is 1.58 nm and mutant showed an average of 1.69 nm. From the Rg plot, we can say that the mutant is more unstable when compared to the wild-type protein (Lee et al., 2011). We can see the average values in Table 4. The Rg values got in simulation two and three can be correlated with the Rg values of simulation one.

The wild-type protein structure shows an Rg range between 1.39 and 1.54 nm where as the mutant shows an Rg range between 1.42 and 1.60 nm. In simulation three, the Rg value ranged between 1.38–1.50 nm and 1.42–1.53 nm in the wild-type and mutant protein structures, respectively (Figure 5).

Tryptophan to Glycine mutation in the position 116

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

3.4.5. Solvent-accessible surface area SASA is performed to understand the solvent accessibility of the wild-type and mutant LITAF structures. SASA plot accounts for bimolecular surface area that is assessable to solvent molecules. The rise in SASA value denotes relative expansion. SASA is calculated for both the wild type and the mutant protein throughout the simulation time period. We showed that the SASA range of the wild-type structure is between 48.12 and 64.16 nm2. The mutant showed a range of 53.10–69.32 nm2. From 0 to 50,000 ps, the wild type showed an average SASA value of 57.40 nm2 and the mutant showed an average SASA value of 62.45 nm2. Between 50,000 and 100,000 ps, the average SASA value of the wild-type protein is 54.78 nm2 whereas the mutant structure showed an average of 58.93 nm2 in this simulation time period. The average SASA between 100,000 and 150,000 ps is 55.18 nm2 for the wild-type protein and 58.78 nm2 for the mutant. Finally, the average SASA value of the wild-type protein between 150,000 and 200,000 ps is 52.27 nm2 and the mutant showed an average of 59.54 nm2. From the SASA plot, we can say that the mutant protein is more unstable when compared to the wild-type protein (Lee et al., 2011). We can see the average values in Table 4. The SASA value in simulation two and three are seen to be in favor of the results obtained in simulation one. In simulation two, the wildtype protein structure is seen to have a SASA value from 41.06to 61.42 nm2 and the mutant shows a SASA range of 44.76–64.20 nm2. In simulation three, the SASA range is 42.98–60.51 nm2 and 45.49–61.91 nm2 in the wild-type and mutant structures, respectively. From this we can say that the mutant protein structure is more unstable when compared to the wild-type protein structure (Figure 6). 3.4.6. Principal component analysis A better view of the dynamical mechanical properties of the investigated system has been obtained by using ED analysis. The large-scale collective motions of the wildtype structure and mutant structure using ED analysis are determined to further support our MD simulation results. The dynamics of both structures is usually best achieved via characterization of its phase space behavior. The eigenvectors of the covariance matrix are called as its principle components. The change in particular trajectory along each eigenvector is obtained by this projection. The spectrum of the corresponding eigenvalues indicates that the fluctuation of the system is basically confined within the first two eigenvectors. The projection of trajectories obtained at 300 K onto the first two principal components (PC1, PC2) shows the motion of two proteins in phase space. On these projections, we see clusters of stable states. Two features are very apparent from the plots. First, the clusters are well defined in wild

11

type than wild type. Second, mutant covered a larger region of phase space along both PC1 and PC2 plane than mutant and it is depicted in Figure 7. We have obtained the following values for wild-type protein 5.965109 (nm2) and for the mutant protein and it is 6.509638 (nm2) at 300 K. The conformational changes of the mutant is quite opposite when compared to the wild type. Similar results can be seen in simulation two and three. From the PCA got from simulation two and three, we can see that the mutant protein is more scattered when compared to the wild type, making it more unstable then the wild-type protein. 3.5. Secondary structure prediction After molecular dynamics, we subjected the Pdb files of both the structures before MD and after MD to predict

Figure 7. Projection of the motion of the protein in phase space along the first two principal eigenvectors at 300 K. (A) PCA for simulation one where the wild type is shown in black and mutant in blue. (B) PCA for simulation two where the wild type is shown in black and mutant in red. (C) PCA for simulation three where the wild type is shown in black and mutant in green.

12

C.V. Kumar et al.

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

4. Discussion

Figure 8. Superimposed structure of wild-type and mutant form of LITAF protein structures.

the protein aggregation. The results were obtained from the secondary structure prediction server; we can see that there are no sheets in this protein. The percentage of helix at 0 ns is seen to be 48.4%. After 200 ns, we can see that both the protein structures show a loss of helix configuration. This can be seen in Table 5. The wild-type protein showed 42.2% helix after 200 ns whereas the mutant showed 41.6% helix. This occurred because of the mutation W116G. This can be seen in Table 5. The superimposed structures of both the wild-type and mutant proteins obtained after 200 ns can be seen in Figure 8.

Table 3.

The genotype–phenotype correlation is explained by using genome sequencing and its analysis. The existence of disease is possible to be predicted by observations on the effect of point mutations at the protein level. Such observations can be done using advanced methods in computational biology and the consequence of deleterious mutations can be predicted which is helpful. The computational study to determine the genotypic– phenotypic association and possible pathogenic consequences at disease level are not generally been carried up to higher accuracy level. To examine the structural consequence and the stability of the above predicted CMT1C associated nsSNP, we performed MD simulation of the prioritized mutant and the wild-type LITAF protein. The results from the present study give a clear understanding on the structural aspects of W116G LITAF mutation and its effect on CMT1C. Secondary structure prediction showed protein aggregation and hence it confirms the loss of function caused by the W116G mutation in LITAF. A total of 28 SNPs got from dbSNP were selected for this study which was subjected to various servers to examine and predict the CMT1C associated mutations with a relatively higher accuracy (Tables 1–3). Initially, out of the 28 inputs, 10 were predicted to be deleterious (Table 1). We subjected these 10 deleterious mutations and found 6 mutations to be disease associated (Table 2). The mutation W116G was seen to cause CMT1C in a family (Street et al., 2003) so we decided to carry on our further analysis on this mutation. We performed multiple MD simulation for both the wild-type and the mutant structures separately and calculated the Table 4. The statistically calculated average values of the MD simulation results. Average RMSD (nm) Wild 0–50 ns 50–100 ns 100–150 ns 150–200 ns

.35 .45 .48 .48

Average Rg (nm)

Mutant Wild Mutant .52 .57 .58 .70

1.62 1.65 1.61 1.58

1.70 1.67 1.68 1.69

Average SASA (nm2) Wild

Mutant

57.40 54.78 55.18 52.27

62.45 58.93 58.78 59.54

The g score, p score, molecular variations, and prediction reliability calculated from MutPred server.

SNP ID

Mutation

g score

p score

rs281865135 rs144232569 rs121908615 rs104894521 rs104894519 rs11544251

P135T C137Y V144M W116G G112S P91H

.829 .701 .851 .774 .859 .75

.1706 .1945 .1636 .0543 .0082 .0357

MUTpred

Prediction reliability

Loss of helix Gain of sheet Loss of stability Loss of catalytic residue at L114 Gain of relative solvent accessibility Loss of sheet

No reliable Inference No reliable Inference No reliable Inference No reliable Inference Very confident Hypotheses Actionable Hypotheses

Tryptophan to Glycine mutation in the position 116 Table 5.

Secondary structure prediction of the protein.

Duration

Helix (%)

Sheet (%)

Other (%)

48.4 42.2 48.4 41.6

0 0 0 0

46.6 57.8 46.6 58.4

Wild type before MD simulation (at 0 ns) Wild type after MD simulation (at 200 ns) Mutant before MD simulation (at 0 ns) Mutant after MD simulation (at 200 ns)

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

13

RMSD for all the Cα atoms from the initial structure, which were considered as the central criterion to measure the convergence of the protein system concerned. In Figure 2, wild-type structure showed overall lesser RMSD scores as compared to mutant. The RMSD results are seen to converge after 53, 500 ps. These findings are comparable to earlier results with multiple simulations (Grossfield et al., 2006, 2007; Grossfield & Zuckerman, 2009). We highlighted the RMSF of backbone carbon by trajectory analysis obtained from the performed MD simulation. NH bond analyses were performed to understand the flexibility behavior of residues. The Rg is generally defined as the mass-weighted root mean square distance of a collection of atoms from their common center of mass. Hence, this analysis gives us an insight to the overall dimensions of the protein. The plot of Rg of Cα atoms of the protein vs. time at 300 K is shown in Figure 5. The Rg value of the wild-type protein structure is seen to decrease throughout the MD simulation. The mutant showed a wide variation but was overall more than the wild type throughout the simulation. We observed a great rise in Rg in mutant structure as compared to the wild type which further supported our hypothesis. Mutant structure exhibited more flexibility compared to wild type. To investigate the flexible behavior of binding residues, we plotted the RMSF of Cα atoms between the residues. Mutant structure residues were found to exhibit larger flexibility as compared to wild type. Intermolecular NH bond was calculated for wild type and mutant structure during the simulation time. Notable differences in protein–solvent interactions were evident in wild-type and mutant structure in Figure 4. On the basis of RMSF observation and NH bond analysis, confirmed that the occurrence of the mutation led to a more flexible conformation due to the formation of more number of hydrogen bonds. The RMSD plot also conform loss of stability in the mutant. On the basis of the MD simulation results, we got a conclusion regarding the flexibility and the stability of the wild-type and mutant structure. The projection of trajectories obtained at 300 K onto the first two principal components (PC1, PC2) showed the motion of two proteins in phase space. On these projections, we saw clusters of stable states. There are a lot of conformational changes when we compared the wild-type and mutant structures. The W116G mutant protein was seen to be less stable when compared to the wild-type protein structure. The

W116G mutant protein structure exhibited loss of function even while being more stable than the wild-type protein. Similar results were seen in simulation two and simulation three. To analyze the loss of stability, we subjected the obtained pdb file to the secondary structure server (Klose et al., 2010). We obtained the secondary structures for both before and after MD simulations. We notice a clear protein aggregation from the results obtained from the secondary structure server. The result in protein aggregation in the W116G mutant structure suggests us the loss of stability in the W116G mutant structure. The aggregation report showed the loss of helix in the W116G mutant structure hence suggesting a functional change in the protein. To understand the aggregates, we subjected the LITAF-W116G and analyzed the protein aggregation using the secondary structure prediction server. Hence the protein aggregation in the mutant leads to the instability of the mutant structure. 5. Conclusion CMT is one of the most common inherited neurological disorders. CMT1C is caused by mutation in LITAF gene. The CMT1C-linked W116G mutation in LITAF is shown to cause remarkable changes in the biochemical and biological properties of LITAF gene. We noticed loss of helix during mutation. We conclude that the resulting protein aggregation leads to the loss of function in the mutant which leads to CMT1C. Our SNP results showed W116G to be very deleterious and to be “damaging”. Our results suggest that there is an effect on stability caused by the mutant W116G in the LITAF gene. Results also suggest that the mutant W116G in the LITAF gene is disease related. This suggest that the mutation W116G in the LITAF gene may show variation in the stability of the protein which is further conformed in the MD simulation. The MD simulation results show us that the W116G mutation in LITAF decreases the stability of the protein, resulting in the loss of function which correlates well with the previous experimental reports. We then checked the protein aggregation which showed the loss of helix that lead to protein misfolding and results in protein malfunction. Our results strongly conclude that the LITAF-W116G decreases the stability of the protein and exhibits a protein aggregation that leads to loss of function.

14

C.V. Kumar et al.

Acknowledgments RGS thanks ICMR for the Senior Research Fellowship. The authors would also like to thank the management of VIT University for providing the necessary facilities to carry out this research project.

Funding SR and AA gratefully acknowledge the Indian Council of Medical Research (ICMR), Government of India Agency for the research grant [IRIS ID: 2014-0099].

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

References Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., … Sunyaev, S. R. (2010). A method and server for predicting damaging missense mutations. Nature Methods, 7, 248–249. Amadei, A., Linssen, A. B. M., & Berendsen, H. J. C. (1993). Essential dynamics of proteins. Proteins: Structure, Function, and Genetics, 17, 412–425. Amberger, J., Bocchini, C. A., Scott, A. F., & Hamosh, A. (2009). McKusick’s online mendelian inheritance in man (OMIM(R)). Nucleic Acids Research, 37, D793–D796. Berendsen, H. J. C., Postma, J. P. M., Gunsteren, W. F. V., & Hermans, J. (1981). Interaction models for water in relation to protein hydration. In B. Pullman (Ed.), Intermolecular forces (pp. 331–342). Dordrecht: D. Riedel. Buchthal, F., & Behse, F. (1977). Peroneal muscular atrophy (PMA) and related disorders. Brain, 100, 41–66. Calabrese, R., Capriotti, E., Fariselli, P., Martelli, P. L., & Casadio, R. (2009). Functional annotations improve the predictive score of human disease-related mutations in proteins. Human Mutation, 30, 1237–1244. Campbell, R., Dorey, H., Naegeli, M., Grubstein, L. K., Bennett, K. K., Bonter, F., … Davidson, W. S. (2004). An empowerment evaluation model for sexual assault programs: Empirical evidence of effectiveness. American Journal of Community Psychology, 34, 251–262. Capriotti, E., Fariselli, P., Rossi, I., & Casadio, R. (2008). A three state prediction of single point mutations on protein stability changes. BMC Bioinformatics, 9, S6. Carvalho, M. A., Marsillac, S. M., Karchin, R., Manoukian, S., Grist, S., Swaby, R. F., … Monteiro, A. N. (2007). Determination of cancer risk associated with germ line BRCA1 missense variants by functional analysis. Cancer Research, 67, 1494–1501. Carvalho, M., Pino, M. A., Karchin, R., Beddor, J., Godinho-Netto, M., Mesquita, R. D., … Billack, B. (2009). Analysis of a set of missense, frameshift, and in-frame deletion variants of BRCA1. Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis, 660, 1–11. Dixit, A., Torkamani, A., Schork, N. J., & Verkhivker, G. (2009). Computational modeling of structurally conserved cancer mutations in the RET and MET kinases: The impact on protein structure, dynamics, and stability. Biophysical Journal, 96, 858–874. Dyck, P. J., & Lambert, E. H. (1968). Lower motor and primary sensory neuron diseases with peroneal muscular atrophy. Archives of Neurology, 18, 603–618. Essmann, U., Perera, L., Berkowitz, M. L., Darden, T., Lee, H., & Pedersen, L. G. (1995). A smooth particle mesh Ewald method. The Journal of Chemical Physics, 103, 8577–8593.

Eswar, N., Marti-Renom, M. A., Webb, B., Madhusudhan, M. S., Eramian, D., Shen, M., … Sali, A. (2006). Comparative protein structure modeling with MODELLER. Current Protocols in Bioinformatics, 15, 5.6.1–5.6.30. Gerding, W. M., Koetting, J., Epplen, J. T., & Neusch, C. (2009). Hereditary motor and sensory neuropathy caused by a novel mutation in LITAF. Neuromuscular Disorders, 19, 701–703. Goldgar, D. E., Easton, D. F., Deffenbaugh, A. M., Monteiro, A. N., Tavtigian, S. V., & Couch, F. J. (2004). Integrated evaluation of DNA sequence variants of unknown clinical significance: Application to BRCA1 and BRCA2. The American Journal of Human Genetics, 75, 535–544. Grossfield, A., Feller, S. E., & Pitman, M. C. (2006). A role for direct interactions in the modulation of rhodopsin by ω-3 polyunsaturated lipids. Proceedings of the National Academy of Sciences, 103, 4888–4893. Grossfield, A., Feller, S. E., & Pitman, M. C. (2007). Convergence of molecular dynamics simulations of membrane proteins. Proteins: Structure, Function, and Bioinformatics, 67, 31–40. Grossfield, A., & Zuckerman, D. M. (2009). Quantifying uncertainty and sampling quality in biomolecular simulations. Annual reports in computational chemistry, 5, 23–48. Gunsteren, W. F. V., Billeter, S. R., Eising, A. A., Hunenberger, P. H., Kruger, P. K., Mark, A. E., … Tironi, I. G. (1996). Biomolecular simulation: The GROMOS96 manual and user guide (pp. 1–1024). Zurich: Verlag der Fachvereine. Guo, J., Ning, L., Ren, H., Liu, H., & Yao, X. (2012). Influence of the pathogenic mutations T188K/R/A on the structural stability and misfolding of human prion protein: Insight from molecular dynamics simulations. Biochimica et Biophysica Acta (BBA) – General Subjects, 1820, 116–123. Harding, A. E., & Thomas, P. K. (1980). The clinical features of hereditary motor and sensory neuropathy types I and II. Brain, 103, 259–280. Henne, W. M., Buchkovich, N. J., & Emr, S. D. (2011). The ESCRT pathway. Developmental Cell, 21, 77–91. Hess, B., Bekker, H., Berendsen, H. J. C., & Fraaije, J. G. E. M. (1997). LINCS: A linear constraint solver for molecular simulations. Journal of Computational Chemistry, 18, 1463–1472. Hess, B., Kutzner, C., van der Spoel, D., & Lindahl, E. (2008). GROMACS 4: Algorithms for highly efficient, loadbalanced, and scalable molecular simulation. Journal of Chemical Theory and Computation, 4, 435–447. Hicks, S., Wheeler, D. A., Plon, S. E., & Kimmel, M. (2011). Prediction of missense mutation functionality depends on both the algorithm and sequence alignment employed. Human Mutation, 32, 661–668. Karchin, R. (2009). Next generation tools for the annotation of human SNPs. Briefings in Bioinformatics, 10, 35–52. Katzmann, D. J., Odorizzi, G., & Emr, S. D. (2002). Receptor downregulation and multivesicular-body sorting. Nature Reviews Molecular Cell Biology, 3, 893–905. Klose, D. P., Wallace, B. A., & Janes, R. W. (2010). 2Struc: The secondary structure server. Bioinformatics, 26, 2624–2625. Kumar, P., Henikoff, S., & Ng, P. C. (2009). Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nature Protocols, 4, 1073–1081. Laskowski, R. A. (1997). PDBsum: A web-based database of summaries and analyses of all PDB structures. Trends in Biochemical Sciences, 22, 488–490.

Downloaded by [Pennsylvania State University] at 23:31 04 December 2014

Tryptophan to Glycine mutation in the position 116 Latour, P., Gonnaud, P. M., Ollagnon, E., Chan, V., Perelman, S., Stojkovic, T., … Maire, I. (2006). SIMPLE mutation analysis in dominant demyelinating Charcot–Marie–Tooth disease: Three novel mutations. Journal of the Peripheral Nervous System, 11, 148–155. Lee, S. M., Chin, L. S., & Li, L. (2012). Charcot–Marie–Tooth disease-linked protein SIMPLE functions with the ESCRT machinery in endosomal trafficking. Journal of Cell Biology, 199, 799–816. Lee, S. M., Olzmann, J. A., Chin, L.-S., & Li, L. (2011). Mutations associated with Charcot–Marie–Tooth disease cause SIMPLE protein mislocalization and degradation by the proteasome and aggresome–autophagy pathways. Journal of Cell Science, 124, 3319–3331. Li, B., Krishnan, V. G., Mort, M. E., Xin, F., Kamati, K. K., Cooper, D. N., … Radivojac, P. (2009). Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 25, 2744–2750. Lupski, J. R. (1998). Charcot–Marie–Tooth disease: Lessons in genetic mechanisms. Molecular Medicine, 4, 3–11. Lupski, J. R., de Oca-Luna, R., Slaugenhaupt, S., Pentao, L., Guzzetta, V., Trask, B. J., … Patel, P. I. (1991). DNA duplication associated with Charcot–Marie–Tooth disease type 1A. Cell, 66, 219–232. Matsunami, N., Smith, B., Ballard, L., William Lensch, M., Robertson, M., Albertsen, H., … Chance, P. F. (1992). Peripheral myelin protein-22 gene maps in the duplication in chromosome 17p11.2 associated with Charcot–Marie– Tooth 1A. Nature Genetics, 1, 176–179. Moriwaki, Y., Begum, N. A., Kobayashi, M., Matsumoto, M., Toyoshima, K., & Seya, T. (2001). Mycobacterium bovis Bacillus Calmette-Guerin and its cell wall complex induce a novel lysosomal membrane protein, SIMPLE, that bridges the missing link between lipopolysaccharide and p53-inducible gene, LITAF(PIG7), and estrogen-inducible gene, EET1. Journal of Biological Chemistry, 276, 23065–23076. Oostenbrink, C., Villa, A., Mark, A. E., & Gunsteren, W. F. V. (2004). A biomolecular force field based on the free enthalpy of hydration and solvation: The GROMOS forcefield parameter sets 53A5 and 53A6. Journal of Computational Chemistry, 25, 1656–1676. Patel, P. I., Roa, B. B., Welcher, A. A., Schoener-Scott, R., Trask, B. J., Pentao, L., … Suter, U. (1992). The gene for the peripheral myelin protein PMP-22 is a candidate for Charcot–Marie–Tooth disease type 1A. Nature Genetics, 1, 159–165. Raeymaekers, P., Timmerman, V., Nelis, E., De Jonghe, P., Hoogenduk, J. E., Baas, F., … Van Broeckhoven, C. (1991). Duplication in chromosome 17p11.2 in Charcot– Marie–Tooth neuropathy type 1a (CMT 1a). Neuromuscular Disorders, 1, 93–97. Roxrud, I., Stenmark, H., & Malerød, L. (2010). ESCRT & Co. Biology of the Cell, 102, 293–318. Saifi, G. M., Szigeti, K., Wiszniewski, W., Shy, M. E., Krajewski, K., Hausmanowa-Petrusewicz, I., … Lupski, J. R. (2005). SIMPLE mutations in Charcot–Marie–Tooth dis-

15

ease and the potential role of its protein product in protein degradation. Human Mutation, 25, 372–383. Sherry, S. T., Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., & Sirotkin, K. (2001). dbSNP: The NCBI database of genetic variation. Nucleic Acids Research, 29, 308–311. Skre, H. (1974). Genetic and clinical aspects of Charcot– Marie–Tooth’s disease. Clinical Genetics, 6, 98–118. Street, V. A., Bennett, C. L., Goldy, J. D., Shirk, A. J., Kleopa, K. A., Tempel, B. L., … Chance, P. F. (2003). Mutation of a putative protein degradation gene LITAF/SIMPLE in Charcot–Marie–Tooth disease 1C. Neurology, 60, 22–26. Thusberg, J., & Vihinen, M. (2009). Pathogenic or not? and if so, then how? Studying the effects of missense mutations using bioinformatics methods. Human Mutation, 30, 703–714. Timmerman, V., Nelis, E., Van Hul, W., Nieuwenhuijsen, B. W., Chen, K. L., Wang, S., … Van Broeckhoven, C. (1992). The peripheral myelin protein gene PMP-22 is contained within the Charcot–Marie–Tooth disease type 1A duplication. Nature Genetics, 1, 171–175. Turner, P. J. (2005). XMGRACE, Version 5.1.19. Beaverton, OR: Central for costal and Land-Margin Research; Oregon Graduate Institute of Science and Technology. Van Der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, A. E., & Berendsen. (2005). GROMACS: Fast, flexible, and free. Journal of Computational Chemistry, 26, 1701–1718. Vinay Kumar, C., Kumar, K. M., Swetha, R., Ramaiah, S., & Anbarasu, A. (2014). Protein aggregation due to nsSNP resulting in P56S VABP protein is associated with amyotrophic lateral sclerosis. Journal of Theoretical Biology, 354, 72–80. Warner, L. E., Mancias, P., Butler, I. J., McDonald, C. M., Keppen, L., Koob, K., & Lupski, J. R. (1998). Mutations in the early growth response 2 (EGR2) gene are associated with hereditary myelinopathies. Nature Genetics, 18, 382–384. Warner, L. E., Roa, B. B., & Lupski, J. R. (1996). Absence ofPMP22 coding region mutations in CMT1A duplication patients: Further evidence supporting gene dosage as a mechanism for charcot-marie-tooth disease type 1A. Human Mutation, 8, 362–365. Waterman, H., & Yarden, Y. (2001). Molecular mechanisms underlying endocytosis and sorting of ErbB receptor tyrosine kinases. FEBS Letters, 490, 142–152. Wu, X., Yang, G., Zu, Y., Fu, Y., Zhou, L., & Yuan, X. (2011). The Trp-cage miniprotein with single-site mutations: Studies of stability and dynamics using molecular dynamics. Computational and Theoretical Chemistry, 973, 1–8. Zhang, K. H., Li, Z., Lei, J., Pang, T., Xu, B., Jiang, W. Y., & Li, H. Y. (2011). Oculocutaneous Albinism type 3 (OCA3): Analysis of two novel mutations in TYRP1 gene in two Chinese patients. Cell Biochemistry and Biophysics, 61, 523–529.

Tryptophan to Glycine mutation in the position 116 leads to protein aggregation and decreases the stability of the LITAF protein.

Mutations in the gene-encoding vesicle lipopolysaccharide-induced tumor necrosis factor (LITAF) protein cause Charcot-Marie-Tooth type 1C (CMT1C) dise...
1MB Sizes 4 Downloads 2 Views