This article was downloaded by: [Michigan State University] On: 05 March 2015, At: 02:01 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Biomolecular Structure and Dynamics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/tbsd20

Distinct molecular features facilitating ice-binding mechanisms in hyperactive antifreeze proteins closely related to an Antarctic sea ice bacterium a

b

c

Rachana Banerjee , Pratim Chakraborti , Rupa Bhowmick & Subhasish Mukhopadhyay

a

a

Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, 92, A.P.C. Road, Kolkata 700009, India b

Apt Software Avenues Pvt. Ltd, Unit G 301, Block DC, City Centre, Sector I, Salt Lake, Kolkata 700064, India c

Click for updates

Chemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pashan, Pune 411008, India Published online: 05 Sep 2014.

To cite this article: Rachana Banerjee, Pratim Chakraborti, Rupa Bhowmick & Subhasish Mukhopadhyay (2014): Distinct molecular features facilitating ice-binding mechanisms in hyperactive antifreeze proteins closely related to an Antarctic sea ice bacterium, Journal of Biomolecular Structure and Dynamics, DOI: 10.1080/07391102.2014.952665 To link to this article: http://dx.doi.org/10.1080/07391102.2014.952665

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Journal of Biomolecular Structure and Dynamics, 2014 http://dx.doi.org/10.1080/07391102.2014.952665

Distinct molecular features facilitating ice-binding mechanisms in hyperactive antifreeze proteins closely related to an Antarctic sea ice bacterium Rachana Banerjeea*, Pratim Chakrabortib, Rupa Bhowmickc and Subhasish Mukhopadhyaya a

Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, 92, A.P.C. Road, Kolkata 700009, India; Apt Software Avenues Pvt. Ltd, Unit G 301, Block DC, City Centre, Sector I, Salt Lake, Kolkata 700064, India; cChemical Engineering and Process Development Division, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pashan, Pune 411008, India b

Communicated by Ramaswamy H. Sarma

Downloaded by [Michigan State University] at 02:01 05 March 2015

(Received 16 June 2014; accepted 5 August 2014) Antifreeze proteins or ice-binding proteins (IBPs) facilitate the survival of certain cellular organisms in freezing environment by inhibiting the growth of ice crystals in solution. Present study identifies orthologs of the IBP of Colwellia sp. SLW05, which were obtained from a wide range of taxa. Phylogenetic analysis on the basis of conserved regions (predicted as the ‘ice-binding domain’ [IBD]) present in all the orthologs, separates the bacterial and archaeal orthologs from that of the eukaryotes’. Correspondence analysis pointed out that the bacterial and archaeal IBDs have relatively higher average hydrophobicity than the eukaryotic members. IBDs belonging to bacterial as well as archaeal AFPs contain comparatively more strands, and therefore are revealed to be under higher evolutionary selection pressure. Molecular docking studies prove that the ice crystals form more stable complex with the bacterial as well as archaeal proteins than the eukaryotic orthologs. Analysis of the docked structures have traced out the ice-binding sites (IBSs) in all the orthologs which continue to facilitate ice-binding activity even after getting mutated with respect to the well-studied IBSs of Typhula ishikariensis and notably, all these mutations performing ice-binding using ‘anchored clathrate mechanism’ have been found to prefer polar and hydrophilic amino acids. Horizontal gene transfer studies point toward a strong selection pressure favoring independent evolution of the IBPs in some polar organisms including prokaryotes as well as eukaryotes because these proteins facilitate the polar organisms to acclimatize to the adversities in their niche, thus safeguarding their existence. Keywords: antifreeze proteins; ice-recrystallization inhibition; phylogenetic analysis; molecular docking; horizontal gene transfer

1. Introduction Antifreeze proteins (AFPs) or ice-binding proteins (IBPs) have been discovered in various taxa including unicellular eukaryotes, plants, bacteria, fungi, fish, crustaceans, and insects that inhabit sub-zero environments, facilitating their strategies to protect themselves from freezing (Duman, 2001; Duman & Olsen, 1993; Griffith & Yaish, 2004; Hoshino et al., 2003; Janech, Krell, Mock, Kang, & Raymond, 2006). On the basis of this functionality, two major groups of AFPs have been recognized. One of the types prevents body fluids from freezing (otherwise known as freeze avoidance). The mechanism of freeze avoidance is thought to be due to the secretion of some extrapolymeric substances that accumulate in brine pockets in sea ice and help to preserve a liquid environment (Krembs, Eicken, Junge, & Deming, 2002). The other type consists of proteins that make it possible for organisms to survive body fluid freezing (otherwise known as freeze tolerance) (Liu et al., 2007). Freeze tolerance mechanism is also known as ice-recrystallization inhibition (RI) activity. *Corresponding author. Email: [email protected] © 2014 Taylor & Francis

AFPs were initially described in marine fishes (Devries & Wohlschlag, 1969; Duman & Devries, 1974), where they have been found to facilitate freeze tolerance by RI activity. In this mechanism, AFPs assist in freeze tolerance by binding to the surface of ice crystals and thus preventing their growth (Guo, Garnham, Whitney, Graham, & Davies, 2012; Raymond & Devries, 1977). This prevention causes thermal hysteresis, which is the non-colligative depression of the freezing temperature of a solution containing ice below its melting point (Tm), enabling the survival of the organisms inhabiting iceladen environment (Fletcher, Kao, & Fourney, 1986; Knight, Devries, & Oolman, 1984). Within the hysteresis temperature gap, AFPs modify the ice-crystal habitat, in such a way that the AFP-saturated ice crystal forms a unique shape, which is hexagonal bipyramid (Scotter et al., 2006; Xiao et al., 2010). Adsorption of AFP to the surface of ice crystals also causes RI (Knight et al., 1984; Knight, Hallett, & DeVries, 1988). Due to this RI property, AFPs are highly useful in the preservation

Downloaded by [Michigan State University] at 02:01 05 March 2015

2

R. Banerjee et al.

techniques (Knight et al., 1984). For example, AFPs have potential applications in agriculture for protecting crops from freezing, in maintaining the texture in frozen foods, and for producing cold-hardy plants using transgenic technology. AFPs are also used in the cryosurgery for the low-temperature preservation of cells, tissues, and organs (Fletcher, Goddard, & Wu, 1999). By contributing to both freeze avoidance and freeze tolerance, AFPs have helped to increase species’ diversity in the harshest of environments (Jia & Davies, 2002). In response to climate change, AFPs have modified their structure and distribution in these varieties of species, and in this way they have evolved their icebinding abilities. This structural diversity, together with an inability to examine the AFP–ice interaction directly, has made their structure–function studies extremely challenging (Jia & Davies, 2002). Till date, five different AFPs from fish have been isolated and they are classified as AFP Type I, Type II, Type III, Type IV, and Type I hyperactive AFP, based on their properties, molecular weight, and structure (Davies & Hew, 1990). Besides this, AFPs have also been isolated from bacteria, insects, and plants (Kwan et al., 2005; Patel & Graether, 2010; Siemer & McDermott, 2008). Crystal structures of six different types of AFPs from freeze-avoiding organisms are available (Antson et al., 2001; Liou, Tocilj, Davies, & Jia, 2000; Nishimiya et al., 2008; Pentelute et al., 2008; Sicheri & Yang, 1995). They have remarkably diverse protein folds that include an α-helix (Sicheri & Yang, 1995), globular proteins of mixed secondary structure (Antson et al., 2001; Nishimiya et al., 2008), β-solenoids (Graether et al., 2000; Liou et al., 2000), and polyproline type II coils (Pentelute et al., 2008). Ice-binding sites (IBSs) of these AFPs have been already defined by site-directed mutagenesis revealing the fact that inspite of having different structures, the key-features of their IBSs are common among them (DeLuca, Davies, Ye, & Jia, 1998; Kondo et al., 2012; Marshall, Daley, Graham, Sykes, & Davies, 2002). The IBSs are relatively flat and somewhat hydrophobic and they cover an extensive area on the protein which provides good surface complementarity with the ice crystal. For the globular type III AFP, small adjoining flat surfaces may combine to form a ‘compound’ IBS (Garnham et al., 2010). Molecular dynamics simulations have indicated that the relatively hydrophobic IBS of an AFP is capable of ordering water molecules into an ice-like lattice (Gallagher & Sharp, 2003; Jorov, Zhorov, & Yang, 2004; Nutt & Smith, 2008; Smolin & Daggett, 2008; Wierzbicki et al., 2007) and instead of shedding the bound water molecules upon ice binding, the ordered waters might facilitate the interaction of AFP with ice by matching certain ice planes (Nutt & Smith, 2008). Colwellia sp. strain SLW05 is an Antarctic sea-ice bacterium, containing IBP with ice-RI activity. In the

present study, AFPs sharing more than 40% sequence identity at the amino acid level with the AFP from Colwellia sp. strain SLW05 have been taken into account and studied thereof. It is significant to note that the orthologs were obtained from divergent taxa, including algae, fungi, bacteria, diatoms, and copepods, in spite of the presence of a conserved domain region in their amino acid sequences. Interestingly, this conserved domain was found to be associated with ice-binding activity. Therefore, analysis of structural and functional characteristics of this conserved ice-binding domain (IBD) regions gained our attention. We were also interested to evaluate the physico-chemical characterization of the conserved IBD regions present within the related AFPs. According to Kelley et al., genes coding for AFPs probably have evolved independently in a number of lineages (Kelley, Aagaard, MacCoss, & Swanson, 2010). On the other hand, few other researchers report horizontal gene transfer (HGT) as being responsible for acquiring the AFPs in some organisms (Graham, Lougheed, Ewart, & Davies, 2008; Janech et al., 2006). The evolution of prokaryotic genomes is highly influenced by lateral gene transfer (Keeling & Palmer, 2008), while there is a considerable number of recognized HGTs in eukaryotes, particularly in unicellular eukaryotes, which signify HGT to be responsible in promoting evolution in them (Armbrust, 2009). As our data-set includes AFPs from varying range of taxa, i.e. prokaryotes as well as unicellular eukaryotes, we were motivated to study their mode of evolution. 2. Materials and methods 2.1. Sequence extraction and orthologs identification Colwellia sp. strain SLW05 is an Antarctic sea-ice bacterium, exhibiting ice-binding and IR inhibition activities (Raymond, Fritsen, & Shen, 2007). We have downloaded the coding sequence of the AFP of Colwellia sp. strain SLW05 (Accession number: ABH08428.1) from NCBI GenBank (http://www.ncbi.nlm.nih.gov/genbank/) and used the corresponding amino acid sequence as query in PSI-Blast (Johnson et al., 2008) to search for its distant orthologs. Orthologs satisfying the PSI-Blast criteria with e-value ≤0.0001, identity ≥40%, and query coverage ≥60%, were selected. 2.2. Phylogenetic analysis and calculation of evolutionary rate Multiple sequence alignment of AFPs from Colwellia sp. strain SLW05 and its orthologs was carried out by CLUSTALW2 (Thompson, Higgins, & Gibson, 1994). The conserved regions depicted in the previous multiple sequence alignment, have been extracted from each of

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium the orthologs. On the basis of the alignment of these conserved regions, a Neighbor-joining phylogenetic tree was generated using MEGA5 (Saitou & Nei, 1987). Evolutionary rate of each individual residue for a given ortholog is calculated using SWAKK server (Liang, Zhou, & Landweber, 2006).

Downloaded by [Michigan State University] at 02:01 05 March 2015

2.3. Multivariate analyses on synonymous codon and amino acid usage Correspondence analysis (COA) is an ordination technique that identifies the key trends in the variation of the data and then in accordance with those trends, it distributes the genes along continuous axes. It does not make any assumption about the data falling into discrete clusters and therefore represents continuous variation accurately (Banerjee, Roy, Ahmad, Das, & Basak, 2012; Basak, Banerjee, Gupta & Ghosh, 2004; Basak & Ghosh, 2006; Sabbia et al., 2007). COA on relative amino acid usage (RAAU) of the genes had been carried out using the program CODONW 1.4.2, in order to identify any significant variation in the usage of amino acids among the orthologs considered in our study (Peden, 2000). The pictorial illustration of the amino acid usage of all the genomes has been represented using a Heat Map. Every column of the Heat Map represents a color gradient from yellow to green, which stands for increasing values of frequency for a single amino acid. Correlation analysis (Spearman’s Rank Correlation) was performed (at a level of significance p < 0.01 or p < 0.05) using SPSS version 15.0. In addition, t-test has been performed using GraphPad t-test calculator (http:// www.graphpad.com/quickcalcs/ttest1.cfm). 2.4. Calculation of Gravy Score and estimation of secondary structure of proteins Average hydrophobicity (Gravy Score) (Kyte & Doolittle, 1982) of the orthologs were calculated using the ExPASy proteomics server (Gasteiger et al., 2003). Using the software PREDATOR (Frishman & Argos, 1997), secondary structures (viz., helices, sheets, and coils) of the orthologs were derived and the variations in their percentage were also calculated. 2.5. Prediction and validation of three-dimensional structure and molecular docking studies Three-dimensional model of all the orthologous proteins, considered in the present study, was prepared using abinitio method in the ROSETTA online modeling service (http://robetta.bakerlab.org/) (Raman et al., 2009). The model was validated using Procheck server (Laskowski, MacArthur, Moss, & Thornton, 1993). We performed molecular docking between the three-dimensional models

3

constructed by us and hexagonal ice crystal, using the molecular docking software Hex version 8.0.0 (http://hex. loria.fr/) (Ghoorah, Devignes, Smaïl-Tabbone, & Ritchie, 2013; Macindoe, Mavridis, Venkatraman, Devignes, & Ritchie, 2010; Ritchie, 2003; Ritchie, Kozakov, & Vajda, 2008). It is an interactive, protein docking, and molecular superposition program, which understands protein structures in PDB format (Ritchie & Venkatraman, 2010). Hex can calculate protein–ligand docking, assuming the ligand is rigid (Mustard & Ritchie, 2005; Ritchie, 2005; Ritchie & Kemp, 1999). For the docking study, we have used hexagonal ice, which is the form of all natural snow and ice on earth as evidenced in the sixfold symmetry in ice crystals grown from water vapor. The pdb structure of hexagonal ice has been downloaded from: http://www1. lsbu.ac.uk/water/ice1hsc.html. Subsequently, the docked complexes were analyzed with LigPlot+(https://www.ebi. ac.uk/thornton-srv/software/LigPlus/) for identifying the interactions between the proteins and ice crystal (Laskowski & Swindells, 2011). 2.6. Prediction of HGT A species phylogeny was created based on the NCBI taxonomic classification (http://itol.embl.de/other_trees.shtml). To infer gene tree, we applied maximum-likelihood analyses in TreeFinder (Jobb, von Haeseler, & Strimmer, 2004). To assess support for particular clades, non-parametric bootstrap analyses were performed on 1000 replicates for TreeFinder. Additionally, to conclude whether topological differences between the gene and species trees are statistically significant, we applied the Local Rearrangements-Expected Likelihood Weights (LR-ELW) method in TreeFinder and obtained the edge support values (Anisimova & Gascuel, 2006). T-REX (Tree and reticulogram REConstruction) web server was used to detect the complete HGT events between the species included in this study (Boc, Diallo, & Makarenkov, 2012). The gene and the species phylogeny were provided as input for the T-REX server. The HGT-Detection algorithm of T-REX determined an optimal or minimum-cost scenario while proceeding by a gradual reconciliation of the given gene and species trees (Boc, Philippe, & Makarenkov, 2010). T-REX statistically validates the inferred HGT events by bootstrapping. The HGT bootstrap scores of the predicted gene transfers are obtained by considering the uncertainty of the gene tree as well as the number of occurrences of the selected transfers in all minimum-cost HGT scenarios found for the given species tree and the gene tree replicates (Boc & Makarenkov, 2011). 3. Results IBP of Colwellia sp. SLW05 (accession number: ABH08428.1) shows very little sequence similarity with

Downloaded by [Michigan State University] at 02:01 05 March 2015

4

R. Banerjee et al.

NCBI Refseq proteins (http://www.ncbi.nlm.nih.gov/ref seq/). Therefore, we performed PSI-Blast search (Altschul et al., 1997) using the amino acid sequence of Colwellia sp. SLW05 and identified 45 orthologs satisfying the criteria of identity ≥40%, and query coverage ≥80%. The orthologs were obtained from a wide range of taxa like Bacteria, Algae, Diatom, Fungi, Yeast, and Copepods. Among these orthologs, only 17 proteins with complete coding sequences were used for further studies (Table 1). Sequence alignment of these 17 orthologs using CLUSTALW2 (Thompson et al., 1994) revealed presence of a conserved region within these orthologous proteins (Figure 1). This conserved region is mentioned as ‘IBD’ in the present analysis (as represented in Figure 1). Prodom (Servant et al., 2002) identified a structural domain region annotated as ‘ICEBINDING GROUP DOMAIN MEMBRANE’ [Prodom ID: PDB758U2) (underlined in Figure 1)] within the IBD regions of all of the 17 orthologous proteins. 3.1. Phylogenetic analysis The conserved nature of the IBDs in the 17 orthologous protein sequences led us to analyze their phylogenetic relationship. Therefore, we have extracted the amino acid sequences of the IBDs (represented in Figure 1) from each of the 17 sequences and constructed a phylogenetic tree on the basis of the IBDs using neighbor-joining method (Saitou & Nei, 1987). The phylogenetic tree differentiated the 17 members into distinct groups (Figure 2). Group 1 comprised eight members that mainly include bacteria and archaea, whereas, Group 2 comprised nine members from algae, fungi, diatom, and one member from copepod, symbolizing different patterns of amino acid usage for the two groups.

Table 1.

3.2. Evaluation of RAAU of the IBP and its orthologs We have carried out COA on the basis of RAAU of the IBDs of the 17 orthologous proteins. In Figure 3, Axis 1–Axis 2 plot of the COA on RAAU has been represented, where the 17 IBDs have been segregated into two clusters along Axis 1 (representing 64.15% of the total variation). In Figure 3, red squares represent Group 1 members and blue diamonds represent Group 2 members. To understand the physical property of amino acids which might be responsible for the above pattern of segregation, we performed the same analysis for each of the 17 IBDs. Gravy Score is the only biophysical property that has been found to possess significant positive correlation with Axis 1 of the COA on RAAU (ρ = 0.56, p ≤ 0.001). Analysis of the Gravy Scores, all of which are positive, points toward the hydrophobic nature of all the 17 IBDs considered by us. But the average hydrophobicity of the IBDs belonging to Group 1 is nearly two times higher than that of Group 2. A more in-depth analysis of amino acid usage revealed that 11 (Cys, Gly, Ile, Leu, Met, Phe, Pro, Ser, Thr, Trp, Val) out of 20 amino acids are preferred by all of the 17 IBDs. Among these 11 amino acids, 7 are hydrophobic (Cys, Leu, Ile, Met, Phe, Trp, Val) and the rest (Gly, Pro, Ser, Thr) are less hydrophobic in nature (Figure 4). Overall, usage of these 11 amino acids somewhat explains the hydrophobic characteristics for all the 17 IBDs. But, comparative usage of these 11 amino acids within two groups of IBDs (Figure 2), points toward the fact that significant higher usage of the hydrophobic amino acids Ile, Leu, Met, Trp, Val (p < 0.0001) and relatively lower usage of the less hydrophobic amino acids Pro, Ser, and Thr (p < 0.0001) by Group 1 IBDs relative to Group 2 members, makes their hydrophobic nature more prominent.

Taxa and GenBank accession numbers of the 17 orthologous proteins used in the study.

Species

Taxon

Higher taxon

Accession numbers

GenBank annotation

Methanosphaerula palustris E1-9c Methanoregula boonei 6A8 Halovivax ruber XH-70 Polaribacter irgensii Cytophaga hutchinsonii ATCC 33406 Aequorivita sublithincola DSM 14238 Rhodoferax ferrireducens T118 Colwellia sp. SLW05 Stephos longipes Pyramimonas gelidicola Chaetoceros neogracile Navicula glaciei Typhula ishikariensis Leucosporidium sp. AY30 Flammulina populicola Lentinula edodes Glaciozyma antarctica

Euryarchaeota Euryarchaeota Euryarchaeota Bacteroidetes Bacteroidetes Bacteroidetes Proteobacteria Proteobacteria Arthropoda Chlorophyta Bacillariophyta Bacillariophyta Basidiomycota Basidiomycota Basidiomycota Basidiomycota Basidiomycota

Archaea Archaea Archaea Bacteria Bacteria Bacteria Bacteria Bacteria Copepods Algae Diatom Diatom Fungus Fungus Fungus Fungus Yeast

YP_002465308.1 YP_001404652.1 WP_007698523.1 WP_004570305.1 YP_676864.1 YP_006418469.1 YP_523138.1 ABH08428.1 ACL00838.1 AFK64812.1 ACU09498.1 AAZ76251.1 BAD02894.1 ACU30806.1 ACL27143.1 ACL27145.1 AGE93832.1

Periplasmic copper-binding protein Hypothetical protein Protein of unknown function Hypothetical protein Antifreeze-like protein Hypothetical protein Ig-like protein Ice-binding protein Antifreeze protein Ice-binding protein Antifreeze protein Ice-binding protein Antifreeze protein Antifreeze protein Ice-binding protein Ice-binding protein Antifreeze protein

5

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium

Figure 1. Multiple sequence alignment of the conserved regions of the 17 orthologs using CLUSTALW2. Notes: Residues marked in blue indicate ≥90% conservation; pink indicates ≥60% conservation; green indicates ≥40% conservation among all orthologs. ‘ICE-BINDING GROUP DOMAIN MEMBRANE’, Conserved IBD regions as predicted by Prodom in all the 17 orthologous proteins, have has been marked through underline.

Downloaded by [Michigan State University] at 02:01 05 March 2015

6

R. Banerjee et al.

Figure 2. Neighbor-joining tree (with 500 bootstrap replicates) showing evolutionary relationship of the conserved IBD regions of the orthologs. Notes: Two distinct clades obtained through the phylogenetic analysis have been marked as Group 1 and Group 2. The numerical values present above the nodes represent the bootstrap value for that particular node. Scale bar; 0.1 substitutions per site. Group 1 is comprised of eight members from Bacteria and Archaea, whereas, Group 2 consists of nine members from Algae, Fungi, Diatom and Copepod.

Figure 3. COA on RAAU of the IBD regions from 17 orthologs. Notes: Positions of Group 1 and Group 2 members have been represented (as depicted in Figure 2) along first and second principal axes generated by COA on relative synonymous amino acid usage values of 17 IBD regions. Group 1 members are depicted by red squares and Group 2 members are depicted by blue diamonds.

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium

7

Figure 4. Heat Map representing the RAAU values in IBD regions of the orthologous proteins under study. Notes: The over-representation and under-representations of amino acid residues in the organisms are shown in green and yellow colored blocks of varying color intensities, respectively. Groups have been mentioned according to the phylogenetic tree in Figure 2.

3.3. Evolutionary pressure on the orthologs of IBP The evolutionary rate between amino acids within a protein may differ depending on functional constraints (Kimura, 1983). Hence, to study the functional importance of the IBD regions present in the above-mentioned 17 orthologous proteins, we measured the evolutionary rate for all amino acids for each of the 17 proteins using SWAKK web server (Liang et al., 2006). The ω values (ω = ratio of the number of nonsynonymous substitutions per nonsynonymous site (Ka) to the number of synonymous substitutions per synonymous site (Ks)) for 17 orthologous proteins have been represented in (Table 2(a)). Variation of Ka from Ks occurs due to positive Darwinian selection or purifying selection (Sinha, Roy, Das, Das, & Basak, 2009; Thomas et al., 2003). Thus, ω > 1.0 symbolizes positive selection, whereas ω < 1.0 stands for purifying selection and the condition of ω approaching 0 takes place due to highest intensity of purifying selection (Thomas et al., 2003; Wang, Zhang, Zhao, Wang, & Pan, 2010). It has been found that the ω value is comparatively lower within the IBDs for all the 17 orthologous proteins. A more detailed analysis of the ω values revealed that both Ka and Ks values have increased within the 17 IBDs with respect to their corresponding full-length proteins. But, the increase of Ks is much higher than the increase of Ka, and this resulted in the overall reduction of ω values within the IBD regions. In addition, the trend of decrease of ω value within the

IBD regions is much more for Group 1 members (on an average the increment of Ks values within IBDs in comparison to full-length proteins are nearly 86% more than that of Ka values’) than that of Group 2 members’ (on an average, the increment of Ks values within IBDs in comparison to full-length proteins is nearly 42% more than that of Ka values). A linear regression was fitted between Ka values for full-length protein and Ka values for IBDs (we mention the line as ‘Ka_line’ in our study) through the origin, on the assumption that both the above values are initially zero at the moment of their lineage divergence. Next, we calculated the value of the slope for ‘Ka_line.’ Similarly, a linear regression line was fitted between Ks values for full-length protein and Ks values for IBDs (we mention the line as ‘Ks_line’) through the origin, and then the slope of the ‘Ks_line’ was also calculated (Table 2(b)). For both groups the Ks_line is much steeper than Ka_line (for Group 1, Ks_line is 1.36 times steeper than Ka line, whereas for Group 2, Ks line is 1.15 times steeper than Ka_line). This indicates faster increment of Ks values than Ka values, within the domain regions than the full-length proteins. These results signify the residues within the IBDs to be under maximum evolutionary constraint and thereby having significant influence on protein structure and function. Comparatively higher steepness of Ks_line than Ka_line is more prominent in case of Group 1 than that of Group 2 (1.2 times

8

R. Banerjee et al.

Table 2(a).

Evolutionary rates (ω) in two groups of orthologous proteins.

Downloaded by [Michigan State University] at 02:01 05 March 2015

Name of the organisms Methanoregula Halovivax Colwellia Polaribacter Rhodoferax Aequorivita Methanosphaerula Cytophaga Glaciozyma Leucosporidium Typhula Flammulina Lentinula Pyramimonas Navicula Stephos Chaetoceros

Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group Group

1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2

Ka values; full-length protein

Ks values; full-length protein

ω values (Ka/Ks); full-length protein

Ka values; IBD

Ks values; IBD

ω values (Ka/Ks); IBD

% difference in ω values

1.56 1.65 1.18 1.40 1.32 1.20 1.61 1.10 1.29 0.85 0.87 1.16 1.00 1.12 1.02 0.99 1.04

1.69 1.84 1.76 1.68 1.70 1.65 1.13 1.13 1.99 1.90 1.19 1.81 1.73 1.71 1.41 1.39 1.60

0.92 0.90 0.67 0.84 0.78 0.73 1.42 0.98 0.65 0.45 0.73 0.64 0.57 0.65 0.72 0.71 0.65

1.60 1.70 1.20 1.49 1.43 1.22 1.65 1.20 1.43 1.15 1.02 1.36 1.16 1.22 1.27 1.13 1.24

2.08 2.62 2.43 2.62 2.46 2.36 1.68 1.68 2.52 2.81 1.60 2.38 2.28 2.13 1.93 1.81 2.24

0.77 0.65 0.49 0.57 0.58 0.52 0.98 0.71 0.57 0.41 0.64 0.57 0.51 0.57 0.66 0.62 0.55

16.69 27.67 26.29 31.79 25.15 29.08 30.70 27.16 12.55 8.53 13.18 10.89 11.88 12.51 9.02 12.82 14.52

Note: Groups have been indicated according to the phylogenetic tree in Figure 2.

Table 2(b). Values of the slopes for Ka_line and Ks_line in two groups of orthologous proteins.

Group 1 Group 2

Slope of Ka_line

Slope of Ks_line

1.04 1.16

1.42 1.34

Ks/Ka 1.36 1.15

Note: For details about Ka_line and Ks_line, see result Section 3. (Groups have been indicated according to the phylogenetic tree in Figure 2).

higher in Group 1 than Group 2). It supports the fact that the residues within the IBDs for Group 1 members are under higher evolutionary constraint than that of Group 2 members’. 3.4. Analysis of secondary structure Next, we have analyzed the secondary structures of the 17 orthologous proteins using PREDATOR (Frishman & Argos, 1996). It has been found that the all of the 17 IBDs contain more strands and less helix, in comparison to their corresponding full-length proteins (Table 3). In addition, in case of Group 1 members, the increase of strand content and decrease of helix content within the IBDs is more prominent than within Group 2 members. According to Sitbon et al., strands can tolerate fewer sequence changes and therefore, keep their specific shape and function (Sitbon & Pietrokovski, 2007). Thus, they tend to be more conserved than the helix, which on the other hand can keep their shape and function even with more changes. Thus, our previous finding which revealed maximum evolutionary constraint on IBDs in comparison to the corresponding full-length proteins can again be

supported by the secondary structural conformations. In addition, the trends in secondary structural arrangements of Group 1 members can also sustain much higher evolutionary constraints within the IBDs of Group 1 proteins than that of Group 2 proteins’. 3.5. Construction of three-dimensional structures and identification of IBSs In the next phase of our work, we focused toward the construction of three-dimensional model [Supplementary Figure 1(A)] of the IBP of Colwellia sp. SLW05 using ab-initio method which was subsequently validated by PROCHECK Server (Laskowski et al., 1993). Presence of >92% amino acid residues within the most-favored regions of the Ramachandran Plot indicated >95% accuracy of the model (Supplementary Figure 2). Distinctive beta-helical structure, a long alpha-helical region, and an adjoining loop region have been found within the structural model. Similar structural features for Leucosporidium sp. have been found by Lee et al. (2012). Using the above-mentioned methodology, we have also constructed the structural models for the rest of the 16 orthologous proteins. Each one of them has been again validated by PROCHECK. All of them were found to acquire substantial helical structures within the conserved IBD regions as shown in Figure 1. The structural models are also analogous to the high-resolution crystal structure of the AFP from the snow mold fungus Typhula ishikariensis, as reported by Kondo et al. in their studies. In support, we have superimposed the structural model of IBP of Colwellia sp. SLW05 with the resolved crystal structure of the AFP from the snow

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium Table 3.

Percentage composition (average) of secondary structural traits within full-length protein and the IBD of the protein. Full-length protein

Downloaded by [Michigan State University] at 02:01 05 March 2015

9

Glaciozyma (Group 1) Navicula (Group 1) Leucosporidium (Group 1) Chaetoceros (Group 1) Typhula (Group 1) Flammulina (Group 1) Lentinula (Group 1) Pyramimonas (Group 1) Stephos (Group 1) Methanoregula (Group 2) Halovivax (Group 2) Colwellia (Group 2) Polaribacter (Group 2) Rhodoferax (Group 2) Aequorivita (Group 2) Methanosphaerula (Group 2) Cytophaga (Group 2)

IBD

Helix

Strand

Random coil

Helix

Strand

Random coil

16.72 19.46 15.78 19.34 13.58 19.51 12.98 14.62 13.31 18.75 18.49 21.08 22.64 20.87 25.36 23.29 21.47

15.33 14.13 15.01 18.7 22.16 23.81 21.89 20.11 25.19 17.01 18 12.04 20.92 12.21 19.09 18.4 12.16

67.95 66.41 69.21 61.96 64.26 56.68 65.13 65.27 61.5 64.24 63.51 66.88 56.44 66.92 55.55 58.31 66.37

9.34 12.35 8.38 10.53 7.13 10.1 7.15 9.39 6.98 6.27 5.45 7.02 7.98 7.56 9.41 9.98 11.1

23.52 21.54 22.87 26.7 27.89 32.92 28.92 25.61 32.51 29.57 31.39 25.1 34.88 25.32 34.4 32.07 22.33

67.14 66.11 68.75 62.77 64.98 56.98 63.93 65 60.51 64.16 63.16 67.88 57.14 67.12 56.19 57.95 66.57

Note: Groups have been indicated according to the phylogenetic tree in Figure 2.

mold fungus T. ishikariensis (RCSB PDB ID: 3VN3) (Kondo et al., 2012) and the result shows that the two structures are considerably similar. The structure alignment of the two super imposed structures reveals nearly 94% equivalent positions with an RMSD of 1.58 [Supplementary Figure 1(B)]. The superimposition of the above-mentioned two structures has been carried out using the server FATCAT (Flexible structure AlignmenT by Chaining Aligned fragment pairs allowing Twists) (Ye & Godzik, 2003). Lee et al. have identified six amino acid residues i.e. 137 PGLYKW142 in Leucosporodium sp., which are very much conserved across IBPs from diverse species. According to them, this conserved region may play a significant role in the formation of correct folding of the IBPs from Leucosporodium sp. They have also traced out another highly conserved residue i.e. Ser80, which is also important for the structural stability of Leucosporodium sp. (Lee et al., 2012). In accordance with the finding of Lee et al., we have also noticed that 137 PGLYKW142 and Ser80 in Leucosporodium sp. are highly preserved within the diverse AFPs considered in the present study (Figure 1). Further, we have performed molecular docking between ice crystal and each of the above structural models and measured the stability (potential energy) of each of the docked structures in terms of Gibbs Free Energy change (ΔG). Interestingly, the docked complex containing bacterial and archaeal proteins have been found to be more stable than the docked complex containing the proteins from algae, fungi, diatom, and copepod (Table 4). In a very recent study, Hanada, Nishimiya, Miura, Tsuda, and Kondo (July, 2014) have determined the

Table 4. Variation in potential energies of the complexes containing ice-crystal and 17 orthologous proteins. Name of the organisms Methanoregula (Group 1) Halovivax (Group 1) Colwellia (Group 1) Polaribacter (Group 1) Rhodoferax (Group 1) Aequorivita (Group 1) Methanosphaerula (Group 1) Cytophaga (Group 1) Glaciozyma (Group 2) Leucosporidium (Group 2) Typhula (Group 2) Flammulina (Group 2) Lentinula (Group 2) Pyramimonas (Group 2) Navicula (Group 2) Stephos (Group 2) Chaetoceros (Group 2)

Potential energy of the complex (kcal/mol) −5102.15 −5317.59 −5018.12 −5740.46 −5655.89 −5740.45 −4913.02 −5178.01 −5037.84 −4737.84 −4613.14 −4253.14 −4471.49 −3800.32 −4544.68 −4005.26 −4603.92

Note: Groups have been mentioned according to the phylogenetic tree in Figure 2.

crystal structure of the hyperactive AFP from Colwellia sp., (RCSB PDB ID: 3WP9) (Hanada et al., 2014) and observed that the protein binds to ice through a compound IBS located at a flat surface of the β-helix and the adjoining loop region. They also found that the IBS of Colwellia sp. lacks the repetitive sequences that are characteristic of hyperactive AFPs. We have superimposed our ab-initio structural model of IBP of Colwellia sp. SLW05 with its resolved crystal structure (RCSB PDB ID: 3WP9) (Hanada et al., 2014) and the acquired result

Downloaded by [Michigan State University] at 02:01 05 March 2015

10

R. Banerjee et al.

reveals significant similarity between them, with nearly 95% equivalent positions with an RMSD of 0.95 [Supplementary Figure 1(C)]. In the next part of our work, we have involved the study of the IBS of the 17 orthologous proteins. According to Kondo et al., the main structural element of the β-helical hyperactive AFPs is an irregular β-helix with six loops of 18 or more residues that lies alongside an α-helix. They have also found that β-helices have evolved independently as AFPs and they are ideally structured to bind to several planes of ice. Moreover, they have identified the IBS in T. ishikariensis, by site-directed mutagenesis, which were found to be present in the flattest surface of the protein, supporting the fact that flatness is one of the key attributes for an IBS (Yang et al., 1998). According to Kondo et al., a set of four to five residues of peptides located in the middle of five parallel β-strands create the ice-binding β-sheet of T. ishikariensis. We have marked them in asterisk in Figure 1. They are as follows: (1) β1 (T-G-V-S-T-V), (2) β6 (T-S-V-A-L-Q), (3) β5 (T-A-V-TF-K), (4) β4 (G-A-V-N-I-E), and (5) β3 (G-T-L-D-V). Out of these 29 residues on this face, 12 IBSs were found within the IBD regions. Each of the residues of these five hexapeptides has unique functions in maintaining the structure and function of the IBP. The first residues of the five hexapeptides (T,T,T,G,G) facilitate the corner formation of the triangular molecule. The Thr side chains are directed outside of the β-strands. The second (G,S,A,A,T), fourth (S,A,T,N,D), and sixth (V,Q,K,E) residues present their side chains toward the solvent. The third (V,V,V,V,L) and fifth (T,L,F,I,V) residues have their side chains directed toward the inside of the β-helix to form part of the interior hydrophobic core. As a consequence, each β-strand constructs an out-out-in-out-in-out pattern of side chains with respect to the core of the protein, which repeats itself in each loop of the β-helix. Along with these repetitive structural motifs, regularly aligned surface waters have been observed in ‘troughlike’ regions on their IBSs. Surface waters forming cages around hydrophobic groups were in turn anchored by hydrogen bonding to side-chain and backbone hydrophilic group (Kondo et al., 2012). According to Nutt and Smith (2008), these particular patterns of binding of the AFP to ice-like waters are hypothesized as ‘anchored clathrate mechanism’ (Nutt & Smith, 2008). In Figure 1, we have marked these IBSs according to their percentage of conservation in 17 orthologous proteins (residues which are conserved in more than 40% positions have been marked in green, residues which are conserved in more than 60% positions have been marked in pink, and residues which are conserved in more than 90% positions have been marked in blue). In Table 5, we have marked the IBSs in red, which are conserved with respect to the well-studied IBSs of T. ishikariensis, in the rest of its orthologs. Next, we were interested about

those IBSs where mutations occurred in other orthologs, with respect to the IBSs of T. ishikariensis. We have performed molecular docking studies to trace out if these mutated positions are facilitating in ice binding or not. Indeed, we have found the binding of the ice crystal in the middle (flat surface) of five parallel β-strands using hydrogen bonding (Supplementary Figure 3) and as per our interest, we have pointed out the IBSs which are found to be still assisting in ice binding even after getting mutated. We have represented them in Table 5 and marked them in blue. Mutations that have been found to be supportive in ice binding are as follows: T → S (1st position of β1), S → T (4th position of β1), A → T (4th position of β6), Q → D (6th position of β6), A → Q (2nd position of β5), A → Q (2nd position of β4), N → T (4th position of β4). Notably, all the above mutations prefer hydrophilic amino acids with polar side chains (S, T, D, and Q). We have provided seven figures i.e. Supplementary Figures 4(a)–(g) produced using LigPlot+, representing each of the above mutations. In all the seven figures, we have marked the hydrogen bonding of the hydrophilic amino acids with oxygen atoms of the neighboring water molecules with red circles. We have also marked the hydrophobic amino acids which are surrounding the ice-binding hydrophilic amino acid with black circles. Two distinctive features have been found in all of these mutated hydrophilic amino acids. First, they are forming hydrogen bond with the oxygen atom of the neighboring ice molecule. Second, these hydrophilic amino acids have been found to be surrounded by different hydrophobic amino acids. This result can be supported by Garnham et al., who have hypothesized that hydrophobic sites present in the ice-binding surface force water molecules near the surface of the protein into an ice-like cage that reflects the pattern of water molecules on the surface of the ice crystal. The water-loving or hydrophilic sites on the protein’s surface anchor this ice-like cage to the protein via hydrogen bonds (Garnham et al., 2010). In addition, we have also pointed out three mutations (marked in green in Table 5) that have favored hydrophobic amino acids like A, I, and notably, in all these cases, the ice-binding activities have not been found. All the above three mutations have been illustrated in Supplementary Figures 5(a)–(c). 3.6. Detection of HGT The species phylogeny was generated based on the NCBI taxonomic classification information (Figure 5). The species phylogeny was rooted between Eubacteria

(Continued)

Table 5. IBSs in 17 orthologous proteins. IBSs which are conserved with respect to the well-studied IBSs of T. ishikariensis are marked in red. IBSs which have been proved (by molecular-docking study) to be ice binding after getting mutated, are marked in blue (continued in the next page).

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium 11

Table 5.

(Continued).

Downloaded by [Michigan State University] at 02:01 05 March 2015

12 R. Banerjee et al.

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium and Eukaryota. Among Eubacterial lineage two major clades were inferred, i.e. Bacteroidetes and Proteobacteria. It has also been found from the phylogenetic relationships that the clade is holding up the diatoms Bacillariophyta and Basidiomycota, and this is wellsupported. Figure 6 represents the gene tree i.e. the unrooted maximum-likelihood tree, derived from the alignment of amino acid sequences of the 17 orthologous proteins used in our study. We have observed a number of major discrepancies between the species and the gene tree, represented in Figures 5 and 6, respectively. Four prokaryotes viz., Rhodoferax, Aequorivita, Methanoregula, and Methanosphaerula club with the eukaryotes, i.e. diatoms viz., Navicula, Chaetoceros, and the copepod Stephos. Moreover, the Euryarchaeota Halovivax has been found within the Eukaryote clade, comprising Pyramimonas and five Basidiomycota. In addition, with the above-mentioned eukaryote clade, the Proteobacteria Colwellia, and the Bacteroidetes Polaribacter form sister relationship.

13

From the HGT Detection of T-REX (http://www. trex.uqam.ca/index.php?action=hgt&project=trex), HGT events were inferred (Figure 5). The arrows in Figure 5 show the direction of HGTs. For the HGT inference, the topological inconsistency between the gene and species phylogenies should acquire a minimum LR-ELW edge support value of 95% in the gene tree (Figure 6). Three transfers have occurred among Eukaryotes. One of them is from Chaetoceros neogracile to the Copepod Stephos longipes. The next transfer is from the Microbotryomycetes group comprising Leucosporidium sp. AY30 and Glaciozyma antarctica to the Agaricales consisting of Flammulina populicola and Lentinula edodes. Microbotryomycetes group belongs to the clade Basiodiomycota. Agaricales is the ancestral group of the clade Basiodiomycota. The third transfer takes place from the Basidiomycota L. edodes to T. ishikariensis. The other transmission has taken place from Polaribacter (present within the ancestral group of the Bacteroidetes) to the Proteobacterial clade consisting of Colwellia sp. SLW05 and Rhodoferax ferrireducens.

Figure 5. Species phylogeny showing 4 HGTs events. Notes: Arrows indicate the direction of gene transfer. The HGTs were inferred based on minimum 95% LR-ELW edge support values for recognizing topological conflict between the gene (Figure 6) and species trees. Figure 6: Maximum-likelihood tree inferred by the Treefinder based on the alignment of the amino acid sequences of the 17 orthologous proteins used in our study. The values next to the nodes are LR-ELW edge supports. The scale bar shows the number of substitutions per nucleotide.

Downloaded by [Michigan State University] at 02:01 05 March 2015

14

R. Banerjee et al.

Figure 6. Maximum-likelihood tree inferred by the Treefinder based on the alignment of the amino acid sequences of the 17 orthologous proteins used in our study. Notes: The values next to the nodes are LR-ELW edge supports. The scale bar shows the number of substitutions per nucleotide.

4. Discussion IBP of Colwellia sp. SLW05 has 17 orthologs from varied taxa with complete coding sequences. They possess a conserved region recognized as ‘IBD.’ Phylogenetic analysis on the basis of conserved IBD regions of these 17 orthologs, separates them into two groups, Group 1 comprised only bacterial and archaeal members, whereas Group 2 comprised algae, fungi, diatom, and copepod. Analysis of the sequence and structural features of conserved sequence regions of the orthologous proteins can help to visualize their relationship. COA on RAAU of the IBDs of the 17 orthologous proteins segregates them along Axis 1. Gravy Score has been found to be the motivating factor behind the above separation as the average hydrophobicity of the IBDs belonging to Group 1 is nearly two times higher than that of Group 2’s, due to significant higher usage of the hydrophobic amino acids Ile, Leu, Met, Trp, Val and relatively lower usage of the less hydrophobic amino acids Pro, Ser, and Thr. Much higher increase in percentage of Ks than Ka within the IBDs in comparison to the complete protein sequences for all the 17 orthologous proteins, results in lesser ω values for the IBDs. This signifies the residues within the IBDs to be under maximum evolutionary pressure and therefore having considerable influence on protein structure and function. On an average, the tendency of relatively higher percentage increment of Ks

values over that of Ka values within IBDs in comparison to full-length proteins, is nearly twofold high for Group 1 members than Group 2 members. It implies that the IBDs of bacterial and archaeal members are under higher evolutionary constraint than those of algae, fungi, diatom, and copepod’s. The more conserved nature of the bacterial and archaeal IBDs can again be supported by the presence of the comparatively higher content of strands and lesser content of helix in them. Constructions of the three-dimensional structures reveal a β-helical structure with six loops of 18 or more residues that lies alongside α-helix, for all of the 17 orthologous proteins. The position of the IBD has been found to be within the stacked parallel β–sheet regions. We have primarily observed higher hydrophobicity for the bacterial and archaeal IBDs representing Group 1 (Figure 2). The protein fold in a β-helical structure is stabilized by a hydrophobic core (Graether et al., 2000). Notably, out of the five hydrophobic amino acids favored by bacterial and archaeal IBDs, three (Ile, Leu, Val) are normally buried inside the protein core. Thus, the higher hydrophobic IBDs of bacterial and archaeal orthologs representing Group 1 may have significant efficacy in supporting a more stable protein fold than that of the orthologs from fungus, diatoms, and copepods. Besides this, as we have discussed before, water-repellent or hydrophobic sites present in the ice-binding surface help

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium to form a ice-like cage by compelling the water molecules present near the surface to the protein and then the water-loving or hydrophilic sites on the protein’s surface anchor this ice-like cage to the protein via hydrogen bonds. Therefore, elevated hydrophobicity of bacterial and archaeal IBDs can result in higher firmness in forcing the ice-like water molecules to aggregate near the surface of the IBPs. In support of this hypothesis our molecular docking studies using Hex version 8.0.0, revealed the complex of ice crystal with bacterial as well as archaeal proteins to be more stable than the complex of ice crystal with the proteins from algae, fungi, diatom, and copepod, used in the present work. Analysis of docked structures using LigPlot+ has identified the IBSs, which are conserved with respect to the well-studied IBSs of T. ishikariensis, in the rest of its orthologs. Molecular docking studies have also traced the positions within the IBSs that continue to facilitate ice-binding activity even after getting mutated (Table 5). Notably, all these mutations, which have been found to interact with oxygen atom of the nearby water molecules using hydrogen bonding, prefer polar and hydrophilic amino acids. Moreover, all these polar and hydrophilic amino acids are surrounded by hydrophobic amino acids facilitating the formation of ice-like cage in the way described above. Finally, during the analysis of HGT events, the conflicts between species and gene trees can be assumed due to lateral gene transfers between distantly related species (Abby, Tannier, Gouy, & Daubin, 2010). The work by Sorhannus has already rejected the involvement of the prokaryote genera Cytophaga in transferring AFP genes to the diatoms (Janech et al., 2006; Sorhannus, 2011). Therefore, AFP gene of the diatom C. neogracile is expected to be transmitted to the copepod S. longipes, and do not appear to have been acquired from other lineages. Thus, it can be concluded from the above facts that the AFPs of the diatoms may have evolved from ancestral genes with different functions (Bayer-Giraldi, Uhlig, John, Mock, & Valentin, 2010) and their transfer may not necessarily involve HGT events. But, the transfer of the AFP gene from the diatom C. neogracile to the copepod S. longipes can be supported as both of them inhabit the same environment i.e. sea and Stephos ‘graze’ on diatoms (Sorhannus, 2011). The HGT that has taken place from Polaribacter to the Proteobacterial clade is the only example of prokaryote-to-prokaryote transfer. Another mentionable fact is that we have not observed any ‘prokaryote to eukaryote’ and ‘eukaryote to prokaryote’ transfer. Gene transfers from eukaryotes to prokaryotes are generally not sustained due to two main reasons. First, due to the presence of introns in eukaryotes and second, due to the conjugation/transduction events that take place in

15

prokaryotes (Keeling & Palmer, 2008; Sorhannus, 2011). Consequently, in our result, the absence of ‘Eukaryotic to Prokaryotic’ HGTs is not very surprising. But, ‘Prokaryotic to Eukaryotic’ HGTs are very common (Keeling & Palmer, 2008). Therefore, nonappearance of ‘Prokaryotic to Eukaryotic’ HGTs is quite unexpected. But, it can be supported by the idea that the proteins with antifreeze/ice-binding property or any other properties (Table 1) have evolved independently in some eukaryotes. For example, Gene tree (Figure 5) has revealed a close association in some distantly related taxa. Four prokaryotes viz., Rhodoferax, Aequorivita, Methanoregula, and Methanosphaerula have been found to cluster with the eukaryotes Navicula, Chaetoceros, and Stephos. Here, it is worthy to mention, in spite of having sufficient sequence similarities with the eukaryotic protein from Navicula, Chaetoceros, and Stephos with antifreeze/ice-binding properties, the prokaryotic proteins from Rhodoferax, Aequorivita, Methanoregula, and Methanosphaerula do not show antifreeze functions. Among these four prokaryotes, Rhodoferax and Aequorivita are psychrotolerant bacteria, but Methanoregula and Methanosphaerula are mesophilic in nature. Moreover, except for Aequorivita, the rest of the three bacteria are not even known to occur in polar habitats or in ice. Though the present HGT analysis by T-REX-Online did not recognize the above-mentioned clustering of distantly related taxa, it might be possible that this relationship can be explained by convergent evolution. This can be supported by the fact that antifreeze activity of the proteins, found in the organisms dwelling in polar environments, may have evolved from proteins with a different function (Sorhannus, 2011). Our HGT analysis predicts that the proteins in Rhodoferax (without antifreeze/ice-binding property) and Colwellia (with antifreeze property) have been transmitted from Polaribacter, which is a psychrophilic bacteria residing in polar habitats but its proteins do not have antifreeze/icebinding property. Our HGT analysis also infers close association of Polaribacter with Aequorivita, which is a psychrotolerant bacterial species dwelling in Antarctic habitat, but its protein does not have antifreeze/ice-binding property. According to Bayer-Giraldi, some of the AFPs show high degree of similarity to surface proteins having ‘adhesins’ like functions that could easily be modified to have antifreeze properties through molecular evolution (Bayer-Giraldi et al., 2010). Due to the strong selection pressures favoring the evolution of AFPs, it is possible that proteins without antifreeze/ice-binding functions can be transferred to organisms living in polar environments, if they can help in better adaptations to the polar organisms and it might be expected that in course of time, natural selection will favor their functions to be modified to possess antifreeze properties.

16

R. Banerjee et al.

Downloaded by [Michigan State University] at 02:01 05 March 2015

5. Conclusion AFP from the Antarctic sea-ice bacterium Colwellia sp. strain SLW05 has orthologs from divergent taxa, including algae, fungi, bacteria, diatoms, and copepods, containing a conserved domain region in their amino acid sequences. Interestingly, this conserved domain was found to be associated with ice-binding activity. Phylogenetic analysis on the basis of this conserved IBDs separates the orthologs into two groups; one group containing bacterial and archaeal members, while the other group consisting of algae, fungi, diatom, and copepod. Statistical analysis points toward Gravy Score as the motivating force behind this segregation. Relatively more hydrophobic IBDs from AFPs of bacterial as well as archaeal members have been found to be under higher evolutionary selection pressure and form more stable complex with ice crystals than their orthologs from algae, fungi, diatoms, and copepods. Molecular docking has identified the IBSs in all the orthologs, which are mutated with respect to the well-studied IBSs of T. ishikariensis, facilitating continuation of ice-binding activity. All these mutations prefer polar and hydrophilic amino acids, which perform ice binding using ‘anchored clathrate mechanism.’ One prokaryote-to-prokaryote HGT has been found from Polaribacter to the Proteobacterial clade. Surprisingly, no ‘prokaryote to eukaryote’ or ‘eukaryote to prokaryote’ transfers have been noticed, revealing an independent evolution of the proteins with antifreeze/ice-binding property or any other properties in some prokaryotes as well as in some eukaryotes. HGT events along with the positions of the orthologous proteins in the gene tree point toward a strong selection pressure favoring the evolution of the AFPs. In addition, it is also implied that if the proteins, without any ice-binding function, help in the better adaptations of the polar organisms then the selection pressure would favor the transfer of those proteins to the organism living in polar environments and in course of time, those proteins might modify their functions to possess antifreeze properties. Acknowledgments We acknowledge UGC-RFSMS and the computational facility of DIC and Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta.

Supplemental data Supplemental data for this article can be accessed here http:// dx.doi.org/10.1080/07391102.2014.952665.

References Abby, S. S., Tannier, E., Gouy, M., & Daubin, V. (2010). Detecting lateral gene transfers by statistical reconciliation of phylogenetic forests. BMC Bioinformatics, 11, 324–336.

Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402. Anisimova, M., & Gascuel, O. (2006). Approximate likelihoodratio test for branches: A fast, accurate, and powerful alternative. Systematic Biology, 55, 539–552. Antson, A. A., Smith, D. J., Roper, D. I., Lewis, S., Caves, L. S., Verma, C. S., … Hubbard, R. E. (2001). Understanding the mechanism of ice binding by type III antifreeze proteins. Journal of Molecular Biology, 305, 875–889. Armbrust, E. V. (2009). The life of diatoms in the world’s oceans. Nature, 459, 185–192. Banerjee, R., Roy, A., Ahmad, F., Das, S., & Basak, S. (2012). Evolutionary patterning of hemagglutinin gene sequence of 2009 H1N1 pandemic. Journal of Biomolecular Structure & Dynamics, 29, 733–742. Basak, S., Banerjee, T., Gupta, S. K., & Ghosh, T. C. (2004). Investigation on the causes of codon and amino acid usages variation between thermophilic Aquifex aeolicus and mesophilic Bacillus subtilis. Journal of Biomolecular Structure & Dynamics, 22, 205–214. Basak, S., & Ghosh, T. C. (2006). Temperature adaptation of synonymous codon usage in different functional categories of genes: A comparative study between homologous genes of Methanococcus jannaschii and Methanococcus maripaludis. FEBS Letters, 580, 3895–3899. Bayer-Giraldi, M., Uhlig, C., John, U., Mock, T., & Valentin, K. (2010). Antifreeze proteins in polar sea ice diatoms: Diversity and gene expression in the genus Fragilariopsis. Environmental Microbiology, 12, 1041–1052. Boc, A., Diallo, A. B., & Makarenkov, V. (2012). T-REX: A web server for inferring, validating and visualizing phylogenetic trees and networks. Nucleic Acids Research, 40, W573–W579. Boc, A., & Makarenkov, V. (2011). Towards an accurate identification of mosaic genes and partial horizontal gene transfers. Nucleic Acids Research, 39, e144. Boc, A., Philippe, H., & Makarenkov, V. (2010). Inferring and validating horizontal gene transfer events using bipartition dissimilarity. Systematic Biology, 59, 195–211. Davies, P. L., & Hew, C. L. (1990). Biochemistry of fish antifreeze proteins. The FASEB Journal, 4, 2460–2468. DeLuca, C. I., Davies, P. L., Ye, Q., & Jia, Z. (1998). The effects of steric mutations on the structure of type III antifreeze protein and its interaction with ice. Journal of Molecular Biology, 275, 515–525. Devries, A. L., & Wohlschlag, D. E. (1969). Freezing resistance in some Antarctic fishes. Science, 163, 1073–1075. Duman, J. G. (2001). Antifreeze and ice nucleator proteins in terrestrial arthropods. Annual Review of Physiology, 63, 327–357. Duman, J. G., & Devries, A. L. (1974). Freezing resistance in winter flounder Pseudopleuronectes-Americanus. Nature, 247, 237–238. Duman, J. A., & Olsen, T. M. (1993). Thermal hysteresis protein activity in bacteria, fungi and phylogenetically diverse plants. Cryobiology, 30, 322–328. Fletcher, G. L., Goddard, S. V., & Wu, Y. (1999). Antifreeze proteins and their genes: From basic research to business opportunity. Chemtech -Washington DC, 6, 17–29. Fletcher, G. L., Kao, M. H., & Fourney, R. M. (1986). Antifreeze peptides confer freezing resistance to fish. Canadian Journal of Zoology, 64, 1897–1901.

Downloaded by [Michigan State University] at 02:01 05 March 2015

Molecular features of antifreeze proteins related to an Antarctic sea ice bacterium Frishman, D., & Argos, P. (1996). Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence. Protein Engineering, Design and Selection, 9, 133–142. Frishman, D., & Argos, P. (1997). Seventy-five percent accuracy in protein secondary structure prediction. Proteins: Structure, Function, and Genetics, 27, 329–335. Gallagher, K. R., & Sharp, K. A. (2003). Analysis of thermal hysteresis protein hydration using the random network model. Biophysical Chemistry, 105, 195–209. Garnham, C. P., Natarajan, A., Middleton, A. J., Kuiper, M. J., Braslavsky, I., & Davies, P. L. (2010). Compound icebinding site of an antifreeze protein revealed by mutagenesis and fluorescent tagging. Biochemistry, 49, 9063–9071. Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R. D., & Bairoch, A. (2003). ExPASy: The proteomics server for in-depth protein knowledge and analysis. Nucleic Acids Research, 31, 3784–3788. Ghoorah, A. W., Devignes, M. D., Smaïl-Tabbone, M., & Ritchie, D. W. (2013). Protein docking using case-based reasoning. Proteins: Structure, Function, and Bioinformatics, 81, 2150–2158. Graether, S. P., Kuiper, M. J., Gagné, S. M., Walker, V. K., Jia, Z., Sykes, B. D., & Davies, P. L. (2000). β-helix structure and ice-binding properties of a hyperactive antifreeze protein from an insect. Nature, 406, 325–328. Graham, L. A., Lougheed, S. C., Ewart, K. V., & Davies, P. L. (2008). Lateral transfer of a lectin-like antifreeze protein gene in fishes. PLoS ONE, 3, e2616. Griffith, M., & Yaish, M. W. F. (2004). Antifreeze proteins in overwintering plants: A tale of two activities. Trends in Plant Science, 9, 399–405. Guo, S., Garnham, C. P., Whitney, J. C., Graham, L. A., & Davies, P. L. (2012). Re-evaluation of a bacterial antifreeze protein as an adhesin with ice-binding activity. PLoS ONE, 7, e48805. Hanada, Y., Nishimiya, Y., Miura, A., Tsuda, S., & Kondo, H. (in press). Hyperactive antifreeze protein from an Antarctic sea ice bacterium Colwellia sp. has a compound ice-binding site without repetitive sequences. FEBS Journal, 281, 3576–3590. Hoshino, T., Kiriaki, M., Ohgiya, S., Fujiwara, M., Kondo, H., Nishimiya, Y., … Tsuda, S. (2003). Antifreeze proteins from snow mold fungi. Canadian Journal of Botany, 81, 1175–1181. Janech, M. G., Krell, A., Mock, T., Kang, J. S., & Raymond, J. A. (2006). Ice-binding proteins from sea ice diatoms (Bacillariophyceae). Journal of Phycology, 42, 410–416. Jia, Z., & Davies, P. L. (2002). Antifreeze proteins: An unusual receptor-ligand interaction. Trends in Biochemical Sciences, 27, 101–106. Jobb, G., von Haeseler, A., & Strimmer, K. (2004). TREEFINDER: A powerful graphical analysis environment for molecular phylogenetics. BMC Evolutionary Biology, 4, 18–26. Johnson, M., Zaretskaya, I., Raytselis, Y., Merezhuk, Y., McGinnis, S., & Madden, T. L. (2008). NCBI BLAST: A better web interface. Nucleic Acids Research, 36, W5–W9. Jorov, A., Zhorov, B. S., & Yang, D. S. (2004). Theoretical study of interaction of winter flounder antifreeze protein with ice. Protein Science, 13, 1524–1537. Keeling, P. J., & Palmer, J. D. (2008). Horizontal gene transfer in eukaryotic evolution. Nature Reviews Genetics, 9, 605–618. Kelley, J. L., Aagaard, J. E., MacCoss, M. J., & Swanson, W. J. (2010). Functional diversification and evolution of antifreeze proteins in the antarctic fish. Lycodichthys dearborni. Journal of Molecular Biology, 71, 111–118.

17

Kimura, M. (1983). The neutral theory of molecular evolution. Cambridge: Cambridge University Press. Knight, C. A., Devries, A. L., & Oolman, L. D. (1984). Fish antifreeze protein and the freezing and recrystallization of ice. Nature, 308, 295–296. Knight, C. A., Hallett, J., & DeVries, A. L. (1988). Solute effects on ice recrystallization: An assessment technique. Cryobiology, 25, 55–60. Kondo, H., Hanada, Y., Sugimoto, H., Hoshino, T., Garnham, C. P., Davies, P. L., & Tsuda, S. (2012). Ice-binding site of snow mold fungus antifreeze protein deviates from structural regularity and high conservation. Proceedings of the National Academy of Sciences, 109, 9360–9365. Krembs, C., Eicken, H., Junge, K., & Deming, J. W. (2002). High concentrations of exopolymeric substances in Arctic winter sea ice: Implications for the polar ocean carbon cycle and cryoprotection of diatoms. Deep-Sea Research Part I: Oceanographic Research Papers, 49, 2163–2181. Kwan, A. H., Fairley, K., Anderberg, P. I., Liew, C. W., Harding, M. M., & Mackay, J. P. (2005). Solution structure of a recombinant type I sculpin antifreeze protein. Biochemistry, 44, 1980–1988. Kyte, J., & Doolittle, R. F. (1982). A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology, 157, 105–132. Laskowski, R. A., MacArthur, M. W., Moss, D. S., & Thornton, J. M. (1993). PROCHECK – A program to check the stereochemical quality of protein structures. Journal of Applied Crystallography, 26, 283–291. Laskowski, R. A., & Swindells, M. B. (2011). LigPlot+: Multiple ligand-protein interaction diagrams for drug discovery. Journal of Chemical Information and Modeling, 51, 2778–2786. Lee, J. H., Park, A. K., Do, H., Park, K. S., Moh, S. H., Chi, Y. M., & Kim, H. J. (2012). Structural basis for antifreeze activity of ice-binding protein from arctic yeast. The Journal of Biological Chemistry, 287, 11460–11468. Liang, H., Zhou, W., & Landweber, L. F. (2006). SWAKK: A web server for detecting positive selection in proteins using a sliding window substitution rate analysis. Nucleic Acids Research, 34, W382–W384. Liou, Y. C., Tocilj, A., Davies, P. L., & Jia, Z. (2000). Mimicry of ice structure by surface hydroxyls and water of a β-helix antifreeze protein. Nature, 406, 322–324. Liu, Y., Li, Z., Lin, Q., Kosinski, J., Seetharaman, J., Bujnicki, J. M., … Hew, C. L. (2007). Structure and evolutionary origin of Ca2+-dependent herring type II antifreeze protein. PLoS ONE, 2, e548. Macindoe, G., Mavridis, L., Venkatraman, V., Devignes, M. D., & Ritchie, D. W. (2010). HexServer: An FFT-based protein docking server powered by graphics processors. Nucleic Acids Research, 38, W445–W449. Marshall, C. B., Daley, M. E., Graham, L. A., Sykes, B. D., & Davies, P. L. (2002). Identification of the ice-binding face of antifreeze protein from Tenebrio molitor. FEBS Letters, 529, 261–267. Mustard, D., & Ritchie, D. W. (2005). Docking essential dynamics eigen structures. Proteins: Structure, Function, and Bioinformatics, 60, 269–274. Nishimiya, Y., Kondo, H., Takamichi, M., Sugimoto, H., Suzuki, M., Miura, A., & Tsuda, S. (2008). Crystal structure and mutational analysis of Ca2+-independent type II antifreeze protein from longsnout poacher, Brachyopsis rostratus. Journal of Molecular Biology, 382, 734–746.

Downloaded by [Michigan State University] at 02:01 05 March 2015

18

R. Banerjee et al.

Nutt, D. R., & Smith, J. C. (2008). Dual function of the hydration layer around an antifreeze protein revealed by atomistic molecular dynamics simulations. Journal of the American Chemical Society, 130, 13066–13073. Patel, S. N., & Graether, S. P. (2010). Structures and ice-binding faces of the alanine-rich type I antifreeze proteins. Biochemistry and Cell Biology, 88, 223–229. Peden, J. F. (2000). Analysis of codon usage (Dissertation). University of Nottingham, England. Pentelute, B. L., Gates, Z. P., Tereshko, V., Dashnau, J. L., Vanderkooi, J. M., Kossiakoff, A. A., & Kent, S. B. (2008). X-ray structure of snow flea antifreeze protein determined by racemic crystallization of synthetic protein enantiomers. Journal of the American Chemical Society, 130, 9695–9701. Raman, S., Vernon, R., Thompson, J., Tyka, M., Sadreyev, R., Pei, J., … Baker, D. (2009). Structure prediction for CASP8 with all-atom refinement using Rosetta. Proteins: Structure, Function, and Bioinformatics, 77, 89–99. Raymond, J. A., & Devries, A. L. (1977). Adsorption inhibition as a mechanism of freezing resistance in polar fishes. Proceedings of the National Academy of Sciences, 74, 2589–2593. Raymond, J. A., Fritsen, C., & Shen, K. (2007). An ice-binding protein from an Antarctic sea ice bacterium. FEMS Microbiology Ecology, 61, 214–221. Ritchie, D. W. (2003). Evaluation of protein docking predictions using Hex 3.1 in CAPRI rounds 1 and 2. Proteins: Structure, Function, and Genetics, 52, 98–106. Ritchie, D. W. (2005). High order analytic translation matrix elements for real space six-dimensional polar Fourier correlations. Journal of Applied Crystallography, 38, 808–818. Ritchie, D. W., & Kemp, G. J. L. (1999). Fast computation, rotation, and comparison of low resolution spherical harmonic molecular surfaces. Journal of Computational Chemistry, 20, 383–395. Ritchie, D. W., Kozakov, D., & Vajda, S. (2008). Accelerating and focusing protein-protein docking correlations using multi-dimensional rotational FFT generating functions. Bioinformatics, 24, 1865–1873. Ritchie, D. W., & Venkatraman, V. (2010). Ultra-fast FFT protein docking on graphics processors. Bioinformatics, 26, 2398–2405. Sabbia, V., Piovani, R., Naya, H., Rodriguez-Maseda, H., Romero, H., & Musto, H. (2007). Trends of amino acid usage in the proteins from the human genome. Journal of Biomolecular Structure & Dynamics, 25, 55–59. Saitou, N., & Nei, M. (1987). The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution, 4, 406–425. Scotter, A. J., Marshall, C. B., Graham, L. A., Gilbert, J. A., Garnham, C. P., & Davies, P. L. (2006). The basis for hyperactivity of antifreeze proteins. Cryobiology, 53, 229–239.

Servant, F., Bru, C., Carrere, S., Courcelle, E., Gouzy, J., Peyruc, D., & Kahn, D. (2002). ProDom: Automated clustering of homologous domains. Briefings in Bioinformatics, 3, 246–251. Sicheri, F., & Yang, D. S. C. (1995). Ice-binding structure and mechanism of an antifreeze protein from winter flounder. Nature, 375, 427–431. Siemer, A. B., & McDermott, A. E. (2008). Solid-state NMR on a type III antifreeze protein in the presence of ice. Journal of the American Chemical Society, 130, 17394–17399. Sinha, N. K., Roy, A., Das, B., Das, S., & Basak, S. (2009). Evolutionary complexities of swine flu H1N1 gene sequences of 2009. Biochemical and Biophysical Research Communications, 390, 349–351. Sitbon, E., & Pietrokovski, S. (2007). Occurrence of protein structure elements in conserved sequence regions. BMC Structural Biology, 7, 3–17. Smolin, N., & Daggett, V. (2008). Formation of ice-like water structure on the surface of an antifreeze protein. The Journal of Physical Chemistry B, 112, 6193–6202. Sorhannus, U. (2011). Evolution of antifreeze protein genes in the diatom genus Fragilariopsis: Evidence for horizontal gene transfer, gene duplication and episodic diversifying selection. Evolutionary Bioinformatics Online, 7, 279–289. Thomas, M. A., Weston, B., Joseph, M., Wu, W., Nekrutenko, A., & Tonellato, P. J. (2003). Evolutionary dynamics of oncogenes and tumor suppressor genes: Higher intensities of purifying selection than other genes. Molecular Biology and Evolution, 20, 964–968. Thompson, J. D., Higgins, D. G., & Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research, 22, 4673–4680. Wang, M., Zhang, X., Zhao, H., Wang, Q., & Pan, Y. (2010). Comparative analysis of vertebrate PEPT1 and PEPT2 genes. Genetica, 138, 587–599. Wierzbicki, A., Dalal, P., Cheatham, T. E. 3rd, Knickelbein, J. E., Haymet, A. D., & Madura, J. D. (2007). Antifreeze proteins at the ice/water interface: Three calculated discriminating properties for orientation of type I proteins. Biophysical Journal, 93, 1442–1451. Xiao, N., Suzuki, K., Nishimiya, Y., Kondo, H., Miura, A., Tsuda, S., & Hoshino, T. (2010). Comparison of functional properties of two fungal antifreeze proteins from Antarctomyces psychrotrophicus and Typhula ishikariensis. FEBS Journal, 277, 394–403. Yang, D. S., Hon, W. C., Bubanko, S., Xue, Y., Seetharaman, J., Hew, C. L., & Sicheri, F. (1998). Identification of the ice-binding surface on a type III antifreeze protein with a “flatness function” algorithm. Biophysical Journal, 74, 2142–2151. Ye, Y., & Godzik, A. (2003). Flexible structure alignment by chaining aligned fragment pairs allowing twists. Bioinformatics, 19, 246–255.

Distinct molecular features facilitating ice-binding mechanisms in hyperactive antifreeze proteins closely related to an Antarctic sea ice bacterium.

Antifreeze proteins or ice-binding proteins (IBPs) facilitate the survival of certain cellular organisms in freezing environment by inhibiting the gro...
2MB Sizes 3 Downloads 5 Views