GENE-39572; No. of pages: 8; 4C: Gene xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Functional annotation of putative hypothetical proteins from Candida dubliniensis Kundan Kumar, Amresh Prakash, Munazzah Tasleem, Asimul Islam, Faizan Ahmad, Md. Imtaiyaz Hassan ⁎ Center for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi 110025, India

a r t i c l e

i n f o

Article history: Received 23 November 2013 Received in revised form 27 March 2014 Accepted 28 March 2014 Available online xxxx Keywords: Candida dubliniensis Hypothetical protein Sequence analysis Functional annotation Functional genomics

a b s t r a c t An extensive analysis of C. dubliniensis proteomics data showed that ~22% protein are conserved hypothetical proteins (HPs) whose function is still not determined precisely. Analysis of gene sequence of HPs provides a platform to establish sequence–function relationships to a more profound understanding of the molecular machinery of organisms at systems level. Here we have combined the latest versions of bioinformatics tools including, protein family, motifs, intrinsic features from the amino acid sequence, sequence–function relationship, pathway analysis, etc. to assign a precise function to HPs for which no any experimental information is available. Our results show that 27 HPs have well defined functions and we categorized them as enzyme, nucleic acid binding, transport protein, etc. Five HPs showed adhesin character that is likely to be essential for the survival of yeast and pathogenesis. We also addressed issues related to the sub-cellular localization and signal peptide identification which provides an idea about its colocalization and function. The outcome of the present study may facilitate better understanding of mechanism of virulence, drug resistance, pathogenesis, adaptability to host, tolerance for host immune response, and drug discovery for treatment of C. dubliniensis infections. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Candida dubliniensis is a germ tube-positive yeast, that acts as an opportunist pathogen (Sebti et al., 2001). Normally, this species of Candida is harmless in different body parts but it may become virulent under certain conditions (Sullivan et al., 2005). Candida present in most of the body parts including the oral cavity, urine, vagina, lung, feces and sputum, especially in immunocompromised individuals/HIV-infected patient (Sebti et al., 2001). Clinically, 2 to 7% of candidemia caused by C. dubliniensis, showed their presence in the gastrointestinal tract. Candida showed a wide range of infections from superficial vaginal and oral mucosa to serious systematic infections (Sullivan et al., 2005). These infections are usually countered with the administration of antifungal drugs, however, treatment becomes more difficult with the development of resistance to antifungal agents (Moran et al., 1997). Furthermore, a close phenotypic resemblance of C. dubliniensis

Abbreviations: HP, hypothetical protein; BLAST, basic local alignment search tool; PSIBLAST, position specific iterative basic local alignment search tool; HMMTOP, prediction of transmembrane helices and topology of proteins; TMHMM, membrane protein topology prediction method based on a hidden Markov model; CATH, class, architecture, topology and homology; GRAVY, grand average of hydropathicity; CDD, Conserved Domain Database; SMART, simple modular architecture research tool; PANTHER, Protein ANalysis THrough Evolutionary Relationships; SVM, Support Vector Machine; PP2C, protein phosphatase 2C; SAM, S­adenosyl methionine; DGK, diacylglycerol kinase; CMD, carboxymuconolactone decarboxylase; MFS, major facilitator superfamily. ⁎ Corresponding author. E-mail address: [email protected] (M.I. Hassan).

with Candida albicans makes the clinical diagnosis more difficult (O'Connor et al., 2010). Although, C. dubliniensis is less pathogenic than C. albicans, its ability to produce hyphae and having more survival time pronounce its pathogenicity (Jackson et al., 2009). Hence, this species is a prime target of investigation of fungal infection, especially for the condition of low immunity and frequent development of resistance to antifungal agents (Sullivan and Coleman, 1998). Recently, the genome of C. dubliniensis has been sequenced, and open a new promising channel for extensive research (Jackson et al., 2009). The genome of C. dubliniensis is composed of eight chromosomes containing 262288 reads with a total length of 14.6 Mb. An extensive analysis of C. dubliniensis genome leads to the identification of 1323 proteins as hypothetical out of 5860 open reading frames (Jackson et al., 2009). HPs are predicted from open reading frame, having no experimental evidence of translation and from their functional annotation (Nimrod et al., 2008). Nearly, half of the proteins in most genomes belong to HPs, and have an absolute importance to complete genomic and proteomic information (Loewenstein et al., 2009). Recent studies suggest many significant roles of HPs because it constitutes a considerable fraction of proteomes and has a reasonable probability that these proteins are novel with uncharacterized biological roles (Adams et al., 2007; Desler et al., 2009; Eisenstein et al., 2000). HPs generally contain low identity compared to other known or annotated proteins (Galperin and Koonin, 2004). However, recent studies showed that a large fraction of genes encoding HPs have strong phylogenetic linkages with known proteins (Mazandu and Mulder, 2012; Shahbaaz et al., 2013). Furthermore, we have been working on the structure based drug design and

http://dx.doi.org/10.1016/j.gene.2014.03.060 0378-1119/© 2014 Elsevier B.V. All rights reserved.

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

2

K. Kumar et al. / Gene xxx (2014) xxx–xxx

searching for a novel therapeutic targets (Hassan et al., 2007a, 2007b; Thakur and Hassan, 2011; Thakur et al., 2013). Therefore, HPs may also serve as markers and a potential drug target for drug design, discovery and screen. A precise annotation of HPs of a particular genome leads to the discovery of new functions, and helps in bringing out a list of additional protein pathways and cascades, thus completing our fragmentary knowledge on biological significance of many novel proteins. The use of advance bioinformatics tools for sequence analysis is an initial step to identify homology shared between proteins, which could lead to a robust function prediction. Here, we have successfully characterized 43 HPs of C. dubliniensis using various computational tools. Preliminary sequence analysis of all 43 HPs was carried using BLAST-P, PSI-BLAST, Pfam and CDD search. Their functions were inferred on the basis of the presence of specific motifs, important region(s) and specific folds, using InterProScan, InterPro, ScanProsite and PFP-FunDSeqE. Other bioinformatics tools such as ProtParam, HMMTOP, TMHMM, SOSUI and CATH have been used precisely to precisely define physicochemical property, subcellular localization and their family. Furthermore, adhesin like proteins, or human pathogenic fungal adhesins were identified with FungalRV. Furthermore, C. dubliniensis is one of the major causative agents of infection in immuno-compromised individual, especially in HIV/AIDS patients. Therefore, functional annotation of HPs may lead to identification of novel targets for better treatment and understanding of C. dubliniensis infections. 2. Materials and methods

(Altschul and Koonin, 1998; Altschul et al., 1997). Top hits were selected and further analyzed using ClustalW to find the alignment of functional residues of protein of known function with the sequence of HPs (Thompson et al., 2002). 2.2. Physicochemical characterization Theoretical physiochemical parameters such as molecular weight, isoelectric point, aliphatic index, instability index and grand average of hydropathicity (GRAVY) of each protein was carried out on Expasy's ProtParam server (http://web.expasy.org/protparam/). Results of this analysis are listed in Table S1. 2.3. Sub-cellular localization In order to identify a protein as a drug or vaccine target, sub-cellular localization of the protein is essentially important. Surface membrane protein can be used as a potential vaccine target while cytoplasmic proteins may act as promising drug targets (Vahisalu et al., 2008). We used PSORT II tool (Nakai and Horton, 1999) for the prediction of sub-cellular localization protein. Online tools, TMHMM, SOSUI and HMMTOP were used for predicting the propensity of a protein for being a membrane protein, based on Hidden Markov Model (Chen et al., 2003; Hirokawa et al., 1998). SingnalP 4.1 (Petersen et al., 2011) was used to predict the signal peptide and location of cleavage site in the peptide chain based on neural network method. Results of these predictions are summarized in Table 2.

2.1. Sequence retrieval and homology search 2.4. Function prediction Search for HP sequences of C. dubliniensis was carried out on UniProt database (http://www.uniprot.org/). The FASTA sequence along with their UniProt ID and primary accession number of 43 HPs were taken separately to perform sequence analysis. UniProt ID of protein has been used to identify the protein sequence to perform sequence analysis. Table 1 provides list of all tools and software that were used for the functional annotation of HPs from C. dubliniensis. We used BLAST-P and PSI-BLAST for searching similar sequences with known function

In order to assign a precise function to HPs from C. dubliniensis, we first analyzed all sequences on Conserved Domain Database (CDD) (Marchler-Bauer et al., 2011), SMART (Letunic et al., 2012), ScanProsite, CATH and PANTHER. CDD includes manually curated domain model based on the tertiary structure of the protein to provide sequence/ structure/function relationship in an organized hierarchy of family and superfamily (Marchler-Bauer et al., 2011). SMART compares

Table 1 List of bioinformatics tools and databases used for function prediction. S. N

Tools

URL

Uses

1. Sequence similarity search tool i BLAST http://blast.ncbi.nlm.nih.gov/Blast.cgi ii ClustalW2 https://www.ebi.ac.uk/Tools/msa/clustalw2/

To find the similar sequence in the gene database Sequence comparison to compare homologous region

2. Biophysical &chemical characterization i ProtoParam http://web.expasy.org/protparam/

To calculate various physical and chemical parameters for a given protein sequence

3. Function prediction i. Conserved Domain ii. InterProScan iii. Interpro

http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi http://www.ebi.ac.uk/Tools/pfa/iprscan/ http://www.ebi.ac.uk/interpro/

iv v vi vii

http://prosite.expasy.org/scanprosite/ http://www.pantherdb.org/ http://pfam.sanger.ac.uk/ http://smart.embl-heidelberg.de/

ScanProsite Panther Pfam SMART

5. Sub-cellular localization of the protein i. SOSUI http://bp.nuap.nagoya-u.ac.jp/sosui/sosui_submit.html ii. iii. iv. v.

TMHMM Psort II SignalP HMMTOP

http://www.cbs.dtu.dk/services/TMHMM/ http://psort.hgc.jp/form2.html http://www.cbs.dtu.dk/services/SignalP/ http://www.enzim.hu/hmmtop/index.php

Used to search Conserved Domain in the sequences For functional analysis of the amino acid sequences by finding the specific motif in the sequences For functional analysis of proteins on the basis of protein family categorization by predicting domains and important sites Used to scan profile based on domains, motifs and pattern Classify proteins on the basis of evolutionary relation and biological process Classify protein into family on the basis of multiple sequence alignment Allow analysis of the domain in the protein sequences

Used to identify weather the given protein sequences is of soluble protein or of trans-membrane protein Used to predict the transmembrane topology of the protein Used to predict sub-cellular localization with a good reliability Predict cleavage site of signal protein Predict transmembrane helix and topology of the protein

6. Prediction of fold pattern i. PFP-FunDSeqE http://www.csbio.sjtu.edu.cn/bioinf/PFP-FunDSeqE/

Used to find the type of protein fold in the protein sequence

7. Virulence prediction i. FungalRV

Used in adhesin prediction

fungalrv.igib.res.in/query.php

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

K. Kumar et al. / Gene xxx (2014) xxx–xxx Table 2 Sub-cellular localization of HPs. S. no.

Uniprot ID

HMMTOP

SOSUI

TMHMM

SignalP

Psort

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

B9W9J1 B9WBA5 B9WFD2 B9WFD7 B9WFE4 B9WFE6 B9WFF1 B9WFF7 B9WFG4 B9WFG8 B9WFG9 B9WFH2 B9WFH4 B9WFM3 B9WFP3 B9WFR1 B9WFR8 B9WFR9 B9WFS0 B9WFS1 B9WFS2 B9WFS4 B9WFS6 B9WFT3 B9WFT7 B9WFT8 B9WFU3 B9WFU7 B9WFU9 B9WFV3 B9WFV7 B9WFW2 B9WFW8 B9WFX1 B9WIA6 B9WIB2 B9WIC3 B9WIF0 B9WIF4 B9WIG1 B9WIG2 B9WIH4 B9WIH5

NIL 3 TMH NIL NIL 1 TMH NIL NIL NIL NIL 1 TMH NIL 10 TMH NIL 1 TMH 1 TMH NIL 4 TMH 2TMH NIL NIL 2 TMH 2 TMH NIL 1 TMH 1 TMH NIL 1 TMH NIL NIL NIL 2 TMH NIL NIL NIL NIL 2 TMH NIL NIL 1 TMH NIL NIL 1 TMH 1 TMH

S M, 2 TMH S S S S S S S S S M, 10 TMH S S S S M, 1 TMH S S S S M, 1 TMH S M, 1 TMH M, 2 TMH M, 1 TMH M, 1 TMH S S S M, 2 TMH S S S S M, 1 TMH S S M, 1 TMH S S S M, 2 TMH

NIL 2 TMH NIL NIL NIL NIL NIL NIL NIL NIL NIL 9 TMH NIL NIL 1 TMH NIL NIL NIL NIL NIL NIL 2 TMH NIL 1 TMH NIL 1 TMH NIL NIL NIL NIL 2 TMH NIL NIL NIL NIL 2 TMH NIL NIL 1 TMH NIL NIL 1 TMH 1 TMH

NSP NSP NSP NSP NSP NSP NSP NSP NSP SP NSP NSP NSP NSP NSP NSP NSP NSP NSP NSP NSP NSP NSP NSP SP NSP SP NSP NSP NSP NSP NSP NSP NSP SP NSP NSP NSP NSP NSP NSP NSP SP

Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Cytoplasmic Nuclear Nuclear Nuclear Nuclear Nuclear Cytoplasmic Cytoplasmic Cytoplasmic Nuclear Cytoplasmic Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Cytoplasmic Nuclear Cytoplasmic Cytoplasmic Nuclear Nuclear Nuclear Cytoplasmic Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear Nuclear

TMH — transmembrane helix.

the query sequence with the database and searches sequence with similar domain based on domain architecture and profiles. SMART performs multiple sequence alignment and identifies region in the sequence which is compositionally biased such as transmembrane, coiled coil portion and signal peptide (Letunic et al., 2012). ScanProsite is a publicly available web-based tool to scan PROSITE profile that is based on the protein domains, families and functional sites and associated pattern in the protein sequence that is structurally and functionally critical. CATH brings about structurally related protein even with low sequence identities (Orengo et al., 1997). Likewise, PANTHER is a widely-ranged, curated database of protein families, subfamilies, trees and was used to find evolutionary relationships to deduce the functionality of HPs (Thomas et al., 2003). In a protein, motifs are signatures of the protein function that can be used as a basis to define the family of proteins, particularly enzymes in which motifs are associated with catalytic function (Bork and Koonin, 1996). We used the InterProScan (Quevillon et al., 2005), which combines different protein signature recognition methods from the InterPro consortium for motif discovery. We have used web server PFP-FunDSeqE to find out protein fold pattern, based on combination of functional domain information and evolutionary information.

3

with the host cell and cause infection. FungalRV (Chaudhuri et al., 2011) is a tool based on Support Vector Machine (SVM) method and trained by a large number of compositional properties that are used to classify human pathogenic fungal adhesins and adhesin like proteins. 3. Results 3.1. Sequence analysis In the present study, we systematically analyzed the sequence of 43 HPs from C. dubliniensis genome, using modern bioinformatics tools. Here, BLAST-P, PSI-BLAST, Pfam, CDD search, InterProScan, InterPro and PFP-FunDSeqE have been used for functional annotation of these HPs. We successfully assigned the function of 27 HPs very precisely (Fig. 1, Table 3). We found a well defined domain in 22 HPs showing corresponding functions (Table 4). All 43 HPs have been characterized for their folding patterns, and types of folds present in each protein are listed in Table 5 and Fig. 2. Interestingly, 19 HPs showed their close resemblance with immunoglobulin type protein. Conversely, HPs: B9WFG8, B9WFU7, B9WFX1, B9WIG1 and B9WIH5 showed a close structural resemblance to the viral coat and capsid proteins. Few HPs have TIM barrel and thioredoxin like fold. The adhesin like character studied using FungalRV showed that out of 43 proteins, five proteins may have adhesin like signature, indicating their role in pathogenesis. Furthermore, eight HPs have nucleic acid binding property in which three are RNAbinding proteins and five are DNA-binding (Table 6). Many HPs possess enzymatic activities and are categorized as hydrolases, phosphatase, transferases, kinase, oxidoreductases, and peroxiredoxin. B9WFE4 and B9WFD2 showed ATP and phospholipid-binding activities, respectively. HPs, B9WFH2 and B9WIH5 may act as transporter protein. Here, we provide a detailed analysis of each group of proteins. 3.2. Enzymes Enzymes produced by the yeast, have a key role for its survival in their host because they provide nutrient for growth, and are responsible for pathogenesis. Enzymes modify the local environment for favorable growth inside the host and essential for various metabolism. These enzymes may also affect the physiology of the organism (Bjornson, 1984). We found 17 HPs showing catalytic activity and have been categorized in six classes. A detailed knowledge of these enzymes is important for understanding the molecular basis of pathogenesis and host–pathogen interaction. 3.2.1. Hydrolase Hydrolytic enzymes play key roles in the invasion of the host tissue and evading the host defense mechanism. It is an important virulence factor in the vaginal infection caused by C. albicans (Schaller et al., 2005). In our study, we have found that HP B9WFS1 is comprised of an α/β hydrolase fold and possesses hydrolytic activity (Marchler-Bauer et al., 2011). This fold is very common in several hydrolytic enzymes having different phylogenetic origins and different catalytic functions. However, they all have similar core architecture and topology (Ollis et al., 1992). These enzymes have catalytic triad, three specific residues, namely, serine,

Enzyme Protein DNA Binding Protein RNA Binding Protin Protin Binding ATP binding Phoshphoinositide Binding

2.5. Virulence factor analysis Adhesins are characterized as a potential target for vaccine development because they are an essential factor that makes the fungus interact

Transport Structural

Fig. 1. HPs classified into different groups based on their functions.

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

4

K. Kumar et al. / Gene xxx (2014) xxx–xxx

Table 3 Predicted function of HPs from Candida dubliniensis.

Table 4 List of domains identified in the HPs from Candida dubliniensis.

S. no.

Gene ID

Uniprot ID

Protein function

S. no.

Uniprot ID

Conserved Domain (super family)

1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.

8045310 8047341 8047346 8047351 8047353 8047358 8047371 8047376 8047379 8047381 8047605 8047460 8047467 8047468 8047470 8047471 8047600 8047474 8047491 8047495 8047497 8047654 8047663 8047669 8048708 8048742 8048763

B9W9J1 B9WFD2 B9WFD7 B9WFE4 B9WFE6 B9WFF1 B9WFG4 B9WFG9 B9WFH2 B9WFH4 B9WFM3 B9WFR1 B9WFR8 B9WFR9 B9WFS1 B9WFS2 B9WFS4 B9WFS6 B9WFU3 B9WFU7 B9WFU9 B9WFV3 B9WFW2 B9WFW8 B9WIB2 B9WIF0 B9WIH5

Peroxiredoxin activity Phosphoinositide binding Structural protein ATP binding RNA binding Protein binding Phosphatase RNA binding Transporter activity Kinase activity Protein binding Transferase activity DNA binding Transferase activity Hydrolase activity DNA binding Protein binding RNA binding Oxidoreductase activity DNA binding Hydrolase DNA binding Kinase activity DNA binding Protein binding Oxidoreductase activity Transport activity

1. 2. 3. 4. 5. 6. 7. 8.

B9W9J1 B9WBA5 B9WFD2 B9WFE4 B9WFG4 B9WFG8 B9WFG9 B9WFH2

9.

B9WFH4

10. 11. 12.

B9WFM3 B9WFP3 B9WFR8

13.

B9WFR9

14.

B9WFS1

15.

B9WFS2

16. 17. 18.

B9WFS4 B9WFS6 B9WFU3

19.

B9WFU7

20.

B9WFV3

21.

B9WFW2

22.

B9WFW8

Carboxymuconolactone decarboxylase (CMD) Hypothetical protein FLILHELTA ANTH domain family Archaeal ATPasea PP2Cc super family lipoprotein A (RlpA)-like double-psi beta-barrel PIN domain Major facilitator superfamily (MFS) Sugar (and other) transportera Diacylglycerol kinase catalytic domain (DAG) LCB5; Sphingosine kinase and enzymesa CUE domain Oxidoreductase-like protein, N-terminal Oxidoreductase-like protein, N-terminal, GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain GAL4-like Zn(II)2Cys6 (C6 zinc) binuclear cluster DNA-binding domaina 1. CoA-transferase family III 2. Predicted acyl-CoA transferases/carnitine dehydratasea Putative lysophospholipase Alpha/beta hydrolase familya fungal transcription factor regulatory middle homology region GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain GAL4-like Zn(II)2Cys6 (C6 zinc) binuclear cluster DNA-binding domaina Chaperone for protein-folding within the ER, fungala Putative RNA methyltransferase Protein disulfide isomerase (PDIa) family Protein disulfide oxidoreductases and proteins with a thioredoxin fold Fungal transcription factor regulatory middle homology region GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain GAL4-like Zn(II)2Cys6 (C6 zinc) binuclear cluster DNA-binding domaina GAL4-like Zn2Cys6 binuclear cluster DNA-binding domain GAL4-like Zn(II)2Cys6 (C6 zinc) binuclear cluster DNA-binding domaina Yersinia pseudotuberculosis carbohydrate kinase-like subgroup Nucleotide-binding domain of the sugar kinase/HSP70/actin superfamily FGGY-family pentulose kinasea Rad17 cell cycle checkpoint proteina

glutamate or aspartate and a histidine, in their catalytic domain (Marchler-Bauer et al., 2011), a signature of serine protease (Chen and Bode, 1983). B9WFU9 shows the presence of N-glycanase signature sequence (Quevillon et al., 2005), and possesses hydrolase activity, similar to P21163A, an amidase protein, found in bacteria Elizabethkingia miricola, and cleaves β-aspartylglucosylamine bond of asparaginelinked glycans (Kuhn et al., 1994), essential for pathogenesis. Our sequence based function analysis clearly indicates the presence of various hydrolases not known to earlier which may be involved in the pathogenesis, and essential for the survival of Candida. 3.2.2. Phosphatase Decrease in phosphate concentration of host may lead to increase in virulence of pathogens like C. albicans, Candida glabrata and Saccharomyces cerevisiae (Powell et al., 2012). Phosphatase enzymes secreted by these pathogens lead to depletion of phosphate level in the local environment of the infection sites to enhance their pathogenicity. Protein B9WFG4 has a conserved domain with protein phosphatase 2C (PP2C), a major family of serine/threonine phosphatase protein (Marchler-Bauer et al., 2011; Quevillon et al., 2005). This is a Mn2 + or Mg2 + dependent Ser/Thr phosphatase protein essential for regulating cellular stress responses in eukaryotes (Das et al., 1996), similar to PTC1 of S. cerevisiae, that have shown similar activity with conserved catalytic domain to PP2C (Maeda et al., 1993). However, for its function PTC1 requires higher concentration of divalent ion than PP2C to function (Maeda et al., 1993). 3.2.3. Transferase In yeast some transferases have been found to play significant role against oxidative stress (Garcera et al., 2010). In our study protein B9WFR1 shows a signature domain of the methylase subunit of type I DNA methyltransferase (Quevillon et al., 2005), that presumably involved in adenine-specific DNA–methyltransferase activity (Quevillon et al., 2005). In human, N-6 adenine-specific DNA methyltransferase 1 is present which is orthologous to the yeast MTQ2 gene, and encodes S­adenosyl methionine (SAM)­dependent methyltransferase that participates in arsenic metabolism to detoxify the cyto-toxicity (monomethylarsonous acid) (Ren et al., 2011). Similarly HP

a

Multi-domain protein.

B9WFR9 also shows transferase activity. In this protein a domain for CoA-transferase family III has been found (Marchler-Bauer et al., 2011; Quevillon et al., 2005). Formyl-CoA transferase is found in

Table 5 Different types of folds identified in HPs from Candida dubliniensis. S. no.

Fold type

UniProt ID

1. 2. 3.

Beta-trefoil Small inhibitors, toxins, lectins Immunoglobulin-like

4. 5. 6. 7.

DNA binding 3-helical 4-helical cytokines Ob-fold Viral coat and capsid proteins

8. 9. 10. 11. 12. 13. 14. 15.

4-helical up and down bundle TIM-barrel Hydrolases Thioredoxin like Belta-grasp Cupredoxins Ribonuclease h-like motif Cona-like lectin/glucanases

B9W9J1, B9WFG9 B9WBA5, B9WFS4 B9WFD2, B9WFE4,B9WFF7, B9WFG4, B9WFH2, B9WFH4, B9WFP3, B9WFR1, B9WFS0, B9WFS6, B9WFT3, B9WFT7, B9WFT8, B9WFV7, B9WIB2, B9WIC3, B9WIF4, B9WIG2, B9WIH4 B9WFD7, B9WFR8 B9WFE6, B9WFS2 B9WFF1 B9WFG8, B9WFU7, B9WFX1, B9WIG1, B9WIH5 B9WFM3 B9WFR9, B9WFW8 B9WFS1 B9WFU3, B9WIF0 B9WFU9 B9WFV3 B9WFW2 B9WIA6

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

K. Kumar et al. / Gene xxx (2014) xxx–xxx

5

Beta-trefoil Small inhibitors, toxins, lectins Immunoglobulin-like DNA binding 3-helical 4-helical cytokines Ob-fold Viral coat and capsid proteins 4-helical up and down Bundle TIM-barrel Hydrolase Thioredoxin like Belta-grasp Cupredoxins Ribonuclease h-like motif Cona-like lectin/glucanases Fig. 2. HPs classified on the basis of types of fold present.

Oxalobacter formigenes, a bacterium present in the intestine and involved in oxalate catabolism in mammal catalysis transfer of CoA from formate to oxalate in the first step of oxalate degradation by O. formigenes (Ricagno et al., 2003). This is the key enzyme for oxalate-dependent ATP synthesis. 3.2.4. Kinase activity Kinases play an essential role in cell cycle regulation, filamentous growth and signal transduction in the Candida (Bruckmann et al., 2000; Monge et al., 2006). HP B9WFH4 has a catalytic domain for diacylglycerol kinase (DGK) (Marchler-Bauer et al., 2011; Quevillon et al., 2005). It was known that proteins belonging to this family prevent activation of protein kinase C by converting diacylglycerol, which is a protein kinase C activator, to phosphatidic acid (Bakali et al., 2007). There are more than ten isozymes of DGK that have been reported. These isoforms possess a variety of regulatory domains that play key roles in signal transduction pathways, neural and immune responses, cytoskeleton reorganization and carcinogenesis (Sakane et al., 2007). HP B9WFH4 has domain of sphingosine kinase, a DGK related enzyme (Marchler-Bauer et al., 2011).

Table 6 Functional categories of HPs. Predicted functions

HPs

Enzymatic activity Hydrolase activity Phosphatase activity Transferase activity Kinase activity Oxidoreductase activity Peroxiredoxin activity

B9WFS1, B9WFU9 B9WFG4 B9WFR1, B9WFR9 B9WFH4, B9WFW2 B9WIF0, B9WFU3 B9W9J1

Binding protein DNA binding RNA binding Protein binding ATP binding Phosphoinositide-binding

B9WFR8, B9WFS2, B9WFU7, B9WFV3, B9WFW8 B9WFE6, B9WFS6, B9WFG9 B9WFF1, B9WFM3, B9WFS4, B9WIB2 B9WFE4 B9WFD2

Other proteins Transport activity Structural protein

B9WFH2, B9WIH5 B9WFD7

Recently, sphingosine kinase has been reported as an oncogene, a potential neoplastic drug target. This kinase plays a significant role in pro-inflammatory and anti-apoptotic pathways (Bakali et al., 2007). Another protein showing kinase activity is HP B9WFW2. This protein has domain for Yersinia pseudotuberculosis carbohydrate kinase-like subgroup which belongs to FGGY family of carbohydrate kinases (Marchler-Bauer et al., 2011; Quevillon et al., 2005). Protein of this family catalyzes ATP-dependent phosphorylation in the presence of Mg2+ (Lim and Cohen, 1966). 3.2.5. Oxidoreductase We found that HP B9WIF0 contains a domain signature for the NADH–ubiquinone oxidoreductase, suggesting that it may be involved in dehydrogenase activity (Quevillon et al., 2005). Other HP B9WFU3 is expected to show oxidoreductase activity because it has redox active TRX domain. It contains a CXXC motif, a signature of the protein disulfide isomerase (PDIa) family (Marchler-Bauer et al., 2011; Quevillon et al., 2005). Member of this family acts as oxidases by catalyzing formation of disulphide bond of polypeptide in the endoplasmic reticulum and acts as isomerase to correct non-native disulfide bonds (Ellgaard and Ruddock, 2005). Such proteins also show chaperone activity (Ferrari and Soling, 1999). In S. cerevisiae, PDI plays an essential role in the isomerization of disulphide bonds along with its redox activity for the substrates such as carboxypeptidase Y. The role of a periplasmic disulfide oxidoreductase in the pathogenesis has already been well established in the case of Haemophilus influenzae (Rosadini et al., 2008), indicating the significance of oxidoreductase enzyme as a potential therapeutic target. 3.2.6. Peroxidoxin activity HP B9W9J1 is predicted as an enzyme with peroxidoxin activity because it has carboxymuconolactone decarboxylase (CMD) domain (Marchler-Bauer et al., 2011; Quevillon et al., 2005). Protein of this family plays a vital role in aromatic compound degradation under aerobic conditions in bacteria as it is involved in protocatechuate catabolism 3-oxoadipate pathway (Eulberg et al., 1998). Alkyl hydroperoxide reductase shows antioxidant activity with hydroperoxidase activity and together with protein like AhpC, DlaT and Lpd, it constitutes NADHdependent peroxidase, that protects bacterium from reactive nitrogen

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

6

K. Kumar et al. / Gene xxx (2014) xxx–xxx

by serving against peroxynitrate reductase (Bryk et al., 2002). Furthermore, mycobacterial peroxiredoxin AhpC, a member of the family of non-heme peroxidases, protects heterologous bacterial and human cells from oxidative and nitrosative injuries (Chen et al., 1998). These observations clearly indicate the potential role of enzymes, having CMD domain, in the pathogenesis.

with YAP180, a protein of S. cerevisiae, involved in phosphoinositide binding that acts as universal adaptor for the nucleation of cathrin coats (Bruckmann et al., 2000; Monge et al., 2006).

3.3. Binding proteins

3.4.1. Structural In our study we have also found that HP B9WFD7 has a cuticular protein signature which is a structural protein (Quevillon et al., 2005). These cuticular proteins are a composite structures with optimized mechanical properties for biological function. Cuticular protein LM-76 isolated from pharate cuticle of the Locusta migratoria and has homology with B9WFD7, is rich in amino acids Gly, Leu and Tyr at the N-terminal position in the conserved sequence (Andersen et al., 1993). Fungus is often covered by a proteinaceous surface layer that acts as a sieve for external molecular influx and protects microbes from external aggression (Kwan et al., 2006). Hence, the structural proteins are equally important for survival and pathogenesis.

3.3.1. Nucleic acid binding protein Five HPs are predicted as DNA-binding proteins. HP B9WFR8 has a domain similar to the transcription factor which is specifically found in fungi, includes transcriptional activator xlnR, and a Zn2 +-Cys6 binuclear cluster DNA-binding domains. These domains are present in the GAL4, a transcription regulators, and contain Zn2+-Cys6 motif that binds to sequences containing 2 DNA half sites constituted by 3–5 C/G combinations (Marmorstein et al., 1992). These domains are involved in binding to DNA at a major groove along with zinc (Marmorstein et al., 1992). Similarly, HP B9WFS2 contains a domain for GAL4 along with fungal transcription factor regulatory middle homology region (fungal_TF_MHR) (Marchler-Bauer et al., 2011). Fungal_TF_MHR is found in a large family of fungal zinc cluster transcriptional factors that have N-terminal GAL4-like C6 zinc binuclear cluster DNA-binding domain. This protein showed 84% sequence identity with known orthologous protein ZCF25 of C. albicans (Letunic et al., 2012; Powell et al., 2012). Interestingly, these domains are also conserved in HP B9WFU7, however, no significant hits are obtained for orthologous protein search. GAL4 domain is also conserved in DNA-binding protein HP B9WFV3, and showed significant sequence similarity with orthologous protein RGT1 of S. cerevisiae and C. albicans (Letunic et al., 2012). HP B9WFW8 showed higher similarity with RAD24 protein of S. cerevisiae and C. albicans and presumably is involved in DNA damage checkpoint mechanism (Marchler-Bauer et al., 2011; Quevillon et al., 2005). HP B9WFG9 shows close sequence similarity with PIN (PilT N terminus) domain of RBP1, a well characterized protein of S. cerevisiae. It is involved in nonsense-mediated mRNA decay, and essentially binds to either RNA or single stranded DNA (Lee and Moss, 1993; Letunic et al., 2012; Marchler-Bauer et al., 2011). Similarly, HP B9WFE6 was predicted as a RNA-binding protein, and showed close sequence similarity to the tobacco mosaic virus RNA-binding protein, which presumably plays a significant role in cell to cell movement during the early stage of infection (Gafny et al., 1992; Quevillon et al., 2005). HP B9WFS6 contains a domain like RNA methyltransferase, and may be involved in RNA methylation (Zarembinski et al., 2003). 3.3.2. Protein binding HP B9WFF1 showed a motif like ‘tetratrico peptide repeat’ that mediates protein–protein interaction and assembly of multi-protein complex. This protein is generally involved in neurogenesis, cell cycle regulation, transcriptional control, mitochondrial and peroxisomal transport and protein folding (Lamb et al., 1995). HP B9WFM3 has CUE like domain, involved in ubiquitin interaction. It also shows similarity with interleukin 1 protein and is involved in signal transduction pathway (Donaldson et al., 2003). The sequence of HP B9WFS4 is conserved with fungal chaperone Rot1, an essential molecular chaperon found in the membrane of the endoplasmic reticulum of S. cerevisiae. Molecular chaperons are involved in folding of denatured protein in vivo and prevent self-aggregation of proteins in vitro (Marchler-Bauer et al., 2011; Quevillon et al., 2005). HP B9WIB2 shows a close relationship with class S protein, a protein of phosphatidylinositol–glycan biosynthesis family. It complexes with glycosylphosphatidylinositol trans-amidase anchoring GPI in the endoplasmic reticulum (Ohishi et al., 2001). HP B9WFE4 shares a similar domain with proteins of P-loop NTPase superfamily that have motif for phosphate-binding known as Walker A motif. Members of this super family participate in nucleotide/nucleoside binding (Ohishi et al., 2001). HP B9WFD2 showed a close phylogenetic relationship

3.4. Other proteins

3.4.2. Transport HP B9WFH2 is predicted to be involved in transportation because of its close resemblance with the major facilitator superfamily (MFS) proteins. MFS proteins act as secondary transporters to facilitate transport of various substrates including drugs, neurotransmitters, amino acids, sugar phosphate and ions across cytoplasmic or internal membrane (Law et al., 2008; Marchler-Bauer et al., 2011; Quevillon et al., 2005). It has been reported that multidrug transporter of MFS proteins plays crucial role in the treatment of infectious disease. They are capable to handle wide range of cytotoxic compounds even if they are structurally and electrically dissimilar (Lewinson et al., 2006). Another protein predicted for transport activity is HP B9WIH5, that shows homology with Mae1 of Schizosaccharomyces pombe and Ss1 of S. cerevisiae. Mae1 is a malate transporter, whereas Ss1 is reported to be involved in sulfite efflux pump in the yeast (Quevillon et al., 2005; Vahisalu et al., 2008). Generally, therapeutic drugs act on four main categories of molecular targets such as enzymes, receptors, ion channels and transporters. Among these potential drug targets 60–70% are membrane proteins, clearly indicating the potential therapeutic application of HPs (St Georgiev, 2000).

3.4.3. Adhesins It has been reported that adhesins in Candida, play a very important role in host cell recognition. It binds to carbohydrate-containing receptors during invasion (Sturtevant and Calderone, 1997). We used the FungalRV server to predict the HPs having adhesin like signature, which is one of the important factors for causing pathogenesis to the host (Krogfelt, 1991). Adherence of microorganisms to host tissue causes tissue damage, invasion and dissemination. Among the 43 HPs, we have found five HPs, B9WFD7, B9WFE6, B9WFG8, B9WFT7, B9WIH4, that showed adhesin character, and could be used as a target for vaccine generation because adhesins of the fungal cell wall is primarily involved in adherence to host tissue, critical for colonization leading to invasion and damage of the host tissue. In the group of adhesin proteins, HP B9WFD7 is a structural protein that may participate in hyphae formation and get involved in host–pathogen interaction. Furthermore, such proteins may be a potential target for drug design and discovery because of the striking features of pathogenic fungi and Candida spp. and their ability to adhere tightly to different surfaces such as human skin, endothelial and epithelial mucosal tissues (de Groot et al., 2013). We expect that future experimental studies focused on functional characterization of novel putative adhesins will provide many new insights into their role in pathogenesis and host niche where the fungus lives and survives.

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

K. Kumar et al. / Gene xxx (2014) xxx–xxx

7

4. Discussion

References

The unit of HPs is still waiting for an experimental validation to show their existence at the protein level. Hence, bioinformatic handling of these protein sequences to assign a tentative function is mandatory (Lubec et al., 2005). Understanding of HPs is of utmost importance to complete genomic and proteomic information. Furthermore, detection of new HPs not only offers presentation of new structures but also new functions. Here we aligned the sequences of HPs in global and focal databases for homology searches. Searches for some distinct motif and domain have provided some clues for possible functions of HPs that are listed in Table 3. Moreover, prediction of subcellular localization and signal peptide identifications were applied to know the actual functional site for respective protein at cellular level (Table 2). Moreover, a complementing strategies for determination of physicochemical properties such as amino acid composition and hydrophobicity scoring, etc. were addressed in Table S1. We first annotated the sequence of HPs on the basis of sequence similarity followed by domain, motif and family search. If all results suggest the same function then we assigned a particular function to the corresponding protein sequence listed in Table 4. Moreover, protein fold plays a significant role in their function and hence the fold prediction has also been applied in order to further validate the predicted function (Table 5). Based on these results, we successfully annotated the function of 27 HPs which can be further used as a lead for designing experimental approaches geared towards evaluation of exact function of the gene. C. dubliniensis is the most closely related species to C. albicans, a pathogenic yeast species to humans. We used in silico approach to predict the function of 43 HPs. Five proteins are predicted as DNAbinding protein which may be involved in transcription regulation. Three proteins function as RNA-binding protein. There are 10 HPs which showed catalytic activity, that are essentially important for the pathogenesis. HP, B9WFE4 showed ATP-binding activity, while B9WFD7 acts as a structural protein. HPs, B9WFH2 and B9WIH5 may act as transporter protein. We identified five HPs that are adhesin-like protein, that may be involved in host–pathogen interaction. We did not find sufficient evidences to predict the functions of 16 proteins.

Adams, M.A., Suits, M.D., Zheng, J., Jia, Z., 2007. Piecing together the structure–function puzzle: experiences in structure-based functional annotation of hypothetical proteins. Proteomics 7, 2920–2932. Altschul, S.F., Koonin, E.V., 1998. Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases. Trends in Biochemical Sciences 23, 444–447. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25, 3389–3402. Andersen, J.S., Andersen, S.O., Hojrup, P., Roepstorff, P., 1993. Primary structure of a 14 kDa basic structural protein (Lm-76) from the cuticle of the migratory locust, Locusta migratoria. Insect Biochemistry and Molecular Biology 23, 391–402. Bakali, H.M., Herman, M.D., Johnson, K.A., Kelly, A.A., Wieslander, A., Hallberg, B.M., Nordlund, P., 2007. Crystal structure of YegS, a homologue to the mammalian diacylglycerol kinases, reveals a novel regulatory metal binding site. Journal of Biological Chemistry 282, 19644–19652. Bjornson, H.S., 1984. Enzymes associated with the survival and virulence of gramnegative anaerobes. Reviews of Infectious Diseases 6 (Suppl. 1), S21–S24. Bork, P., Koonin, E.V., 1996. Protein sequence motifs. Current Opinion in Structural Biology 6, 366–376. Bruckmann, A., Kunkel, W., Hartl, A., Wetzker, R., Eck, R., 2000. A phosphatidylinositol 3kinase of Candida albicans influences adhesion, filamentous growth and virulence. Microbiology 146 (Pt 11), 2755–2764. Bryk, R., Lima, C.D., Erdjument-Bromage, H., Tempst, P., Nathan, C., 2002. Metabolic enzymes of mycobacteria linked to antioxidant defense by a thioredoxin-like protein. Science 295, 1073–1077. Chaudhuri, R., Ansari, F.A., Raghunandanan, M.V., Ramachandran, S., 2011. FungalRV: adhesin prediction and immunoinformatics portal for human fungal pathogens. BMC Genomics 12, 192. Chen, Z., Bode, W., 1983. Refined 2.5 A X-ray crystal structure of the complex formed by porcine kallikrein A and the bovine pancreatic trypsin inhibitor. Crystallization, Patterson search, structure determination, refinement, structure and comparison with its components and with the bovine trypsin–pancreatic trypsin inhibitor complex. Journal of Molecular Biology 164, 283–311. Chen, L., Xie, Q.W., Nathan, C., 1998. Alkyl hydroperoxide reductase subunit C (AhpC) protects bacterial and human cells against reactive nitrogen intermediates. Molecular Cell 1, 795–805. Chen, Y., Yu, P., Luo, J., Jiang, Y., 2003. Secreted protein prediction system combining CJSPHMM, TMHMM, and PSORT. Mammalian Genome 14, 859–865. Das, A.K., Helps, N.R., Cohen, P.T., Barford, D., 1996. Crystal structure of the protein serine/ threonine phosphatase 2C at 2.0 A resolution. EMBO Journal 15, 6798–6809. de Groot, P.W., Bader, O., de Boer, A.D., Weig, M., Chauhan, N., 2013. Adhesins in human fungal pathogens: glue with plenty of stick. Eukaryotic Cell 12, 470–481. Desler, C., Suravajhala, P., Sanderhoff, M., Rasmussen, M., Rasmussen, L.J., 2009. In silico screening for functional candidates amongst hypothetical proteins. BMC Bioinformatics 10, 289. Donaldson, K.M., Yin, H., Gekakis, N., Supek, F., Joazeiro, C.A., 2003. Ubiquitin signals protein trafficking via interaction with a novel ubiquitin binding domain in the membrane fusion regulator, Vps9p. Current Biology 13, 258–262. Eisenstein, E., Gilliland, G.L., Herzberg, O., Moult, J., Orban, J., Poljak, R.J., Banerjei, L., Richardson, D., Howard, A.J., 2000. Biological function made crystal clear — annotation of hypothetical proteins via structural genomics. Current Opinion in Biotechnology 11, 25–30. Ellgaard, L., Ruddock, L.W., 2005. The human protein disulphide isomerase family: substrate interactions and functional properties. EMBO Reports 6, 28–32. Eulberg, D., Lakner, S., Golovleva, L.A., Schlomann, M., 1998. Characterization of a protocatechuate catabolic gene cluster from Rhodococcus opacus 1CP: evidence for a merged enzyme with 4-carboxymuconolactone-decarboxylating and 3-oxoadipate enol-lactone-hydrolyzing activity. Journal of Bacteriology 180, 1072–1081. Ferrari, D.M., Soling, H.D., 1999. The protein disulphide-isomerase family: unravelling a string of folds. Biochemical Journal 339 (Pt 1), 1–10. Gafny, R., Lapidot, M., Berna, A., Holt, C.A., Deom, C.M., Beachy, R.N., 1992. Effects of terminal deletion mutations on function of the movement protein of tobacco mosaic virus. Virology 187, 499–507. Galperin, M.Y., Koonin, E.V., 2004. ‘Conserved hypothetical’ proteins: prioritization of targets for experimental study. Nucleic Acids Research 32, 5452–5463. Garcera, A., Casas, C., Herrero, E., 2010. Expression of Candida albicans glutathione transferases is induced inside phagocytes and upon diverse environmental stresses. FEMS Yeast Research 10, 422–431. Hassan, M.I., Kumar, V., Singh, T.P., Yadav, S., 2007a. Structural model of human PSA: a target for prostate cancer therapy. Chemical Biology & Drug Design 70, 261–267. Hassan, M.I., Kumar, V., Somvanshi, R.K., Dey, S., Singh, T.P., Yadav, S., 2007b. Structureguided design of peptidic ligand for human prostate specific antigen. Journal of Peptide Science 13, 849–855. Hirokawa, T., Boon-Chieng, S., Mitaku, S., 1998. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14, 378–379. Jackson, A.P., Gamble, J.A., Yeomans, T., Moran, G.P., Saunders, D., Harris, D., Aslett, M., Barrell, J.F., Butler, G., Citiulo, F., et al., 2009. Comparative genomics of the fungal pathogens Candida dubliniensis and Candida albicans. Genome Research 19, 2231–2244. Krogfelt, K.A., 1991. Bacterial adhesion: genetics, biogenesis, and role in pathogenesis of fimbrial adhesins of Escherichia coli. Reviews of Infectious Diseases 13, 721–735. Kuhn, P., Tarentino, A.L., Plummer Jr., T.H., Van Roey, P., 1994. Crystal structure of peptideN4-(N-acetyl-beta-D-glucosaminyl)asparagine amidase F at 2.2-A resolution. Biochemistry 33, 11699–11706.

5. Conclusions In silico analysis described here provides a simple and correct method for assigning function to various HPs of C. dubliniensis. Insufficient sequence resemblance in the database to the some of HPs creates problems for accurate functional predictions. Our study facilitates a rapid identification of the hidden function of HPs which is a potential therapeutic targets and may play a significant role in host–pathogen interactions. Once these HPs are established as a novel drug/vaccine targets, further research for new inhibitors and vaccines can be conducted for other clinically important pathogens. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2014.03.060.

Conflict of interest We do not have any conflict of interest.

Acknowledgments This work is supported by the Indian Council of Medical Research (BIC/12(04)/2012) to MIH and FA. AP is thankful to the UGC (BSR grant), Delhi, India, for providing the Dr. DS Kothari post-doctoral fellowship to carry this work.

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

8

K. Kumar et al. / Gene xxx (2014) xxx–xxx

Kwan, A.H., Winefield, R.D., Sunde, M., Matthews, J.M., Haverkamp, R.G., Templeton, M.D., Mackay, J.P., 2006. Structural basis for rodlet assembly in fungal hydrophobins. Proceedings of the National Academy of Sciences of the United States of America 103, 3621–3626. Lamb, J.R., Tugendreich, S., Hieter, P., 1995. Tetratrico peptide repeat interactions: to TPR or not to TPR? Trends in Biochemical Sciences 20, 257–259. Law, C.J., Maloney, P.C., Wang, D.N., 2008. Ins and outs of major facilitator superfamily antiporters. Annual Review of Microbiology 62, 289–305. Lee, F.J., Moss, J., 1993. An RNA-binding protein gene (RBP1) of Saccharomyces cerevisiae encodes a putative glucose-repressible protein containing two RNA recognition motifs. Journal of Biological Chemistry 268, 15080–15087. Letunic, I., Doerks, T., Bork, P., 2012. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Research 40, D302–D305. Lewinson, O., Adler, J., Sigal, N., Bibi, E., 2006. Promiscuity in multidrug recognition and transport: the bacterial MFS Mdr transporters. Molecular Microbiology 61, 277–284. Lim, R., Cohen, S.S., 1966. D-phosphoarabinoisomerase and D-ribulokinase in Escherichia coli. Journal of Biological Chemistry 241, 4304–4315. Loewenstein, Y., Raimondo, D., Redfern, O.C., Watson, J., Frishman, D., Linial, M., Orengo, C. , Thornton, J., Tramontano, A., 2009. Protein function annotation by homology-based inference. Genome Biology 10, 207. Lubec, G., Afjehi-Sadat, L., Yang, J.W., John, J.P., 2005. Searching for hypothetical proteins: theory and practice based upon original data and literature. Progress in Neurobiology 77, 90–127. Maeda, T., Tsai, A.Y., Saito, H., 1993. Mutations in a protein tyrosine phosphatase gene (PTP2) and a protein serine/threonine phosphatase gene (PTC1) cause a synthetic growth defect in Saccharomyces cerevisiae. Molecular and Cellular Biology 13, 5408–5417. Marchler-Bauer, A., Lu, S., Anderson, J.B., Chitsaz, F., Derbyshire, M.K., DeWeese-Scott, C., Fong, J.H., Geer, L.Y., Geer, R.C., Gonzales, N.R., et al., 2011. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Research 39, D225–D229. Marmorstein, R., Carey, M., Ptashne, M., Harrison, S.C., 1992. DNA recognition by GAL4: structure of a protein–DNA complex. Nature 356, 408–414. Mazandu, G.K., Mulder, N.J., 2012. Function prediction and analysis of Mycobacterium tuberculosis hypothetical proteins. International Journal of Molecular Sciences 13, 7283–7302. Monge, R.A., Roman, E., Nombela, C., Pla, J., 2006. The MAP kinase signal transduction network in Candida albicans. Microbiology 152, 905–912. Moran, G.P., Sullivan, D.J., Henman, M.C., McCreary, C.E., Harrington, B.J., Shanley, D.B., Coleman, D.C., 1997. Antifungal drug susceptibilities of oral Candida dubliniensis isolates from human immunodeficiency virus (HIV)-infected and non-HIV-infected subjects and generation of stable fluconazole-resistant derivatives in vitro. Antimicrobial Agents and Chemotherapy 41, 617–623. Nakai, K., Horton, P., 1999. PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends in Biochemical Sciences 24, 34–36. Nimrod, G., Schushan, M., Steinberg, D.M., Ben-Tal, N., 2008. Detection of functionally important regions in “hypothetical proteins” of known structure. Structure 16, 1755–1763. O'Connor, L., Caplice, N., Coleman, D.C., Sullivan, D.J., Moran, G.P., 2010. Differential filamentation of Candida albicans and Candida dubliniensis is governed by nutrient regulation of UME6 expression. Eukaryotic Cell 9, 1383–1397. Ohishi, K., Inoue, N., Kinoshita, T., 2001. PIG-S and PIG-T, essential for GPI anchor attachment to proteins, form a complex with GAA1 and GPI8. EMBO Journal 20, 4088–4098. Ollis, D.L., Cheah, E., Cygler, M., Dijkstra, B., Frolow, F., Franken, S.M., Harel, M., Remington, S.J., Silman, I., Schrag, J., et al., 1992. The alpha/beta hydrolase fold. Protein Engineering 5, 197–211.

Orengo, C.A., Michie, A.D., Jones, S., Jones, D.T., Swindells, M.B., Thornton, J.M., 1997. CATH —a hierarchic classification of protein domain structures. Structure 5, 1093–1108. Petersen, T.N., Brunak, S., von Heijne, G., Nielsen, H., 2011. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nature Methods 8, 785–786. Powell, S., Szklarczyk, D., Trachana, K., Roth, A., Kuhn, M., Muller, J., Arnold, R., Rattei, T., Letunic, I., Doerks, T., et al., 2012. eggNOG v3.0: orthologous groups covering 1133 organisms at 41 different taxonomic ranges. Nucleic Acids Research 40, D284–D289. Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., Lopez, R., 2005. InterProScan: protein domains identifier. Nucleic Acids Research 33, W116–W120. Ren, X., Aleshin, M., Jo, W.J., Dills, R., Kalman, D.A., Vulpe, C.D., Smith, M.T., Zhang, L., 2011. Involvement of N-6 adenine-specific DNA methyltransferase 1 (N6AMT1) in arsenic biomethylation and its role in arsenic-induced toxicity. Environmental Health Perspectives 119, 771–777. Ricagno, S., Jonsson, S., Richards, N., Lindqvist, Y., 2003. Formyl-CoA transferase encloses the CoA binding site at the interface of an interlocked dimer. EMBO Journal 22, 3210–3219. Rosadini, C.V., Wong, S.M., Akerley, B.J., 2008. The periplasmic disulfide oxidoreductase DsbA contributes to Haemophilus influenzae pathogenesis. Infection and Immunity 76, 1498–1508. Sakane, F., Imai, S., Kai, M., Yasuda, S., Kanoh, H., 2007. Diacylglycerol kinases: why so many of them? Biochimica et Biophysica Acta 1771, 793–806. Schaller, M., Borelli, C., Korting, H.C., Hube, B., 2005. Hydrolytic enzymes as virulence factors of Candida albicans. Mycoses 48, 365–377. Sebti, A., Kiehn, T.E., Perlin, D., Chaturvedi, V., Wong, M., Doney, A., Park, S., Sepkowitz, K. A., 2001. Candida dubliniensis at a cancer center. Clinical Infectious Diseases 32, 1034–1038. Shahbaaz, M., Hassan, M.I., Ahmad, F., 2013. Functional annotation of conserved hypothetical proteins from Haemophilus influenzae Rd KW20. PLoS One 8, e84263. St Georgiev, V., 2000. Membrane transporters and antifungal drug resistance. Current Drug Targets 1, 261–284. Sturtevant, J., Calderone, R., 1997. Candida albicans adhesins: biochemical aspects and virulence. Revista Iberoamericana de Micología 14, 90–97. Sullivan, D., Coleman, D., 1998. Candida dubliniensis: characteristics and identification. Journal of Clinical Microbiology 36, 329–334. Sullivan, D.J., Moran, G.P., Coleman, D.C., 2005. Candida dubliniensis: ten years on. FEMS Microbiology Letters 253, 9–17. Thakur, P.K., Hassan, I., 2011. Discovering a potent small molecule inhibitor for gankyrin using de novo drug design approach. International Journal of Computational Biology and Drug Design 4, 373–386. Thakur, P.K., Kumar, J., Ray, D., Anjum, F., Hassan, M.I., 2013. Search of potential inhibitor against New Delhi metallo-beta-lactamase 1 from a series of antibacterial natural compounds. Journal of Natural Science, Biology and Medicine 4, 51–56. Thomas, P.D., Campbell, M.J., Kejariwal, A., Mi, H., Karlak, B., Daverman, R., Diemer, K., Muruganujan, A., Narechania, A., 2003. PANTHER: a library of protein families and subfamilies indexed by function. Genome Research 13, 2129–2141. Thompson, J.D., Gibson, T.J., Higgins, D.G., 2002. Multiple sequence alignment using ClustalW and ClustalX. Current Protocols in Bioinformatics (Chapter 2, Unit 2 3). Vahisalu, T., Kollist, H., Wang, Y.F., Nishimura, N., Chan, W.Y., Valerio, G., Lamminmaki, A., Brosche, M., Moldau, H., Desikan, R., et al., 2008. SLAC1 is required for plant guard cell S-type anion channel function in stomatal signalling. Nature 452, 487–491. Zarembinski, T.I., Kim, Y., Peterson, K., Christendat, D., Dharamsi, A., Arrowsmith, C.H., Edwards, A.M., Joachimiak, A., 2003. Deep trefoil knot implicated in RNA binding found in an archaebacterial protein. Proteins 50, 177–183.

Please cite this article as: Kumar, K., et al., Functional annotation of putative hypothetical proteins from Candida dubliniensis, Gene (2014), http:// dx.doi.org/10.1016/j.gene.2014.03.060

Functional annotation of putative hypothetical proteins from Candida dubliniensis.

An extensive analysis of C. dubliniensis proteomics data showed that ~22% protein are conserved hypothetical proteins (HPs) whose function is still no...
508KB Sizes 0 Downloads 3 Views