Sexually transmitted diseases putative drug target database: A comprehensive database of putative drug targets of pathogens identified by comparative genomics Vijayakumari Malipatil, Shivkumar Madagi, Biplab Bhattacharjee1
ABSTRACT Objective: Sexually transmitted diseases (STD) are the serious public health problems and also impose a financial burden on the economy. Sexually transmitted infections are cured with single or multiple antibiotics. However, in many cases the organism showed persistence even after treatment. In the current study, the set of druggable targets in STD pathogens have been identified by comparative genomics. Materials and Methods: The subtractive genomics scheme exploits the properties of non-homology, essentiality, membrane localization and metabolic pathway uniqueness in identifying the drug targets. To achieve the effective use of data and to understand properties of drug target under single canopy, an integrated knowledge database of drug targets in STD bacteria was created. Data for each drug targets include biochemical pathway, function, cellular localization, essentiality score and structural details. Results: The proteome of STD pathogens yielded 44 membrane associated proteins possessing unique metabolic pathways when subjected to the algorithm. The database can be accessed at http://biomedresearchasia.org/index.html. Conclusion: Diverse data merged in the common framework of this database is expected to be valuable not only for basic studies in clinical bioinformatics, but also for basic studies in immunological, biotechnological and clinical fields.
Department of Bioinformatics, Karnataka State Women University, Bijapur, 1Department of Biotechnology, PES Institute of Technology, Bangalore, Karnataka, India Received: 27-04-2013 Revised: 20-05-2013 Accepted: 07-07-2013 Correspondence to: Mr. Biplab Bhattacharjee, E-mail: [email protected]
KEY WORDS: Drug targets, homology modeling, sexually transmitted infections, subtractive genomics
Introduction The human genome project and genomic sequencing of pathogenic bacteria have increased momentum in the field of drug discovery. Sexually transmitted diseases (STD) generally classified as acute and chronic during penile, anal, oral, vaginal infections and infection in other body humors. Center for Disease Control and Prevention, USA estimated that there are approximately 19 million STD infections every year and almost half of them are between the ages of 15-24 years. A comparable situation is also observed in Europe. Sexually transmitted infections are mainly caused by Chlamydia Access this article online Website: www.ijp-online.com
Quick Response Code:
434 Indian Journal of Pharmacology | October 2013 | Vol 45 | Issue 5
trachomatis, Haemophilus ducreyi, Streptococcus agalactiae, Neisseria gonorrhea, Mycoplasma genitalium, Ureaplasma urealyticum and Chlamydia pneumonia. Together these organisms amount to a considerable health menace. These organisms are collectively responsible for various conditions at advanced stages such as blindness, infertility, bone deformities, brain damage, birth defects and even death. Prophylactic treatment for the chlamydial infections includes macrolide antibiotics like erythromycin and its derivative azithromycin, polyketide antibiotic such as tetracycline and doxycycline are also used. Even though treated with these antibiotics, a noncompliance is observed. Reason for such non-compliance is yet to be discovered.[2,3] H. ducreyi showed more resistance than Escherichia coli to antimicrobial peptides. It is also recognized to possess β-lactamase activity making it resistant to the antibiotic ampicillin.[4,5] U. urealyticum showed sensitivity to doxycycline, however, the increase in inoculum size of about 10 (5)/ml showed resistance. Resistant strains of Ureaplasm have been observed in hypogammaglobulinemia patients. Animal studies show the persistence of Chlamydia
Malipatil, et al.: STD putative drug target database
even after antibiotic treatment. Treponema pallidum possess membrane bound protein Tp47 which is analogous to β-lactamase and other penicillin binding protein, conferring resistance to antibiotics. Mutations in 23S ribosomal RNA also contribute to resistance to macrolides.[8,9] Similarly, N. gonorrhea also showed resistance to antibiotics. The rising antibiotic resistance exhibited by the STD organism’s bacons the need for an alternative remedy. There is a need to explore additional potential drug targets for the STD organisms. The present study serves as, an investigative platform for identification of potential drug targets that are unique to a microorganism and essential for its survival in addition to being non-cross reactive with human proteome. Finding of unique proteins and their 3-D modeling may provide an opportunity for improving the diagnostic methods for detecting the presence of the organism in clinical diagnosis. The present study can expand its horizon in areas of research such as virtual screening and epitope mapping. Materials and Methods Identification of the Putative Drug Targets Figure 1 depicts the workflow of identification of putative drug targets of the STD pathogens. Proteome-Retrieval of Host and Pathogen The complete proteome of STD organisms (C. trachomatis, H. ducreyi, S. agalactiae, N. gonorrhea, M. genitalium, U. urealyticum and C. pneumonia) and Homo sapiens was retrieved from NCBI Protein and SwissProt. Filtering the Non-Homologous Proteins of Pathogen The elimination strategy of redundant sequences of STD organisms was performed by taking the NR(non-redundant) reference sequence dataset from NCBI. The next strategy that was implemented on the filtered dataset was Blastp analysis against Homo sapiens proteome keeping the cutoff of 60% sequence identity. The screened set of protein thus obtained has no significant similarity with the host proteome. Figure 1: Flow chart for identification of drug targets in sexually transmitted diseases organisms
Detection of Essential Proteins in STD Organisms In order to detect the essential proteins from the filtered dataset BlastP analysis with the database of essential genes was performed keeping the cutoff Bits Score at 100 respectively. As a result, a subset of essential proteins having insignificant sequence similarity with host proteome is obtained from this step. Prediction of Sub-Cellular Location In quest of identification of the precise location of essential proteins in different cellular organelles, sub-cellular localization prediction is performed by employing CELLO, an online prediction server. Protein Functional Annotation Many of the identified essential non-homologous proteins didn’t have any proper function and were mostly in putative classes. To uncover the function of these proteins, online function prediction server PFP Automated Protein Function Prediction Server (http://kiharalab.org/web/pfp.php) was engaged. Identification of Metabolic Pathway Involvement In tune with the objective of subtractive genomics, the screened essential non-homologous protein dataset was put into use for deriving their involvement in pathogen specific pathways. This step was conducted using KEGG automatic annotation server. Once the pathways were identified, only those proteins were analyzed further, which showed the involvement in unique pathogen specific pathways and nil participation not present in host pathways. Structure Prediction of Putative Drug Targets From the earlier steps, membrane associated targets were identified for H. ducreyi, N. gonorrhea, S. agalactiae, C. trachomatis, C. pneumonia and T. pallidum respectively and cytoplasmic targets were identified for remaining two pathogens namely: U. urealyticum and M. genitalium [Tables 1 and 2]. Biophysically derived experimental structures were not available for any of these targets, so we employed computational tertiary structure prediction techniques to derive a theoretical model of these druggable targets. The first step employed in the computational structure prediction was comparative modeling. However, the lack of genuine similarity between target and template sequence, did not yield results. For getting better prediction accuracy, fold recognition technique Phyre2 server was implemented. Prediction of protein structure based on similarity in folds occurring in the known protein structures is called fold recognition. Once the tertiary structure is generated, energy minimization is employed by means of GROMACS (OPLS force field) with Steepest Descent and Conjugate Gradient Algorithm. Structure Validation Generating Ramchandan Plot in program RAMPAGE allows elucidation of the atomic and inter-atomic physical parameters showing accommodation of amino acids in 3-D space and atomic nomenclature. Errat program was employed to establish the overall quality of the theoretical model based on the contribution of non-covalent interactions between different atoms. Root-mean-square deviation and root mean square fluctuation was calculated for modeled structures. Scatter Indian Journal of Pharmacology | October 2013 | Vol 45 | Issue 5 435
Malipatil, et al.: STD putative drug target database
Table 1: Membrane-associated putative drug targets in STD pathogens with unique pathways Biochemical pathway
Protein accession no.
Phospho-N-acetylmuramoyl-pentapeptide-transferase activity Electrochemical potential-driven transporters-porters Magnesium-binding Transferases-glycosyltransferases Transmembrane
NP_225095.1cp, YP_208585.1ng, NP_687323.1sa YP_328449.1ct YP_328585.1ct YP_328589.1ct NP_872841.1hd, NP_874315.1hd, YP_208751.1ng, YP_001933518.1tp NP_873332.1hd YP_208595.1ng, NP_687174.1sa NP_687322.1sa NP_225158.1cp YP_328228.1ct NP_872818.1hd, NP_872980.1hd NP_873583.1hd NP_225098.1cp YP_328588.1ct YP_328589.1ct NP_872843.1hd, NP_873653.1hd, YP_001933392.1tp NP_873332.1hd
Primary active transporters-P-P-bond-hydrolysis-driven transporters Undecaprenyl-diphosphatase activity Penicillin binding Lipid-A-disaccharide synthase activity Transferases-glycosyltransferases All lipid-binding proteins Transmembrane Dopamine receptor activity Electrochemical potential-driven transporters-porters Transferases-glycosyltransferases Transmembrane Primary active transporters-P-P-bond-hydrolysis-driven transporters Binding Adenyl nucleotide binding Transferases-transferring phosphorus-containing groups Binding
YP_208582.1ng, YP_208830.1ng, NP_688903.1sa NP_687776.1sa NP_873883.1hd NP_687160.1sa, NP_687355.1sa, NP_687428.1sa, NP_687735.1sa, NP_688947.1sa, NP_689041.1sa, NP_689108.1sa YP_001933953.1tp NP_873268.1hd
Chlamydophila pneumoniae CWL029, ctChlamydia trachomatis A/HAR-13, hdHaemophilus ducreyi 35000HP, ngNeisseria gonorrhoeae FA 1090, saStreptococcus agalactiae 2603V/R, tpTreponema pallidum ssp. pallidum strain SS14, STD=Sexually transmitted diseases cp
Table 2: Cytoplasmic putative drug targets in STD pathogens with unique pathways Biochemical pathway
Protein accession no.
Cell cycle — caulobacter
Hydrolase activity, acting on acid anhydrides Oxidoreductases-acting on the CH-OH group of donors Zinc-binding All DNA-binding All DNA-binding
Translation regulator activity Zinc-binding
Two-component system Plant-pathogen interaction
NP_072890.1mg NP_072905.1mg NP_073140.1mg NP_073140.1mg
Ureaplasma urealyticum serovar 10 str. ATCC 33699, mgMycoplasma genitalium G37, STD=Sexually transmitted diseases up
plot diagram generated by program Prochek was analyzed to demonstrate a graphical view indicating the amino acids falling in allowed and disallowed regions. The disallowed region indicates stearic hindrance. A higher percentage of the amino acids in the allowed region indicate a stable structure. The study conducted is completely on in-silico platforms and no 436 Indian Journal of Pharmacology | October 2013 | Vol 45 | Issue 5
experimental animal models have been used. Hence, Animal Ethics Committee Approval not included. Database Construction Database schema The repository of the putative drug targets of STD organisms was part of the well-designed relationally organized database. The implementation of the database by FTP was accomplished by use of HTML and PHP. The Windows based web server was used to deliver the interface. The data base is launched and is available for the scientific community at http:// biomedresearchasia.org/index.html. The pipeline to construct the database was partially automated and is also manually verified at particular stages, to ensure a minimum error level in the data. The schematic representation of the workflow also sheds light on overall data content of this database, which comprises of list putative drug targets in of each microbe, in addition to annotated data about the target, i.e., FASTA sequence, Biochemical pathways, function, cellular localization and theoretical 3-D structure. We have also incorporated “User Submission page,” which supports incorporation of additional information acquired by experimental techniques or computational methods. The derived data set of individual target such as protein sequence and theoretical structure are stored in standard file formats, i.e., FASTA and PDB respectively,
Malipatil, et al.: STD putative drug target database
which are processed and accessed through the home page. The schema diagram and the home page of the final database are displayed in Figures 2 and 3. Cross-referencing with the Related Biological Database External data repositories containing relevant data to the drug targets are hyperlinked by echoing appropriate HTML to call the URL of the target database [Figure 4]. The essential data from the repositories can be mined and extracted through hyperlinks created. The primary data in the present study are the accession numbers of protein-drug targets, which are hyperlinked to NCBI protein database and give the complete GenBank record of the protein. The Biochemical pathways identified are hyperlinked to NCBI Biosystems that provide a detailed metabolic pathway explanation with diagram. Cross linking an appropriate reference to the relevant content in the external database aids in a retrieval of in-depth information about the data-type. As the information in the database is updated on a regular basis due to experimental developments, there lies a demand of cross-referencing that helps to avoid obsolete data. Results STD database provides a single platform for accessing diverse information pertaining to potential drug targets in human STD organism. It integrates and filters information from the database and various tools. Furthermore, the function of the druggable target; its cellular localization; metabolic pathway attachment and theoretical protein structure models are available in this database. A total of 10435 references protein sequences of STD organisms were retrieved from database NCBI. The sequences represent chief organisms that cause STD. Retrieving of the sequence information of STD organisms is possible as complete genomes of these pathogens were deciphered using genomic projects. In the present study, subtractive genomics approach is used to find drug targets in STD pathogens.[18,19] About 10435 proteins were studied for their properties to qualify as drug targets. Out them 3955 were metabolically essential for the bacteria under study and 44 were proteins associated with unique pathway so has to act like unique drug targets. 36 of the 44 are membrane associated and 6 are cytoplasmic. The unique drug target proteins in STD organisms were membraneassociated as they act to trigger an immune response. Tables 1 and 2 are display the list of organisms causing STD and identified putative drugs of each organism. The functional and biochemical pathway involvement is also displayed in this table. The database also consists of the Protein Data Bank (PDB) file to give structural information about the drug targets.
Figure 2: A schematic representation of the pipeline in the development of sexually transmitted diseases putative drug target database
Figure 3: Screen shot of the sexually transmitted diseases putative drug target database home page
Figure 4: Screen shot of webpage of one sexually transmitted diseases organism with hyperlinking
Discussion STD database provides a user-friendly web interface with pliability to select for an entry or a collective set of entries matching the user’s criteria such as name of the drug target, biochemical pathway and cellular localization and PDB structure. Metabolic pathway analysis indicated the involvement of the identified drug targets in several essential pathways viz. cell cycle caulobacter; methane metabolism; Indian Journal of Pharmacology | October 2013 | Vol 45 | Issue 5 437
Malipatil, et al.: STD putative drug target database
lipopolysaccharide biosynthesis; peptidoglycan biosynthesis and two component system. These pathways represent the unique pathways of the STD organisms. Even though U. urealyticum and M. genitalium did not show any membrane protein that has unique pathway, it showed cytoplasmic proteins with unique pathways. The majority of the drug targets identified in our study are associated with peptidoglycan biosynthetic pathway, which forms a key synthesizing machinery of the peptidoglycan layer of the cell wall in bacteria. It also forms the chief structural element providing resistance to water imbalance and serves a key functionality of defense machinery and virulence of the organism. The second pathway that is relevant to the identified set of drug targets is the lipopolysaccharide biosynthesis pathway. Lipopolysaccharide is principally found in Gram negative bacteria and serves a key function in structural integrity, immunogenicity and shielding from chemicals. The third pathway of drug targets is cell cycle caulobacter. The drug targets also showed their participation in a significant manner in two other pathways namely: Methane metabolism and two component system. In methane metabolism pathway 1-Carbon compounds such as methane and methanol are used for generation of energy for growth related processes. The two component system facilitates the bacteria in responding to environmental variations. Thus; it forms the key signal transducing machinery for modifying the cell physiology intended for adapting environmental changes. Since the identified set of druggable proteins forms, the key component of the above mentioned metabolic pathways, inhibition of these druggable protein by means of competitive, non-competitive or metabolites analogous to inhibitors can upset the normal processes of the pathway and may prove as bacteriostatic or bactericidal. Any attempt of obstructing the pathways averts the accumulation of the product, which in turn is a potential reactant of the succeeding reaction in the metabolic cascade. This has a colossal effect in slowing down or inhibiting further reaction down the respective metabolic pathway. In last two decades, the drug resistance has turned into gigantic health menace for health-care providers of STD infections. To curb this menace, new drugs targeting novel druggable targets are the need of the hour. Our database is an attempt to use computational techniques to generate a repository of putative drug targets for STD pathogens. An indepth understanding of the predicted drug targets will facilitate in the screening of ligand library to generate a new set of drug molecules for effective treatment of the drug resistant pathogens. Structural knowledge of the drug targets also is an important part in finding active part of an antigen during epitope mapping for vaccine design. In the pursuit of the above applications, it is our assumption that this database will serve as a useful resource of manually curated information relating to sequence, structure and function; all integrated into a single platform. The putative targets should be further validated in an
438 Indian Journal of Pharmacology | October 2013 | Vol 45 | Issue 5
experimental setting to gather concrete evidence about their druggable potentiality. Acknowledgment The study has been carried out with financial support by DBT-BIF, Karnataka State Women University, Bijapur, Karnataka, India.
References 1. Centers for Disease Control and Prevention.Most Widely Reported,Curable STDs Remain Significant Health Threat, CDC fact sheet, march 2009. Available from http://www.cdc.gov/nchhstp/newsroom/docs/stdfastfacts-3.27.09-508%20 compliant.pdf. 2. Jones RB. New treatments for Chlamydia trachomatis. Am J Obstet Gynecol 1991;164:1789-93. 3. Mårdh PA, Persson K. Is there a need for rescreening of patients treated for genital chlamydial infections? Int J STD AIDS 2002;13:363-7. 4. Sturani E, Zippel R, Morello L, Brambilla R, Comoglio PM, Alberghina L. Kinetics of tyrosine phosphorylation and internalization of human EGF receptors overexpressed in NIH 3T3 fibroblasts. Exp Cell Res 1990;191:323-7. 5. Kobayashi R, Nakadaira H, Ishigami K, Muto K, Anesaki S, Yamamoto M. Effects of physical exercise on fall risk factors in elderly at home in intervention trial. Environ Health Prev Med 2006;11:250-5. 6. Jalil N, Gilchrist C, Taylor-Robinson D. Factors influencing the in-vitro sensitivity of Ureaplasma urealyticum to tetracyclines. J Antimicrob Chemother 1989;23:341-5. 7. Taylor-Robinson D, Furr PM. Clinical antibiotic resistance of Ureaplasma urealyticum. Pediatr Infect Dis 1986;5:S335-7. 8. Woznicová V, Heroldová M. Detection of Treponema pallidum DNA in the serum of an adequately treated patient with latent syphilis. Acta Derm Venereol 2007;87:379-80. 9. Tipple C, McClure MO, Taylor GP. High prevalence of macrolide resistant Treponema pallidum strains in a London centre. Sex Transm Infect 2011;87:486-8. 10. Li W, Godzik A. Cd-hit: A fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006;22:1658-9. 11. Zhang R, Lin Y. DEG 5.0, a database of essential genes in both prokaryotes and eukaryotes. Nucleic Acids Res 2009;37:D455-8. 12. www.ncbi.nlm.nih.gov. 13. Chitale M, Kihara D. Computational protein function prediction: Framework and challenges. In: Kihara D, editor. Protein Function Prediction for Omics Era. 1st ed. West Lafayette: Springer; 2011. p. 1-17. 14. Moriya Y, Itoh M, Okuda S, Yoshizawa AC, Kanehisa M. KAAS: An automatic genome annotation and pathway reconstruction server. Nucleic Acids Res 2007;35:W182-5. 15. Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res 2000;28:27-30. 16. Kelley LA, Sternberg MJ. Protein structure prediction on the Web: A case study using the Phyre server. Nat Protoc 2009;4:363-71. 17. Ramachandran GN, Ramakrishnan C, Sasisekharan V. Stereochemistry of polypeptide chain configurations. J Mol Biol 1963;7:95-9. 18. Sakharkar KR, Sakharkar MK, Chow VT. A novel genomics approach for the identification of drug targets in pathogens, with special reference to Pseudomonas aeruginosa. In Silico Biol 2004;4:355-60. 19. Madagi S, Patil VM, Sadegh S, Singh AK, Garwal B, Banerjee A, et al. Identification of membrane associated drug targets in Borrelia burgdorferi ZS7-subtractive genomics approach. Bioinformation 2011;6:356-9. Cite this article as: Malipatil V, Madagi S, Bhattacharjee B. Sexually transmitted diseases putative drug target database: A comprehensive database of putative drug targets of pathogens identified by comparative genomics. Indian J Pharmacol 2013;45:434-8. Source of Support: The study has been carried out with financial support by DBT-BIF, Karnataka State Women University, Bijapur, Karnataka, India, Conflict of Interest: No.
Copyright of Indian Journal of Pharmacology is the property of Medknow Publications & Media Pvt. Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.