Motivation: In silico prediction of drug–target interactions from heterogeneous biological data is critical in the search for drugs and therapeutic targets for known diseases such as cancers. There is therefore a strong incentive to develop new methods capable of detecting these potential drug–target interactions efficiently.Results: In this article, we investigate the relationship between the chemical space, the pharmacological space and the topology of drug–target interaction networks, and show that drug–target interactions are more correlated with pharmacological effect similarity than with chemical structure similarity. We then develop a new method to predict unknown drug–target interactions from chemical, genomic and pharmacological data on a large scale. The proposed method consists of two steps: (i) prediction of pharmacological effects from chemical structures of given compounds and (ii) inference of unknown drug–target interactions based on the pharmacological effect similarity in the framework of supervised bipartite graph inference. The originality of the proposed method lies in the prediction of potential pharmacological similarity for any drug candidate compounds and in the integration of chemical, genomic and pharmacological data in a unified framework. In the results, we make predictions for four classes of important drug–target interactions involving enzymes, ion channels, GPCRs and nuclear receptors. Our comprehensively predicted drug–target interaction networks enable us to suggest many potential drug–target interactions and to increase research productivity toward genomic drug discovery.Supplementary information: Datasets and all prediction results are available at http://cbio.ensmp.fr/~yyamanishi/pharmaco/.Availability: Softwares are available upon request.Contact: yoshihiro.yamanishi@ensmp.fr
The EC (Enzyme Commission) numbers represent a hierarchical classification of enzymatic reactions, but they are also commonly utilized as identifiers of enzymes or enzyme genes in the analysis of complete genomes. This duality of the EC numbers makes it possible to link the genomic repertoire of enzyme genes to the chemical repertoire of metabolic pathways, the process called metabolic reconstruction. Unfortunately, there are numerous reactions known to be present in various pathways, but they will never get EC numbers because the EC number assignment requires published articles on full characterization of enzymes. Here we report a computerized method to automatically assign the EC numbers up to the sub-subclasses, i.e., without the fourth serial number for substrate specificity, given pairs of substrates and products. The method is based on a new classification scheme of enzymatic reactions, named the RC (reaction classification) number. Each reaction in the current dataset of the EC numbers is first decomposed into reactant pairs. Each pair is then structurally aligned to identify the reaction center, the matched region, and the difference region. The RC number represents the conversion patterns of atom types in these three regions. We examined the correspondence between computationally assigned RC numbers and manually assigned EC numbers by the jackknife cross-validation test and found that the EC sub-subclasses could be assigned with the accuracy of about 90%. Furthermore, we examined the correlation with genomic information as represented by the KEGG ortholog clusters (OC) and confirmed that the RC numbers are correlated not only with elementary reaction mechanisms but also with protein families.
The KEGG RPAIR database is a collection of biochemical structure transformation patterns, called RDM patterns, and chemical structure alignments of substrate-product pairs (reactant pairs) in all known enzyme-catalyzed reactions taken from the Enzyme Nomenclature and the KEGG PATHWAY database. Here, we present PathPred (http://www.genome.jp/tools/pathpred/), a web-based server to predict plausible pathways of muti-step reactions starting from a query compound, based on the local RDM pattern match and the global chemical structure alignment against the reactant pair library. In this server, we focus on predicting pathways for microbial biodegradation of environmental compounds and biosynthesis of plant secondary metabolites, which correspond to characteristic RDM patterns in 947 and 1397 reactant pairs, respectively. The server provides transformed compounds and reference transformation patterns in each predicted reaction, and displays all predicted multi-step reaction pathways in a tree-shaped graph.
Motivation: Unexpected drug activities derived from off-targets are usually undesired and harmful; however, they can occasionally be beneficial for different therapeutic indications. There are many uncharacterized drugs whose target proteins (including the primary target and off-targets) remain unknown. The identification of all potential drug targets has become an important issue in drug repositioning to reuse known drugs for new therapeutic indications.Results: We defined pharmacological similarity for all possible drugs using the US Food and Drug Administration's (FDA's) adverse event reporting system (AERS) and developed a new method to predict unknown drug–target interactions on a large scale from the integration of pharmacological similarity of drugs and genomic sequence similarity of target proteins in the framework of a pharmacogenomic approach. The proposed method was applicable to a large number of drugs and it was useful especially for predicting unknown drug–target interactions that could not be expected from drug chemical structures. We made a comprehensive prediction for potential off-targets of 1874 drugs with known targets and potential target profiles of 2519 drugs without known targets, which suggests many potential drug–target interactions that were not predicted by previous chemogenomic or pharmacogenomic approaches.Availability: Softwares are available upon request.Contact: yamanishi@bioreg.kyushu-u.ac.jpSupplementary Information: Datasets and all results are available at http://cbio.ensmp.fr/~yyamanishi/aers/.
Drug side-effects, or adverse drug reactions, have become a major public health concern and remain one of the main causes of drug failure and of drug withdrawal once they have reached the market. Therefore, the identification of potential severe side-effects is a challenging issue. In this paper, we develop a new method to predict potential side-effect profiles of drug candidate molecules based on their chemical structures and target protein information on a large scale. We propose several extensions of kernel regression model for multiple responses to deal with heterogeneous data sources. The originality lies in the integration of the chemical space of drug chemical structures and the biological space of drug target proteins in a unified framework. As a result, we demonstrate the usefulness of the proposed method on the simultaneous prediction of 969 side-effects for approved drugs from their chemical substructure and target protein profiles and show that the prediction accuracy consistently improves owing to the proposed regression model and integration of chemical and biological information. We also conduct a comprehensive side-effect prediction for uncharacterized drug molecules stored in DrugBank and confirm interesting predictions using independent information sources. The proposed method is expected to be useful at many stages of the drug development process.
The metabolic network is both a network of chemical reactions and a network of enzymes that catalyze reactions. Toward better understanding of this duality in the evolution of the metabolic network, we developed a method to extract conserved sequences of reactions called reaction modules from the analysis of chemical compound structure transformation patterns in all known metabolic pathways stored in the KEGG PATHWAY database. The extracted reaction modules are repeatedly used as if they are building blocks of the metabolic network and contain chemical logic of organic reactions. Furthermore, the reaction modules often correspond to traditional pathway modules defined as sets of enzymes in the KEGG MODULE database and sometimes to operon-like gene clusters in prokaryotic genomes. We identified well-conserved, possibly ancient, reaction modules involving 2-oxocarboxylic acids. The chain extension module that appears as the tricarboxylic acid (TCA) reaction sequence in the TCA cycle is now shown to be used in other pathways together with different types of modification modules. We also identified reaction modules and their connection patterns for aromatic ring cleavages in microbial biodegradation pathways, which are most characteristic in terms of both distinct reaction sequences and distinct gene clusters. The modular architecture of biodegradation modules will have a potential for predicting degradation pathways of xenobiotic compounds. The collection of these and many other reaction modules is made available as part of the KEGG database.
In this chapter, we demonstrate the usability of the KEGG (Kyoto encyclopedia of genes and genomes) databases and tools, especially focusing on the visualization of the omics data. The desktop application KegArray and many Web-based tools are tightly integrated with the KEGG knowledgebase, which helps visualize and interpret large amount of data derived from high-throughput measurement techniques including microarray, metagenome, and metabolome analyses. Recently developed resources for human disease, drug, and plant research are also mentioned.
Motivation: The IUBMB's Enzyme Nomenclature system, commonly known as the Enzyme Commission (EC) numbers, plays key roles in classifying enzymatic reactions and in linking the enzyme genes or proteins to reactions in metabolic pathways. There are numerous reactions known to be present in various pathways but without any official EC numbers, most of which have no hope to be given ones because of the lack of the published articles on enzyme assays.Results: In this article we propose a new method to predict the potential EC numbers to given reactant pairs (substrates and products) or uncharacterized reactions, and a web-server named E-zyme as an application. This technology is based on our original biochemical transformation pattern which we call an ‘RDM pattern’, and consists of three steps: (i) graph alignment of a query reactant pair (substrates and products) for computing the query RDM pattern, (ii) multi-layered partial template matching by comparing the query RDM pattern with template patterns related with known EC numbers and (iii) weighted major voting scheme for selecting appropriate EC numbers. As the result, cross-validation experiments show that the proposed method achieves both high coverage and high prediction accuracy at a practical level, and consistently outperforms the previous method.Availability: The E-zyme system is available at http://www.genome.jp/tools/e-zyme/Contact: kanehisa@kuicr.kyoto-u.ac.jp
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.