The information of protein targets and small molecule has been highly valued by biomedical and pharmaceutical research. Several protein target databases are available online for FDA-approved drugs as well as the promising precursors that have largely facilitated the mechanistic study and subsequent research for drug discovery. However, those related resources regarding to herbal active ingredients, although being unusually valued as a precious resource for new drug development, is rarely found. In this article, a comprehensive and fully curated database for Herb Ingredients’ Targets (HIT, http://lifecenter.sgst.cn/hit/) has been constructed to complement above resources. Those herbal ingredients with protein target information were carefully curated. The molecular target information involves those proteins being directly/indirectly activated/inhibited, protein binders and enzymes whose substrates or products are those compounds. Those up/down regulated genes are also included under the treatment of individual ingredients. In addition, the experimental condition, observed bioactivity and various references are provided as well for user's reference. Derived from more than 3250 literatures, it currently contains 5208 entries about 1301 known protein targets (221 of them are described as direct targets) affected by 586 herbal compounds from more than 1300 reputable Chinese herbs, overlapping with 280 therapeutic targets from Therapeutic Targets Database (TTD), and 445 protein targets from DrugBank corresponding to 1488 drug agents. The database can be queried via keyword search or similarity search. Crosslinks have been made to TTD, DrugBank, KEGG, PDB, Uniprot, Pfam, NCBI, TCM-ID and other databases.
The identification of synergistic chemotherapeutic agents from a large pool of candidates is highly challenging. Here, we present a Ranking-system of Anti-Cancer Synergy (RACS) that combines features of targeting networks and transcriptomic profiles, and validate it on three types of cancer. Using data on human β-cell lymphoma from the Dialogue for Reverse Engineering Assessments and Methods consortium we show a probability concordance of 0.78 compared with 0.61 obtained with the previous best algorithm. We confirm 63.6% of our breast cancer predictions through experiment and literature, including four strong synergistic pairs. Further in vivo screening in a zebrafish MCF7 xenograft model confirms one prediction with strong synergy and low toxicity. Validation using A549 lung cancer cells shows similar results. Thus, RACS can significantly improve drug synergy prediction and markedly reduce the experimental prescreening of existing drugs for repurposing to cancer treatment, although the molecular mechanism underlying particular interactions remains unknown.
B-cell epitope information is critical to immune therapy and vaccine design. Protein epitopes can be significantly affected by glycosylation, while no methods have considered this till now. Based on previous versions of Spatial Epitope Prediction of Protein Antigens (SEPPA), we here present an enhanced tool SEPPA 3.0, enabling glycoprotein antigens. Parameters were updated based on the latest and largest dataset. Then, additional micro-environmental features of glycosylation triangles and glycosylation-related amino acid indexes were added as important classifiers, coupled with final calibration based on neighboring antigenicity. Logistic regression model was retained as SEPPA 2.0. The AUC value of 0.794 was obtained through 10-fold cross-validation on internal validation. Independent testing on general protein antigens resulted in AUC of 0.740 with BA (balanced accuracy) of 0.657 as baseline of SEPPA 3.0. Most importantly, when tested on independent glycoprotein antigens only, SEPPA 3.0 gave an AUC of 0.749 and BA of 0.665, leading the top performance among peers. As the first server enabling accurate epitope prediction for glycoproteins, SEPPA 3.0 shows significant advantages over popular peers on both general protein and glycoprotein antigens. It can be accessed at http://bidd2.nus.edu.sg/SEPPA3/ or at http://www.badd-cao.net/seppa3/index.html. Batch query is supported.
Spatial Epitope Prediction server for Protein Antigens (SEPPA) has received lots of feedback since being published in 2009. In this improved version, relative ASA preference of unit patch and consolidated amino acid index were added as further classification parameters in addition to unit-triangle propensity and clustering coefficient which were previously reported. Then logistic regression model was adopted instead of the previous simple additive one. Most importantly, subcellular localization of protein antigen and species of immune host were fully taken account to improve prediction. The result shows that AUC of 0.745 (5-fold cross-validation) is almost the baseline performance with no differentiation like all the other tools. Specifying subcellular localization of protein antigen and species of immune host will generally push the AUC up. Secretory protein immunized to mouse can push AUC to 0.823. In this version, the false positive rate has been largely decreased as well. As the first method which has considered the subcellular localization of protein antigen and species of immune host, SEPPA 2.0 shows obvious advantages over the other popular servers like SEPPA, PEPITO, DiscoTope-2, B-pred, Bpredictor and Epitopia in supporting more specific biological needs. SEPPA 2.0 can be accessed at http://badd.tongji.edu.cn/seppa/. Batch query is also supported.
Literature-described targets of herbal ingredients have been explored to facilitate the mechanistic study of herbs, as well as the new drug discovery. Though several databases provided similar information, the majority of them are limited to literatures before 2010 and need to be updated urgently. HIT 2.0 was here constructed as the latest curated dataset focusing on Herbal Ingredients’ Targets covering PubMed literatures 2000–2020. Currently, HIT 2.0 hosts 10 031 compound-target activity pairs with quality indicators between 2208 targets and 1237 ingredients from more than 1250 reputable herbs. The molecular targets cover those genes/proteins being directly/indirectly activated/inhibited, protein binders, and enzymes substrates or products. Also included are those genes regulated under the treatment of individual ingredient. Crosslinks were made to databases of TTD, DrugBank, KEGG, PDB, UniProt, Pfam, NCBI, TCM-ID and others. More importantly, HIT enables automatic Target-mining and My-target curation from daily released PubMed literatures. Thus, users can retrieve and download the latest abstracts containing potential targets for interested compounds, even for those not yet covered in HIT. Further, users can log into ‘My-target’ system, to curate personal target-profiling on line based on retrieved abstracts. HIT can be accessible at http://hit2.badd-cao.net.
As an extension of the conventional quantitative structure activity relationship models, proteochemometric (PCM) modelling is a computational method that can predict the bioactivity relations between multiple ligands and multiple targets. Traditional PCM modelling includes three essential elements: descriptors (including target descriptors, ligand descriptors and cross-term descriptors), bioactivity data and appropriate learning functions that link the descriptors to the bioactivity data. Since its appearance, PCM modelling has developed rapidly over the past decade by taking advantage of the progress of different descriptors and machine learning techniques, along with the increasing amounts of available bioactivity data. Specifically, the new emerging target descriptors and cross-term descriptors not only significantly increased the performance of PCM modelling but also expanded its application scope from traditional protein-ligand interaction to more abundant interactions, including protein-peptide, protein-DNA and even protein-protein interactions. In this review, target descriptors and cross-term descriptors, as well as the corresponding application scope, are intensively summarized. Additionally, we look forward to seeing PCM modelling extend into new application scopes, such as Target-Catalyst-Ligand systems, with the further development of descriptors, machine learning techniques and increasing amounts of available bioactivity data.
Low drug productivity has been a significant problem of the pharmaceutical industry for several decades even though numerous novel technologies were introduced during this period. Currently pharmacologic dogma, "single drug, single target, single disease", is at the root of the lack of drug productivity. From a systems biology viewpoint, network pharmacology has been proposed to complement the established guiding pharmacologic approaches. The rationale for network pharmacology as a major component of drug discovery and development is that a disease can be caused by perturbation of the disease-causing network and a drug may be designed to interact with multiple targets for modulation of such a network from the disease status toward normal status. Therefore, network pharmacology has been applied to guide and assist in drug repositioning. Drugs exerting their therapeutic effects may directly target disease-associated proteins, but they may also modulate the pathways involved in the pathological process. In this review, we discuss the progresses and prospects in network pharmacology, focusing on drug off-targets discovery, disease-associated protein identification, and pathway analysis for elucidating relationships between drug targets and disease-associated proteins.
BackgroundThe rapid increase in the emergence of novel chemical substances presents a substantial demands for more sophisticated computational methodologies for drug discovery. In this study, the idea of Learning to Rank in web search was presented in drug virtual screening, which has the following unique capabilities of 1). Applicable of identifying compounds on novel targets when there is not enough training data available for these targets, and 2). Integration of heterogeneous data when compound affinities are measured in different platforms.ResultsA standard pipeline was designed to carry out Learning to Rank in virtual screening. Six Learning to Rank algorithms were investigated based on two public datasets collected from Binding Database and the newly-published Community Structure-Activity Resource benchmark dataset. The results have demonstrated that Learning to rank is an efficient computational strategy for drug virtual screening, particularly due to its novel use in cross-target virtual screening and heterogeneous data integration.ConclusionsTo the best of our knowledge, we have introduced here the first application of Learning to Rank in virtual screening. The experiment workflow and algorithm assessment designed in this study will provide a standard protocol for other similar studies. All the datasets as well as the implementations of Learning to Rank algorithms are available at http://www.tongji.edu.cn/~qiliu/lor_vs.html.Graphical AbstractThe analogy between web search and ligand-based drug discovery
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.