ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 and 2014 Nucleic Acids Research Database Issues. Since then, alongside the continued extraction of data from the medicinal chemistry literature, new sources of bioactivity data have also been added to the database. These include: deposited data sets from neglected disease screening; crop protection data; drug metabolism and disposition data and bioactivity data from patents. A number of improvements and new features have also been incorporated. These include the annotation of assays and targets using ontologies, the inclusion of targets and indications for clinical candidates, addition of metabolic pathways for drugs and calculation of structural alerts. The ChEMBL data can be accessed via a web-interface, RDF distribution, data downloads and RESTful web-services.
We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.
Additive models for the estimation of Abraham's molecular descriptors R 2 , π 2 H , ΣR 2 H , Σβ 2 H , Σβ 2 O , and log L 16 have been developed. For five of the six descriptors, one set of 81 atom and functional group fragments is capable of reproducing experimentally derived results with correlation coefficients ranging from 0.95 to 0.99. However, one descriptor, ΣR 2 H , required an entirely separate set of 51 fragments to be developed, resulting in a correlation coefficient of 0.97. Of particular importance is the speed of calculation (approximately 700 molecules/min), allowing so-called "high-throughput screening". Several applications of this model for molecules containing intramolecular interactions are discussed.
The ‘druggable genome’ encompasses several protein families, but only a subset of targets within them have attracted significant research attention and thus have information about them publicly available. The Illuminating the Druggable Genome (IDG) program was initiated in 2014, has the goal of developing experimental techniques and a Knowledge Management Center (KMC) that would collect and organize information about protein targets from four families, representing the most common druggable targets with an emphasis on understudied proteins. Here, we describe two resources developed by the KMC: the Target Central Resource Database (TCRD) which collates many heterogeneous gene/protein datasets and Pharos (https://pharos.nih.gov), a multimodal web interface that presents the data from TCRD. We briefly describe the types and sources of data considered by the KMC and then highlight features of the Pharos interface designed to enable intuitive access to the IDG knowledgebase. The aim of Pharos is to encourage ‘serendipitous browsing’, whereby related, relevant information is made easily discoverable. We conclude by describing two use cases that highlight the utility of Pharos and TCRD.
Background: The ChEMBL database is one of a number of public databases that contain bioactivity data on small molecule compounds curated from diverse sources. Incoming compounds are typically not standardised according to consistent rules. In order to maintain the quality of the final database and to easily compare and integrate data on the same compound from different sources it is necessary for the chemical structures in the database to be appropriately standardised. Results: A chemical curation pipeline has been developed using the open source toolkit RDKit. It comprises three components: a Checker to test the validity of chemical structures and flag any serious errors; a Standardizer which formats compounds according to defined rules and conventions and a GetParent component that removes any salts and solvents from the compound to create its parent. This pipeline has been applied to the latest version of the ChEMBL database as well as uncurated datasets from other sources to test the robustness of the process and to identify common issues in database molecular structures. Conclusion: All the components of the structure pipeline have been made freely available for other researchers to use and adapt for their own use. The code is available in a GitHub repository and it can also be accessed via the ChEMBL Beaker webservices. It has been used successfully to standardise the nearly 2 million compounds in the ChEMBL database and the compound validity checker has been used to identify compounds with the most serious issues so that they can be prioritised for manual curation.
This kinetic and thermodynamic study is concerned with a comparison of binding of simple anions, azo-dyes and longchain surfactants to acyclodextrin. Two competition reactions have been employed. (i) In order to obtain information on the association of simple anions and surfactants with a-cyclodextrin, the competitive binding of an azo-dye with a-cyclodextrin has been used in stopped-flow experiments. (ii) In order to study the binding of the azo-dye pyridine-2-azo-p-dimethylaniline (PADA) to acyclodextrin, the competition reaction used is that between PADA and aquo-metal ions. Ni2+(aq) and Zn2+(aq) are used as competitors for PADA in stopped-flow and Joule-heating temperature-jump experiments.It is confirmed that when azo-dyes interact with a-cyclodextrin, they do so by a two-step mechanism. The first step involves the formation of an intermediate complex in a fast pre-equilibrium step. The second step, which is rate-determining, results in the formation of the final stable inclusion complex. Studies with I-(as) as a competing ion suggest that the intermediate is associated with dye binding to the inside rather than the outside of the cyclodextrin.Small anions in general have low stability constants for binding to a-cyclodextrin. In marked contrast, surfactants of general formula C,H,,+,SO; (where n > 8) interact strongly with a-cyclodextrin, and inclusion compounds are formed in which one surfactant binds two cyclodextrin molecules.
Materials and MethodsThe dyes MY7 and PADA were supplied by Aldrich and Sigma, respectively. Elemental analysis was used to confirm purity. The structures are shown below.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.