In the last decade, many statistical-based approaches have been developed to improve poor pharmacokinetics (PK) and to reduce toxicity of lead compounds, which are one of the main causes of high failure rate in drug development. Predictive QSAR models are not always very efficient due to the low number of available biological data and the differences in the experimental protocols. Fortunately, the number of available databases continues to grow every year. However, it remains a challenge to determine the source and the quality of the original data. The main goal is to identify the relevant databases required to generate the most robust predictive models. In this study, an interactive network of databases was proposed to easily find online data sources related to ADME-Tox parameters data. In this map, relevant information regarding scope of application, data availability and data redundancy can be obtained for each data source. To illustrate the usage of data mining from the network, a dataset on plasma protein binding is selected based on various sources such as DrugBank, PubChem and ChEMBL databases. A total of 2,606 unique molecules with experimental values of PPB were extracted and can constitute a consistent dataset for QSAR modeling.
Thirteen pairs of enantiomers belonging to the same structural family (phenylthiohydantoin-amino acids) were analyzed on two polysaccharide chiral stationary phases, namely, tris-(3,5-dimethylphenylcarbamate) of amylose (Chiralpak AD-H) or cellulose (Chiralcel OD-H) in supercritical fluid chromatography with a carbon dioxide/methanol mobile phase (90:10 v/v). Five different temperatures (5, 10, 20, 30, 40°C) were applied to evaluate the thermodynamic behavior of these enantioseparations. On the cellulose stationary phase, the retention, and separation trends were most similar among the set of probe analytes, suggesting that the chiral cavities in this stationary phase have little diversity, or that all analytes accessed the same cavities. Conversely, the retention and separation trends on the amylose phase were much more diverse, and could be related to structural differences among the set of probe analytes (carbon chain length in the amino acid residue, secondary amine in proline, existence of covalent rings, or formation of pseudo-rings via intramolecular hydrogen bonds). The large variability of behaviors on the amylose phase suggests that the chiral-binding sites in this chiral stationary phase have more variety than on the cellulose phase, and that the analytes did access different cavities.
The molecule generation process in de novo design can produce many hundreds of thousands of compounds from an input pool of one or more known compounds. Property filtering and other fast scoring methods can reduce this set in accordance with project goals. However, there may still be many tens of thousands of compounds to select from. The question we pose here is how to identify the most relevant compounds from a set for further analysis and selection. We define relevant as the compounds most suitable for exploring the SAR. Whilst similarity methods can be used to identify the most similar compounds to the input set we show that they are not well suited to this more general task. We introduce the Chemical Annotation Score as a novel method for calculating chemical distance. We show the superiority to fingerprint based similarity methods for identifying the most relevant set of compounds from a large pool for further exploration given an input query or queries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.