Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite their multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for...
Physicochemical properties are fundamental to predict the pharmacokinetic and pharmacodynamic behavior of drug candidates. Easily calculated descriptors such as molecular weight and logP have been found to correlate with the success rate of clinical trials. These properties have been previously shown to highlight a sweet-spot in the chemical space associated with favorable pharmacokinetics, which is superior against other regions during hit identification and optimization. In this study, we applied self-organizing maps (SOMs) trained on sixteen calculated properties of a subset of known drugs for the analysis of commercially available compound databases, as well as public biological and chemical databases frequently used for drug discovery. Interestingly, several regions of the property space have been identified that are highly overrepresented by commercially available chemical libraries, while we found almost completely unoccupied regions of the maps (commercially neglected chemical space resembling the properties of known drugs). Moreover, these underrepresented portions of the chemical space are compatible with most rigorous property filters applied by the pharma industry in medicinal chemistry optimization programs. Our results suggest that SOMs may be directly utilized in the strategy of library design for drug discovery to sample previously unexplored parts of the chemical space to aim at yet-undruggable targets. Graphic abstract
The term ‘scavengome’ refers to the chemical space of all the metabolites that may be formed from an antioxidant upon scavenging reactive oxygen or nitrogen species (ROS/RNS). This chemical space is very rich in structures representing an increased chemical complexity as compared to the parent antioxidant: a wide range of unusual heterocyclic structures, new C-C bonds, etc. may be formed. Further, in a biological environment, this increased chemical complexity is directly translated from the localized conditions of oxidative stress that determines the amounts and types of ROS/RNS present. Biomimetic oxidative chemistry provides an excellent tool to model chemical reactions between antioxidants and ROS/RNS. In this chapter, we provide an overview on the known metabolites obtained by biomimetic oxidation of a few selected natural antioxidants, i.e., a stilbene (resveratrol), a pair of hydroxycinnamates (caffeic acid and methyl caffeate), and a flavonol (quercetin), and discuss the drug discovery perspectives of the related chemical space.
Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite their multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for this problem. However, the size of the catalogues of commercially available molecules has dramatically increased over the last decade, redefining the problem of compound accessibility as a matter of budget. In this paper we show that if compound prices are the desired proxy for compound availability, then synthetic accessibility scores are not effective strategies for us in selection. Our approach, CopriNet, is a retrosynthesis-free deep learning model trained on 2D graph representations of compounds alongside their prices extracted from the Mcule catalogue. We show that CoPriNet provides price predictions that correlate far better with actual compound prices than any synthetic accessibility score. Moreover, unlike standard retrosynthesis methods, CoPriNet is rapid, with execution times comparable to popular synthetic accessibility metrics, and thus is suitable for high-throughput experiments including virtual screening and de novo compound generation. While the Mcule catalogue is a proprietary dataset, the CoPriNet source code and the model trained on the proprietary data as well as the fraction of the catalogue (100K compound/prices) used as test dataset have been made publicly available at https://github.com/oxpig/CoPriNet.
Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite its multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for this problem. However, the size of the catalogues of commercially available molecules has dramatically increased over the last decade, redefining the problem of compound accessibility as a matter of budget. In this paper we show that if compound prices are an alternative proxy for compound availability, then synthetic accessibility scores are not effective strategies for assessing availability. Instead, we learn how to predict prices directly from the catalogues. Our approached, CopriNet, is a retrosynthesis-free deep learning model trained on pairs of compound/prices extracted from the Mcule catalogue. CoPriNet is able to provide price predictions that exhibit far better correlation with actual compound prices than any synthetic accessibility measurement. Moreover, unlike standard retrosynthesis methods, CoPriNet is rapid, comparable in execution time to popular synthetic accessibility metrics and thus is suitable for high-throughput experiments including virtual screening and de novo compound generation.
Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite their multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for this problem. However, the size of the catalogues of commercially available molecules has dramatically increased over the last decade, redefining the problem of compound accessibility as a matter of budget. In this paper we show that if compound prices are the desired proxy for compound availability, then synthetic accessibility scores are not effective strategies for us in selection. Our approach, CopriNet, is a retrosynthesis-free deep learning model trained on 2D graph representations of compounds alongside their prices extracted from the Mcule catalogue. We show that CoPriNet provides price predictions that correlate far better with actual compound prices than any synthetic accessibility score. Moreover, unlike standard retrosynthesis methods, CoPriNet is rapid, with execution times comparable to popular synthetic accessibility metrics, and thus is suitable for high-throughput experiments including virtual screening and de novo compound generation. While the Mcule catalogue is a proprietary dataset, the CoPriNet source code and the model trained on the proprietary data as well as the fraction of the catalogue (100K compound/prices) used as test dataset have been made publicly available at https://github.com/oxpig/CoPriNet.
Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite their multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for this problem. However, the size of the catalogues of commercially available molecules has dramatically increased over the last decade, redefining the problem of compound accessibility as a matter of budget. In this paper we show that if compound prices are the desired proxy for compound availability, then synthetic accessibility scores are not effective strategies for us in selection. Our approach, CopriNet, is a retrosynthesis-free deep learning model trained on 2D graph representations of compounds alongside their prices extracted from the Mcule catalogue. We show that CoPriNet provides price predictions that correlate far better with actual compound prices than any synthetic accessibility score. Moreover, unlike standard retrosynthesis methods, CoPriNet is rapid, with execution times comparable to popular synthetic accessibility metrics, and thus is suitable for high-throughput experiments including virtual screening and de novo compound generation. While the Mcule catalogue is a proprietary dataset, the CoPriNet source code and the model trained on the proprietary data as well as the fraction of the catalogue (100K compound/prices) used as test dataset have been made publicly available at https://github.com/oxpig/CoPriNet.
Compound availability is a critical property for design prioritization across the drug discovery pipeline. Historically, and despite their multiple limitations, compound-oriented synthetic accessibility scores have been used as proxies for this problem. However, the size of the catalogues of commercially available molecules has dramatically increased over the last decade, redefining the problem of compound accessibility as a matter of budget. In this paper we show that if compound prices are the desired proxy for compound availability, then synthetic accessibility scores are not effective strategies for us in selection. Our approach, CopriNet, is a retrosynthesis-free deep learning model trained on 2D graph representations of compounds alongside their prices extracted from the Mcule catalogue. We show that CoPriNet provides price predictions that correlate far better with actual compound prices than any synthetic accessibility score. Moreover, unlike standard retrosynthesis methods, CoPriNet is rapid, with execution times comparable to popular synthetic accessibility metrics, and thus is suitable for high-throughput experiments including virtual screening and de novo compound generation. While the Mcule catalogue is a proprietary dataset, the CoPriNet source code and the model trained on the proprietary data as well as the fraction of the catalogue (100K compound/prices) used as test dataset have been made publicly available at https://github.com/oxpig/CoPriNet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.