BackgroundA method to estimate ease of synthesis (synthetic accessibility) of drug-like molecules is needed in many areas of the drug discovery process. The development and validation of such a method that is able to characterize molecule synthetic accessibility as a score between 1 (easy to make) and 10 (very difficult to make) is described in this article.ResultsThe method for estimation of the synthetic accessibility score (SAscore) described here is based on a combination of fragment contributions and a complexity penalty. Fragment contributions have been calculated based on the analysis of one million representative molecules from PubChem and therefore one can say that they capture historical synthetic knowledge stored in this database. The molecular complexity score takes into account the presence of non-standard structural features, such as large rings, non-standard ring fusions, stereocomplexity and molecule size. The method has been validated by comparing calculated SAscores with ease of synthesis as estimated by experienced medicinal chemists for a set of 40 molecules. The agreement between calculated and manually estimated synthetic accessibility is very good with r2 = 0.89.ConclusionA novel method to estimate synthetic accessibility of molecules has been developed. This method uses historical synthetic knowledge obtained by analyzing information from millions of already synthesized chemicals and considers also molecule complexity. The method is sufficiently fast and provides results consistent with estimation of ease of synthesis by experienced medicinal chemists. The calculated SAscore may be used to support various drug discovery processes where a large number of molecules needs to be ranked based on their synthetic accessibility, for example when purchasing samples for screening, selecting hits from high-throughput screening for follow-up, or ranking molecules generated by various de novo design approaches.
Fingerprint-based similarity searching is widely used for virtual screening when only a single bioactive reference structure is available. This paper reviews three distinct ways of carrying out such searches when multiple bioactive reference structures are available: merging the individual fingerprints into a single combined fingerprint; applying data fusion to the similarity rankings resulting from individual similarity searches; and approximations to substructural analysis. Extended searches on the MDL Drug Data Report database suggest that fusing similarity scores is the most effective general approach, with the best individual results coming from the binary kernel discrimination technique.
The identification of small molecules that fall within the biologically relevant subfraction of vast chemical space is of utmost importance to chemical biology and medicinal chemistry research. The prerequirement of biological relevance to be met by such molecules is fulfilled by natural product-derived compound collections. We report a structural classification of natural products (SCONP) as organizing principle for charting the known chemical space explored by nature. SCONP arranges the scaffolds of the natural products in a tree-like fashion and provides a viable analysis-and hypothesis-generating tool for the design of natural product-derived compound collections. The validity of the approach is demonstrated in the development of a previously undescribed class of selective and potent inhibitors of 11-hydroxysteroid dehydrogenase type 1 with activity in cells guided by SCONP and protein structure similarity clustering. 11-hydroxysteroid dehydrogenase type 1 is a target in the development of new therapies for the treatment of diabetes, the metabolic syndrome, and obesity.chemical biology ͉ compound libraries ͉ hydroxysteroid dehydrogenase ͉ cheminformatics T he efficient identification of small molecules that modulate protein function in vitro and in vivo is at the heart of chemical biology and medicinal chemistry research and the development of new therapies and diagnostics for disease. Key to their discovery is the identification and charting of biologically relevant space, i.e., the regions of complete chemical space that are relevant to biology (1-5). The underlying structures of evolutionary selected natural products (NPs) define structural prerequisites for binding to proteins (4, 6). Their structural scaffolds represent the biologically relevant and prevalidated fractions of chemical structure space explored by nature so far. Consequently, the probability that compound libraries designed to mimic the structures and properties of NP classes will be biologically relevant is high, and it is also to be expected that ''NP-guided compound library development'' (1, 4) will prove to be a viable guiding principle for the identification of small molecules for chemical biology and medicinal chemistry research (1-6).A systematic structure-orientated organizing principle of the known NPs combined with annotations of biological origin and pharmacological activity would chart the regions of chemical space explored by nature, provide a structural rationalization and categorization of NP diversity, and also provide guidance for the development of NP-like compound libraries.Statistical analyses of different NP databases have been performed in a few cases (7-10); however, a systematic and annotated structural categorization of NPs leading to development principles for compound library design is missing.Here, we introduce a structural classification of NPs (SCONP) as a idea-and hypothesis-generating tool to define structural relationships between different NP classes in a tree-like arrangement and for the design of NP-de...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.