2017
DOI: 10.1002/biot.201700503
|View full text |Cite
|
Sign up to set email alerts
|

Analysing and Navigating Natural Products Space for Generating Small, Diverse, But Representative Chemical Libraries

Abstract: Armed with the digital availability of two natural products libraries, amounting to some 195 885 molecular entities, we ask the question of how we can best sample from them to maximize their "representativeness" in smaller and more usable libraries of 96, 384, 1152, and 1920 molecules. The term "representativeness" is intended to include diversity, but for numerical reasons (and the likelihood of being able to perform a QSAR) it is necessary to focus on areas of chemical space that are more highly populated. E… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
32
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4
1

Relationship

4
5

Authors

Journals

citations
Cited by 29 publications
(32 citation statements)
references
References 130 publications
(130 reference statements)
0
32
0
Order By: Relevance
“…Although this is a continuous function, Tanimoto similarities above 0.8 are commonly found (i) to be resistant to the precise encoding used, and (ii) indeed to have broadly similar effects in most cases (review at [213]). An alternative encoding uses calculated molecular 'descriptors' or parametrised properties [252] such as polarity, number of hydrogen bond donors, and so on (CDK includes 22, for instance [253]). Of course fingerprints can also be combined with descriptors.…”
Section: The Importance Of Qsars (Quantitative Structure-activity Relmentioning
confidence: 99%
See 1 more Smart Citation
“…Although this is a continuous function, Tanimoto similarities above 0.8 are commonly found (i) to be resistant to the precise encoding used, and (ii) indeed to have broadly similar effects in most cases (review at [213]). An alternative encoding uses calculated molecular 'descriptors' or parametrised properties [252] such as polarity, number of hydrogen bond donors, and so on (CDK includes 22, for instance [253]). Of course fingerprints can also be combined with descriptors.…”
Section: The Importance Of Qsars (Quantitative Structure-activity Relmentioning
confidence: 99%
“…Note, however, that 'pure' structural similarity analyses do not take any pharmacological activities into account, and such methods are referred to as 'unsupervised' learning methods ( Figure 5), an important subset of which includes clustering methods (see e.g. [253][254][255][256]). …”
Section: The Importance Of Qsars (Quantitative Structure-activity Relmentioning
confidence: 99%
“…Cheminformatics [96][97][98] (sometimes called chemoinformatics [96,[99][100][101][102][103]) describes the discipline that helps researchers assess questions such as the degrees of similarity between individual molecules [104][105][106] or the molecular diversity within a library [107,108]. We applied standard cheminformatics methods [46,47,109,110] to the analysis of the relative diversity of our palette of 39 dyes. Given that a pairwise Tanimoto similarity below 0.8 (or a Tanimoto difference exceeding 0.2) is usually taken to mean a significant difference in bioactivity [47,[49][50][51][52], it is encouraging that while there were some small clusters ( Figures 5, 6), the median Tanimoto similarity was just 0.6, implying strong orthogonality in the behaviour of our palette, as was borne out experimentally.…”
Section: Cheminformatics Of Chosen Fluorophoresmentioning
confidence: 99%
“…However, difficulties in producing novel molecules by current generative methods arise because of the discrete nature of chemical space, as well as the large number of molecules [29]. For example, the number of drug-like molecules has been estimated to be between 10 23 and 10 60 [30][31][32][33][34]. Moreover, a slight change in molecular structure can lead to a drastic change in a molecular property such as binding potency (so-called activity cliffs [35][36][37]).…”
Section: Introductionmentioning
confidence: 99%