2018
DOI: 10.1002/cmdc.201700561
|View full text |Cite
|
Sign up to set email alerts
|

Mapping of the Available Chemical Space versus the Chemical Universe of Lead‐Like Compounds

Abstract: This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪10 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary ma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
46
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6

Relationship

3
3

Authors

Journals

citations
Cited by 40 publications
(53 citation statements)
references
References 30 publications
0
46
0
Order By: Relevance
“…10 63 molecules with 30 atoms or less);e ven if only the subsection of the lead-like part is considered, it is still impossible to comprehend by any synthetic and even most virtuals creening methods. [33] To sample the chemical space in an efficient manner, one might select molecules that are as diversea spossible; indeed, diversity is one of the key features considered in compoundl ibrary design.A lthough understandablea tt he intuitive level, molecular diversity,a sw ella st he related concept of moleculars imilarity,i sd ifficult to define in quantitative terms. Primarily,t his is relatedt ot he fact that these concepts are not uniquelyd efined and cannot be inherently "objective."…”
Section: Diversity Of Chemical Librariesmentioning
confidence: 99%
“…10 63 molecules with 30 atoms or less);e ven if only the subsection of the lead-like part is considered, it is still impossible to comprehend by any synthetic and even most virtuals creening methods. [33] To sample the chemical space in an efficient manner, one might select molecules that are as diversea spossible; indeed, diversity is one of the key features considered in compoundl ibrary design.A lthough understandablea tt he intuitive level, molecular diversity,a sw ella st he related concept of moleculars imilarity,i sd ifficult to define in quantitative terms. Primarily,t his is relatedt ot he fact that these concepts are not uniquelyd efined and cannot be inherently "objective."…”
Section: Diversity Of Chemical Librariesmentioning
confidence: 99%
“…The GTM method relates the data points positions in the initial N-dimensional space and in the latent 2D space. The GTM algorithm is described in a range of publications [10][11][12]18]. Briefly speaking, GTM injects a 2D hypersurface (manifold) into a multidimensional data space populated by a set of representative items (the Frame Set, FS).…”
Section: Gtm Trainingmentioning
confidence: 99%
“…In our early study [12], frame set compounds were randomly selected from large chemical libraries. Here, a FS containing 25K AMS compounds of controlled diversity (featuring no two compounds more similar than a given threshold) was prepared.…”
Section: Gtm Trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…A general lecture in cheminformatics must by necessity be selective and will not be able to cover all aspects of the field. An important are that won't be discussed is visualization of the chemical space, which is an active research area ,. Besides traditional cheminformatics tasks as outlined above, the use of machine learning in cheminformatics has become a very hot topic and will be extensively discussed in the lecture…”
Section: Introductionmentioning
confidence: 99%