2014
DOI: 10.1021/ci5001778
|View full text |Cite
|
Sign up to set email alerts
|

Modeling a Crowdsourced Definition of Molecular Complexity

Abstract: This paper brings together the concepts of molecular complexity and crowdsourcing. An exercise was done at Merck where 386 chemists voted on the molecular complexity (on a scale of 1-5) of 2681 molecules taken from various sources: public, licensed, and in-house. The meanComplexity of a molecule is the average over all votes for that molecule. As long as enough votes are cast per molecule, we find meanComplexity is quite easy to model with QSAR methods using only a handful of physical descriptors (e.g., number… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
75
0
2

Year Published

2015
2015
2021
2021

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 60 publications
(78 citation statements)
references
References 30 publications
(46 reference statements)
1
75
0
2
Order By: Relevance
“…Figure S14: Correlation between the length of the first synthetic pathway found by ASKCOS and expert scores assigned by chemists in Sheridan et al. 17 A length of 0 indicates that the molecule can be found in our database of readily-purchasable compounds; a length of 11 indicates that no pathway was found with the fixed expansion settings (see Methods). Figure S15: Correlation between the length of the first synthetic pathway found by ASKCOS using all compound datasets and the SA Score 31 heuristic.…”
Section: Biasing Techniques For Molecular Generationmentioning
confidence: 99%
See 2 more Smart Citations
“…Figure S14: Correlation between the length of the first synthetic pathway found by ASKCOS and expert scores assigned by chemists in Sheridan et al. 17 A length of 0 indicates that the molecule can be found in our database of readily-purchasable compounds; a length of 11 indicates that no pathway was found with the fixed expansion settings (see Methods). Figure S15: Correlation between the length of the first synthetic pathway found by ASKCOS using all compound datasets and the SA Score 31 heuristic.…”
Section: Biasing Techniques For Molecular Generationmentioning
confidence: 99%
“…2 Current procedures for quantifying synthesizability are based on (1) structure complexity and similarity or (2) synthetic pathways. The structure-based approach usually involves constructing a heuristic definition based on domain expertise or chemical substructure diversity 14,15 or designing a model that can be fit to expert scores [16][17][18] or reaction data. 19,20 This kind of method is widely used due to its ease of implementation and low computational cost.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…There have been many attempts to quantify synthetic accessibility, primarily involving heuristic scoring functions trained on subjective expert ratings. 56 Here, we use a success criterion that enables a more objective evaluation: that when given the products of reactions in the United States patent literature, the program recovers and ranks highly the recorded reactants without having seen that reaction previously.…”
Section: Approachmentioning
confidence: 99%
“…In particular we based the development of the QC metric on a crowd-sourced assessment of response matrix quality. The use of crowd-sourced assessments of “quality” and other abstract descriptors has precedents14151617. For example, Lajiness et al 18.…”
mentioning
confidence: 99%