2022
DOI: 10.1002/cphc.202200255
|View full text |Cite
|
Sign up to set email alerts
|

An Ensemble Structure and Physicochemical (SPOC) Descriptor for Machine‐Learning Prediction of Chemical Reaction and Molecular Properties

Abstract: Feature representations, or descriptors, are machines’ chemical language that largely shapes the prediction capability, generalizability and interpretability of machine learning models. To develop a generally applicable descriptor is highly warranted for chemists to deal with conventional prediction tasks in the context of sparsely distributed and small datasets. Inspired by the chemist's vision on molecules, we presented herein an ensemble descriptor, SPOC, curated on the principles of physical organic chemis… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

4
4

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 65 publications
(78 reference statements)
0
10
0
Order By: Relevance
“…The first comparison was conducted over the results presented by [ 40 ] using a random forest (i.e., RF) and Morgan MFs on BBBP (0.909 ± 0.028 AUC), Tox21 (0.819 ± 0.017), SIDER (0.687 ± 0.014), and ClinTox (0.759 ± 0.060). In [ 41 ], the authors evaluated MAACS fingerprints over the Tox21 dataset achieving an AUC of 0.805 ± 0.01, an AUC of 0.721 ± 0.004 for BBBP, and an AUC equal to 0.797 ± 0.151 for Clintox, applying an ensemble of decision trees over 5-fold cross-validation. Another paper [ 42 ] focused on the Tox21 dataset reporting the outcomes of the in silico toxicity evaluation by five classifiers on Morgan fingerprints: the LightGBM overperformed other classifiers, reaching an AUC of 0.795 on the test set (standard deviation was not reported) for NR-AR.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…The first comparison was conducted over the results presented by [ 40 ] using a random forest (i.e., RF) and Morgan MFs on BBBP (0.909 ± 0.028 AUC), Tox21 (0.819 ± 0.017), SIDER (0.687 ± 0.014), and ClinTox (0.759 ± 0.060). In [ 41 ], the authors evaluated MAACS fingerprints over the Tox21 dataset achieving an AUC of 0.805 ± 0.01, an AUC of 0.721 ± 0.004 for BBBP, and an AUC equal to 0.797 ± 0.151 for Clintox, applying an ensemble of decision trees over 5-fold cross-validation. Another paper [ 42 ] focused on the Tox21 dataset reporting the outcomes of the in silico toxicity evaluation by five classifiers on Morgan fingerprints: the LightGBM overperformed other classifiers, reaching an AUC of 0.795 on the test set (standard deviation was not reported) for NR-AR.…”
Section: Resultsmentioning
confidence: 99%
“…Standard deviation was included if reported in the original papers. AUC values from [ 40 , 41 , 42 , 43 , 44 , 45 , 47 , 48 ].…”
Section: Figurementioning
confidence: 99%
See 1 more Smart Citation
“…17 Yang et al combined fingerprint and physicochemical descriptors for reaction prediction. 18 However, these descriptors are not easy to interpret nor generalize well outside the training reaction space except the Hammett and TSEI descriptors.…”
Section: Introductionmentioning
confidence: 99%
“…17 Luo et al combined fingerprint and physicochemical descriptors for reaction prediction. 18 However, these descriptors are not easy to interpret nor generalize well outside the training reaction space except the Hammett and TSEI descriptors.…”
Section: Introductionmentioning
confidence: 99%