2023
DOI: 10.1088/2632-2153/acee42
|View full text |Cite
|
Sign up to set email alerts
|

Comment on ‘Physics-based representations for machine learning properties of chemical reactions’

Kevin A Spiekermann,
Thijs Stuyver,
Lagnajit Pattanaik
et al.

Abstract: In a recent article in this journal, van Gerwen et al (2022 Mach. Learn.: Sci. Technol. 3 045005) presented a kernel ridge regression model to predict reaction barrier heights. Here, we comment on the utility of that model and present references and results that contradict several statements made in that article. Our primary interest is to offer a broader perspective by presenting three aspects that are essential for researchers to consider when creating models for chemical kinetics: (1) are … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 196 publications
(212 reference statements)
0
6
0
Order By: Relevance
“…Graph-based neural-network models have become notorious in many contexts for overfitting and poor out-of-distribution performance. ,, Although the models trained on RGD1 show excellent testing performance on unseen reactions, this is a large data set, and reactions typically involve a small number of bond changes and conserved mechanisms. This means that even if the testing set involves unseen reactions in terms of reactants or products, it is not expected to necessarily present novel reactivity (e.g., in terms of new types of bonds being broken and formed) that is not seen elsewhere in the training data.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Graph-based neural-network models have become notorious in many contexts for overfitting and poor out-of-distribution performance. ,, Although the models trained on RGD1 show excellent testing performance on unseen reactions, this is a large data set, and reactions typically involve a small number of bond changes and conserved mechanisms. This means that even if the testing set involves unseen reactions in terms of reactants or products, it is not expected to necessarily present novel reactivity (e.g., in terms of new types of bonds being broken and formed) that is not seen elsewhere in the training data.…”
Section: Resultsmentioning
confidence: 99%
“…The negotiation of these tradeoffs remains a live issue. 44,56,57 Despite many practical demonstrations of the graph-toactivation-energy (G2Ea) concept, several challenges persist that limit the usefulness of these models as drop-in replacements for quantum-chemistry-based TS searches. One challenge is that the scarcity of large reaction data sets has limited convincing out-of-distribution tests of the transferability of G2Ea models.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The random splitting approach employed above assesses the interpolation ability of the tree models. 55 Here, we assess the extrapolation ability of the empirical tree model and the SIDT model with a more challenging data split, i.e., cluster split. The details on how we perform the cluster split can be found in Section 3.3.3.…”
Section: Comparison With the Empirical Treementioning
confidence: 99%
“…We nd that the MAE is 0.42 eV on a dataset which spans z10 eV, similar to the MAE of obtained with the SLATM (2) d kernel and nearly double that of a trained directed message-passing neural network for slightly different dataset splits. 40 We do not perform extensive hyperparameter tests, instead we use the same parameters as listed in Section VI; while out of the scope of our current work, we expect that further hyperparameter and model tuning to be benecial to reduce the mean absolute error for this dataset. Furthermore, we nd that the standard deviation of the error to be 0.68 eV (corresponding to a root-mean-square error of 0.65 eV), well within the margin of error expected to accurately predict reactions of interest which can then undergo computationally intensive DFT and transition state nding calculations.…”
Section: E Model Performance: Activation Barriersmentioning
confidence: 99%