2022
DOI: 10.1063/5.0088404
|View full text |Cite
|
Sign up to set email alerts
|

Comment on “Manifolds of quasi-constant SOAP and ACSF fingerprints and the resulting failure to machine learn four-body interactions” [J. Chem. Phys. 156, 034302 (2022)]

Abstract: The "quasi-constant' SOAP and ACSF fingerprint manifolds recently discovered by Parsaeifard and Goedecker[J. Chem. Phys. 156, 034302 (2022)] are closely related to the degenerate pairs of configurations, that are a known shortcoming of all low-body-order atom-density correlation representations of molecular structures. Configurations that are rigorously singular -- that we demonstrate can only occur in finite, discrete sets, and not as a continuous manifolds -- determine the complete failure of machine-learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(5 citation statements)
references
References 22 publications
0
5
0
Order By: Relevance
“…Physically interpretable descriptors like the atomic charges and electrostatic potentials improve the description of the atomic environments and are thus suitable to overcome current limitations in the resolution of the atomic energies due to the incompleteness of local atomic environments represented by two-and three-body structural descriptors. 44 In terms of scalability of the model, the charge equilibration step in the framework of the 4G-and ee4G-HDNNPs incurs additional computational cost compared to second generation methods. Despite the availability of highly optimized linear solvers, 78,79 the 4G-and ee4G-HDNNPs could still become demanding for simulating very large systems beyond tens of thousands of atoms.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Physically interpretable descriptors like the atomic charges and electrostatic potentials improve the description of the atomic environments and are thus suitable to overcome current limitations in the resolution of the atomic energies due to the incompleteness of local atomic environments represented by two-and three-body structural descriptors. 44 In terms of scalability of the model, the charge equilibration step in the framework of the 4G-and ee4G-HDNNPs incurs additional computational cost compared to second generation methods. Despite the availability of highly optimized linear solvers, 78,79 the 4G-and ee4G-HDNNPs could still become demanding for simulating very large systems beyond tens of thousands of atoms.…”
Section: Discussionmentioning
confidence: 99%
“…In spite of these advances, the accuracy of current MLPs is still limited by their ability to distinguish different bonding situations, which requires sufficient information about the system. Recent studies 44,45 have shown that commonly used atomic environment descriptors such as atom centered symmetry functions (ACSFs) 39 and the smooth overlap of atomic positions 46 provide an incomplete description of the atomic environment due to the lack of higher order terms. As a result, in rare cases different atomic environments can map to the exact same descriptor values, which lead to the atomic NN predicting the same atomic energy.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, degeneracies or near-degeneracies can occur, meaning that different structures (with different properties) are mapped to the same representation. [70][71][72][73][74] This is highly problematic for ML models, which obviously cannot assign different outputs to identical inputs. Further- more, it is clear that not all molecular properties are invariant to rotations.…”
Section: Inductive Biases and Physical Priorsmentioning
confidence: 99%
“…While invariance is thus a key property, it has recently been found that important structural information can be lost in the process of making representations rotationally invariant. In particular, degeneracies or near‐degeneracies can occur, meaning that different structures (with different properties) are mapped to the same representation [70–74] . This is highly problematic for ML models, which obviously cannot assign different outputs to identical inputs.…”
Section: Inductive Biases and Physical Priorsmentioning
confidence: 99%
“…Structures of any size and complexity can be generated with this construction, even though the increase in co-dimension suggests that they become less 'dense' . One should keep in mind that the presence of degenerate configurations affects the accuracy and numerical stability of ML models even for other structures [39,40]. From the point of view of the input features (or more broadly, the hidden representation of a deep-learning framework), an overlap of two structures that should be distinct determines a distortion that brings close together structures that should be far apart (figure 5(a)).…”
Section: A Counterexamplementioning
confidence: 99%