High-throughput quantum theory of atoms in molecules (QTAIM) for geometric deep learning of molecular and reaction properties
Santiago Vargas,
Winston Gee,
Anastassia Alexandrova
Abstract:We present a package, Generator, for geometric molecular property prediction based on topological features of quantum mechanical electron density. Generator computes Quantum Theory of Atoms in Molecules (QTAIM) features, at...
“…Despite these advances for molecular property prediction, the prediction of computed reaction properties (principally, reaction barriers ,,− ) is still in its infancy . Machine learning approaches span from utilizing simple two-dimensional fingerprints of reaction components , (reactants and products) to physical-organic descriptors, ,,,,− or electronic structure-inspired features, to transformer models , adapted for regression, and 2D graph-based approaches. ,,, The latter, particularly the ChemProp model, , are often best-in-class in predicting reaction properties . It has been shown that these models achieve their impressive performance by exploiting atom-mapping information, − which provide information analogous to the reaction mechanism.…”
Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DREACT, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS data sets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DREACT offers a flexible framework that exploits atommapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different data sets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
“…Despite these advances for molecular property prediction, the prediction of computed reaction properties (principally, reaction barriers ,,− ) is still in its infancy . Machine learning approaches span from utilizing simple two-dimensional fingerprints of reaction components , (reactants and products) to physical-organic descriptors, ,,,,− or electronic structure-inspired features, to transformer models , adapted for regression, and 2D graph-based approaches. ,,, The latter, particularly the ChemProp model, , are often best-in-class in predicting reaction properties . It has been shown that these models achieve their impressive performance by exploiting atom-mapping information, − which provide information analogous to the reaction mechanism.…”
Geometric deep learning models, which incorporate the relevant molecular symmetries within the neural network architecture, have considerably improved the accuracy and data efficiency of predictions of molecular properties. Building on this success, we introduce 3DREACT, a geometric deep learning model to predict reaction properties from three-dimensional structures of reactants and products. We demonstrate that the invariant version of the model is sufficient for existing reaction data sets. We illustrate its competitive performance on the prediction of activation barriers on the GDB7-22-TS, Cyclo-23-TS, and Proparg-21-TS data sets in different atom-mapping regimes. We show that, compared to existing models for reaction property prediction, 3DREACT offers a flexible framework that exploits atommapping information, if available, as well as geometries of reactants and products (in an invariant or equivariant fashion). Accordingly, it performs systematically well across different data sets, atom-mapping regimes, as well as both interpolation and extrapolation tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.