The on-top pair density [Πr] is a local quantum-chemical property that reflects the probability of two electrons of any spin to occupy the same position in space. Being the simplest quantity related to the two-particle density matrix, the on-top pair density is a powerful indicator of electron correlation effects, and as such, it has been extensively used to combine density functional theory and multireference wavefunction theory. The widespread application of Π(r) is currently hindered by the need for post-Hartree–Fock or multireference computations for its accurate evaluation. In this work, we propose the construction of a machine learning model capable of predicting the complete active space self-consistent field (CASSCF)-quality on-top pair density of a molecule only from its structure and composition. Our model, trained on the GDB11-AD-3165 database, is able to predict with minimal error the on-top pair density of organic molecules, bypassing completely the need for ab initio computations. The accuracy of the regression is demonstrated using the on-top ratio as a visual metric of electron correlation effects and bond-breaking in real-space. In addition, we report the construction of a specialized basis set, built to fit the on-top pair density in a single atom-centered expansion. This basis, cornerstone of the regression, could be potentially used also in the same spirit of the resolution-of-the-identity approximation for the electron density.
Physics-inspired molecular representations are the cornerstone of similarity-based learning applied to solve chemical problems. Despite their conceptual and mathematical diversity, this class of descriptors shares a common underlying philosophy: they...
Machine-learning in quantum chemistry is currently booming, with reported applications spanning all molecular properties from simple atomization energies to complex mathematical objects such as the many-body wavefunction. Due to its central role in density functional theory, the electron
density is a particularly compelling target for non-linear regression. Nevertheless, the scalability and the transferability of the existing machine-learning models of ρ(r) are limited by its complex rotational symmetries. Recently, in collaboration with Ceriotti and coworkers, we combined
an efficient electron density decomposition scheme with a local regression framework based on symmetry-adapted Gaussian process regression able to accurately describe the covariance of the electron density spherical tensor components. The learning exercise is performed on local environments,
allowing high transferability and linear-scaling of the prediction with respect to the number of atoms. Here, we review the main characteristics of the model and show its predictive power in a series of applications. The scalability and transferability of the trained model are demonstrated
through the prediction of the electron density of Ubiquitin.
Atomic effective one-electron potentials in a compact analytic form in terms of a few Gaussian charge distributions are developed, for Hydrogen through Nobelium, for starting molecular electronic structure calculations by a simple diagonalization. For each element, all terms but one are optimized in an isolated-atom Hartree-Fock calculation, and the last one is parametrized on a set of molecules. This one-parameter-per-atom model gives a good starting guess for typical molecules and may be of interest even on its own.
Machine learning (ML) algorithms have undergone an explosive development impacting every aspect of computational chemistry. To obtain reliable predictions, one needs to maintain a proper balance between the black-box nature of ML frameworks and the physics of the target properties. One of the most appealing quantum-chemical properties for regression models is the electron density, and some of us recently proposed a transferable and scalable model based on the decomposition of the density onto an atom-centered basis set. The decomposition, as well as the training of the model, is at its core a minimization of some loss function, which can be arbitrarily chosen and may lead to results of different quality. Well-studied in the context of density fitting (DF), the impact of the metric on the performance of ML models has not been analyzed yet. In this work, we compare predictions obtained using the overlap and the Coulomb-repulsion metrics for both decomposition and training. As expected, the Coulomb metric used as both the DF and ML loss functions leads to the best results for the electrostatic potential and dipole moments. The origin of this difference lies in the fact that the model is not constrained to predict densities that integrate to the exact number of electrons N. Since an a posteriori correction for the number of electrons decreases the errors, we proposed a modification of the model, where N is included directly into the kernel function, which allowed lowering of the errors on the test and out-of-sample sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.