Atomistic machine learning (AML) simulations are used in chemistry at an ever-increasing pace. A large number of AML models has been developed, but their implementations are scattered among different packages, each with its own conventions for input and output. Thus, here we give an overview of our MLatom 2 software package, which provides an integrative platform for a wide variety of AML simulations by implementing from scratch and interfacing existing software for a range of state-of-the-art models. These include kernel method-based model types such as KREG (native implementation), sGDML, and GAP-SOAP as well as neural-network-based model types such as ANI, DeepPot-SE, and PhysNet. The theoretical foundations behind these methods are overviewed too. The modular structure of MLatom allows for easy extension to more AML model types. MLatom 2 also has many other capabilities useful for AML simulations, such as the support of custom descriptors, farthest-point and structure-based sampling, hyperparameter optimization, model evaluation, and automatic learning curve generation. It can also be used for such multi-step tasks as Δ-learning, self-correction approaches, and absorption spectrum simulation within the machine-learning nuclear-ensemble approach. Several of these MLatom 2 capabilities are showcased in application examples.
Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra....
We demonstrate that artificial intelligence (AI) can learn four-dimensional (4D) atomistic systems in the spacetime continuum. Given the initial conditions – nuclear positions and velocities at time zero – the proposed 4D-atomistic AI (4D-A2I) models can predict nuclear positions at any time in the future or past for the simplest systems as we show for H2. For larger polyatomic molecules, AI is capable of learning distant but finite future as we demonstrate for an ethanol molecule. 4D-A2I models provide direct access to a multitude of properties at a given time such as geometries, velocities, forces, and energies which can be used in simulating physicochemical transformations and spectra. Our approach can be used as a cost-efficient alternative to traditional molecular dynamics. We show an example of a 4D-A2I model describing the dynamical behavior of ethanol at the coupled-cluster level with the speed of one nanosecond simulation time per one hour wall-clock time on a single GPU card – a previously unachievable feat with traditional Born–Oppenheimer molecular dynamics. 4D-A2I model is also demonstrated to provide direct access to atomistic time-resolved details of physicochemical transformations.
Atomistic machine learning (AML) simulations are used in chemistry at an everincreasing pace. A large number of AML models has been developed, but their implementations are scattered among different packages, each with its own conventions for input and output. Thus, here we give an overview of our MLatom 2 software package, which provides an integrative platform for a wide variety of AML simulations by implementing from scratch and interfacing existing software for a range of state-of-the-art models. These include kernel method-based model types such as KREG (native implementation), sGDML, and GAP-SOAP as well as neuralnetwork- based model types such as ANI, DeepPot-SE, and PhysNet. The theoretical foundations behind these methods are overviewed too. The modular structure of MLatom allows for easy extension to more AML model types. MLatom 2 also has many other capabilities useful for AML simulations, such as the support of custom descriptors, farthest-point and structure-based sampling, hyperparameter optimization, model evaluation, and automatic learning curve generation. It can also be used for such multi-step tasks as Δ-learning, self-correction approaches, and absorption spectrum simulation within the machine-learning nuclear-ensemble approach. Several of these MLatom 2 capabilities are showcased in application examples.
Molecular dynamics (MD) is a widely-used tool for simulating the molecular and materials properties. It is a common wisdom that molecular dynamics simulations should obey physical laws and, hence, lots...
Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra. Machine learning potentials promise to significantly reduce the computational cost and hence enable otherwise unfeasible simulations. However, the surging number of such potentials begs the question of which one to choose or whether we still need to develop yet another one. Here, we address this question by evaluating the performance of popular machine learning potentials in terms of accuracy and computational cost. In addition, we deliver structured information for non-specialists in machine learning to guide them through the maze of acronyms, recognize each potential's main features, and judge what they could expect from each one.
Molecules with strong two‐photon absorption (TPA) are important in many advanced applications such as upconverted laser and photodynamic therapy, but their design is hampered by the high cost of experimental screening and accurate quantum chemical (QC) calculations. Here a systematic study is performed by collecting an experimental TPA database with ≈900 molecules, analyzing with interpretable machine learning (ML) the key molecular features explaining TPA magnitudes, and building a fast ML model for predictions. The ML model has prediction errors of similar magnitude compared to experimental and affordable QC methods errors and has the potential for high‐throughput screening as additionally validated with the new experimental measurements. ML feature analysis is generally consistent with common beliefs which is quantified and rectified. The most important feature is conjugation length followed by features reflecting the effects of donor and acceptor substitution and coplanarity.
The KREG and pKREG models were proven to enable accurate learning of multidimensional single-molecule surfaces of quantum chemical properties such as ground-state potential energies, excitation energies, and oscillator strengths. These models are based on kernel ridge regression (KRR) with the Gaussian kernel function and employ a relative-to-equilibrium (RE) global molecular descriptor, while pKREG is designed to enforce invariance under atom permutations with a permutationally invariant kernel. Here we extend these two models to also explicitly include the derivative information from the training data into the models, which greatly improves their accuracy. We demonstrate on the example of learning potential energies and energy gradients that KREG and pKREG models are better or on par with state-of-the-art machine learning models. We also found that in challenging cases both energy and energy gradient labels should be learned to properly model potential energy surfaces and learning only energies or gradients is insufficient. The models’ open-source implementation is freely available in the MLatom package for general-purpose atomistic machine learning simulations, which can be also performed on the MLatom@XACS cloud computing service.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.