Atomistic or ab initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multibody terms that emerge from the dimensionality reduction.
Machine learning (ML) is transforming all areas of science. The complex and time-consuming calculations in molecular simulations are particularly suitable for a machine learning revolution and have already been profoundly impacted by the application of existing ML methods. Here we review recent ML methods for molecular simulation, with particular focus on (deep) neural networks for the prediction of quantum-mechanical energies and forces, coarse-grained molecular dynamics, the extraction of free energy surfaces and kinetics and generative network approaches to sample molecular equilibrium structures and compute thermodynamics. To explain these methods and illustrate open methodological problems, we review some important principles of molecular physics and describe how they can be incorporated into machine learning structures. Finally, we identify and describe a list of open challenges for the interface between ML and molecular simulation. 1 arXiv:1911.02792v1 [physics.chem-ph] 7 Nov 2019 complex relationship between the input (the pixels) and the output (the labels) that is unknown in its explicit form but can be inferred by a suitable algorithm. Clearly, such an operating principle can be very useful in the description of atomic and molecular systems as well. We know that atomistic configurations dictate the chemical properties, and the machine can learn to associate the latter to the former without solving first principle equations, if presented with enough examples. Although different machine learning tools are available and have been applied to molecular simulation (e.g., kernel methods [2]), here we mostly focus on the use of neural networks, now often synonymously used with the term "deep learning". We assume the reader has basic knowledge of machine learning and we refer to the literature for an introduction to statistical learning theory [3,4] and deep learning [5,6].One of the first applications of machine learning in Chemistry has been to extract classical potential energy surfaces from quantum mechanical (QM) calculations, in order to efficiently perform molecular dynamics (MD) simulations that can incorporate quantum effects. The seminal work of Behler and Parrinello in this direction [7] has opened the way to a now rapidly advancing area of research [8,9,10,11,12,13,14,15]. In addition to atomistic force fields, it has been recently shown that, in the same spirit, effective molecular models at resolution coarser than atomistic can be designed by ML [16,17,18]. Analysis and simulation of MD trajectories has also been affected by ML, for instance for the definition of optimal reaction coordinates [19,20,21,22,23,24], the estimate of free energy surfaces [25,26,27,22], the construction of Markov State Models [21,23,28], and for enhancing MD sampling by learning bias potentials [29,30,31,32,33] or selecting starting configurations by active learning [34,35,36]. Finally, ML can be used to generate samples from the equilibrium distribution of a molecular system without performing MD altogether, as proposed...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.