With the continual improvement of computing hardware and algorithms, simulations have become a powerful tool for understanding all sorts of (bio)molecular processes. To handle the large simulation data sets and to accelerate slow, activated transitions, a condensed set of descriptors, or collective variables (CVs), is needed to discern the relevant dynamics that describes the molecular process of interest. However, proposing an adequate set of CVs that can capture the intrinsic reaction coordinate of the molecular transition is often extremely difficult. Here, we present a framework to find an optimal set of CVs from a pool of candidates using a combination of artificial neural networks and genetic algorithms. The approach effectively replaces the encoder of an autoencoder network with genes to represent the latent space, i.e., the CVs. Given a selection of CVs as input, the network is trained to recover the atom coordinates underlying the CV values at points along the transition. The network performance is used as an estimator of the fitness of the input CVs. Two genetic algorithms optimize the CV selection and the neural network architecture. The successful retrieval of optimal CVs by this framework is illustrated at the hand of two case studies: the well-known conformational change in the alanine dipeptide molecule and the more intricate transition of a base pair in B-DNA from the classic Watson–Crick pairing to the alternative Hoogsteen pairing. Key advantages of our framework include the following: optimal interpretable CVs, avoiding costly calculation of committor or time-correlation functions, and automatic hyperparameter optimization. In addition, we show that applying a time-delay between the network input and output allows for enhanced selection of slow variables. Moreover, the network can also be used to generate molecular configurations of unexplored microstates, for example, for augmentation of the simulation data.
The GTPase KRas is a signaling protein in networks for cell differentiation, growth, and division. KRas mutations can prolong activation of these networks, resulting in tumor formation. When active, KRas tightly binds GTP. Several oncogenic mutations affect the conversion between this rigid state and inactive, more flexible states. Detailed understanding of these transitions may provide valuable insights into how mutations affect KRas. Path sampling simulations, which focus on transitions, show KRas visiting several states, which are the same for wild type and the oncogenic mutant Q61L. Large differences occur when converting between these states, indicating the dramatic effect of the Q61L mutation on KRas dynamics. For Q61L a route to the flexible state is inaccessible, thus shifting the equilibrium to more rigid states. Our methodology presents a novel way to predict dynamical effects of KRas mutations, which may aid in identifying potential therapeutic targets.
Flexibility is essential for many proteins to function, but can be difficult to characterize. Experiments lack resolution in space and time, while the time scales involved are prohibitively long for straightforward molecular dynamics simulations. In this work, we present a multiple state transition path sampling simulation study of a protein that has been notoriously difficult to characterize in its active state. The GTPase enzyme KRas is a signal transduction protein in pathways for cell differentiation, growth, and division. When active, KRas tightly binds guanosine triphosphate (GTP) in a rigid state. The protein–GTP complex can also visit more flexible states, in which it is not active. KRas mutations can affect the conversion between these rigid and flexible states, thus prolonging the activation of signal transduction pathways, which may result in tumor formation. In this work, we apply path sampling simulations to investigate the dynamic behavior of KRas-4B (wild type, WT) and the oncogenic mutant Q61L (Q61L). Our results show that KRas visits several conformational states, which are the same for WT and Q61L. The multiple state transition path sampling (MSTPS) method samples transitions between the different states in a single calculation. Tracking which transitions occur shows large differences between WT and Q61L. The MSTPS results further reveal that for Q61L, a route to a more flexible state is inaccessible, thus shifting the equilibrium to more rigid states. The methodology presented here enables a detailed characterization of protein flexibility on time scales not accessible with brute-force molecular dynamics simulations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.