The weighted ensemble (WE) path sampling approach orchestrates an ensemble of parallel calculations with intermittent communication to enhance the sampling of rare events, such as molecular associations or conformational changes in proteins or peptides. Trajectories are replicated and pruned in a way that focuses computational effort on under-explored regions of configuration space while maintaining rigorous kinetics. To enable the simulation of rare events at any scale (e.g. atomistic, cellular), we have developed an open-source, interoperable, and highly scalable software package for the execution and analysis of WE simulations: WESTPA (The Weighted Ensemble Simulation Toolkit with Parallelization and Analysis). WESTPA scales to thousands of CPU cores and includes a suite of analysis tools that have been implemented in a massively parallel fashion. The software has been designed to interface conveniently with any dynamics engine and has already been used with a variety of molecular dynamics (e.g. GROMACS, NAMD, OpenMM, AMBER) and cell-modeling packages (e.g. BioNetGen, MCell). WESTPA has been in production use for over a year, and its utility has been demonstrated for a broad set of problems, ranging from atomically detailed host-guest associations to non-spatial chemical kinetics of cellular signaling networks. The following describes the design and features of WESTPA, including the facilities it provides for running WE simulations, storing and analyzing WE simulation data, as well as examples of input and output.
Equilibrium formally can be represented
as an ensemble of uncoupled
systems undergoing unbiased dynamics in which detailed balance is
maintained. Many nonequilibrium processes can be described by suitable
subsets of the equilibrium ensemble. Here, we employ the “weighted
ensemble” (WE) simulation protocol [Huber and Kim, Biophys. J.1996, 70, 97–110]
to generate equilibrium trajectory ensembles and extract nonequilibrium
subsets for computing kinetic quantities. States do not need to be
chosen in advance. The procedure formally allows estimation of kinetic
rates between arbitrary states chosen after the simulation, along
with their equilibrium populations. We also describe a related history-dependent
matrix procedure for estimating equilibrium and nonequilibrium observables
when phase space has been divided into arbitrary non-Markovian regions,
whether in WE or ordinary simulation. In this proof-of-principle study,
these methods are successfully applied and validated on two molecular
systems: explicitly solvated methane association and the implicitly
solvated Ala4 peptide. We comment on challenges remaining in WE calculations.
Molecular dynamics simulations were used to examine the structural dynamics of two fluorescent probes attached to a typical protein, hen egg-white lysozyme (HEWL). The donor probe (D) was attached via a succinimide group, consistent with the commonly-used maleimide conjugation chemistry, and the acceptor probe (A) was bound into the protein as occurs naturally for HEWL and the dye Eosin Y. The is found to deviate significantly from the theoretical value and high correlation between the orientation factor kappa and the distance R is observed. The correlation is quantified using several possible fixed A orientations and correlation as high as 0.80 is found between kappa and R and as high as 0.68 between kappa(2) and R. The presence of this correlation highlights the fact that essentially all fluorescence-detected resonance energy transfer studies have assumed that kappa and R are independent--an assumption that is clearly not justified in the system studied here. The correlation results in the quantities and < R(-)(6)> differing by a factor of 1.6. The observed correlation between kappa and R is caused by the succinimide linkage between the D and HEWL, which is found to be relatively inflexible.
The characterization of protein binding processes — with all of the key conformational changes — has been a grand challenge in the field of biophysics. Here, we have used the weighted ensemble path sampling strategy to orchestrate molecular dynamics simulations, yielding atomistic views of protein–peptide binding pathways involving the MDM2 oncoprotein and an intrinsically disordered p53 peptide. A total of 182 independent, continuous binding pathways were generated, yielding a kon that is in good agreement with experiment. These pathways were generated in 15 days using 3500 cores of a supercomputer, substantially faster than would be possible with “brute force” simulations. Many of these pathways involve the anchoring of p53 residue F19 into the MDM2 binding cleft when forming the metastable encounter complex, indicating that F19 may be a kinetically important residue. Our study demonstrates that it is now practical to generate pathways and calculate rate constants for protein binding processes using atomistic simulation on typical computing resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.