PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models

Scherer, Martin K.; Trendelkamp-Schroer, Benjamin; Paul, Fabian; Pérez‐Hernández, Guillermo; Hoffmann, Moritz; Plattner, Nuria; Wehmeyer, Christoph; Prinz, Jan-Hendrik; Noé, Frank

doi:10.1021/acs.jctc.5b00743

Cited by 1,027 publications

(1,212 citation statements)

References 126 publications

Supporting

Mentioning

1,140

Contrasting

Unclassified

Order By: Relevance

“…This enables us to select the best set of trial ansatz eigenfunctions by choosing the set that yields the maximum GMRQ. The use of a variational approach to choose MSM construction protocol is not new 71 and is similar to the variational selection of the ground-state wavefunction that yields the minimum energy in quantum mechanics.…”

Section: B Variational Principlementioning

confidence: 99%

“…In terms of kinetics, however, it is important to recognize that models can only describe processes captured by the collective degrees of freedom chosen as the system's features. 32,71,[79][80][81] The GMRQ serves as an excellent tool to distinguish between the predictive capabilities of MSMs constructed from different types of features, which enables modelers to choose the most suitable features. This example demonstrates that it is crucial to investigate different featurization choices, since the best model created for a given set of features may be underestimating slow time scales if those features are not capable of describing the corresponding processes.…”

Section: Appropriate Featurization Is Required To Describe Kineticsmentioning

confidence: 99%

See 1 more Smart Citation

Optimized parameter selection reveals trends in Markov state models for protein folding

Husic

McGibbon

Sultan

et al. 2016

The Journal of Chemical Physics

119

View full text Add to dashboard Cite

As molecular dynamics simulations access increasingly longer time scales, complementary advances in the analysis of biomolecular time-series data are necessary. Markov state models offer a powerful framework for this analysis by describing a system's states and the transitions between them. A recently established variational theorem for Markov state models now enables modelers to systematically determine the best way to describe a system's dynamics. In the context of the variational theorem, we analyze ultra-long folding simulations for a canonical set of twelve proteins [K. Lindorff-Larsen et al., Science 334, 517 (2011)] by creating and evaluating many types of Markov state models. We present a set of guidelines for constructing Markov state models of protein folding; namely, we recommend the use of cross-validation and a kinetically motivated dimensionality reduction step for improved descriptions of folding dynamics. We also warn that precise kinetics predictions rely on the features chosen to describe the system and pose the description of kinetic uncertainty across ensembles of models as an open issue. Published by AIP Publishing. [http://dx

show abstract

Section: B Variational Principlementioning

confidence: 99%

Section: Appropriate Featurization Is Required To Describe Kineticsmentioning

confidence: 99%

Optimized parameter selection reveals trends in Markov state models for protein folding

Husic

McGibbon

Sultan

et al. 2016

The Journal of Chemical Physics

119

View full text Add to dashboard Cite

show abstract

“…40,41 There are several software packages available for TICA and MSM analysis. [42][43][44][45] Our calculations are done using the pyEMMA software. 42 …”

Section: Tica and Msm Analysismentioning

confidence: 99%

Markov modeling of peptide folding in the presence of protein crowders

Nilsson

Mohanty

Irbäck

2018

The Journal of Chemical Physics

View full text Add to dashboard Cite

We use Markov state models (MSMs) to analyze the dynamics of a β-hairpin-forming peptide in Monte Carlo (MC) simulations with interacting protein crowders, for two different types of crowder proteins [bovine pancreatic trypsin inhibitor (BPTI) and GB1]. In these systems, at the temperature used, the peptide can be folded or unfolded and bound or unbound to crowder molecules. Four or five major freeenergy minima can be identified. To estimate the dominant MC relaxation times of the peptide, we build MSMs using a range of different time resolutions or lag times. We show that stable relaxation-time estimates can be obtained from the MSM eigenfunctions through fits to autocorrelation data. The eigenfunctions remain sufficiently accurate to permit stable relaxation-time estimation down to small lag times, at which point simple estimates based on the corresponding eigenvalues have large systematic uncertainties. The presence of the crowders have a stabilizing effect on the peptide, especially with BPTI crowders, which can be attributed to a reduced unfolding rate k u , while the folding rate k f is left largely unchanged.

show abstract

“…We used the PME method 28 to handle long range electrostatics using a 1nm cutoff. The simulations were 29,30 algorithm. For each simulated frame, we used the last reported bias across the tIC CVs as an estimate for input into the MBAR algorithm.…”

Section: Transferable Tics Are An Efficient Methods To Sample Mutationsmentioning

confidence: 99%

Transfer Learning from Markov models leads to efficient sampling of related systems

Sultan

Pande

2017

Preprint

View full text Add to dashboard Cite

Abstract:We recently showed that the time-structure based independent component analysis method from Markov state model literature provided a set of variationally optimal slow collective variables for Metadynamics (tICA-Metadynamics). In this paper, we extend the methodology towards efficient sampling of related mutants by borrowing ideas from transfer learning methods in machine learning. Our method explicitly assumes that a similar set of slow modes and metastable states are found in both the wild type (base line) and its mutants. Under this assumption, we describe a few simple techniques using sequence mapping for transferring the slow modes and structural information contained in the wild type simulation to a mutant model for performing enhanced sampling. The resulting simulations can then be reweighted onto the full-phase space using Multi-state Bennett Acceptance Ratio, allowing for thermodynamic comparison against the wild type. We first benchmark our methodology by re-capturing alanine dipeptide dynamics across a range of different atomistic force fields, including the polarizable Amoeba force field, after learning a set of slow modes using Amber ff99sb-ILDN. We next extend the method by including structural data from the wild type simulation and apply the technique to recapturing the affects of the GTT mutation on the FIP35 WW domain.Introduction: Efficient sampling of protein configuration space remains an unsolved problem in computational biophysics. While algorithmic advances in molecular dynamics (MD) code bases 1 combined with distributed computing hardware 2 , specialized chips 3 , and large-scale increasingly faster GPU clusters have provided routine access to microsecond timescale dynamics, there is still room for significant improvements. One such potential avenue is predicting the effects of mutations onto the protein's wild type free energy landscape. At this point it is worth explicitly noting that while we use the biological terms wild type and mutant extensively, our methodology is easily generalizable to scenarios where a baseline (aka the wild type) free energy landscape has been mapped out and we now wish to understand the dynamical consequences of a making some changes to system (aka the mutation). Under the current scheme, one would have to re-run our entire simulation to ascertain the affects of a mutation onto a protein's free energy landscape. Due to the vast

show abstract

PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models

Cited by 1,027 publications

References 126 publications

Optimized parameter selection reveals trends in Markov state models for protein folding

Optimized parameter selection reveals trends in Markov state models for protein folding

Markov modeling of peptide folding in the presence of protein crowders

Transfer Learning from Markov models leads to efficient sampling of related systems

Contact Info

Product

Resources

About