Structural and thermodynamic consistency of coarse-graining models across multiple length scales is essential for the predictive role of multi-scale modeling and molecular dynamic simulations that use mesoscale descriptions. Our approach is a coarse-grained model based on integral equation theory, which can represent polymer chains at variable levels of chemical details. The model is analytical and depends on molecular and thermodynamic parameters of the system under study, as well as on the direct correlation function in the k → 0 limit, c0. A numerical solution to the PRISM integral equations is used to determine c0, by adjusting the value of the effective hard sphere diameter, dHS, to agree with the predicted equation of state. This single quantity parameterizes the coarse-grained potential, which is used to perform mesoscale simulations that are directly compared with atomistic-level simulations of the same system. We test our coarse-graining formalism by comparing structural correlations, isothermal compressibility, equation of state, Helmholtz and Gibbs free energies, and potential energy and entropy using both united atom and coarse-grained descriptions. We find quantitative agreement between the analytical formalism for the thermodynamic properties, and the results of Molecular Dynamics simulations, independent of the chosen level of representation. In the mesoscale description, the potential energy of the soft-particle interaction becomes a free energy in the coarse-grained coordinates which preserves the excess free energy from an ideal gas across all levels of description. The structural consistency between the united-atom and mesoscale descriptions means the relative entropy between descriptions has been minimized without any variational optimization parameters. The approach is general and applicable to any polymeric system in different thermodynamic conditions.
Gap junctions establish direct pathways for cells to transfer metabolic and electrical messages. The local lipid environment is known to affect the structure, stability and intercellular channel activity of gap junctions; however, the molecular basis for these effects remains unknown. Here, we incorporate native connexin-46/50 (Cx46/50) intercellular channels into a dual lipid nanodisc system, mimicking a native cell-to-cell junction. Structural characterization by CryoEM reveals a lipid-induced stabilization to the channel, resulting in a 3D reconstruction at 1.9 Å resolution. Together with all-atom molecular dynamics simulations, it is shown that Cx46/50 in turn imparts long-range stabilization to the dynamic local lipid environment that is specific to the extracellular lipid leaflet. In addition,~400 water molecules are resolved in the CryoEM map, localized throughout the intercellular permeation pathway and contributing to the channel architecture. These results illustrate how the aqueous-lipid environment is integrated with the architectural stability, structure and function of gap junction communication channels.
Despite the development of massively parallel computing hardware including inexpensive graphics processing units (GPUs), it has remained infeasible to simulate the folding of atomistic proteins at room temperature using conventional molecular dynamics (MD) beyond the μs scale. Here we report the folding of atomistic, implicitly solvated protein systems with folding times τ f ranging from ~10 μs to ~100 ms using the weighted ensemble (WE) strategy in combination with GPU computing. Starting from an initial structure or set of structures, WE organizes an ensemble of GPU-accelerated MD trajectory segments via intermittent pruning and replication events to generate statistically unbiased estimates of rate constants for rare events such as folding; no biasing forces are used. Although the variance among atomistic WE folding runs is significant, multiple independent runs are used to reduce and quantify statistical uncertainty. Folding times are estimated directly from WE probability flux and from history-augmented Markov analysis of the WE data. Three systems were examined: NTL9 at low solvent viscosity (yielding τ f = 0.8 − μs), NTL9 at water-like viscosity (τ f = 0.2 − 2 ms), and Protein G at low viscosity (τ f = 3 − 200 ms). In all cases the folding time, uncertainty, and ensemble properties could be estimated from WE simulation; for Protein G, this characterization required significantly less overall computing than would be required to observe a single folding event with conventional MD simulations. Our results suggest that the use and calibration of force fields and solvent models for precise estimation of kinetic quantities is becoming feasible.
We utilize a multiscale approach where molecular dynamic simulations are performed to obtain quantitative structural averages used as input to a coarse-grained Langevin equation for protein dynamics, which can be solved analytically. The approach describes proteins as fundamentally semiflexible objects collapsed into the free energy well representing the folded state. The normal-mode analytical solution to this Langevin equation naturally separates into global modes describing the fully anisotropic tumbling of the macromolecule as a whole and internal modes which describe local fluctuations about the folded structure. Complexity in the configurational free-energy landscape of the macromolecule leads to a renormalization of the internal modes, while the global modes provide a basis set in which the dipolar orientation and global anisotropy can be accounted for when comparing to experiments. This simple approach predicts the dynamics of both global rotational diffusion and internal motion from the picosecond to the nanosecond regime and is quantitative when compared to time correlation functions calculated from molecular dynamic simulations and in good agreement with nuclear magnetic resonance relaxation experiments. Fundamental to this approach is the inclusion of internal dissipation, which is absent in any rigid-body hydrodynamical modeling scheme.
Using a manifold-based analysis of experimental diffraction snapshots from an X-ray free electron laser, we determine the three-dimensional structure and conformational landscape of the PR772 virus to a detector-limited resolution of 9 nm. Our results indicate that a single conformational coordinate controls reorganization of the genome, growth of a tubular structure from a portal vertex and release of the genome. These results demonstrate that single-particle X-ray scattering has the potential to shed light on key biological processes.
The weighted ensemble (WE) family of methods is one of several statistical mechanics-based path sampling strategies that can provide estimates of key observables (rate constants and pathways) using a fraction of the time required by direct simulation methods such as molecular dynamics or discrete-state stochastic algorithms. WE methods oversee numerous parallel trajectories using intermittent overhead operations at fixed time intervals, enabling facile interoperability with any dynamics engine. Here, we report on the major upgrades to the WESTPA software package, an open-source, high-performance framework that implements both basic and recently developed WE methods. These upgrades offer substantial improvements over traditional WE methods. The key features of the new WESTPA 2.0 software enhance the efficiency and ease of use: an adaptive binning scheme for more efficient surmounting of large free energy barriers, streamlined handling of large simulation data sets, exponentially improved analysis of kinetics, and developer-friendly tools for creating new WE methods, including a Python API and resampler module for implementing both binned and “binless” WE strategies.
The weighted ensemble (WE) simulation strategy provides unbiased sampling of non-equilibrium processes, such as molecular folding or binding, but the extraction of rate constants relies on characterizing steady state behavior. Unfortunately, WE simulations of sufficiently complex systems will not relax to steady state on observed simulation times. Here we show that a postsimulation clustering of molecular configurations into "microbins" using methods developed in the Markov State Model (MSM) community, can yield unbiased kinetics from WE data before steadystate convergence of the WE simulation itself. Because WE trajectories are directional and not equilibrium-distributed, the history-augmented MSM (haMSM) formulation can be used, which yields the mean first-passage time (MFPT) without bias for arbitrarily small lag times. Accurate kinetics can be obtained while bypassing the often prohibitive convergence requirements of the non-equilibrium weighted ensemble. We validate the method in a simple diffusive process on a 2D random energy landscape, and then analyze atomistic protein folding simulations using WE molecular dynamics. We report significant progress towards the unbiased estimation of protein folding times and pathways, though key challenges remain.
Edited by Ronald C. WekMyotonic dystrophy type 2 is a genetic neuromuscular disease caused by the expression of expanded CCUG repeat RNAs from the non-coding region of the CCHC-type zinc finger nucleic acid-binding protein (CNBP) gene. These CCUG repeats bind and sequester a family of RNA-binding proteins known as Muscleblind-like 1, 2, and 3 (MBNL1, MBNL2, and MBNL3), and sequestration plays a significant role in pathogenicity. MBNL proteins are alternative splicing regulators that bind to the consensus RNA sequence YGCY (Y ؍ pyrimidine). This consensus sequence is found in the toxic RNAs (CCUG repeats) and in cellular RNA substrates that MBNL proteins have been shown to bind. Replacing the uridine in CCUG repeats with pseudouridine (⌿) resulted in a modest reduction of MBNL1 binding. Interestingly, ⌿ modification of a minimally structured RNA containing YGCY motifs resulted in more robust inhibition of MBNL1 binding. The different levels of inhibition between CCUG repeat and minimally structured RNA binding appear to be due to the ability to modify both pyrimidines in the YGCY motif, which is not possible in the CCUG repeats. Molecular dynamic studies of unmodified and pseudouridylated minimally structured RNAs suggest that reducing the flexibility of the minimally structured RNA leads to reduced binding by MBNL1. Myotonic dystrophy type 1 (DM1)3 is a genetic neuromuscular disease caused by expression of expanded CUG repeats in the 3Ј UTR of the dystrophia myotonica protein kinase (DMPK) gene. Similar to DM1, myotonic dystrophy type 2 (DM2) is caused by expression of expanded CCUG repeats in an intron of the CCHC-type zinc finger nucleic acid-binding protein (CNBP) gene. DM1 and DM2 occur when the CUG/CCUG repeats are expanded beyond 100 repeats, and patients can have up to thousands of CUG/CCUG repeats (1, 2). A primary component of the currently accepted DM1 and DM2 disease mechanism is that expanded CUG/CCUG repeats sequester RNAbinding proteins (primarily the Muscleblind-like family), which prevents these proteins from performing their functions in cells (3, 4).The members of the Muscleblind-like family of proteins (MBNL1, MBNL2, and MBNL3) bind RNA and regulate several RNA processing pathways, including alternative splicing, premiRNA biogenesis, mRNA localization, alternative polyadenylation, and circular RNA generation (5-9). MBNL proteins bind to the consensus YGCY RNA sequence (6, 10). CUG and CCUG repeats are composed of YGCY motifs creating hundreds or thousands of perfect MBNL-binding sites resulting in large numbers of MBNL proteins binding to the repeats and forming nuclear foci (11). When MBNL proteins are sequestered, they are unable to regulate RNA processing events, and consequently, many DM1 and DM2 symptoms are caused by misregulated alternative splicing and potentially the loss of other MBNL activities (12). It is therefore important to understand how MBNL proteins bind to their toxic and cellular RNA substrates to develop mechanisms to alleviate MBNL sequestration in DM1 and DM2.MBNL pro...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.