We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike’s full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.
SARS-CoV-2 infection is controlled by the opening of the spike protein receptor binding domain (RBD), which transitions from a glycan-shielded (down) to an exposed (up) state in order to bind the human ACE2 receptor and infect cells. While snapshots of the up and down states have been obtained by cryoEM and cryoET, details of the RBD opening transition evade experimental characterization. Here, over 200 μs of weighted ensemble (WE) simulations of the fully glycosylated spike ectodomain allow us to characterize more than 300 continuous, kinetically unbiased RBD opening pathways. Together with biolayer interferometry experiments, we reveal a gating role for the N-glycan at position N343, which facilitates RBD opening. Residues D405, R408, and D427 also participate. The atomic-level characterization of the glycosylated spike activation mechanism provided herein achieves a new high-water mark for ensemble pathway simulations and offers a foundation for understanding the fundamental mechanisms of SARS-CoV-2 viral entry and infection.
The weighted ensemble (WE) strategy has been demonstrated to be highly efficient in generating pathways and rate constants for rare events such as protein folding and protein binding using atomistic molecular dynamics simulations. Here we present five tutorials instructing users in the best practices for preparing, carrying out, and analyzing WE simulations for various applications using the WESTPA software. Users are expected to already have significant experience with running standard molecular dynamics simulations using the underlying dynamics engine of interest (e.g. Amber, Gromacs, OpenMM). The tutorials range from a molecular association process in explicit solvent to more complex processes such as host-guest association, peptide conformational sampling, and protein folding.
The weighted ensemble (WE) family of methods is one of several statistical mechanics-based path sampling strategies that can provide estimates of key observables (rate constants and pathways) using a fraction of the time required by direct simulation methods such as molecular dynamics or discrete-state stochastic algorithms. WE methods oversee numerous parallel trajectories using intermittent overhead operations at fixed time intervals, enabling facile interoperability with any dynamics engine. Here, we report on the major upgrades to the WESTPA software package, an open-source, high-performance framework that implements both basic and recently developed WE methods. These upgrades offer substantial improvements over traditional WE methods. The key features of the new WESTPA 2.0 software enhance the efficiency and ease of use: an adaptive binning scheme for more efficient surmounting of large free energy barriers, streamlined handling of large simulation data sets, exponentially improved analysis of kinetics, and developer-friendly tools for creating new WE methods, including a Python API and resampler module for implementing both binned and “binless” WE strategies.
A promising approach for simulating rare events with rigorous kinetics is the weighted ensemble path sampling strategy. One challenge of this strategy is the division of configurational space into bins for sampling. Here we present a minimal adaptive binning (MAB) scheme for the automated, adaptive placement of bins along a progress coordinate within the framework of the weighted ensemble strategy. Results reveal that the MAB binning scheme, despite its simplicity, is more efficient than a manual, fixed binning scheme in generating transitions over large free energy barriers, generating a diversity of pathways, estimating rate constants, and sampling conformations. The scheme is general and extensible to any rare-events sampling strategy that employs progress coordinates.
We develop a generalizable AI-driven workflow that leverages heterogeneous HPC resources to explore the time-dependent dynamics of molecular systems. We use this workflow to investigate the mechanisms of infectivity of the SARS-CoV-2 spike protein, the main viral infection machinery. Our workflow enables more efficient investigation of spike dynamics in a variety of complex environments, including within a complete SARS-CoV-2 viral envelope simulation, which contains 305 million atoms and shows strong scaling on ORNL Summit using NAMD. We present several novel scientific discoveries, including the elucidation of the spike’s full glycan shield, the role of spike glycans in modulating the infectivity of the virus, and the characterization of the flexible interactions between the spike and the human ACE2 receptor. We also demonstrate how AI can accelerate conformational sampling across different systems and pave the way for the future application of such methods to additional studies in SARS-CoV-2 and other molecular systems.ACM Reference FormatLorenzo Casalino1†, Abigail Dommer1†, Zied Gaieb1†, Emilia P. Barros1, Terra Sztain1, Surl-Hee Ahn1, Anda Trifan2,3, Alexander Brace2, Anthony Bogetti4, Heng Ma2, Hyungro Lee5, Matteo Turilli5, Syma Khalid6, Lillian Chong4, Carlos Simmerling7, David J. Hardy3, Julio D. C. Maia3, James C. Phillips3, Thorsten Kurth8, Abraham Stern8, Lei Huang9, John McCalpin9, Mahidhar Tatineni10, Tom Gibbs8, John E. Stone3, Shantenu Jha5, Arvind Ramanathan2∗, Rommie E. Amaro1∗. 2020. AI-Driven Multiscale Simulations Illuminate Mechanisms of SARS-CoV-2 Spike Dynamics. In Supercomputing ’20: International Conference for High Performance Computing, Networking, Storage, and Analysis. ACM, New York, NY, USA, 14 pages. https://doi.org/finalDOI
We present a new force field, AMBER ff15ipq-m, for simulations of protein mimetics in applications from therapeutics to biomaterials. This force field is an expansion of the AMBER ff15ipq force field that was developed for canonical proteins and enables the modeling of four classes of artificial backbone units that are commonly used alongside natural α residues in blended or “heterogeneous” backbones: chirality-reversed D-α-residues, the Cα-methylated α-residue Aib, homologated β-residues (β3) bearing proteinogenic side chains, and two cyclic β residues (βcyc; APC and ACPC). The ff15ipq-m force field includes 472 unique atomic charges and 148 unique torsion terms. Consistent with the AMBER IPolQ lineage of force fields, the charges were derived using the Implicitly Polarized Charge (IPolQ) scheme in the presence of explicit solvent. To our knowledge, no general force field reported to date models the combination of artificial building blocks examined here. In addition, we have derived Karplus coefficients for the calculation of backbone amide J-coupling constants for β3Ala and ACPC β residues. The AMBER ff15ipq-m force field reproduces experimentally observed J-coupling constants in simple tetrapeptides and maintains the expected conformational propensities in reported structures of proteins/peptides containing the artificial building blocks of interest—all on the μs timescale. These encouraging results demonstrate the power and robustness of the IPolQ lineage of force fields in modeling the structure and dynamics of natural proteins as well as mimetics with protein-inspired artificial backbones in atomic detail.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.