Pragmatic Coarse-Graining of Proteins: Models and Applications
Luís Borges-Araújo,
Ilias Patmanidis,
Akhil P. Singh
et al.
Abstract:The molecular details involved in the folding, dynamics, organization, and interaction of proteins with other molecules are often difficult to assess by experimental techniques. Consequently, computational models play an ever-increasing role in the field. However, biological processes involving large-scale protein assemblies or long time scale dynamics are still computationally expensive to study in atomistic detail. For these applications, employing coarse-grained (CG) modeling approaches has become a key str… Show more
“…The first step in running a simulation with SIRAH is to map an atomistic structure to its CG representation. This procedure is performed using a script provided in the SIRAH package. , A correct mapping absolutely requires residue names to be written according to the GLYCAM nomenclature. It is important to emphasize that our CG model requires the position of the hydrogen atoms in the atomistic structure.…”
Section: Methodsmentioning
confidence: 99%
“…For instance, the intricate glycosylation pattern recently reported for the Spike protein of SARS-CoV-2 53 is perfectly amenable to the set of CG parameters reported here. Owing to the pragmatic nature of our force field, 60 generating a general recipe for arbitrary topologies for new residues, for instance, furanoses, might not be straightforward. Nevertheless, the availability parameter files (https://github.com/SIRAHFF) and the information reported in the Supporting Information (Figure S1 and Tables S1−S3) can provide general rules for creating CG parameters for other glycans.…”
Glycans constitute one of the most complex families of biological molecules. Despite their crucial role in a plethora of biological processes, they remain largely uncharacterized because of their high complexity. Their intrinsic flexibility and the vast variability associated with the many combination possibilities have hampered their experimental determination. Although theoretical methods have proven to be a valid alternative to the study of glycans, the large size associated with polysaccharides, proteoglycans, and glycolipids poses significant challenges to a fully atomistic description of biologically relevant glycoconjugates. On the other hand, the exquisite dependence on hydrogen bonds to determine glycans' structure makes the development of simplified or coarsegrained (CG) representations extremely challenging. This is particularly the case when glycan representations are expected to be compatible with CG force fields that include several molecular types. We introduce a CG representation able to simulate a wide variety of polysaccharides and common glycosylation motifs in proteins, which is fully compatible with the CG SIRAH force field. Examples of application to N-glycosylated proteins, including antibody recognition and calcium-mediated glycan−protein interactions, highlight the versatility of the enlarged set of CG molecules provided by SIRAH.
“…The first step in running a simulation with SIRAH is to map an atomistic structure to its CG representation. This procedure is performed using a script provided in the SIRAH package. , A correct mapping absolutely requires residue names to be written according to the GLYCAM nomenclature. It is important to emphasize that our CG model requires the position of the hydrogen atoms in the atomistic structure.…”
Section: Methodsmentioning
confidence: 99%
“…For instance, the intricate glycosylation pattern recently reported for the Spike protein of SARS-CoV-2 53 is perfectly amenable to the set of CG parameters reported here. Owing to the pragmatic nature of our force field, 60 generating a general recipe for arbitrary topologies for new residues, for instance, furanoses, might not be straightforward. Nevertheless, the availability parameter files (https://github.com/SIRAHFF) and the information reported in the Supporting Information (Figure S1 and Tables S1−S3) can provide general rules for creating CG parameters for other glycans.…”
Glycans constitute one of the most complex families of biological molecules. Despite their crucial role in a plethora of biological processes, they remain largely uncharacterized because of their high complexity. Their intrinsic flexibility and the vast variability associated with the many combination possibilities have hampered their experimental determination. Although theoretical methods have proven to be a valid alternative to the study of glycans, the large size associated with polysaccharides, proteoglycans, and glycolipids poses significant challenges to a fully atomistic description of biologically relevant glycoconjugates. On the other hand, the exquisite dependence on hydrogen bonds to determine glycans' structure makes the development of simplified or coarsegrained (CG) representations extremely challenging. This is particularly the case when glycan representations are expected to be compatible with CG force fields that include several molecular types. We introduce a CG representation able to simulate a wide variety of polysaccharides and common glycosylation motifs in proteins, which is fully compatible with the CG SIRAH force field. Examples of application to N-glycosylated proteins, including antibody recognition and calcium-mediated glycan−protein interactions, highlight the versatility of the enlarged set of CG molecules provided by SIRAH.
“…Backbone-local interactions are very likely to presculpt the protein free-energy landscape to favor regular helix and sheet structures . Therefore, the development of local-interaction-energy terms in both all-atom − and coarse-grained − force fields has received a lot of attention. The torsional and improper-torsional terms that describe the energy of rotation about a bond (in all-atom models) or about a virtual bond (in coarse-grained models) or the geometry of the surrounding of a central atom or site, respectively, are particularly important because a dihedral angle is a collective reaction coordinate that describes the concerted motion of four atoms or sites.…”
Section: Torsional Terms Imported From All-atom Force Fields Are Insu...mentioning
A reliable representation of local interactions is critical
for
the accuracy of modeling protein structure and dynamics at both the
all-atom and coarse-grained levels. The development of local (mainly
torsional) potentials was focused on careful parametrization of the
predetermined (usually Fourier) formulas rather than on their physics-based
derivation. In this Perspective we discuss the state-of-the-art methods
for modeling local interactions, including the scale-consistent theory
developed in our laboratory, which implies that the coarse-grained
torsional potentials inseparably depend on the virtual-bond angles
adjacent to a given dihedral and that multitorsional terms should
be considered. We extend the treatment to split the residue-based
torsional potentials into the site-based regular and improper torsional
potentials. These considerations are illustrated with the revised
torsional potentials and improper-torsional potentials involving the l-alanine residue and the improper-torsional potential corresponding
to serine-residue enantiomerization. Applications of the new approach
in coarse-grained modeling and revising all-atom force fields are
discussed.
“…Other recent reviews and perspectives − have discussed extensively the current status of biomolecular simulation and modeling, including coarse-grained methods . Here, we focus on atomic-level simulations and both the enormous potential and the resulting challenges created by the emergence of exascale computing.…”
Exascale supercomputers have opened the door to dynamic
simulations,
facilitated by AI/ML techniques, that model biomolecular motions over
unprecedented length and time scales. This new capability holds the
potential to revolutionize our understanding of fundamental biological
processes. Here we report on some of the major advances that were
discussed at a recent CECAM workshop in Pisa, Italy, on the topic
with a primary focus on atomic-level simulations. First, we highlight
examples of current large-scale biomolecular simulations and the future
possibilities enabled by crossing the exascale threshold. Next, we
discuss challenges to be overcome in optimizing the usage of these
powerful resources. Finally, we close by listing several grand challenge
problems that could be investigated with this new computer architecture.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.