Hearing loss is associated with $8100 mutations in 152 genes, and within the coding regions of these genes are over 60,000 missense variants. The majority of these variants are classified as ''variants of uncertain significance'' to reflect our inability to ascribe a phenotypic effect to the observed amino acid change. A promising source of pathogenicity information is biophysical simulation, although input protein structures often contain defects because of limitations in experimental data and/or only distant homology to a template. Here, we combine the polarizable atomic multipole optimized energetics for biomolecular applications force field, many-body optimization theory, and graphical processing unit acceleration to repack all deafness-associated proteins and thereby improve average structure MolProbity score from 2.2 to 1.0. We then used these optimized wild-type models to create over 60,000 structures for missense variants in the Deafness Variation Database, which are being incorporated into the Deafness Variation Database to inform deafness pathogenicity prediction. Finally, this work demonstrates that advanced polarizable atomic multipole force fields are efficient enough to repack the entire human proteome.
Hearing loss is associated with ~8100 mutations in 152 genes, and within the coding regions of these genes are over 60,000 missense variants. The majority of these variants are classified as 'variants of uncertain significance' to reflect our inability to ascribe a phenotypic effect to the observed amino acid change. A promising source of pathogenicity information are atomic resolution simulations, although input protein structures often contain defects due to limitations in experimental data and/or only distant homology to a template. Here we combine the polarizable AMOEBA force field, many-body optimization theory and GPU acceleration to repack all deafness-associated proteins and thereby improve average structure resolution from 2.2 Å to 1.0 Å based on assessment with MolProbity. We incorporate these data into the Deafness Variation Database to inform deafness pathogenicity prediction, and show that advanced polarizable force fields could now be used to repack the entire human proteome using the Force Field X software.
Computational protein design, ab initio protein/RNA folding, and protein-ligand screening can be too computationally demanding for explicit treatment of solvent. For these applications, implicit solvent offers a compelling alternative, which we describe here for the polarizable atomic multipole AMOEBA force field based on three treatments of continuum electrostatics: numerical solutions to the Poisson-Boltzmann equation (PBE), the domain-decomposition Conductor-like Screening Model (ddCOSMO) approximation to the PBE, and the analytic generalized Kirkwood (GK) approximation. The continuum electrostatic models are combined with a nonpolar estimator based on novel cavitation and dispersion terms. Electrostatic model parameters are numerically optimized using a least squares style target function based on a library of 103 small molecule solvation free energy differences. Mean signed errors for the APBS, ddCOSMO, and GK models are 0.05, 0.00, and 0.00 kcal/mol, respectively, while the mean unsigned errors are 0.70, 0.63, and 0.51 kcal/mol, respectively. Validation of the electrostatic response of the resulting implicit solvents, which are available in the Tinker (or Tinker-HP), OpenMM, and Force Field X software packages, is based on comparisons to explicit solvent simulations for a series of proteins and nucleic acids. Overall, the emergence of performative implicit solvent models for polarizable force fields will open the door to their use for folding and design applications.
Some recent advances in biomolecular simulation and global optimization have used hybrid restraint potentials, where harmonic restraints that penalize conformations inconsistent with experimental data are combined with molecular mechanics force fields. These hybrid potentials can be used to improve the performance of molecular dynamics, structure prediction, energy landscape sampling, and other computational methods that rely on the accuracy of the underlying force field. Here, we develop a hybrid restraint potential based on NapShift, an artificial neural network trained to predict protein nuclear magnetic resonance (NMR) chemical shifts from sequence and structure. In addition to providing accurate predictions of experimental chemical shifts, NapShift is fully differentiable with respect to atomic coordinates, which allows us to use it for structural refinement. By employing NapShift to predict chemical shifts from the protein conformation at each simulation step, we can compute an energy penalty and the corresponding hybrid restraint forces based on the difference between the predicted values and the experimental chemical shifts. The performance of the hybrid restraint potential was benchmarked using both basin-hopping global optimization and molecular dynamics simulations. In each case, the NapShift hybrid potential improved the accuracy, leading to better structure prediction via basin-hopping and increased local stability in molecular dynamics simulations. Our results suggest that neural network hybrid potentials based on NMR observables can enhance a broad range of molecular simulation methods, and the prediction accuracy will improve as more experimental training data become available.
Computational protein design, ab initio protein/RNA folding, and protein-ligand screening can be too computationally demanding for explicit treatment of solvent. For these applications, implicit solvent offers a compelling alternative, which we describe here for the polarizable atomic multipole AMOEBA force field based on three treatments of continuum electrostatics: numerical solutions to the Poisson-Boltzmann equation (PBE), the domain-decomposition Conductor-like Screening Model (ddCOSMO) approximation to the PBE, and the analytic generalized Kirkwood (GK) approximation. The continuum electrostatic models are combined with a nonpolar estimator based on novel cavitation and dispersion terms. Electrostatic model parameters are numerically optimized using a least squares style target function based on a library of 103 small molecule solvation free energy differences. Mean signed errors for the APBS, ddCOSMO, and GK models are 0.05, 0.00, and 0.00 kcal/mol, respectively, while the mean unsigned errors are 0.70, 0.63, and 0.51 kcal/mol, respectively. Validation of the electrostatic response of the resulting implicit solvents, which are available in the Tinker (or Tinker-HP), OpenMM, and Force Field X software packages, is based on comparisons to explicit solvent simulations for a series of proteins and nucleic acids. Overall, the emergence of performative implicit solvent models for polarizable force fields will open the door to their use for folding and design applications.<br>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.