The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. The dataset was composed of a series of N-acylsulfonamides and related bioisosteres. 17 research groups participated in the log P challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water log P predictions in the SAMPL7 challenge was lower than octanol-water log P predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7. Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.
Water molecules can be found interacting with the surface and within cavities in proteins. However, water exchange between bulk and buried hydration sites can be slow compared to simulation timescales, thus leading to the inefficient sampling of the locations of water. This can pose problems for free energy calculations for computer-aided drug design. Here, we apply a hybrid method that combines nonequilibrium candidate Monte Carlo (NCMC) simulations and molecular dynamics (MD) to enhance sampling of water in specific areas of a system, such as the binding site of a protein. Our approach uses NCMC to gradually remove interactions between a selected water molecule and its environment, then translates the water to a new region, before turning the interactions back on. This approach of gradual removal of interactions, followed by a move and then reintroduction of interactions, allows the environment relax in response to the proposed water translation, improving acceptance of moves and thereby accelerating water exchange and sampling. We validate this approach on several test systems including the ligand-bound MUP-1 and HSP90 proteins with buried crystallographic waters removed. We show that our NCMC/MD method enhances water sampling relative to normal MD when applied to these systems. Thus, this approach provides a strategy to improve water sampling in molecular simulations which may be useful in practical applications in drug discovery and biomolecular design.
The SAMPL Challenges aim to focus the biomolecular and physical modeling community on issues that limit the 18 accuracy of predictive modeling of protein-ligand binding for rational drug design. In the SAMPL5 log D Challenge, designed to 19 benchmark the accuracy of methods for predicting drug-like small molecule transfer free energies from aqueous to nonpolar 20 phases, participants found it difficult to make accurate predictions due to the complexity of protonation state issues. In the 21 SAMPL6 log P Challenge, we asked participants to make blind predictions of the octanol-water partition coefficients of neutral 22 species of 11 compounds and assessed how well these methods performed absent the complication of protonation state 23 effects. This challenge builds on the SAMPL6 pK a Challenge, which asked participants to predict pK a values of a superset of the 24 compounds considered in this log P challenge. Blind prediction sets of 91 prediction methods were collected from 27 research 25 groups, spanning a variety of quantum mechanics (QM) or molecular mechanics (MM)-based physical methods, knowledge-based 26 empirical methods, and mixed approaches. There was a 50% increase in the number of participating groups and a 20% increase 27 in the number of submissions compared to the SAMPL5 log D Challenge. Overall, the accuracy of octanol-water log P predictions 28 in SAMPL6 Challenge was higher than cyclohexane-water log D predictions in SAMPL5, likely because modeling only the neutral 29 species was necessary for log P and several categories of method benefited from the vast amounts of experimental octanol-water 30 log P data. There were many highly accurate methods: 10 diverse methods achieved RMSE less than 0.5 log P units. These included 31 QM-based methods, empirical methods, and mixed methods with physical modeling supported with empirical corrections. A 32 comparison of physical modeling methods showed that QM-based methods outperformed MM-based methods. The average 33 RMSE of the most accurate five MM-based, QM-based, empirical, and mixed approach methods based on RMSE were 0.92±0.13, 34 0.48±0.06, 0.47±0.05, and 0.50±0.06, respectively. 35 36 0.1 Keywords 37 octanol-water partition coefficient ⋅ log P ⋅ blind prediction challenge ⋅ SAMPL ⋅ free energy calculations ⋅ solvation modeling 38 1 of 50 0.2 Abbreviations 39 SAMPL Statistical Assessment of the Modeling of Proteins and Ligands 40 log P log 10 of the organic solvent-water partition coefficient ( ) of neutral species 41 log D log 10 of organic solvent-water distribution coefficient ( ) 42 pK a −log 10 of the acid dissociation equilibrium constant 43 SEM Standard error of the mean 44 RMSE Root mean squared error 45 MAE Mean absolute error 46 Kendall's rank correlation coefficient (Tau) 47 R 2 Coefficient of determination (R-Squared) 48 QM Quantum Mechanics 49 MM Molecular Mechanics 50 1 Introduction 51 The development of computational biomolecular modeling methodolgoies is motivated by the goal of enabling quantitative 52 molecular design, pred...
<div>The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. </div><div>The dataset was composed of a series of N-acylsulfonamides and related bioisosteres.</div><div>17 research groups participated in the logP challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water logP predictions in the SAMPL7 challenge was lower than octanol-water logP predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7.</div><div>Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.</div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.