Predicting accurate protein−ligand binding affinities is an important task in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the application of deep convolutional and graph neural network-based approaches, it remains unclear what the relative advantages of each approach are and how they compare with physics-based methodologies that have found more mainstream success in virtual screening pipelines. We present fusion models that combine features and inference from complementary representations to improve binding affinity prediction. This, to our knowledge, is the first comprehensive study that uses a common series of evaluations to directly compare the performance of three-dimensional (3D)-convolutional neural networks (3D-CNNs), spatial graph neural networks (SG-CNNs), and their fusion. We use temporal and structure-based splits to assess performance on novel protein targets. To test the practical applicability of our models, we examine their performance in cases that assume that the crystal structure is not available. In these cases, binding free energies are predicted using docking pose coordinates as the inputs to each model. In addition, we compare these deep learning approaches to predictions based on docking scores and molecular mechanic/generalized Born surface area (MM/ GBSA) calculations. Our results show that the fusion models make more accurate predictions than their constituent neural network models as well as docking scoring and MM/GBSA rescoring, with the benefit of greater computational efficiency than the MM/ GBSA method. Finally, we provide the code to reproduce our results and the parameter files of the trained models used in this work. The software is available as open source at https://github.com/llnl/fast. Model parameter files are available at ftp://gdobioinformatics.ucllnl.org/fast/pdbbind2016_model_checkpoints/.
We present an extensive study of a novel class of de novo designed tetrahedral M(4)L(6) (M = Ni, Zn) cage receptors, wherein internal decoration of the cage cavities with urea anion-binding groups, via functionalization of the organic components L, led to selective encapsulation of tetrahedral oxoanions EO(4)(n-) (E = S, Se, Cr, Mo, W, n = 2; E = P, n = 3) from aqueous solutions, based on shape, size, and charge recognition. External functionalization with tBu groups led to enhanced solubility of the cages in aqueous methanol solutions, thereby allowing for their thorough characterization by multinuclear ((1)H, (13)C, (77)Se) and diffusion NMR spectroscopies. Additional experimental characterization by electrospray ionization mass spectrometry, UV-vis spectroscopy, and single-crystal X-ray diffraction, as well as theoretical calculations, led to a detailed understanding of the cage structures, self-assembly, and anion encapsulation. We found that the cage self-assembly is templated by EO(4)(n-) oxoanions (n ≥ 2), and upon removal of the templating anion the tetrahedral M(4)L(6) cages rearrange into different coordination assemblies. The exchange selectivity among EO(4)(n-) oxoanions has been investigated with (77)Se NMR spectroscopy using (77)SeO(4)(2-) as an anionic probe, which found the following selectivity trend: PO(4)(3-) ≫ CrO(4)(2-) > SO(4)(2-) > SeO(4)(2-) > MoO(4)(2-) > WO(4)(2-). In addition to the complementarity and flexibility of the cage receptor, a combination of factors have been found to contribute to the observed anion selectivity, including the anions' charge, size, hydration, basicity, and hydrogen-bond acceptor abilities.
Background: A high-throughput virtual screening pipeline has been extended from single energetically minimized structure Molecular Mechanics/Generalized Born Surface Area (MM/GBSA) rescoring to ensemble-average MM/GBSA rescoring. The correlation coefficient (R2) of calculated and experimental binding free energies for a series of antithrombin ligands has been improved from 0.36 to 0.69 when switching from the single-structure MM/GBSA rescoring to ensemble-average one. The electrostatic interactions in both solute and solvent are identified to play an important role in determining the binding free energy after the decomposition of the calculated binding free energy. The increasing negative charge of the compounds provides a more favorable electrostatic energy change but creates a higher penalty for the solvation free energy. Such a penalty is compensated by the electrostatic energy change, which results in a better binding affinity. A highly hydrophobic ligand is determined by the docking program to be a non-specific binder.Results: Our results have demonstrated that it is very important to keep a few top poses for rescoring, if the binding is non-specific or the binding mode is not well determined by the docking calculation.
Biologists have observed that the presence of divalent metal is essential for the binding of the hormone oxytocin (OT) to its cellular receptor. However, this interaction is not understood on the molecular level. Because conformation is a key factor controlling ligand binding in biomolecule systems, we have used ion mobility experiments and molecular modeling to probe the conformation of the oxytocin-zinc complex. Results show that Zn2+ occupies an octahedral site in the interior of the OT peptide that frees the N-terminus and creates a structured hydrophobic binding site on the peptide exterior; both factors are conducive to binding oxytocin to its receptor.
In this work we announce and evaluate a high throughput virtual screening pipeline for in-silico screening of virtual compound databases using high performance computing (HPC). Notable features of this pipeline are an automated receptor preparation scheme with unsupervised binding site identification. The pipeline includes receptor/target preparation, ligand preparation, VinaLC docking calculation, and molecular mechanics/generalized Born surface area (MM/GBSA) rescoring using the GB model by Onufriev and co-workers [J. Chem. Theory Comput. 2007, 3, 156-169]. Furthermore, we leverage HPC resources to perform an unprecedented, comprehensive evaluation of MM/GBSA rescoring when applied to the DUD-E data set (Directory of Useful Decoys: Enhanced), in which we selected 38 protein targets and a total of ∼0.7 million actives and decoys. The computer wall time for virtual screening has been reduced drastically on HPC machines, which increases the feasibility of extremely large ligand database screening with more accurate methods. HPC resources allowed us to rescore 20 poses per compound and evaluate the optimal number of poses to rescore. We find that keeping 5-10 poses is a good compromise between accuracy and computational expense. Overall the results demonstrate that MM/GBSA rescoring has higher average receiver operating characteristic (ROC) area under curve (AUC) values and consistently better early recovery of actives than Vina docking alone. Specifically, the enrichment performance is target-dependent. MM/GBSA rescoring significantly out performs Vina docking for the folate enzymes, kinases, and several other enzymes. The more accurate energy function and solvation terms of the MM/GBSA method allow MM/GBSA to achieve better enrichment, but the rescoring is still limited by the docking method to generate the poses with the correct binding modes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.