With an increasing interest in RNA therapeutics and for targeting RNA to treat disease, there is a need for the tools used in protein-based drug design, particularly DOCKing algorithms, to be extended or adapted for nucleic acids. Here, we have compiled a test set of RNA-ligand complexes to validate the ability of the DOCK suite of programs to successfully recreate experimentally determined binding poses. With the optimized parameters and a minimal scoring function, 70% of the test set with less than seven rotatable ligand bonds and 26% of the test set with less than 13 rotatable bonds can be successfully recreated within 2 Å heavy-atom RMSD. When DOCKed conformations are rescored with the implicit solvent models AMBER generalized Born with solvent-accessible surface area (GB/SA) and Poisson-Boltzmann with solvent-accessible surface area (PB/SA) in combination with explicit water molecules and sodium counterions, the success rate increases to 80% with PB/SA for less than seven rotatable bonds and 58% with AMBER GB/SA and 47% with PB/SA for less than 13 rotatable bonds. These results indicate that DOCK can indeed be useful for structure-based drug design aimed at RNA. Our studies also suggest that RNA-directed ligands often differ from typical protein-ligand complexes in their electrostatic properties, but these differences can be accommodated through the choice of potential function. In addition, in the course of the study, we explore a variety of newly added DOCK functions, demonstrating the ease with which new functions can be added to address new scientific questions.
This manuscript presents the latest algorithmic and methodological developments to the structure-based design program DOCK 6.7 focused on an updated internal energy function, new anchor selection control, enhanced minimization options, a footprint similarity scoring function, a symmetry-corrected RMSD algorithm, a database filter, and docking forensic tools. An important strategy during development involved use of three orthogonal metrics for assessment and validation: pose reproduction over a large database of 1043 protein-ligand complexes (SB2012 test set), cross-docking to 24 drug-target protein families, and database enrichment using large active and decoy data sets (DUD-E test set) for 5 important proteins including HIV protease and IGF-1R. Relative to earlier versions, a key outcome of the work is a significant increase in pose reproduction success in going from DOCK 4.0.2 (51.4%) → 5.4 (65.2%) → 6.7 (73.3%) as a result of significant decreases in failure arising from both sampling 24.1% → 13.6% → 9.1% and scoring 24.4% → 21.1% → 17.5%. Companion cross-docking and enrichment studies with the new version highlight other strengths and remaining areas for improvement, especially for systems containing metal ions. The source code for DOCK 6.7 is available for download and free for academic users at http://dock.compbio.ucsf.edu/.
We report unrestrained, all-atom molecular dynamics simulations of HIV-1 protease that sample large conformational changes of the active site flaps. In particular, the unliganded protease undergoes multiple conversions between the ''closed'' and ''semiopen'' forms observed in crystal structures of inhibitor-bound and unliganded protease, respectively, including reversal of flap ''handedness.'' Simulations in the presence of a cyclic urea inhibitor yield stable closed flaps. Furthermore, we observe several events in which the flaps of the unliganded protease open to a much greater degree than observed in crystal structures and subsequently return to the semiopen state. Our data strongly support the hypothesis that the unliganded protease predominantly populates the semiopen conformation, with closed and fully open structures being a minor component of the overall ensemble. The results also provide a model for the flap opening and closing that is considered to be essential to enzyme function.
We report on the development and validation of a new version of DOCK. The algorithm has been rewritten in a modular format, which allows for easy implementation of new scoring functions, sampling methods and analysis tools. We validated the sampling algorithm with a test set of 114 protein-ligand complexes. Using an optimized parameter set, we are able to reproduce the crystal ligand pose to within 2 A of the crystal structure for 79% of the test cases using our rigid ligand docking algorithm with an average run time of 1 min per complex and for 72% of the test cases using our flexible ligand docking algorithm with an average run time of 5 min per complex. Finally, we perform an analysis of the docking failures in the test set and determine that the sampling algorithm is generally sufficient for the binding pose prediction problem for up to 7 rotatable bonds; i.e. 99% of the rigid ligand docking cases and 95% of the flexible ligand docking cases are sampled successfully. We point out that success rates could be improved through more advanced modeling of the receptor prior to docking and through improvement of the force field parameters, particularly for structures containing metal-based cofactors.
Absolute free energies of hydration (ΔGhyd) for more than 500 neutral and charged compounds have been computed, using Poisson-Boltzmann (PB) and Generalized Born (GB) continuum methods plus a solvent-accessible surface area (SA) term, to evaluate the accuracy of eight simple point-charge models used in molecular modeling. The goal is to develop improved procedures and protocols for protein-ligand binding calculations and virtual screening (docking). The best overall PBSA and GBSA results, in comparison with experimental ΔGhyd values for small molecules, were obtained using MSK, RESP, or ChelpG charges obtained from ab initio calculations using 6-31G* wave functions. Correlations using semiempirical (AM1BCC, AM1CM2, and PM3CM2) or empirical (Gasteiger-Marsili and MMFF94) methods yielded mixed results, particularly for charged compounds. For neutral compounds, the AM1BCC method yielded the best agreement with experimental results. In all cases, the PBSA and GBSA results are highly correlated (overall r(2) = 0.94), which highlights the fact that various partial charge models influence the final results much more than which continuum method is used to compute hydration free energies. Overall improved agreement with experimental results was demonstrated using atom-based constants in place of a single surface area term. Sets of optimized SA constants, suitable for use with a given charge model, were derived by fitting to the difference in experimental free energies and polar continuum results. The use of optimized atom-based SA constants for the computation of ΔGhyd can fine-tune already reasonable agreement with experimental results, ameliorate gross deficiencies in any particular charge model, account for nonoptimal radii, or correct for systematic errors.
A database consisting of 780 ligand-receptor complexes, termed SB2010, has been derived from the Protein Databank to evaluate the accuracy of docking protocols for regenerating bound ligand conformations. The goal is to provide easily accessible community resources for development of improved procedures to aid virtual screening for ligands with a wide range of flexibilities. Three core experiments using the program DOCK, which employ rigid (RGD), fixed anchor (FAD), and flexible (FLX) protocols, were used to gauge performance by several different metrics: (1) global results, (2) ligand flexibility, (3) protein family, and (4) crossdocking. Global spectrum plots of successes and failures vs rmsd reveal well-defined inflection regions, which suggest the commonly used 2 Å criteria is a reasonable choice for defining success. Across all 780 systems, success tracks with the relative difficulty of the calculations: RGD (82.3%) > FAD (78.1%) > FLX (63.8%). In general, failures due to scoring strongly outweigh those due to sampling. Subsets of SB2010 grouped by ligand flexibility (7-or-less, 8-to-15, and 15-plus rotatable bonds) reveal success degrades linearly for FAD and FLX protocols, in contrast to RGD which remains constant. Despite the challenges associated with FLX anchor orientation and on-the-fly flexible growth, success rates for the 7-or-less (74.5%), and in particular the 8-to-15 (55.2%) subset, are encouraging. Poorer results for the very flexible 15-plus set (39.3%) indicate substantial room for improvement. Family-based success appears largely independent of ligand flexibility suggesting a strong dependence on the binding site environment. For example, zinc-containing proteins are generally problematic despite moderately flexible ligands. Finally, representative crossdocking examples, for carbonic anhydrase, thermolysin, and neuraminidase families, show the utility of family-based analysis for rapid identification of particularly good or bad docking trends, and the type of failures involved (scoring/sampling), which will likely be of interest to researchers making specific receptor choices for virtual screening. SB2010 is available for download at http://rizzolab.org
Fatty acid binding proteins (FABPs), in particular FABP5 and FABP7, have recently been identified by us as intracellular transporters for the endocannabinoid anandamide (AEA). Furthermore, animal studies by others have shown that elevated levels of endocannabinoids resulted in beneficial pharmacological effects on stress, pain and inflammation and also ameliorate the effects of drug withdrawal. Based on these observations, we hypothesized that FABP5 and FABP7 would provide excellent pharmacological targets. Thus, we performed a virtual screening of over one million compounds using DOCK and employed a novel footprint similarity scoring function to identify lead compounds with binding profiles similar to oleic acid, a natural FABP substrate. Forty-eight compounds were purchased based on their footprint similarity scores (FPS) and assayed for biological activity against purified human FABP5 employing a fluorescent displacement-binding assay. Four compounds were found to exhibit approximately 50% inhibition or greater at 10 µM, as good as or better inhibitors of FABP5 than BMS309403, a commercially available inhibitor. The most potent inhibitor, γ-truxillic acid 1-naphthyl ester (ChemDiv 8009-2334), was determined to have Ki value of 1.19±0.01 µM. Accordingly a novel α-truxillic acid 1-naphthyl mono-ester (SB-FI-26) was synthesized and assayed for its inhibitory activity against FABP5, wherein SB-FI-26 exhibited strong binding (Ki 0.93±0.08 µM). Additionally, we found SB-FI-26 to act as a potent anti-nociceptive agent with mild anti-inflammatory activity in mice, which strongly supports our hypothesis that the inhibition of FABPs and subsequent elevation of anandamide is a promising new approach to drug discovery. Truxillic acids and their derivatives were also shown by others to have anti-inflammatory and anti-nociceptive effects in mice and to be the active component of Chinese a herbal medicine (Incarvillea sinensis) used to treat rheumatism and pain in humans. Our results provide a likely mechanism by which these compounds exert their effects.
A new linear interaction energy (LIE) method based on a continuum solvent surface generalized Born (SGB) model is proposed for protein-ligand binding affinity calculations. The new method SGB-LIE is about 1 order of magnitude faster than previously published LIE methods based on explicit solvents. It has been applied to several binding sets: HEPT analogues binding to HIV-1 reverse transcriptase (20 ligands), sulfonamide inhibitors binding to human thrombin (seven ligands), and various ligands binding to coagulation factor Xa (eight ligands). The SGB-LIE predictions and cross-validation results show that about 1.0 kcal/mol accuracy is achievable for binding sets with as many as 20 ligands, e.g., for the HIV-1RT binding set, RMS errors of 1.07 and 1.20 kcal/mol are achieved for LIE fitting and leave-one-out cross validation, respectively, with correlation coefficients r 2 equal to 0.774 and 0.717. We have also explored various techniques for the LIE underlying conformation space sampling, including molecular dynamics and hybrid Monte Carlo methods, and the final results show that comparable binding energies can be obtained no matter which sampling technique is used.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.