With an increasing interest in RNA therapeutics and for targeting RNA to treat disease, there is a need for the tools used in protein-based drug design, particularly DOCKing algorithms, to be extended or adapted for nucleic acids. Here, we have compiled a test set of RNA-ligand complexes to validate the ability of the DOCK suite of programs to successfully recreate experimentally determined binding poses. With the optimized parameters and a minimal scoring function, 70% of the test set with less than seven rotatable ligand bonds and 26% of the test set with less than 13 rotatable bonds can be successfully recreated within 2 Å heavy-atom RMSD. When DOCKed conformations are rescored with the implicit solvent models AMBER generalized Born with solvent-accessible surface area (GB/SA) and Poisson-Boltzmann with solvent-accessible surface area (PB/SA) in combination with explicit water molecules and sodium counterions, the success rate increases to 80% with PB/SA for less than seven rotatable bonds and 58% with AMBER GB/SA and 47% with PB/SA for less than 13 rotatable bonds. These results indicate that DOCK can indeed be useful for structure-based drug design aimed at RNA. Our studies also suggest that RNA-directed ligands often differ from typical protein-ligand complexes in their electrostatic properties, but these differences can be accommodated through the choice of potential function. In addition, in the course of the study, we explore a variety of newly added DOCK functions, demonstrating the ease with which new functions can be added to address new scientific questions.
This manuscript presents the latest algorithmic and methodological developments to the structure-based design program DOCK 6.7 focused on an updated internal energy function, new anchor selection control, enhanced minimization options, a footprint similarity scoring function, a symmetry-corrected RMSD algorithm, a database filter, and docking forensic tools. An important strategy during development involved use of three orthogonal metrics for assessment and validation: pose reproduction over a large database of 1043 protein-ligand complexes (SB2012 test set), cross-docking to 24 drug-target protein families, and database enrichment using large active and decoy data sets (DUD-E test set) for 5 important proteins including HIV protease and IGF-1R. Relative to earlier versions, a key outcome of the work is a significant increase in pose reproduction success in going from DOCK 4.0.2 (51.4%) → 5.4 (65.2%) → 6.7 (73.3%) as a result of significant decreases in failure arising from both sampling 24.1% → 13.6% → 9.1% and scoring 24.4% → 21.1% → 17.5%. Companion cross-docking and enrichment studies with the new version highlight other strengths and remaining areas for improvement, especially for systems containing metal ions. The source code for DOCK 6.7 is available for download and free for academic users at http://dock.compbio.ucsf.edu/.
In conjunction with the recent American Chemical Society symposium titled “Docking and Scoring: A Review of Docking Programs” the performance of the DOCK6 program was evaluated through (1) pose reproduction and (2) database enrichment calculations on a common set of organizer-specified systems and datasets (ASTEX, DUD, WOMBAT). Representative baseline grid score results averaged over five docking runs yield a relatively high pose identification success rate of 72.5 % (symmetry corrected rmsd) and sampling rate of 91.9 % for the multi site ASTEX set (N = 147) using organizer-supplied structures. Numerous additional docking experiments showed that ligand starting conditions, symmetry, multiple binding sites, clustering, and receptor preparation protocols all affect success. Encouragingly, in some cases, use of more sophisticated scoring and sampling methods yielded results which were comparable (Amber score ligand movable protocol) or exceeded (LMOD score) analogous baseline grid-score results. The analysis highlights the potential benefit and challenges associated with including receptor flexibility and indicates that different scoring functions have system dependent strengths and weaknesses. Enrichment studies with the DUD database prepared using the SB2010 preparation protocol and native ligand pairings yielded individual area under the curve (AUC) values derived from receiver operating characteristic curve analysis ranging from 0.29 (bad enrichment) to 0.96 (good enrichment) with an average value of 0.60 (27/38 have AUC ≥ 0.5). Strong early enrichment was also observed in the critically important 1.0–2.0 % region. Somewhat surprisingly, an alternative receptor preparation protocol yielded comparable results. As expected, semi-random pairings yielded poorer enrichments, in particular, for unrelated receptors. Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.
Actinyl complexes are shown, on the basis of known theoretical and experimental results, to be weak-field complexes in 4/7 of the 5f orbital space, the other 3/7 of this space being strongly affected by bonding to the -yl oxygens. The interactions present in these complexes are placed in order of size so that a coupling scheme (Λ-S), including the choice of quantum numbers of varying quality, can be specified. Electronic spectra in the near-infrared and visible regions are discussed in general terms, including different choices of both the lower and upper orbitals (or spin−orbitals) involved in the excitations. For the isolated ions, all transitions in this region are forbidden by electric-dipole selection rules, but the interactions with equatorial ligands can make such transitions allowed.
The basic formulation for the multifacet generalization of the graphically contracted function (MFGCF) electronic structure method is presented. The analysis includes the discussion of linear dependency and redundancy of the arc factor parameters, the computation of reduced density matrices, Hamiltonian matrix construction, spin-density matrix construction, the computation of optimization gradients for single-state and state-averaged calculations, graphical wave function analysis, and the efficient computation of configuration state function and Slater determinant expansion coefficients. Timings are given for Hamiltonian matrix element and analytic optimization gradient computations for a range of model problems for full-CI Shavitt graphs, and it is observed that both the energy and the gradient computation scale as O(N(2)n(4)) for N electrons and n orbitals. The important arithmetic operations are within dense matrix-matrix product computational kernels, resulting in a computationally efficient procedure. An initial implementation of the method is used to present applications to several challenging chemical systems, including N2 dissociation, cubic H8 dissociation, the symmetric dissociation of H2O, and the insertion of Be into H2. The results are compared to the exact full-CI values and also to those of the previous single-facet GCF expansion form.
The core part of the program system COLUMBUS allows highly efficient calculations using variational multireference (MR) methods in the framework of configuration interaction with single and double excitations (MR-CISD) and averaged quadratic coupled-cluster calculations (MR-AQCC), based on uncontracted sets of configurations and the graphical unitary group approach (GUGA). The availability of analytic MR-CISD and MR-AQCC energy gradients and analytic nonadiabatic couplings for MR-CISD enables exciting applications including, e.g., investigations of -conjugated biradicaloid compounds, calculations of multitudes of excited states, development of diabatization procedures, and furnishing the electronic structure information for on-the-fly surface nonadiabatic dynamics. With fully variational uncontracted spin-orbit MRCI, COLUMBUS provides a unique possibility of performing high-level calculations on compounds containing heavy atoms up to lanthanides and actinides. Crucial for carrying out all of these calculations effectively is the availability of an efficient parallel code for the CI step. Configuration spaces of several billion in size now can be treated quite routinely on standard parallel computer clusters. Emerging developments in COLUMBUS, including the all configuration mean energy (ACME) multiconfiguration self-consistent field (MCSCF) method and the Graphically Contracted Function method, promise to allow practically unlimited configuration space dimensions. Spin density based on the GUGA approach, analytic spin-orbit energy gradients, possibilities for local electron correlation MR calculations, the development of
ABSTRACT:Basis sets developed for use with effective core potentials describe pseudo-orbitals rather than orbitals. The primitive Gaussian functions and the contraction coefficients in the basis set must therefore both describe the valence region effectively and allow the pseudo-orbital to be small in the core region. The latter is particularly difficult using 1s primitive functions, which have their maxima at the nucleus. Several methods of choosing contraction coefficients are tried, and it is found that natural orbitals give the best results. The number and optimization of primitive functions are done following Dunning's correlation-consistent procedure. Optimization of orbital exponents for larger atoms frequently results in coalescence of adjacent exponents; use of orbitals with higher principal quantum number is one alternative. Actinide atoms or ions provide the most difficult cases in that basis sets must be optimized for valence shells of different radial size simultaneously considering correlation energy and spin-orbit energy.
Practical algorithms are presented for the parameterization of orthogonal matrices Q ∈ R(m×n) in terms of the minimal number of essential parameters {φ}. Both square n = m and rectangular n < m situations are examined. Two separate kinds of parameterizations are considered, one in which the individual columns of Q are distinct, and the other in which only Span(Q) is significant. The latter is relevant to chemical applications such as the representation of the arc factors in the multifacet graphically contracted function method and the representation of orbital coefficients in SCF and DFT methods. The parameterizations are represented formally using products of elementary Householder reflector matrices. Standard mathematical libraries, such as LAPACK, may be used to perform the basic low-level factorization, reduction, and other algebraic operations. Some care must be taken with the choice of phase factors in order to ensure stability and continuity. The transformation of gradient arrays between the Q and {φ} parameterizations is also considered. Operation counts for all factorizations and transformations are determined. Numerical results are presented which demonstrate the robustness, stability, and accuracy of these algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.