Sudipto Mukherjee scite author profile

With an increasing interest in RNA therapeutics and for targeting RNA to treat disease, there is a need for the tools used in protein-based drug design, particularly DOCKing algorithms, to be extended or adapted for nucleic acids. Here, we have compiled a test set of RNA-ligand complexes to validate the ability of the DOCK suite of programs to successfully recreate experimentally determined binding poses. With the optimized parameters and a minimal scoring function, 70% of the test set with less than seven rotatable ligand bonds and 26% of the test set with less than 13 rotatable bonds can be successfully recreated within 2 Å heavy-atom RMSD. When DOCKed conformations are rescored with the implicit solvent models AMBER generalized Born with solvent-accessible surface area (GB/SA) and Poisson-Boltzmann with solvent-accessible surface area (PB/SA) in combination with explicit water molecules and sodium counterions, the success rate increases to 80% with PB/SA for less than seven rotatable bonds and 58% with AMBER GB/SA and 47% with PB/SA for less than 13 rotatable bonds. These results indicate that DOCK can indeed be useful for structure-based drug design aimed at RNA. Our studies also suggest that RNA-directed ligands often differ from typical protein-ligand complexes in their electrostatic properties, but these differences can be accommodated through the choice of potential function. In addition, in the course of the study, we explore a variety of newly added DOCK functions, demonstrating the ease with which new functions can be added to address new scientific questions.

show abstract

DOCK 6: Impact of new features and current docking performance

Allen

et al. 2015

View full text Add to dashboard Cite

This manuscript presents the latest algorithmic and methodological developments to the structure-based design program DOCK 6.7 focused on an updated internal energy function, new anchor selection control, enhanced minimization options, a footprint similarity scoring function, a symmetry-corrected RMSD algorithm, a database filter, and docking forensic tools. An important strategy during development involved use of three orthogonal metrics for assessment and validation: pose reproduction over a large database of 1043 protein-ligand complexes (SB2012 test set), cross-docking to 24 drug-target protein families, and database enrichment using large active and decoy data sets (DUD-E test set) for 5 important proteins including HIV protease and IGF-1R. Relative to earlier versions, a key outcome of the work is a significant increase in pose reproduction success in going from DOCK 4.0.2 (51.4%) → 5.4 (65.2%) → 6.7 (73.3%) as a result of significant decreases in failure arising from both sampling 24.1% → 13.6% → 9.1% and scoring 24.4% → 21.1% → 17.5%. Companion cross-docking and enrichment studies with the new version highlight other strengths and remaining areas for improvement, especially for systems containing metal ions. The source code for DOCK 6.7 is available for download and free for academic users at http://dock.compbio.ucsf.edu/.

show abstract

Inherited and Somatic Defects in DDX41 in Myeloid Neoplasms

et al. 2015

View full text Add to dashboard Cite

Most cases of adult myeloid neoplasms are routinely assumed to be sporadic. Here, we describe an adult familial acute myeloid leukemia (AML) syndrome caused by germline mutations in the DEAD/H-box helicase gene DDX41. DDX41 was also found to be affected by somatic mutations in sporadic cases of myeloid neoplasms as well as in a biallelic fashion in 50% of patients with germline DDX41 mutations. Moreover, corresponding deletions on 5q35.3 present in 6% of cases led to haploinsufficient DDX41 expression. DDX41 lesions caused altered pre-mRNA splicing and RNA processing. DDX41 is exemplary of other RNA helicase genes also affected by somatic mutations, suggesting that they constitute a family of tumor suppressor genes.

show abstract

International, evidence-based consensus treatment guidelines for idiopathic multicentric Castleman disease

et al. 2018

View full text Add to dashboard Cite

Abstract Castleman disease (CD) describes a group of heterogeneous hematologic disorders with characteristic histopathological features. CD can present with unicentric or multicentric (MCD) regions of lymph node enlargement. Some cases of MCD are caused by human herpesvirus-8 (HHV-8), whereas others are HHV-8–negative/idiopathic (iMCD). Treatment of iMCD is challenging, and outcomes can be poor because no uniform treatment guidelines exist, few systematic studies have been conducted, and no agreed upon response criteria have been described. The purpose of this paper is to establish consensus, evidence-based treatment guidelines based on the severity of iMCD to improve outcomes. An international Working Group of 42 experts from 10 countries was convened by the Castleman Disease Collaborative Network to establish consensus guidelines for the management of iMCD based on published literature, review of treatment effectiveness for 344 cases, and expert opinion. The anti–interleukin-6 monoclonal antibody siltuximab (or tocilizumab, if siltuximab is not available) with or without corticosteroids is the preferred first-line therapy for iMCD. In the most severe cases, adjuvant combination chemotherapy is recommended. Additional agents are recommended, tailored by disease severity, as second- and third-line therapies for treatment failures. Response criteria were formulated to facilitate the evaluation of treatment failure or success. These guidelines should help treating physicians to stratify patients based on disease severity in order to select the best available therapeutic option. An international registry for patients with CD (ACCELERATE, #NCT02817997) was established in October 2016 to collect patient outcomes to increase the evidence base for selection of therapies in the future.

show abstract

ClusterGAN: Latent Space Clustering in Generative Adversarial Networks

Mukherjee

Asnani

Lin

et al. 2019

AAAI

238

220

View full text Add to dashboard Cite

Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latentspace back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets. 1

show abstract

Docking Validation Resources: Protein Family and Ligand Flexibility Experiments

Mukherjee

Balius

Rizzo

2010

J. Chem. Inf. Model.

161

201

View full text Add to dashboard Cite

A database consisting of 780 ligand-receptor complexes, termed SB2010, has been derived from the Protein Databank to evaluate the accuracy of docking protocols for regenerating bound ligand conformations. The goal is to provide easily accessible community resources for development of improved procedures to aid virtual screening for ligands with a wide range of flexibilities. Three core experiments using the program DOCK, which employ rigid (RGD), fixed anchor (FAD), and flexible (FLX) protocols, were used to gauge performance by several different metrics: (1) global results, (2) ligand flexibility, (3) protein family, and (4) crossdocking. Global spectrum plots of successes and failures vs rmsd reveal well-defined inflection regions, which suggest the commonly used 2 Å criteria is a reasonable choice for defining success. Across all 780 systems, success tracks with the relative difficulty of the calculations: RGD (82.3%) > FAD (78.1%) > FLX (63.8%). In general, failures due to scoring strongly outweigh those due to sampling. Subsets of SB2010 grouped by ligand flexibility (7-or-less, 8-to-15, and 15-plus rotatable bonds) reveal success degrades linearly for FAD and FLX protocols, in contrast to RGD which remains constant. Despite the challenges associated with FLX anchor orientation and on-the-fly flexible growth, success rates for the 7-or-less (74.5%), and in particular the 8-to-15 (55.2%) subset, are encouraging. Poorer results for the very flexible 15-plus set (39.3%) indicate substantial room for improvement. Family-based success appears largely independent of ligand flexibility suggesting a strong dependence on the binding site environment. For example, zinc-containing proteins are generally problematic despite moderately flexible ligands. Finally, representative crossdocking examples, for carbonic anhydrase, thermolysin, and neuraminidase families, show the utility of family-based analysis for rapid identification of particularly good or bad docking trends, and the type of failures involved (scoring/sampling), which will likely be of interest to researchers making specific receptor choices for virtual screening. SB2010 is available for download at http://rizzolab.org

show abstract

Evaluation of DOCK 6 as a pose generation and database enrichment tool

Brozell

Mukherjee

Balius

et al. 2012

J Comput Aided Mol Des

139

162

View full text Add to dashboard Cite

In conjunction with the recent American Chemical Society symposium titled “Docking and Scoring: A Review of Docking Programs” the performance of the DOCK6 program was evaluated through (1) pose reproduction and (2) database enrichment calculations on a common set of organizer-specified systems and datasets (ASTEX, DUD, WOMBAT). Representative baseline grid score results averaged over five docking runs yield a relatively high pose identification success rate of 72.5 % (symmetry corrected rmsd) and sampling rate of 91.9 % for the multi site ASTEX set (N = 147) using organizer-supplied structures. Numerous additional docking experiments showed that ligand starting conditions, symmetry, multiple binding sites, clustering, and receptor preparation protocols all affect success. Encouragingly, in some cases, use of more sophisticated scoring and sampling methods yielded results which were comparable (Amber score ligand movable protocol) or exceeded (LMOD score) analogous baseline grid-score results. The analysis highlights the potential benefit and challenges associated with including receptor flexibility and indicates that different scoring functions have system dependent strengths and weaknesses. Enrichment studies with the DUD database prepared using the SB2010 preparation protocol and native ligand pairings yielded individual area under the curve (AUC) values derived from receiver operating characteristic curve analysis ranging from 0.29 (bad enrichment) to 0.96 (good enrichment) with an average value of 0.60 (27/38 have AUC ≥ 0.5). Strong early enrichment was also observed in the critically important 1.0–2.0 % region. Somewhat surprisingly, an alternative receptor preparation protocol yielded comparable results. As expected, semi-random pairings yielded poorer enrichments, in particular, for unrelated receptors. Overall, the breadth and number of experiments performed provide a useful snapshot of current capabilities of DOCK6 as well as starting points to guide future development efforts to further improve sampling and scoring.

show abstract

Bridging Microscopic and Macroscopic Mechanisms of p53-MDM2 Binding with Kinetic Network Models

Zhou

Pantelopulos

Mukherjee

et al. 2017

Biophysical Journal

126

View full text Add to dashboard Cite

Under normal cellular conditions, the tumor suppressor protein p53 is kept at low levels in part due to ubiquitination by MDM2, a process initiated by binding of MDM2 to the intrinsically disordered transactivation domain (TAD) of p53. Many experimental and simulation studies suggest that disordered domains such as p53 TAD bind their targets nonspecifically before folding to a tightly associated conformation, but the microscopic details are unclear. Toward a detailed prediction of binding mechanisms, pathways, and rates, we have performed large-scale unbiased all-atom simulations of p53-MDM2 binding. Markov state models (MSMs) constructed from the trajectory data predict p53 TAD binding pathways and on-rates in good agreement with experiment. The MSM reveals that two key bound intermediates, each with a nonnative arrangement of hydrophobic residues in the MDM2 binding cleft, control the overall on-rate. Using microscopic rate information from the MSM, we parameterize a simple four-state kinetic model to 1) determine that induced-fit pathways dominate the binding flux over a large range of concentrations, and 2) predict how modulation of residual p53 helicity affects binding, in good agreement with experiment. These results suggest new ways in which microscopic models of peptide binding, coupled with simple few-state binding flux models, can be used to understand biological function in physiological contexts.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.