The recently emerged 2019 Novel Coronavirus (SARS-CoV-2) and associated COVID-19 disease cause serious or even fatal respiratory tract infection and yet no approved therapeutics or effective treatment is currently available to effectively combat the outbreak. This urgent situation is pressing the world to respond with the development of novel vaccine or a small molecule therapeutics for SARS-CoV-2. Along these efforts, the structure of SARS-CoV-2 main protease (Mpro) has been rapidly resolved and made publicly available to facilitate global efforts to develop novel drug candidates. Recently, our group has developed a novel deep learning platform -Deep Docking (DD) which provides fast prediction of docking scores of Glide (or any other docking program) and, hence, enables structure-based virtual screening of billions of purchasable molecules in a short time. In the current study we applied DD to all 1.3 billion compounds from ZINC15 library to identify top 1,000 potential ligands for SARS-CoV-2 Mpro protein. The compounds are made publicly available for further characterization and development by scientific community.
Drug discovery is a rigorous process that requires billion dollars of investments and decades of research to bring a molecule "from bench to a bedside". While virtual docking can significantly accelerate the process of drug discovery, it ultimately lags the current rate of expansion of chemical databases that already exceed billions of molecular records. This recent surge of small molecules availability presents great drug discovery opportunities, but also demands much faster screening protocols. In order to address this challenge, we herein introduce Deep Docking (DD), a novel deep learning platform that is suitable for docking billions of molecular structures in a rapid, yet accurate fashion. The DD approach utilizes quantitative structure−activity relationship (QSAR) deep models trained on docking scores of subsets of a chemical library to approximate the docking outcome for yet unprocessed entries and, therefore, to remove unfavorable molecules in an iterative manner. The use of DD methodology in conjunction with the FRED docking program allowed rapid and accurate calculation of docking scores for 1.36 billion molecules from the ZINC15 library against 12 prominent target proteins and demonstrated up to 100-fold data reduction and 6000-fold enrichment of high scoring molecules (without notable loss of favorably docked entities). The DD protocol can readily be used in conjunction with any docking program and was made publicly available.
<div>The recently emerged 2019 Novel Coronavirus (SARS-CoV-2) and associated COVID-19 disease cause serious or even fatal respiratory tract infection and yet no FDA-approved therapeutics or effective treatment is currently available to effectively combat the outbreak. This urgent situation is pressing the world to respond with the development of novel vaccine or a small molecule therapeutics for SARS-CoV-2. Along these efforts, the structure of SARS-CoV-2 main protease (Mpro) has been rapidly resolved and made publicly available to facilitate global efforts to develop novel drug candidates.</div><div>In recent month, our group has developed a novel deep learning platform – Deep Docking (DD) which enables very fast docking of billions of molecular structures and provides up to 6,000X enrichment on the top-predicted ligands compared to conventional docking workflow (without notable loss of information on potential hits). In the current work we applied DD to entire 1.3 billion compounds from ZINC15 library to identify top 1,000 potential ligands for SARS-CoV-2 Mpro. The compounds are made publicly available for further characterization and development by scientific community.</div>
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the pathogen that causes the disease COVID-19, produces replicase polyproteins 1a and 1ab that contain, respectively, 11 or 16 nonstructural proteins (nsp). Nsp5 is the main protease (Mpro) responsible for cleavage at eleven positions along these polyproteins, including at its own N- and C-terminal boundaries, representing essential processing events for subsequent viral assembly and maturation. We have determined X-ray crystallographic structures of this cysteine protease in its wild-type free active site state at 1.8 Å resolution, in its acyl-enzyme intermediate state with the native C-terminal autocleavage sequence at 1.95 Å resolution and in its product bound state at 2.0 Å resolution by employing an active site mutation (C145A). We characterize the stereochemical features of the acyl-enzyme intermediate including critical hydrogen bonding distances underlying catalysis in the Cys/His dyad and oxyanion hole. We also identify a highly ordered water molecule in a position compatible for a role as the deacylating nucleophile in the catalytic mechanism and characterize the binding groove conformational changes and dimerization interface that occur upon formation of the acyl-enzyme. Collectively, these crystallographic snapshots provide valuable mechanistic and structural insights for future antiviral therapeutic development including revised molecular docking strategies based on Mpro inhibition.
With the recent explosion of chemical libraries beyond a billion molecules, more efficient virtual screening approaches are needed. The Deep Docking (DD) platform enables up to 100-fold acceleration of structure-based virtual screening by docking only a subset of a chemical library, iteratively synchronized with a ligand-based prediction of the remaining docking scores. This method results in hundreds-to thousands-fold virtual hit enrichment (without significant loss of potential drug candidates) and hence enables the screening of billion molecule-sized chemical libraries without using extraordinary computational resources. Herein, we present and discuss the generalized DD protocol that has been proven successful in various computer-aided drug discovery (CADD) campaigns and can be applied in conjunction with any conventional docking program. The protocol encompasses eight consecutive stages: molecular library preparation, receptor preparation, random sampling of a library, ligand preparation, molecular docking, model training, model inference and the residual docking. The standard DD workflow enables iterative application of stages 3-7 with continuous augmentation of the training set, and the number of such iterations can be adjusted by the user. A predefined recall value allows for control of the percentage of top-scoring molecules that are retained by DD and can be adjusted to control the library size reduction. The procedure takes 1-2 weeks (depending on the available resources) and can be completely automated on computing clusters managed by job schedulers. This open-source protocol, at https://github.com/jamesgleave/DD_protocol, can be readily deployed by CADD researchers and can significantly accelerate the effective exploration of ultra-large portions of a chemical space.
Deep learning-accelerated docking coupled with computational hit selection strategies enable the identification of inhibitors for the SARS-CoV-2 main protease from a chemical library of 40 billion small molecules.
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), the pathogen that causes COVID-19, produces polyproteins 1a and 1ab that contain, respectively, 11 or 16 non-structural proteins (nsp). Nsp5 is the main protease (Mpro) responsible for cleavage at eleven positions along these polyproteins, including at its own N- and C-terminal boundaries, representing essential processing events for viral assembly and maturation. Using C-terminally substituted Mpro chimeras, we have determined X-ray crystallographic structures of Mpro in complex with 10 of its 11 viral cleavage sites, bound at full occupancy intermolecularly in trans, within the active site of either the native enzyme and/or a catalytic mutant (C145A). Capture of both acyl-enzyme intermediate and product-like complex forms of a P2(Leu) substrate in the native active site provides direct comparative characterization of these mechanistic steps as well as further informs the basis for enhanced product release of Mpro’s own unique C-terminal P2(Phe) cleavage site to prevent autoinhibition. We characterize the underlying noncovalent interactions governing binding and specificity for this diverse set of substrates, showing remarkable plasticity for subsites beyond the anchoring P1(Gln)-P2(Leu/Val/Phe), representing together a near complete analysis of a multiprocessing viral protease. Collectively, these crystallographic snapshots provide valuable mechanistic and structural insights for antiviral therapeutic development.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.