Traditional drug discovery is very laborious, expensive, and time-consuming, due to the huge combinatorial complexity of the discrete molecular search space. Researchers have turned to machine learning methods for help to tackle this difficult problem. However, most existing methods are either virtual screening on the available database of compounds by protein-ligand affinity prediction, or unconditional molecular generation which does not take into account the information of the protein target. In this paper, we propose a protein target-oriented de novo drug design method, called AlphaDrug. Our method is able to automatically generate molecular drug candidates in an autoregressive way, and the drug candidates can dock into the given target protein well. To fulfill this goal, we devise a modified transformer network for the joint embedding of protein target and the molecule, and a Monte Carlo Tree Search (MCTS) algorithm for the conditional molecular generation. In the transformer variant, we impose a hierarchy of skip connections from protein encoder to molecule decoder for efficient feature transfer. The transformer variant computes the probabilities of next atoms based on the protein target and the molecule intermediate. We use the probabilities to guide the look-ahead search by MCTS to enhance or correct the next-atom selection. Moreover, MCTS is also guided by a value function implemented by a docking program, such that the paths with many low docking values are seldom chosen. Experiments on diverse protein targets demonstrate the effectiveness of our methods, indicating that AlphaDrug is a potentially promising solution to target-specific de novo drug design.
Insertions and deletions (indels) are low-frequency deleterious genomic DNA alterations. Despite their rarity, indels are common, and insertions leading to long complementarity-determining region 3 (CDR3) are vital for antigen-binding functions in broadly neutralizing and polyreactive antibodies targeting viruses. Because of challenges in detecting indels, the mechanism that generates indels during immunoglobulin diversification processes remains poorly understood. We carried out ultra-deep profiling of indels and systematically dissected the underlying mechanisms using
passenger-immunoglobulin
mouse models. We found that activation-induced cytidine deaminase–dependent ±1–base pair (bp) indels are the most prevalent indel events, biasing deleterious outcomes, whereas longer in-frame indels, especially insertions that can extend the CDR3 length, are rare outcomes. The ±1-bp indels are channeled by base excision repair, but longer indels require additional DNA-processing factors. Ectopic expression of a DNA exonuclease or perturbation of the balance of DNA polymerases can increase the frequency of longer indels, thus paving the way for models that can generate antibodies with long CDR3. Our study reveals the mechanisms that generate beneficial and deleterious indels during the process of antibody somatic hypermutation and has implications in understanding the detrimental genomic alterations in various conditions, including tumorigenesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.