A growing number of deep learning (DL) methodologies
have recently
been developed to design novel compounds and expand the chemical space
within virtual libraries. Most of these neural network approaches
design molecules to specifically bind a target based on its structural
information and/or knowledge of previously identified binders. Fewer
attempts have been made to develop approaches for
de novo
design of virtual libraries, as synthesizability of generated molecules
remains a challenge. In this work, we developed a new Monte Carlo
Search (MCS) algorithm, DrugSynthMC (Dru
g
Synthesis
using Monte Carlo), in conjunction with DL and statistical-based priors
to generate thousands of interpretable chemical structures and novel
drug-like molecules per second. DrugSynthMC produces drug-like compounds
using an atom-based search model that builds molecules as SMILES,
character by character. Designed molecules follow Lipinski’s
“rule of 5″, show a high proportion of highly water-soluble
nontoxic predicted-to-be synthesizable compounds, and efficiently
expand the chemical space within the libraries, without reliance on
training data sets, synthesizability metrics, or enforcing during
SMILES generation. Our approach can function with or without an underlying
neural network and is thus easily explainable and versatile. This
ease in drug-like molecule generation allows for future integration
of score functions aimed at different target- or job-oriented goals.
Thus, DrugSynthMC is expected to enable the functional assessment
of large compound libraries covering an extensive novel chemical space,
overcoming the limitations of existing drug collections. The software
is available at
.