HyFactor: Hydrogen-count labelled graph-based defactorization Autoencoder

Akhmetshin, Tagir; Lin, Arkadii; Mazitov, Daniyar; Ziaikin, Evgenii; Madzhidov, Timur; Varnek, Alexandre

doi:10.26434/chemrxiv-2021-18x0d

Cited by 2 publications

(1 citation statement)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This is significantly superior to other methods, which showed validity scores ranging from 85% for generative autoencoders to about 96% for RNN-based models [41][42][43] . The ability to generate novel compounds was determined by measuring the percentage of molecules in a library of 10,000 generated SMILES which was not present within ZINC-250K 44 (containing nearly 250,000 molecules) (Table 2). In all cases, we see a high level of novelty.…”

Section: Validation Of the Generated Drug-like Moleculesmentioning

confidence: 99%

DrugSynthMC: an atom based generation of drug-like molecules with Monte Carlo Search

Roucairol,

Georgiou,

Cazenave

et al. 2024

Preprint

View full text Add to dashboard Cite

A growing number of Deep Learning (DL) methodologies have recently been developed to design novel compounds and expand the chemical space within virtual libraries. Most of these Neural Network approaches design molecules to specifically bind a target, based on its structural information and/or knowledge of previously identified binders. Fewer attempts have been made to develop approaches for de novo design of virtual libraries, as synthesizability of generated molecules remains a challenge. In this work, we developed a new Monte Carlo Search (MCS) algorithm, DrugSynthMC (Drug Synthetise using Monte Carlo), in conjunction with DL and statistical-based priors to generate thousands of interpretable chemical structures and novel drug-like molecules per second. DrugSynthMC produces drug-like compounds using an atom-based search model that builds molecules as SMILES, character by character. Designed molecules follow Lipinski’s “rule of 5”, show a high proportion of predicted-to-be synthesisable compounds and efficiently expand the chemical space within the libraries, without reliance on training datasets, synthesizability metrics or enforcing during SMILES generation. Our approach can function with or without an underlying Neural Network and is thus easily explainable and versatile. This ease in drug-like molecule generation allows for future integration of score functions aimed at different target- or job -oriented goals. Thus, DrugSynthMC is expected to enable the functional assessment of large compound libraries covering an extensive novel chemical space, overcoming the limitations of existing drug collections. The software is available at https://github.com/RoucairolMilo/DrugSynthMC

show abstract

Section: Validation Of the Generated Drug-like Moleculesmentioning

confidence: 99%

DrugSynthMC: an atom based generation of drug-like molecules with Monte Carlo Search

Roucairol,

Georgiou,

Cazenave

et al. 2024

Preprint

View full text Add to dashboard Cite

show abstract

DrugSynthMC: An Atom-Based Generation of Drug-like Molecules with Monte Carlo Search

Roucairol,

Georgiou,

Cazenave

et al. 2024

J. Chem. Inf. Model.

View full text Add to dashboard Cite

A growing number of deep learning (DL) methodologies have recently been developed to design novel compounds and expand the chemical space within virtual libraries. Most of these neural network approaches design molecules to specifically bind a target based on its structural information and/or knowledge of previously identified binders. Fewer attempts have been made to develop approaches for de novo design of virtual libraries, as synthesizability of generated molecules remains a challenge. In this work, we developed a new Monte Carlo Search (MCS) algorithm, DrugSynthMC (Dru g Synthesis using Monte Carlo), in conjunction with DL and statistical-based priors to generate thousands of interpretable chemical structures and novel drug-like molecules per second. DrugSynthMC produces drug-like compounds using an atom-based search model that builds molecules as SMILES, character by character. Designed molecules follow Lipinski’s “rule of 5″, show a high proportion of highly water-soluble nontoxic predicted-to-be synthesizable compounds, and efficiently expand the chemical space within the libraries, without reliance on training data sets, synthesizability metrics, or enforcing during SMILES generation. Our approach can function with or without an underlying neural network and is thus easily explainable and versatile. This ease in drug-like molecule generation allows for future integration of score functions aimed at different target- or job-oriented goals. Thus, DrugSynthMC is expected to enable the functional assessment of large compound libraries covering an extensive novel chemical space, overcoming the limitations of existing drug collections. The software is available at .

show abstract

HyFactor: Hydrogen-count labelled graph-based defactorization Autoencoder

Cited by 2 publications

References 21 publications

DrugSynthMC: an atom based generation of drug-like molecules with Monte Carlo Search

DrugSynthMC: an atom based generation of drug-like molecules with Monte Carlo Search

DrugSynthMC: An Atom-Based Generation of Drug-like Molecules with Monte Carlo Search

Contact Info

Product

Resources

About