2020
DOI: 10.26434/chemrxiv.12058026.v3
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

REINVENT 2.0 – an AI Tool for De Novo Drug Design

Abstract: With this application note we aim to offer the community a production-ready tool for de novo design. It can be effectively applied on drug discovery projects that are striving to resolve either exploration or exploitation problems while navigating the chemical space. By releasing the code we are aiming to facilitate the research on using generative methods on drug discovery problems and to promote the collaborative efforts in this area so that it can be used as an interaction point for future scientific collab… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
94
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 57 publications
(94 citation statements)
references
References 0 publications
0
94
0
Order By: Relevance
“…Recurrent neural network-based models of SMILES strings were trained on canonical SMILES or noncanonical SMILES after varying degrees of data augmentation, using either LSTM or GRU architectures. The Python source code used to train the model was derived from our recent benchmarking analysis of generative models of molecules in the low-data regime 27 (https://github.com/skinnider/low-data-generative-models), which was itself adapted from the REINVENT package 24,48 (http://github.com/MarcusOlivecrona/REINVENT). Briefly, each SMILES was converted into a sequence of tokens by splitting the SMILES string into its constituent characters, except for atomic symbols composed of two characters (Br, Cl) and environments within square brackets, such as [nH].…”
Section: Methodsmentioning
confidence: 99%
“…Recurrent neural network-based models of SMILES strings were trained on canonical SMILES or noncanonical SMILES after varying degrees of data augmentation, using either LSTM or GRU architectures. The Python source code used to train the model was derived from our recent benchmarking analysis of generative models of molecules in the low-data regime 27 (https://github.com/skinnider/low-data-generative-models), which was itself adapted from the REINVENT package 24,48 (http://github.com/MarcusOlivecrona/REINVENT). Briefly, each SMILES was converted into a sequence of tokens by splitting the SMILES string into its constituent characters, except for atomic symbols composed of two characters (Br, Cl) and environments within square brackets, such as [nH].…”
Section: Methodsmentioning
confidence: 99%
“…Generative models that have only been pretrained with prior data (e.g. from STD or AGN) and have not been subjected to RL training (or transfer learning [29]) are referred to as priors or prior agents. STD and AGN are used interchangeably to refer either to the datasets themselves or to the priors resulting from pre-training with the respective dataset.…”
Section: Conventions and Notationmentioning
confidence: 99%
“…To explore the effects of the scoring function composition and the prior, all combinations of the respective parameters shown in Table 1 were considered. Each run with activated diversity filter (DF) [23], [29] was repeated three times to evaluate the stochastic effect of the neural network training.…”
Section: D Similarity Querymentioning
confidence: 99%
See 2 more Smart Citations