2020
DOI: 10.1021/acs.jcim.0c00120
|View full text |Cite
|
Sign up to set email alerts
|

Combining Cloud-Based Free-Energy Calculations, Synthetically Aware Enumerations, and Goal-Directed Generative Machine Learning for Rapid Large-Scale Chemical Exploration and Optimization

Abstract: The hit identification process usually involves the profiling of millions to more recently billions of compounds either via traditional experimental high-throughput screens (HTS) or computational virtual high-throughput screens (vHTS). We have previously demonstrated that, by coupling reaction-based enumeration, active learning, and free energy calculations, a similarly largescale exploration of chemical space can be extended to the hit-tolead process. In this work, we augment that approach by coupling large s… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
36
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 44 publications
(40 citation statements)
references
References 65 publications
(110 reference statements)
0
36
0
Order By: Relevance
“…We built a goal-directed generative model using the REINVENT ( Olivecrona et al, 2017 ) protocol, which has shown success in drug discovery applications ( Ghanakota et al, 2020 ) by generating tens of thousands of unique structures with targeted properties while only requiring a few hours of computing time. The deep reinforcement learning methodology applied in the REINVENT protocol is a robust solution for chemical enumeration while not consuming a prohibitive amount of computation cost ( Olivecrona et al, 2017 ; Ghanakota et al, 2020 ).…”
Section: Methods and Principlesmentioning
confidence: 99%
See 1 more Smart Citation
“…We built a goal-directed generative model using the REINVENT ( Olivecrona et al, 2017 ) protocol, which has shown success in drug discovery applications ( Ghanakota et al, 2020 ) by generating tens of thousands of unique structures with targeted properties while only requiring a few hours of computing time. The deep reinforcement learning methodology applied in the REINVENT protocol is a robust solution for chemical enumeration while not consuming a prohibitive amount of computation cost ( Olivecrona et al, 2017 ; Ghanakota et al, 2020 ).…”
Section: Methods and Principlesmentioning
confidence: 99%
“…The second stage shifts the distribution based on a utility function, encoding the desired property ranges. In previous studies, we trained the REINVENT algorithm with a group of structures generated by the PathFinder algorithm ( Ghanakota et al, 2020 ). This work aims to use a design space for REINVENT to cover organic electronics, represented by the popular structural motifs shared among successful hole transport materials.…”
Section: Methods and Principlesmentioning
confidence: 99%
“…To our knowledge, few previous studies exist which have incorporated structural data into deep generative model scoring functions, compared to the ligand-based counterpart. Firstly, Ghanakota et al [ 48 ] combined high throughput free energy perturbation (FEP) with REINVENT to identify potential CDK2 inhibitors. To achieve this, they trained an AutoQSAR model [ 49 ] on a subset of 1,000 enumerated analogues of a potent inhibitor with the corresponding FEP predictions, which was subsequently used as the REINVENT scoring function.…”
Section: Introductionmentioning
confidence: 99%
“…(3) We directly use a physics-based scoring function (i.e. molecular docking) to obtain scores during the generative model training process, as opposed to predicting the outcome of said function via machine learning as in [ 48 , 52 ]. (4) In our approach, the model actively learns the conditional probability distribution of SMILES symbols that are associated with better docking scores and as such variable size distributions can be sampled (up to billions [ 57 ]) of molecules, as opposed to sampling a finite latent space as in [ 51 ].…”
Section: Introductionmentioning
confidence: 99%
“…31 This approach was extended in their recent follow-up to train a goal-directed generative model to generate promising ligands for further screening. 32 We propose formulating the entire goal of multi-target chemical optimization as an active learning problem. Rather than attempting to determine optimal regions of chemical space to sample or train on, we query to model to obtain its suggestions for compounds with the desired properties.…”
Section: Introductionmentioning
confidence: 99%