2021
DOI: 10.48550/arxiv.2112.00905
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HelixMO: Sample-Efficient Molecular Optimization in Scene-Sensitive Latent Space

Abstract: Efficiently discovering molecules that meet various property requirements can significantly benefit the drug discovery industry. Since it is infeasible to search over the entire chemical space, recent works adopt generative models for goal-directed molecular generation. They tend to utilize the iterative processes, optimizing the parameters of the molecular generative models at each iteration to produce promising molecules for further validation. Assessments are exploited to evaluate the generated molecules at… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 28 publications
(40 reference statements)
0
1
0
Order By: Relevance
“…To get a training set that is representative of the chemical space, more molecules were generated from the initial 46 seed molecules using the sequence variational auto-encoder (SeqVAE) method provided by the computational biology platform of PaddleHelix. 31 Those compounds having a considerable molecular weight (M w > 200), high synthesis difficulty (synthesis accessibility score, SA > 4), or low water solubility (lipid−water partition coefficient, logP > 2.5) were excluded from the initial data set. Finally, 342 molecules were added to the training data set.…”
Section: ■ Methodsmentioning
confidence: 99%
“…To get a training set that is representative of the chemical space, more molecules were generated from the initial 46 seed molecules using the sequence variational auto-encoder (SeqVAE) method provided by the computational biology platform of PaddleHelix. 31 Those compounds having a considerable molecular weight (M w > 200), high synthesis difficulty (synthesis accessibility score, SA > 4), or low water solubility (lipid−water partition coefficient, logP > 2.5) were excluded from the initial data set. Finally, 342 molecules were added to the training data set.…”
Section: ■ Methodsmentioning
confidence: 99%