2021
DOI: 10.1021/acs.jcim.1c00600
|View full text |Cite
|
Sign up to set email alerts
|

MolGPT: Molecular Generation Using a Transformer-Decoder Model

Abstract: Application of deep learning techniques for de novo generation of molecules, termed as inverse molecular design, has been gaining enormous traction in drug design. The representation of molecules in SMILES notation as a string of characters enables the usage of state of the art models in natural language processing, such as Transformers, for molecular design in general. Inspired by generative pre-training (GPT) models that have been shown to be successful in generating meaningful text, we train a transformer-d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
142
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 206 publications
(185 citation statements)
references
References 48 publications
0
142
0
Order By: Relevance
“…De novo drug design (Hartenfeller and Schneider, 2010) through existing computer technology can speed up drug development and save research costs. The tasks involved in de novo drug design include molecular generation (Gómez-Bombarelli et al, 2016;Cao and Kipf, 2018;Jin et al, 2018;You et al, 2018;Madhawa et al, 2019;Popova et al, 2019;Zhang et al, 2019;Hong et al, 2020;Zang and Wang, 2020;Bagal et al, 2021), drug and drug interactions (DDI) (Li et al, 2021;Lin et al, 2021;Lyu et al, 2021;Zhao et al, 2021), disease associations (Ding et al, 2020;Lei and Zhang, 2020;Mudiyanselage et al, 2020;Lei X.-J. et al, 2021;Lei X. et al, 2021;Wang Y. et al, 2021;Lei and Zhang, 2021;Yang and Lei, 2021;Zhang et al, 2021), and so on.…”
Section: Introductionmentioning
confidence: 99%
“…De novo drug design (Hartenfeller and Schneider, 2010) through existing computer technology can speed up drug development and save research costs. The tasks involved in de novo drug design include molecular generation (Gómez-Bombarelli et al, 2016;Cao and Kipf, 2018;Jin et al, 2018;You et al, 2018;Madhawa et al, 2019;Popova et al, 2019;Zhang et al, 2019;Hong et al, 2020;Zang and Wang, 2020;Bagal et al, 2021), drug and drug interactions (DDI) (Li et al, 2021;Lin et al, 2021;Lyu et al, 2021;Zhao et al, 2021), disease associations (Ding et al, 2020;Lei and Zhang, 2020;Mudiyanselage et al, 2020;Lei X.-J. et al, 2021;Lei X. et al, 2021;Wang Y. et al, 2021;Lei and Zhang, 2021;Yang and Lei, 2021;Zhang et al, 2021), and so on.…”
Section: Introductionmentioning
confidence: 99%
“…Deep language models have also been used for the generation of protein sequences [27,28] and molecules [29,30]. In [31], Kim et al combined a transformer encoder with a conditional variational autoencoder (cVAE) to achieve high performance molecule generation.…”
Section: Introductionmentioning
confidence: 99%
“…Many generative model techniques and architectures applied to de novo molecule generation exist. These models range from purely symbolic approaches such as genetic algorithms [1,2] to more recent machine learning (ML) approaches such as recurrent neural networks (RNNs) [3][4][5][6][7], transformers [8][9][10], variational autoencoders [11][12][13][14], generative adversarial networks [15][16][17], graph neural networks [18,19] and hybrid approaches that use ML to guide reinforcement learning (RL) in a heuristic action space [20]. These generative models can produce valid and novel molecules [21,22] and condition molecule generation towards a particular endpoint [21] (e.g., predicted bioactivity towards a protein target [4]) via optimization techniques such as, RL [4,20,23], Bayesian optimization [11,13], molecular swarm optimization [14] and Monte Carlo tree search [2,6].…”
Section: Introductionmentioning
confidence: 99%