2020
DOI: 10.1186/s13321-020-00423-w
|View full text |Cite
|
Sign up to set email alerts
|

Transformer-CNN: Swiss knife for QSAR modeling and interpretation

Abstract: We present SMILES-embeddings derived from the internal encoder state of a Transformer [1] model trained to canonize SMILES as a Seq2Seq problem. Using a CharNN [2] architecture upon the embeddings results in higher quality interpretable QSAR/QSPR models on diverse benchmark datasets including regression and classification tasks. The proposed Transformer-CNN method uses SMILES augmentation for training and inference, and thus the prognosis is based on an internal consensus. That both the augmentation and transf… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
130
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 147 publications
(131 citation statements)
references
References 47 publications
1
130
0
Order By: Relevance
“…Training the model to learn different representations of the same reaction by distorting the initial canonical data eliminated the effect of memorization and increased the generalization performance of models. These ideas are intensively used, e.g., for image recognition 39 , and have been already successfully used in the context of several chemical problems 27 – 30 , including reaction predictions 18 , 31 , but were limited to the input data. For the first time we showed that application of augmentation to the target data significantly boosts the quality of the reaction prediction.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Training the model to learn different representations of the same reaction by distorting the initial canonical data eliminated the effect of memorization and increased the generalization performance of models. These ideas are intensively used, e.g., for image recognition 39 , and have been already successfully used in the context of several chemical problems 27 – 30 , including reaction predictions 18 , 31 , but were limited to the input data. For the first time we showed that application of augmentation to the target data significantly boosts the quality of the reaction prediction.…”
Section: Discussionmentioning
confidence: 99%
“…The SMILES representation of molecules is ambiguous. Though the canonicalization procedure exists 26 , it has been shown that models benefit from using a batch of random SMILES (augmentation) during training and inference 27 – 30 . Recently, such augmentation was also applied to reaction modeling 11 , 18 , 31 , 32 .…”
Section: Introductionmentioning
confidence: 99%
“…Augmentation was done offline prior to training the network. Randomized SMILES were generated using RDKit by setting option doRandom = True, which was recently introduced to improve regression and classification models for physico-chemical properties [22,23]. As expected, the augmentation improved the percentage of generated valid SMILES while lowering the number of training epochs.…”
Section: Table 2 Comparison Architectures a B C And Dmentioning
confidence: 81%
“…Our calculations based on the publicly available dataset PubChem [21], clearly demonstrate that the use of bidirectional layers systematically improves the capability of the GEN to generate a vast set of new SMILES within the property space of the training set. Following excellent results of SMILES augmentation for smaller datasets to predict physicochemical properties [22][23][24] and generators [25], we have used SMILES augmentation to increase both the number and diversity of SMILES in the training set.…”
Section: Introductionmentioning
confidence: 99%
“…When making use of feature attribution approaches, it is advisable to choose comprehensible molecular descriptors or representations for model construction (Box 2). Recently, architectures borrowed from the natural language processing field, such as long short-term memory networks 76 and transformers 77 , have been used as feature attribution techniques to identify portions of simplified molecular input line entry systems (SMILES) 78 strings that are relevant for bioactivity or physicochemical properties 79,80 . These approaches constitute a first attempt to bridge the gap between the deep learning and medicinal chemistry communities, by relying on representations (atom and bond types, and molecular connectivity 78 ) that bear direct chemical meaning and need no posterior descriptor-to-molecule decoding.…”
Section: Relevance Of Input Featuresmentioning
confidence: 99%