2020
DOI: 10.1038/s41467-020-18671-7
|View full text |Cite
|
Sign up to set email alerts
|

Transfer learning enables the molecular transformer to predict regio- and stereoselective reactions on carbohydrates

Abstract: Organic synthesis methodology enables the synthesis of complex molecules and materials used in all fields of science and technology and represents a vast body of accumulated knowledge optimally suited for deep learning. While most organic reactions involve distinct functional groups and can readily be learned by deep learning models and chemists alike, regio- and stereoselective transformations are more challenging because their outcome also depends on functional group surroundings. Here, we challenge the Mole… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

4
184
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 176 publications
(194 citation statements)
references
References 40 publications
4
184
1
Order By: Relevance
“…Transfer learning, an important tool in AI, can be utilized to surmount the restriction of limited amounts of data. [29][30][31][32] With transfer learning, the knowledge of solving one task can be applied to another task. For example, general chemical knowledge from the former large chemical dataset can be applied to the latter relative but different reaction prediction task with limited labeled data.…”
Section: Introductionmentioning
confidence: 99%
“…Transfer learning, an important tool in AI, can be utilized to surmount the restriction of limited amounts of data. [29][30][31][32] With transfer learning, the knowledge of solving one task can be applied to another task. For example, general chemical knowledge from the former large chemical dataset can be applied to the latter relative but different reaction prediction task with limited labeled data.…”
Section: Introductionmentioning
confidence: 99%
“…Note that in the best transformer model using MTL on full sentences, there was a clear association of the prediction confidence score with accuracy, as observed with other transformer models ( Figure 3D). 22 Since the subset of the test set containing the word "lipase" performed best ( Figure 3C), we evaluated this subset exhaustively with all models ( Figure 3D). While models trained on the USPTO or ENZR dataset without enzyme information performed poorly (Fig.…”
Section: Analyzing the Prediction Performance Of The Enzymatic Transfmentioning
confidence: 99%
“…Molecular transformer, adapted from Vaswani's original transformer 8 , is the state-of-art SMILES-based seq2seq model 8,9 . Meanwhile, transfer learning has been equipped with the molecular transformer in the form of an additional fine-tuning step, and this combination is found to be beneficial for the forward prediction of complex reactions that involve regioselectivity and stereoselectivity such as carbohydrate reactions 10 , Heck reaction 11 , and Baeyer-Villiger reaction 12 . For example, the regio-and stereoselective carbohydrate reactions, a finetuning step with 20k carbohydrate reaction data gives a 30% increase in the prediction accuracy 10 .…”
Section: Introductionmentioning
confidence: 99%
“…Meanwhile, transfer learning has been equipped with the molecular transformer in the form of an additional fine-tuning step, and this combination is found to be beneficial for the forward prediction of complex reactions that involve regioselectivity and stereoselectivity such as carbohydrate reactions 10 , Heck reaction 11 , and Baeyer-Villiger reaction 12 . For example, the regio-and stereoselective carbohydrate reactions, a finetuning step with 20k carbohydrate reaction data gives a 30% increase in the prediction accuracy 10 . While molecular transformer equipped with transfer learning has exciting prediction performance, it still requires thousands of samples in the fine-tuning step to have the model specialized in predict reactions in certain specific chemical space.…”
Section: Introductionmentioning
confidence: 99%