Automatic Retrosynthetic Pathway Planning Using Template-free Models

Lin, Kangjie; Xu, Youjun; Pei, Jianfeng; Lai, Luhua

doi:10.26434/chemrxiv.8168354.v1

Cited by 12 publications

(15 citation statements)

References 44 publications

(47 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Duan et al [37] increased the batch size and the training time for their Transformer model and were able to achieve a top-1 accuracy of 54.1% on the 50k USPTO data set [44]. Later on, the same architecture was reported to have a top-1 accuracy of 43.8% [36], in line with the three previous transformer-based approaches [32,33,35] but significantly lower than the accuracy previously reported by Duan et al [37]. Interestingly, the transformer model was also trained on a proprietary data set [36], including only reactions with two reactants with a Tanimoto similarity distribution peaked at 0.75, characteristic of an excessive degree of similarity (roughly 2 times higher than the USPTO).…”

Section: Introductionsupporting

confidence: 60%

“…While this extensive production of AI models for Organic chemistry was made possible by the availability of public data [28,29], the noise contained in this data and generated by the text-mining extraction process is heavily reducing their potential. In fact, while rule-based systems [30] demonstrated, through wet-lab experiments, the capability to design target molecules with less purification steps and hence, leading to savings in time and cost [31], the AI approaches [6,9,12,16,[32][33][34][35][36][37][38] still have a long way to go.…”

Section: Introductionmentioning

confidence: 99%

“…Transformer-based retrosynthesis: current status. Inspired by the success of the Molecular Transformer [22,42,43] for forward reaction prediction, a few retrosynthetic models based on the same architecture were reported shortly after [32,33,[35][36][37]. Zheng et al [32] proposed a template-free self-corrected retrosynthesis predictor built on the Transformer architecture.…”

Section: Introductionmentioning

confidence: 99%

“…They were able to successfully predict the reactants with a top-1 accuracy of 42.7%. Lin et al [35] combined a Monte-Carlo tree search, previously introduced for retrosynthesis in the ground-breaking work by Segler et al [12], with a single retrosynthetic step Transformer architecture for predicting multi-step reactions. In a single-step setting, the model described by Lin et al [35] achieved a top-1 prediction accuracy of over 43.1% and 54.1% when trained on the same small data set [44] and a ten times larger collection, respectively.…”

Section: Introductionmentioning

confidence: 99%

“…Lin et al [35] combined a Monte-Carlo tree search, previously introduced for retrosynthesis in the ground-breaking work by Segler et al [12], with a single retrosynthetic step Transformer architecture for predicting multi-step reactions. In a single-step setting, the model described by Lin et al [35] achieved a top-1 prediction accuracy of over 43.1% and 54.1% when trained on the same small data set [44] and a ten times larger collection, respectively. Duan et al [37] increased the batch size and the training time for their Transformer model and were able to achieve a top-1 accuracy of 54.1% on the 50k USPTO data set [44].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Predicting Retrosynthetic Pathways Using a Combined Linguistic Model and Hyper-Graph Exploration Strategy

Schwaller

Petraglia²,

Zullo³

et al. 2019

Preprint

View full text Add to dashboard Cite

We present an extension of our Molecular Transformer architecture combined with a hyper-graph exploration strategy for automatic retrosynthesis route planning without human intervention. The single-step retrosynthetic model sets a new state of the art for predicting reactants as well as reagents, solvents and catalysts for each retrosynthetic step. We introduce new metrics (coverage, class diversity, round-trip accuracy and Jensen-Shannon divergence) to evaluate the single-step retrosynthetic models, using the forward prediction and a reaction classification model always based on the transformer architecture. The hypergraph is constructed on the fly, and the nodes are filtered and further expanded based on a Bayesian-like probability. We critically assessed the end-to-end framework with several retrosynthesis examples from literature and academic exams. Overall, the frameworks has a very good performance with few weaknesses due to the bias induced during the training process. The use of the newly introduced metrics opens up the possibility to optimize entire retrosynthetic frameworks through focusing on the performance of the single-step model only.

show abstract

Section: Introductionsupporting

confidence: 60%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Predicting Retrosynthetic Pathways Using a Combined Linguistic Model and Hyper-Graph Exploration Strategy

Schwaller

Petraglia²,

Zullo³

et al. 2019

Preprint

View full text Add to dashboard Cite

show abstract

Predicting chemical reaction outcomes: A grammar ontology‐based transformer framework

Mann

Venkatasubramanian

2021

AIChE Journal

View full text Add to dashboard Cite

Discovering and designing novel materials is a challenging problem as it often requires searching through a combinatorially large space of potential candidates, typically requiring great amounts of effort, time, expertise, and money. The ability to predict reaction outcomes without performing extensive experiments is, therefore, important. Toward that goal, we report an approach that uses context-free grammarbased representations of molecules in a neural machine translation framework. This involves discovering the transformations from the source sequence (comprising the reactants and agents) to the target sequence (comprising the major product) in the reaction. The grammar ontology-based representation hierarchically incorporates rich molecular-structure information, ensures syntactic validity of predictions, and overcomes over-parameterization in complex machine learning architectures. We achieve an accuracy of 80.1% (86.3% top-2 accuracy) and 99% syntactic validity of predictions on a standard reaction dataset. Moreover, our model is characterized by only a fraction of the number of training parameters used in other similar works in this area.

show abstract

Norrish’ type I and II reactions and their role in the building of photochemical science

Albini

2021

Photochem Photobiol Sci

View full text Add to dashboard Cite

The highly inspiring work by Professor Norrish has exerted a consistent influence on chemistry and, in particular, on photochemistry, where he was one of the first scientists, along with Gilbert N. Lewis, able to develop a viable concept of excited states and their rate of reaction. However, having him listed among the authors of two name reactions, known as Norrish Type I and Type II, plus a subcase of the latter that is the Yang cyclization, is not coherent. Indeed, Norrish had no interest in organic synthesis, while this is a required feature of name reactions. And, at any rate, Professors Ciamician and Paternò had arrived at the same conclusions with the same compounds much earlier, except for the measurement of quantum yields. Things are too long away now for introducing any change, but one should remember that using Norrish name here is a mistake, while it would be appropriate to add the name of Ciamician for a different name reaction, the 2 + 2 cycloaddition of alkenes and conjugated carbonyls. In 1968, such an attribution was proposed by Professor Schömberg, but this had no effect and the present assignment has become a habit. The most important thing, however, is that the 2 + 2 reaction has become one of the most popular reactions in synthetic photochemistry.

show abstract

Automatic Retrosynthetic Pathway Planning Using Template-free Models

Cited by 12 publications

References 44 publications

Predicting Retrosynthetic Pathways Using a Combined Linguistic Model and Hyper-Graph Exploration Strategy

Predicting Retrosynthetic Pathways Using a Combined Linguistic Model and Hyper-Graph Exploration Strategy

Predicting chemical reaction outcomes: A grammar ontology‐based transformer framework

Norrish’ type I and II reactions and their role in the building of photochemical science

Contact Info

Product

Resources

About