2020
DOI: 10.1039/c9sc04944d
|View full text |Cite
|
Sign up to set email alerts
|

Datasets and their influence on the development of computer assisted synthesis planning tools in the pharmaceutical domain

Abstract: Computer Assisted Synthesis Planning (CASP), datasets and their performance.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
226
1

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
1

Relationship

3
5

Authors

Journals

citations
Cited by 150 publications
(237 citation statements)
references
References 44 publications
(67 reference statements)
0
226
1
Order By: Relevance
“…7 As such, it is unlikely that predictions beyond the top 50 templates will be enumerated even if considered due to their large cumulative probability, and there is no guarantee a predicted template can be successfully applied to yield a set of reactants. 7,29 Therefore, this reflects to our measure of comparison outlined previously, the quantity of ring formations in the top 50 templates predicted and their rank. Across the substrates examined ( Figure 2) we found that 'Ring Breaker' was able to suggest a ring-forming template in 98 % of cases using a model trained on Reaxys, compared to 45 % of cases when using the general model (Table 2).…”
Section: Dataset Ringmentioning
confidence: 76%
See 1 more Smart Citation
“…7 As such, it is unlikely that predictions beyond the top 50 templates will be enumerated even if considered due to their large cumulative probability, and there is no guarantee a predicted template can be successfully applied to yield a set of reactants. 7,29 Therefore, this reflects to our measure of comparison outlined previously, the quantity of ring formations in the top 50 templates predicted and their rank. Across the substrates examined ( Figure 2) we found that 'Ring Breaker' was able to suggest a ring-forming template in 98 % of cases using a model trained on Reaxys, compared to 45 % of cases when using the general model (Table 2).…”
Section: Dataset Ringmentioning
confidence: 76%
“…The predictions were made by two models trained on the USPTO or Reaxys data respectively, to determine whether a difference in performance could be observed between the two datasets, considering their differing size and coverage as determined in a previous study. 29 We found that for the 20 substrates tested in this part of the study, 'Ring Breaker' performed better for the prediction of ring formations on average ( Table 2). For each molecule, the top 50 predictions were restricted to those describing ring formations to compare the models.…”
Section: Prediction Of Well-known Ring Formationsmentioning
confidence: 78%
“…7 As such, it is unlikely that predictions beyond the top 50 templates will be enumerated even if considered due to their large cumulative probability, and there is no guarantee a predicted template can be successfully applied to yield a set of reactants. 7,29 Therefore, this reflects to our measure of comparison outlined previously, the quantity of ring formations in the top 50 templates predicted and their rank. Legend: number of ring formations predicted (rank of first applicable ring formation) e.g.…”
Section: Dataset Ringmentioning
confidence: 76%
“…This is in contrast to our previous observations, where we reported that the ability to generate synthetic routes for the general model did not depend on the training dataset. 29 We have now determined that for the domain specific case of ring formations there is a clear effect arising from the training set used, attributed to the number and diversity of the samples available to the network for training. The performance of the model on ring systems classed as 'rare' in the ZINC database, is surprising ( Figure 6).…”
Section: Prediction Of Fragmentsmentioning
confidence: 99%
“…Solving a retrosynthetic problem is equivalent to exploring a directed acyclic graph of all possible retrosyntheses of a given target and finding the optimal route based on the optimization of specific cost functions (price of synthesis, raw materials availability, efficacy, etc.). Monte-Carlo Tree Search (MCTS) algorithms were the method of choice to explore retrosynthetic graphs in previous works [12,35,38]. Here, we use a hypergraph exploration strategy (see Section 4.5).…”
Section: Multi-step Workflowmentioning
confidence: 99%