Zhen Guo scite author profile

Automated prediction of reaction impurities is useful in early‐stage reaction development, synthesis planning and optimization. Existing reaction predictors are catered towards main product prediction, and are often black‐box, making it difficult to troubleshoot erroneous outcomes. This work aims to present an automated, interpretable impurity prediction workflow based on data mining large chemical reaction databases. A 14‐step workflow was implemented in Python and RDKit using Reaxys® data. Evaluation of potential chemical reactions between functional groups present in the same reaction environment in the user‐supplied query species can be accurately performed by directly mining the Reaxys® database for similar or ‘analogue’ reactions involving these functional groups. Reaction templates can then be extracted from analogue reactions and applied to the relevant species in the original query to return impurities and transformations of interest. Three proof‐of‐concept case studies (paracetamol, agomelatine and lersivirine) were conducted, with the workflow correctly suggesting impurities within the top two outcomes. At all stages, suggested impurities can be traced back to the originating template and analogue reaction in the literature, allowing for closer inspection and user validation. Ultimately, this work could be useful as a benchmark for more sophisticated algorithms or models since it is interpretable, as opposed to purely black‐box solutions.

show abstract

Discovering Circular Process Solutions through Automated Reaction Network Optimization

Weber

Guo

Lapkin

2022

ACS Eng. Au

View full text Add to dashboard Cite

The transition toward a circular and biobased chemical industry is needed to cut global CO 2 emissions and limit the chemical industry's overall impact on the environment. However, the development of circular chemical reaction systems is challenging as it requires symbiotic sets of novel chemical reaction pathways and involves unconventional processing steps. We present a methodological pipeline for automated reaction network optimization. The tools can guide the development of circular processes on the reaction pathway level. Chemical big data combined with energetic assessment metrics and state-of-the-art decision-making has the potential to efficiently identify the most promising reaction systems. We mine large-scale chemical reaction data from Reaxys database and automate the screening of pathways based on chemical rules. We then approximate thermodynamic properties for exergy calculations of the prescreened pathways and formulate the optimization problem as linear programming and mixed-integer linear programming problem. The methodological workflow is illustrated in a case study on the conversion of βpinene to citral. Our results show that the tools are well suited to model circular process interactions within different environment scenarios.

show abstract

Reaction impurity prediction using a data mining approach

Arun

Guo

Sung

et al. 2022

Preprint

View full text Add to dashboard Cite

Automated prediction of reaction impurities can be useful in facilitating rapid early-stage reaction development, synthesis planning and optimization. Existing reaction predictors are catered towards main product prediction, and are often black-box, making it difficult to troubleshoot erroneous outcomes. This work presents an automated, interpretable impurity prediction workflow based on data mining large chemical reaction databases. A 14-step workflow was implemented in Python and RDKit using Reaxys® data. Evaluation of potential chemical reactions between functional groups present in the same reaction environment in the user-supplied query species can be accurately performed by directly mining the Reaxys® database for similar or ‘analogue’ reactions involving these functional groups. Reaction templates can then be extracted from analogue reactions and applied to the relevant species in the original query to return impurities and transformations of interest. Three proof-of-concept case studies based on active pharmaceutical ingredients (paracetamol, agomelatine and lersivirine) were conducted, with the workflow able to suggest the correct impurities within the top two outcomes. At all stages, suggested impurities can be traced back to the originating template and analogue reaction in the literature, allowing for closer inspection and user validation. Ultimately, this work could be useful as a benchmark for more sophisticated algorithms or models since it is interpretable, as opposed to purely black-box solutions, and illustrates the potential of chemical data in impurity prediction.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhen Guo

Chemical data intelligence for sustainable chemistry

Towards circular economy: integration of bio-waste into chemical supply chain

Phase-field crystal simulation of evolution of liquid pools in grain boundary pre-melting regions

Multi-component phase-field simulation of microstructural evolution and elemental distribution in Fe–Cu–Mn–Ni–Al alloy

Exploring a Sustainable Business Routing for China’s New Energy Vehicles: BYD as an Example

Reaction Impurity Prediction using a Data Mining Approach**

Discovering Circular Process Solutions through Automated Reaction Network Optimization

Reaction impurity prediction using a data mining approach

Contact Info

Product

Resources

About