Data-driven and knowledge-driven methods are two approaches used in studying reaction kinetics. This article proposes a hybrid-modeling framework for homogeneous synthesis reactions, which combines the advantages of high level of automation in the data-driven approach and improved accuracy in the knowledge-driven approach. A constrained enumeration method is proposed to generate possible candidate stoichiometries, and dynamic response surface methodology, target factor analysis, and mass balance are used together for identifying stoichiometries one-byone, without the necessity of an expert-generated candidate list. Then, the previously screened stoichiometries are formed into different groups that represent candidate reaction systems, and the group (or groups) with the greatest likelihood will be identified, based on kinetic fitting and reaction dynamic criteria. This framework has been demonstrated by several examples of different reaction systems. The true reaction stoichiometries are all correctly identified, and the accurate kinetic models are obtained, showing satisfactory performance of the proposed method.
As one of the most influential industries in public health and the global economy, the pharmaceutical industry is facing multiple challenges in drug research, development and manufacturing. With recent developments in artificial intelligence and machine learning, data-driven modeling methods and techniques have enabled fast and accurate modeling for drug molecular design, retrosynthetic analysis, chemical reaction outcome prediction, manufacturing process optimization, and many other aspects in the pharmaceutical industry. This article provides a review of data-driven methods applied in pharmaceutical processes, based on the mathematical and algorithmic principles behind the modeling methods. Different statistical tools, such as multivariate tools, Bayesian inferences, and machine learning approaches, i.e., unsupervised learning, supervised learning (including deep learning) and reinforcement learning, are presented. Various applications in the pharmaceutical processes, as well as the connections from statistics and machine learning methods, are discussed in the narrative procedures of introducing different types of data-driven models. Afterwards, two case studies, including dynamic reaction data modeling and catalyst-kinetics prediction of cross-coupling reactions, are presented to illustrate the power and advantages of different data-driven models. We also discussed current challenges and future perspectives of data-driven modeling methods, emphasizing the integration of data-driven and mechanistic models, as well as multi-scale modeling.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.