We present a data-driven approach
to identifying the reaction network
of the dominant chemistry in complex mixtures using model compounds
representative of cellulose and lignin chemistry that are processed
using hydrous pyrolysis. We present two methods for the identification
of pseudocomponents: self-modeling multivariate curve resolution,
which is a non-negative matrix factorization method, and Bayesian
hierarchical clustering. The pseudocomponents are identified from
spectroscopic data from two sources: Fourier transform infrared spectroscopy
and 1H NMR spectroscopy. The data from the two sources
is combined using a simple data combination method. Once pseudocomponents
have been identified, Bayesian networks are used to identify directed
pathways between the components, resulting in a proposed hypothesis
for the reaction network or mechanism. We validate the methods by
showing consistency of the derived reaction networks with the known
chemistry of cellulose, lignin, and their derivatives and demonstrate
the importance of data fusion in developing believable reaction networks.
In this work, we analyze the hydrous pyrolysis of a physical mixture of the model components representing cellulose (levoglucosan) and lignin (2-phenoxyethyl benzene). Fourier transform infrared (FTIR) and proton nuclear magnetic resonance ( 1 H-NMR) spectroscopy was used to characterize the products of the reaction. The main objective of the work was to use datadriven methods to develop a reaction network for this system based on the spectroscopic data. This was achieved using Bayesian hierarchical clustering to identify pseudocomponents and Bayesian networks to develop a reaction network between these pseudocomponents. The data-driven reaction network was shown to be consistent with the known chemistry of the pyrolysis of cellulose and lignin, and the chemistry of the physical mixture incorporated/combined elements of the reaction mechanisms of cellulose and lignin.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.