In this study, 13 machine learning (ML) models were employed
to
predict the phase equilibrium temperatures of the CO2 hydrate
in systems with salts and organic inhibitors: Multiple Linear Regression
(MLR), Support Vector Regression (SVR), k-Nearest Neighbors (KNN),
Multi-Layer Perceptron (MLP), Decision Tree (DT), Random Forest (RF),
Extra Trees (ET), Adaptive Boosting (AdaBoost), Categorical Boosting
(CatBoost), Gradient Boosting Machine (GBM), Light Gradient Boosting
Machine (LGBM), Histogram-Based Gradient Boosting (HistGB), and eXtreme
Gradient Boosting (XGBoost). A dataset consisting of 1801 experimentally
measured equilibrium data points was gathered, which included both
pure water systems and systems with thermodynamic inhibitors. After
data preprocessing, 1402 data points were selected for ML training
and validation. Boosting algorithms generally yielded high predictive
accuracy, with the MLP demonstrating notably superior accuracy. The
predicted equilibrium temperatures for complex systems containing
both salts and organic inhibitors were also compared with those calculated
by the widely used CSMGem software. With the exception of SVR, KNN,
and DT, all models outperformed CSMGem, in terms of statistical assessment.
Specifically, the CatBoost model accurately predicted equilibrium
temperatures for most test sets of combinations of salts and organic
inhibitors. This study underscores the viability of ML models for
predicting the phase equilibria of the CO2 hydrate in the
presence of single and mixed thermodynamic inhibitors.