Metal–organic
frameworks (MOFs) are one category of emerging
porous materials, which are promising competitors applied in gas storage
and separation due to their high porosity and high surface area. It
is still time consuming to search for optimal materials for methane
storage from a large number of candidates by traditional methods such
as molecular simulations and quantum mechanics. Recently, machine
learning (ML) algorithms were gradually used to accelerate the discovery
of high-performance MOFs. In this work, Henry’s coefficient
besides other characteristic parameters was computed and appended
into the previously reported data set of hypothetical metal–organic
frameworks (hMOFs) for methane storage. The new data set with 37 features
and 130 397 samples was then randomly split into a training
set and a test set in the ratio of 7:3, which were applied for ML
training and testing with three different algorithms, including support
vector machine, random forest regression (RFR), and gradient boosting
regression tree (GBRT). The results indicate that the GBRT model demonstrates
the best generalization ability to predict nontrained data set, whereas
the RFR model results in the best predictive power in the training
set. The analysis of feature importance from machine learning algorithms
confirms that the high generalization ability of the GBRT model is
attributed to the model extracting more information from a wider range
of features. The RFR model results in the highest prediction accuracy
with Pearson correlation coefficient (r
2) of 0.9984 and root mean square error (RMSE) of 3.93 in the training
set of absolute gravimetric uptakes. The GBRT model results in the
highest prediction accuracy with r
2 of
0.9908 and RMSE of 9.40 in the test set of absolute gravimetric uptakes,
which is the highest prediction accuracy among the up-to-date reports.
According to volumetric capacities for methane storage, the optimal
hMOFs exhibit ϕ of 0.65–0.88, liquid-crystal display
of ∼7.5 Å, VSA of ∼2250 m2 cm–3, etc.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.