At
present, 100 000+ metal–organic frameworks (MOFs)
have been synthesized, and it is challenging to identity the best
candidate for a specific application. In this study, MOFs are rapidly
screened via a hierarchical approach for propane/propylene (C3H8/C3H6) separation. First,
the adsorption capacity and selectivity of C3H8/C3H6 mixture in “Computation-Ready,
Experimental” (CoRE) MOFs are predicted via a molecular simulation
(MS) method. The relationships between separation metrics and structural
factors are established, and top-performing CoRE MOFs are identified.
Then, machine learning (ML) models are trained and developed upon
the CoRE MOFs using pore size, pore geometry, and framework chemistry
as feature descriptors. By introducing binned pore size distributions
and geometric descriptors, the accuracy of ML models is substantially
improved. The feature importance of the descriptors is physically
interpreted by the Gini impurities and Shapley Additive Explanations.
Subsequently, the ML models are used to rapidly screen experimental
“Cambridge Structural Database” (CSD) MOFs and hypothetical
MOFs for C3H8/C3H6 separation.
In the CSD MOFs, the out-of-sample predictions are found to agree
well with simulation results, demonstrating the excellent transferability
of the ML models from the CoRE to CSD MOFs. Moreover, nine CSD MOFs
are identified to possess separation performance superior to top-performing
CoRE MOFs. Finally, the similarity and diversity among experimental
and hypothetical MOFs are visualized and compared by the t-Distributed
Stochastic Neighbor Embedding (t-SNE) feature projections. Remarkably,
the CoRE and CSD MOFs are revealed to share a close similarity in
both chemical and geometric feature spaces. By synergizing MS and
ML, the hierarchical approach developed in this study would advance
the rapid screening of MOFs across different databases toward industrially
important separation processes.