The
ability to predict transport properties of liquids quickly
and accurately will greatly improve our understanding of fluid properties
both in bulk and complex mixtures, as well as in confined environments.
Such information could then be used in the design of materials and
processes for applications ranging from energy production and storage
to manufacturing processes. As a first step, we consider the use of
machine learning (ML) methods to predict the diffusion properties
of pure liquids. Recent results have shown that Artificial Neural
Networks (ANNs) can effectively predict the diffusion of pure compounds
based on the use of experimental properties as the model inputs. In
the current study, a similar ANN approach is applied to modeling diffusion
of pure liquids using fluid properties obtained exclusively from molecular
simulations. A diverse set of 102 pure liquids is considered, ranging
from small polar molecules (e.g., water) to large nonpolar molecules
(e.g., octane). Self-diffusion coefficients were obtained from classical
molecular dynamics (MD) simulations. Since nearly all the molecules
are organic compounds, a general set of force field parameters for
organic molecules was used. The MD methods are validated by comparing
physical and thermodynamic properties with experiment. Computational
input features for the ANN include physical properties obtained from
the MD simulations as well as molecular properties from quantum calculations
of individual molecules. Fluid properties describing the local liquid
structure were obtained from center of mass radial distribution functions
(COM-RDFs). Feature sensitivity analysis revealed that isothermal
compressibility, heat of vaporization, and the thermal expansion coefficient
were the most impactful properties used as input for the ANN model
to predict the MD simulated self-diffusion coefficients. The MD-based
ANN successfully predicts the MD self-diffusion coefficients with
only a subset (2 to 3) of the available computationally determined
input features required. A separate ANN model was developed using
literature experimental self-diffusion coefficients as model targets.
Although this second ML model was not as successful due to a limited
number of data points, a good correlation is still observed between
experimental and ML predicted self-diffusion coefficients.