Size-independent neural networks based first-principles method for accurate prediction of heat of formation of fuels

Yang, GuanYa; Wu, Jiang; Chen, ShuGuang; Zhou, Weijun; Sun, Jian; Chen, Guanhua

doi:10.1063/1.5024442

Cited by 8 publications

(8 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Obviously, in all these specialpurpose approaches, ML's role is relegated to only predicting or improving the prediction of enthalpy of formation for a given chemical structure, and some ML-based approaches have the further limitation that they were developed only for certain classes of compounds such as acyclic hydrocarbons, 33 cyclic hydrocarbons, 34 energetic materials, 8 or fuels. 39 Such specialpurpose ML approaches also rely on molecular structures and other descriptors derived from structures, which are provided to a ML model, with the consequence that the ML model itself can neither generate a new molecular geometry nor improve upon it. An alternative to both QM and special-purpose ML approaches comes from a parallel development of general-purpose, data-driven methods based on ML, which target accurate predictions of QM potential energies for a wide range of compounds and can be used as a drop-in replacement for QM or force-field methods in many simulations such as molecular dynamics and geometry optimizations.…”

mentioning

confidence: 99%

Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods

Zheng

Yang

et al. 2022

J. Phys. Chem. Lett.

View full text Add to dashboard Cite

Enthalpies of formation and reaction are important thermodynamic properties that have a crucial impact on the outcome of chemical transformations. Here we implement the calculation of enthalpies of formation with a general-purpose ANI-1ccx neural network atomistic potential. We demonstrate on a wide range of benchmark sets that both ANI-1ccx and our other general-purpose data-driven method AIQM1 approach the coveted chemical accuracy of 1 kcal/mol with the speed of semiempirical quantum mechanical methods (AIQM1) or faster (ANI-1ccx). It is remarkably achieved without specifically training the machine learning parts of ANI-1ccx or AIQM1 on formation enthalpies. Importantly, we show that these data-driven methods provide statistical means for uncertainty quantification of their predictions, which we use to detect and eliminate outliers and revise reference experimental data. Uncertainty quantification may also help in the systematic improvement of such datadriven methods.

show abstract

mentioning

confidence: 99%

Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods

Zheng

Yang

et al. 2022

J. Phys. Chem. Lett.

View full text Add to dashboard Cite

show abstract

“…Simultaneously, machine learning (ML) has been added to the quantum chemical toolbox, ,,,,− leading to a significant decrease in the computational cost and/or increase in the accuracy of the corresponding calculated properties. The success of a given ML model depends on its chosen set of molecular descriptors, as the representation must fully describe patterns in the desired output values.…”

Section: Machine Learning Models In Thermochemistrymentioning

confidence: 99%

Effective Molecular Descriptors for Chemical Accuracy at DFT Cost: Fragmentation, Error-Cancellation, and Machine Learning

Collins

Raghavachari

2020

J. Chem. Theory Comput.

View full text Add to dashboard Cite

Recent advances in theoretical thermochemistry have allowed the study of small organic and bio-organic molecules with high accuracy. However, applications to larger molecules are still impeded by the steep scaling problem of highly accurate quantum mechanical (QM) methods, forcing the use of approximate, more cost-effective methods at a greatly reduced accuracy. One of the most successful strategies to mitigate this error is the use of systematic error-cancellation schemes, in which highly accurate QM calculations can be performed on small portions of the molecule to construct corrections to an approximate method. Herein, we build on ideas from fragmentation and error-cancellation to introduce a new family of molecular descriptors for machine learning modeled after the Connectivity-Based Hierarchy (CBH) of generalized isodesmic reaction schemes. The best performing descriptor ML(CBH-2) is constructed from fragments preserving only the immediate connectivity of all heavy (non-H) atoms of a molecule along with overlapping regions of fragments in accordance with the inclusion−exclusion principle. Our proposed approach offers a simple, chemically intuitive grouping of atoms, tuned with an optimal amount of error-cancellation, and outperforms previous structure-based descriptors using a much smaller input vector length. For a wide variety of density functionals, DFT+ΔML(CBH-2) models, trained on a set of small-to medium-sized organic HCNOSClcontaining molecules, achieved an out-of-sample MAE within 0.5 kcal/mol and 2σ (95%) confidence interval of <1.5 kcal/mol compared to accurate G4 reference values at DFT cost.

show abstract

“…Yang et al 30 introduce a size-independent NN model of heats of formation trained on small organic molecules that can be applied to large molecules. For these, the MAE from reference B3LYP numbers is reduced to 1.7 kcal/mol.…”

Section: A Prediction Of Energies and Other Properties Throughout Chmentioning

confidence: 99%

Guest Editorial: Special Topic on Data-Enabled Theoretical Chemistry

Rupp

Lilienfeld

Burke

2018

The Journal of Chemical Physics

View full text Add to dashboard Cite

show abstract

Size-independent neural networks based first-principles method for accurate prediction of heat of formation of fuels

Cited by 8 publications

References 28 publications

Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods

Toward Chemical Accuracy in Predicting Enthalpies of Formation with General-Purpose Data-Driven Methods

Effective Molecular Descriptors for Chemical Accuracy at DFT Cost: Fragmentation, Error-Cancellation, and Machine Learning

Guest Editorial: Special Topic on Data-Enabled Theoretical Chemistry

Contact Info

Product

Resources

About