Abstract:Density functionals are often used
in ab initio thermochemistry
to provide optimized geometries for single-point evaluations at a
high level and to supply estimates of anharmonic zero-point energies
(ZPEs). Their use is motivated by relatively high accuracy at a modest
computational expense, but a thorough assessment of geometry-related
error seems to be lacking. We have benchmarked 53 density functionals,
focusing on approximations of the first four rungs and on relatively
small basis sets for computational e… Show more
“…I note below PTCn (Polynomial Trend Correction of order n) the correction of the errors trend vs. V by a polynomial of degree n. For this study, the PTCn model's parameters and their uncertainty are estimated by standard least-squares, which is known to be robust to non-normal error distributions 89 . More complex PU models could be used for datasets presenting complex trends, requiring weighting schemes (e.g., in the case of heterogeneous reference values uncertainties 6 ) or the consideration of heteroscedasticity 21,42 . More specific trend correction models are defined in their application case.…”
Section: A-posteriori Uq Methodsmentioning
confidence: 99%
“…They did not identify any concern about the predicted prediction intervals and conclude that their method "provides a fair estimate of 95 % confidence" 21 . This is a small dataset (N = 99), and I find interesting to check how the validation tools perform in this context.…”
Section: The Bak2021 Datasetmentioning
confidence: 99%
“…Bakowies and von Lilienfeld 21 proposed a new method to estimate zero-point energies (ZPE) and the corresponding expanded U 95 uncertainty in the framework of the composite ATOMIC method. Interestingly, they observed a quadratic dependence of the ZPE scaling factors and their dispersion with the fraction of heteroatoms in a set of 279 molecules.…”
Section: The Bak2021 Datasetmentioning
confidence: 99%
“…• Embedded UQ methods produce PU concurrently with property predictions. This is an heterogeneous class which presently encompasses the above-mentioned BEEF or Bayesian Ensemble approach 4 and several bottom-up correction methods with detailed uncertainty budgets (the Type B methods of Ruscic 5 ), such as the Feller-Peterson-Dixon method 17,18 or the ATOMIC protocol [19][20][21] . Some machine learning methods, such as Bayesian neural networks which are designed to provide PUs along with prediction, also belong here 22,23 .…”
Section: Introductionmentioning
confidence: 99%
“…When quantitative tools have been used, notably for the validation of a-posteriori methods, a popular test statistic is the prediction interval coverage probability (PICP) 23,41 . PICP tests have been done either on the full dataset 21,29,42 , or using a splitting scheme between calibration and validation sets 6,42 . Globally, there does not seem to be (yet) a consensus in the community on the adequate validation vocabulary, concepts and tools, which would be necessary to compare the merits of different CC-UQ strategies.…”
a) Uncertainty quantification (UQ) in computational chemistry (CC) is still in its infancy.Very few CC methods are designed to provide a confidence level on their predictions, and most users still rely improperly on the mean absolute error as an accuracy metric. The development of reliable uncertainty quantification methods is essential, notably for computational chemistry to be used confidently in industrial processes. A review of the CC-UQ literature shows that there is no common standard procedure to report nor validate prediction uncertainty. I consider here analysis tools using concepts (calibration and sharpness) developed in meteorology and machine learning for the validation of probabilistic forecasters. These tools are adapted to CC-UQ and applied to datasets of prediction uncertainties provided by composite methods, Bayesian Ensembles methods, machine learning and a posteriori statistical methods.
“…I note below PTCn (Polynomial Trend Correction of order n) the correction of the errors trend vs. V by a polynomial of degree n. For this study, the PTCn model's parameters and their uncertainty are estimated by standard least-squares, which is known to be robust to non-normal error distributions 89 . More complex PU models could be used for datasets presenting complex trends, requiring weighting schemes (e.g., in the case of heterogeneous reference values uncertainties 6 ) or the consideration of heteroscedasticity 21,42 . More specific trend correction models are defined in their application case.…”
Section: A-posteriori Uq Methodsmentioning
confidence: 99%
“…They did not identify any concern about the predicted prediction intervals and conclude that their method "provides a fair estimate of 95 % confidence" 21 . This is a small dataset (N = 99), and I find interesting to check how the validation tools perform in this context.…”
Section: The Bak2021 Datasetmentioning
confidence: 99%
“…Bakowies and von Lilienfeld 21 proposed a new method to estimate zero-point energies (ZPE) and the corresponding expanded U 95 uncertainty in the framework of the composite ATOMIC method. Interestingly, they observed a quadratic dependence of the ZPE scaling factors and their dispersion with the fraction of heteroatoms in a set of 279 molecules.…”
Section: The Bak2021 Datasetmentioning
confidence: 99%
“…• Embedded UQ methods produce PU concurrently with property predictions. This is an heterogeneous class which presently encompasses the above-mentioned BEEF or Bayesian Ensemble approach 4 and several bottom-up correction methods with detailed uncertainty budgets (the Type B methods of Ruscic 5 ), such as the Feller-Peterson-Dixon method 17,18 or the ATOMIC protocol [19][20][21] . Some machine learning methods, such as Bayesian neural networks which are designed to provide PUs along with prediction, also belong here 22,23 .…”
Section: Introductionmentioning
confidence: 99%
“…When quantitative tools have been used, notably for the validation of a-posteriori methods, a popular test statistic is the prediction interval coverage probability (PICP) 23,41 . PICP tests have been done either on the full dataset 21,29,42 , or using a splitting scheme between calibration and validation sets 6,42 . Globally, there does not seem to be (yet) a consensus in the community on the adequate validation vocabulary, concepts and tools, which would be necessary to compare the merits of different CC-UQ strategies.…”
a) Uncertainty quantification (UQ) in computational chemistry (CC) is still in its infancy.Very few CC methods are designed to provide a confidence level on their predictions, and most users still rely improperly on the mean absolute error as an accuracy metric. The development of reliable uncertainty quantification methods is essential, notably for computational chemistry to be used confidently in industrial processes. A review of the CC-UQ literature shows that there is no common standard procedure to report nor validate prediction uncertainty. I consider here analysis tools using concepts (calibration and sharpness) developed in meteorology and machine learning for the validation of probabilistic forecasters. These tools are adapted to CC-UQ and applied to datasets of prediction uncertainties provided by composite methods, Bayesian Ensembles methods, machine learning and a posteriori statistical methods.
In this work, an efficient and generally applicable scheme for the automatic generation of the minimal set of independent model reactions to be used for the calculation of enthalpies of formation is presented. A post‐processing procedure targeting the selection of the most suitable model reactions by assigning them larger weights is suggested. The developed computational protocol exploiting high‐level ab initio calculations and accurate reference enthalpies of formation retrieved from the Active Thermochemical Tables reproduces with chemical accuracy well‐established enthalpies of formation of 15 relatively small (C10–C24) polycyclic aromatic hydrocarbons (PAHs). A promising alternative single‐reaction strategy encompassing all reference species is outlined. Both methods are then applied to predict for the set of 43 larger (C32) PAHs, and revealed significant deviation (Mean Unsigned Error, MUE = 23.2 kJ mol−1) compared to the earlier theoretical results obtained from B3LYP calculations with subsequent group‐based empirical corrections. In the absence of the experimental data, these evaluations of are expected to be more realistic as the approach employed in this work demonstrated better performance on the benchmarking set of 15 smaller PAHs with the MUE of 1.3–1.5 versus 6.2 kJ mol−1.
The impact of complete basis set extrapolation schemes (CBS), diffuse functions, and tight weighted‐core functions on enthalpies of formation predicted via the DLPNO‐CCSD(T1) reduced Feller‐Peterson‐Dixon approach has been examined for neutral H,C,O‐compounds. All tested three‐point (TZ/QZ/5Z) extrapolation schemes result in mean unsigned deviation (MUD) below 2 kJ mol−1 relative to the experiment. The two‐point QZ/5Z and TZ/QZ CBS 1/lmax3 extrapolation schemes are inferior to their inverse power counterpart (1/()lmax+1/24) by 1.3 and 4.3 kJ mol−1. The CBS extrapolated frozen core atomization energies are insensitive (within 1 kJ mol−1) to augmentation of the basis set with tight weighted core functions. The core‐valence correlation effects converge already at triple‐ζ, although double‐ζ/triple‐ζ CBS extrapolation performs better and is recommended. The effect of diffuse function augmentation converges slowly, and cannot be reproduced with double‐ ζ or triple‐ ζ calculations as these are plagued with basis set superposition and incompleteness errors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.