The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2020
DOI: 10.1063/5.0006202
|View full text |Cite|
|
Sign up to set email alerts
|

Probabilistic performance estimators for computational chemistry methods: Systematic improvement probability and ranking probability matrix. I. Theory

Abstract: The comparison of benchmark error sets is an essential tool for the evaluation of theories in computational chemistry. The standard ranking of methods by their Mean Unsigned Error is unsatisfactory for several reasons linked to the non-normality of the error distributions and the presence of underlying trends. Complementary statistics have recently been proposed to palliate such deficiencies, such as quantiles of the absolute errors distribution or the mean prediction uncertainty. We introduce here a new score… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
26
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(28 citation statements)
references
References 51 publications
2
26
0
Order By: Relevance
“…In some instances, prediction errors due to model inadequacy can be handled by statistical correction of predictions, which may provide a reliable uncertainty measure [20]. Various surrogate methods have been developed for the estimation of prediction uncertainty, such as bootstrap-based methods, Gaussian process regression, neural networks and deep learning ensembles [21][22][23]. Gaussian process regression has been employed to identify particular calculations within a given dataset for which the uncertainties exceed a given threshold [24,25].…”
Section: Introductionmentioning
confidence: 99%
“…In some instances, prediction errors due to model inadequacy can be handled by statistical correction of predictions, which may provide a reliable uncertainty measure [20]. Various surrogate methods have been developed for the estimation of prediction uncertainty, such as bootstrap-based methods, Gaussian process regression, neural networks and deep learning ensembles [21][22][23]. Gaussian process regression has been employed to identify particular calculations within a given dataset for which the uncertainties exceed a given threshold [24,25].…”
Section: Introductionmentioning
confidence: 99%
“…Overall, we consider the Gaussian fit a reasonable approximation to the empirical distribution, which is also reflected by the respective 95% confidence intervals: σ .95 = 1.70×10 −1 (normal distribution), Q .95 = 1.76 × 10 −1 (empirical distribution). The latter quantity refers to the distribution of absolute values of the residuals [28,29]. IV.…”
Section: Model Dispersion Vs Measurement Uncertaintymentioning
confidence: 99%
“…This lack of correlation supports the main message of this work: The number of fitted parameters does not represent an effective measure of the transferability of a functional. More reliable statistical criteria-such as those developed in this work, or alternatively, the probabilistic performance estimator recently introduced by Pernot and Savin [91,92] -should be used to evaluate the reliability of new and existing xc functionals.…”
Section: Evaluation Of 60 Exchange-correlation Functionalsmentioning
confidence: 99%