2023
DOI: 10.1039/d2dd00146b
|View full text |Cite
|
Sign up to set email alerts
|

Calibration and generalizability of probabilistic models on low-data chemical datasets with DIONYSUS

Abstract: A toolkit for the study of the calibration, performance, and generalizability of probabilistic models and molecular featurizations for low-data chemical datasets.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
25
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 16 publications
(26 citation statements)
references
References 81 publications
1
25
0
Order By: Relevance
“…Empirically, GPs tend to be effective surrogate models for Bayesian optimization of molecules in the small-data regime. 109…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Empirically, GPs tend to be effective surrogate models for Bayesian optimization of molecules in the small-data regime. 109…”
Section: Resultsmentioning
confidence: 99%
“…, require a small number of examples to learn to make accurate predictions), and (iii) express well-calibrated uncertainty. 109,113,114 The acquisition function for experimental planning must balance exploration, exploitation, and cost. The surrogate model and acquisition function must be cheap to train and evaluate, respectively, relative to the simulations/experiments to evaluate the material property.…”
Section: Discussionmentioning
confidence: 99%
“…Severity of the Distribution Shifts. The ultimate severity measure of the training to deployment covariate shift is the gap in performance and uncertainty calibration 17 (see Section 4.6.1). Due to the cyclic nature of the drug discovery process, uncertainty calibration is specifically important during deployment to effectively balance between exploration and exploitation.…”
Section: Resultsmentioning
confidence: 99%
“…Existing research, however, suggests that the application of BO can still help reach promising results even in those scenarios. 49 Despite these challenges, we demonstrate that augmenting BO with adequate reaction representations, initialisation schemes and appropriate surrogate models results in an efficient search towards the best-performing additives in less than 100 evaluations while using as little as ten initialisation reactions.…”
Section: Introductionmentioning
confidence: 97%