2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP) 2021
DOI: 10.1109/mlsp52302.2021.9596494
|View full text |Cite
|
Sign up to set email alerts
|

Preferential Batch Bayesian Optimization

Abstract: Most research in Bayesian optimization (BO) has focused on direct feedback scenarios, where one has access to exact, or perturbed, values of some expensive-to-evaluate objective. This direction has been mainly driven by the use of BO in machine learning hyper-parameter configuration problems. However, in domains such as modelling human preferences, A/B tests or recommender systems, there is a need of methods that are able to replace direct feedback with preferential feedback, obtained via rankings or pairwise … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(17 citation statements)
references
References 14 publications
0
17
0
Order By: Relevance
“…Therefore, it can propose examples using an acquisition function, ask the human to provide a choice, and then infer the underlying preference iteratively. When dealing with choices from pairwise comparisons, preferential Bayesian optimization (PBO) 1 as a specialized category of BO has received increasing development in recent studies [24,33,40,49,62]. While BO learns based on absolute rating utility (rate and assign a score to an option), PBO learns from human choice in pairwise comparisons according to Thurstone's law of comparative judgment [65].…”
Section: Modeling Preference From Human Feedbackmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, it can propose examples using an acquisition function, ask the human to provide a choice, and then infer the underlying preference iteratively. When dealing with choices from pairwise comparisons, preferential Bayesian optimization (PBO) 1 as a specialized category of BO has received increasing development in recent studies [24,33,40,49,62]. While BO learns based on absolute rating utility (rate and assign a score to an option), PBO learns from human choice in pairwise comparisons according to Thurstone's law of comparative judgment [65].…”
Section: Modeling Preference From Human Feedbackmentioning
confidence: 99%
“…While BO learns based on absolute rating utility (rate and assign a score to an option), PBO learns from human choice in pairwise comparisons according to Thurstone's law of comparative judgment [65]. To avoid the mentioned violations of the transitivity axiom, the recent extensions [7,40,62] to PBO transited from using a binary pairwise comparison to using a reasonable amount of options. These extensions can largely prevent violation of the transitivity axiom and infer more information at a time because they either consider choosing a set of options as winners among all given options [7,40]; or provide a ranking of all given options, where options may share the same level of rank [62].…”
Section: Modeling Preference From Human Feedbackmentioning
confidence: 99%
“…Gonzalez et al proposed a BO method for pairwise comparison and an improved acquisition function by focusing on the characteristics of the bandit problem [7]. Siivola et al proposed a PbBO method under batched query settings and discussed some approximate inferences [8].…”
Section: B Preference Learningmentioning
confidence: 99%
“…By following [8], a Bayesian optimization (BO) [17] method finds w * = arg max w f (w) from as few queries as possible for black-box function f (w) : W → R, which is difficult to evaluate directly. Therefore, BO first constructs a surrogate model that is relatively easy to evaluate with a probabilistic model and then regresses f on it from set of queries {w i } K−1 i=0 and corresponding {f (w i )} K−1 i=0 .…”
Section: Preference-based Bayesian Optimizationmentioning
confidence: 99%
See 1 more Smart Citation