Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence 2017
DOI: 10.24963/ijcai.2017/291
|View full text |Cite
|
Sign up to set email alerts
|

High Dimensional Bayesian Optimization using Dropout

Abstract: Scaling Bayesian optimization to high dimensions is challenging task as the global optimization of high-dimensional acquisition function can be expensive and often infeasible. Existing methods depend either on limited "active" variables or the additive form of the objective function. We propose a new method for high-dimensional Bayesian optimization, that uses a dropout strategy to optimize only a subset of variables at each iteration. We derive theoretical bounds for the regret and show how it can inform the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
70
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
4

Relationship

3
5

Authors

Journals

citations
Cited by 90 publications
(75 citation statements)
references
References 3 publications
(5 reference statements)
1
70
0
Order By: Relevance
“…To evaluate the performance of our MS-UCB, we have conducted a set of experiments involving optimization of four benchmark functions and two real applications. We compare our approach against six baselines: (1) Standard GP-UCB (Srinivas et al 2012), (2) DropoutUCB (Li et al 2017), 3LineBO (Kirschner et al 2019) which restricts the search space to a one-dimensional subspace, (4) SRE (Qian, Hu, and Yu 2016) which uses sequential random embeddings several times sequentially, (5) REMBO (Wang et al 2013), and (6) HeSBO (Nayebi, Munteanu, and Poloczek 2019) which use hashing-enhanced embedded subspaces. Among these baselines, the first three baselines do not make assumptions on the structure of the objective function, SRE assumes a tiny effect for some of the dimensions i.e.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…To evaluate the performance of our MS-UCB, we have conducted a set of experiments involving optimization of four benchmark functions and two real applications. We compare our approach against six baselines: (1) Standard GP-UCB (Srinivas et al 2012), (2) DropoutUCB (Li et al 2017), 3LineBO (Kirschner et al 2019) which restricts the search space to a one-dimensional subspace, (4) SRE (Qian, Hu, and Yu 2016) which uses sequential random embeddings several times sequentially, (5) REMBO (Wang et al 2013), and (6) HeSBO (Nayebi, Munteanu, and Poloczek 2019) which use hashing-enhanced embedded subspaces. Among these baselines, the first three baselines do not make assumptions on the structure of the objective function, SRE assumes a tiny effect for some of the dimensions i.e.…”
Section: Methodsmentioning
confidence: 99%
“…Thus, the convergence analysis of (Oh, Gavves, and Welling 2018) depends on whether their assumptions holds. Some other methods are based on subspaces (Qian, Hu, and Yu 2016;Li et al 2017;Kirschner et al 2019), which are more amenable to convergence analysis. Among them, LineBO (Kirschner et al 2019) is the first to provide a complete analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Batch normalization subtracts the batch mean and divides it by the batch standard deviation to increase stabilization. It can also eliminate the need for dropout, so we only add dropout to the last hidden layer (Li et al, 2018).…”
Section: Model Settingsmentioning
confidence: 99%
“…Nonetheless for tasks with less than approximately 25 dimensions, it is often cheaper than evaluating an expensive-to-compute objective function and consequently the BO method is often well suited to task of this type. For higher dimension tasks, BO implementations are still being developed that keep the cost of maximizing the acquisition function reasonable, for example Li et al [21]. Algorithm 2 presents pseudo-code outlining BO and is further described in the appendix.…”
Section: Bayesian Optimizationmentioning
confidence: 99%
“…Although the BO implementation tested here cannot be scaled to tasks representing the whole GB network in this way, that is every individual line represented by an optimization variable, there are opportunities at regional and individual train operator scales. While these tasks present a computational challenge at present, new developments such as a BO implementation allowing more optimization variables might be used [21].…”
Section: The Test-tasksmentioning
confidence: 99%