High Dimensional Bayesian Optimization using Dropout

Li, Cheng; Gupta, Sunil; Rana, Santu; Nguyen, Vu; Venkatesh, Svetha; Shilton, Alistair

doi:10.24963/ijcai.2017/291

Cited by 90 publications

(75 citation statements)

References 3 publications

(5 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To evaluate the performance of our MS-UCB, we have conducted a set of experiments involving optimization of four benchmark functions and two real applications. We compare our approach against six baselines: (1) Standard GP-UCB (Srinivas et al 2012), (2) DropoutUCB (Li et al 2017), 3LineBO (Kirschner et al 2019) which restricts the search space to a one-dimensional subspace, (4) SRE (Qian, Hu, and Yu 2016) which uses sequential random embeddings several times sequentially, (5) REMBO (Wang et al 2013), and (6) HeSBO (Nayebi, Munteanu, and Poloczek 2019) which use hashing-enhanced embedded subspaces. Among these baselines, the first three baselines do not make assumptions on the structure of the objective function, SRE assumes a tiny effect for some of the dimensions i.e.…”

Section: Methodsmentioning

confidence: 99%

“…Thus, the convergence analysis of (Oh, Gavves, and Welling 2018) depends on whether their assumptions holds. Some other methods are based on subspaces (Qian, Hu, and Yu 2016;Li et al 2017;Kirschner et al 2019), which are more amenable to convergence analysis. Among them, LineBO (Kirschner et al 2019) is the first to provide a complete analysis.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Tran-The¹,

Gupta²,

Rana³

et al. 2020

AAAI

Self Cite

View full text Add to dashboard Cite

Scaling Bayesian optimisation (BO) to high-dimensional search spaces is a active and open research problems particularly when no assumptions are made on function structure. The main reason is that at each iteration, BO requires to find global maximisation of acquisition function, which itself is a non-convex optimization problem in the original search space. With growing dimensions, the computational budget for this maximisation gets increasingly short leading to inaccurate solution of the maximisation. This inaccuracy adversely affects both the convergence and the efficiency of BO. We propose a novel approach where the acquisition function only requires maximisation on a discrete set of low dimensional subspaces embedded in the original high-dimensional search space. Our method is free of any low dimensional structure assumption on the function unlike many recent high-dimensional BO methods. Optimising acquisition function in low dimensional subspaces allows our method to obtain accurate solutions within limited computational budget. We show that in spite of this convenience, our algorithm remains convergent. In particular, cumulative regret of our algorithm only grows sub-linearly with the number of iterations. More importantly, as evident from our regret bounds, our algorithm provides a way to trade the convergence rate with the number of subspaces used in the optimisation. Finally, when the number of subspaces is "sufficiently large", our algorithm's cumulative regret is at most O*(√TγT) as opposed to O*(√DTγT) for the GP-UCB of Srinivas et al. (2012), reducing a crucial factor √D where D being the dimensional number of input space. We perform empirical experiments to evaluate our method extensively, showing that its sample efficiency is better than the existing methods for many optimisation problems involving dimensions up to 5000.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Tran-The¹,

Gupta²,

Rana³

et al. 2020

AAAI

Self Cite

View full text Add to dashboard Cite

show abstract

“…Batch normalization subtracts the batch mean and divides it by the batch standard deviation to increase stabilization. It can also eliminate the need for dropout, so we only add dropout to the last hidden layer (Li et al, 2018).…”

Section: Model Settingsmentioning

confidence: 99%

Multi-State Health Transition Modeling Using Neural Networks

2020

View full text Add to dashboard Cite

This article proposes a new model that combines a neural network with a generalized linear model (GLM) to estimate and predict health transition intensities. The model allows for socioeconomic and lifestyle factors to impact the health transition processes, and captures linear and nonlinear relationships. A key innovation is that the model features transfer learning between different transition rates. It autonomously finds the relationships between factors and the links between the transition processes. We apply the model to individual-level data from the Chinese Longitudinal Healthy Longevity Survey from 1998-2018. The results show that our model performs better in estimation and prediction than standalone GLM and neural network models. We thus provide new estimates of the life expectancies for a range of population subgroups. The model can be easily applied to other datasets, and our results confirm that machine learning techniques are promising tools to model insurance risks.

show abstract

“…Nonetheless for tasks with less than approximately 25 dimensions, it is often cheaper than evaluating an expensive-to-compute objective function and consequently the BO method is often well suited to task of this type. For higher dimension tasks, BO implementations are still being developed that keep the cost of maximizing the acquisition function reasonable, for example Li et al [21]. Algorithm 2 presents pseudo-code outlining BO and is further described in the appendix.…”

Section: Bayesian Optimizationmentioning

confidence: 99%

“…Although the BO implementation tested here cannot be scaled to tasks representing the whole GB network in this way, that is every individual line represented by an optimization variable, there are opportunities at regional and individual train operator scales. While these tasks present a computational challenge at present, new developments such as a BO implementation allowing more optimization variables might be used [21].…”

Section: The Test-tasksmentioning

confidence: 99%

Investigating Bayesian Optimization for rail network optimization

Hickish

Fletcher

Harrison

2019

International Journal of Rail Transportation

View full text Add to dashboard Cite

Optimizing the operation of rail networks using simulations is an ongoing task where heuristic methods such as Genetic Algorithms have been applied. However, these simulations are often expensive to compute and consequently, because the optimization methods require many (typically >10 4 ) repeat simulations, the computational cost of optimization is dominated by them. This paper examines Bayesian Optimization and benchmarks it against the Genetic Algorithm method. By applying both methods to test-tasks seeking to maximize passenger satisfaction by optimum resource allocation, it is experimentally determined that a Bayesian Optimization implementation finds 'good' solutions in an order of magnitude fewer simulations than a Genetic Algorithm. Similar improvement for realworld problems will allow the predictive power of detailed simulation models to be used for a wider range of network optimization tasks. To the best of the authors' knowledge, this paper documents the first application of Bayesian Optimization within the field of rail network optimization.

show abstract

High Dimensional Bayesian Optimization using Dropout

Cited by 90 publications

References 3 publications

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Trading Convergence Rate with Computational Budget in High Dimensional Bayesian Optimization

Multi-State Health Transition Modeling Using Neural Networks

Investigating Bayesian Optimization for rail network optimization

Contact Info

Product

Resources

About