Scalable Gaussian Processes on Discrete Domains

Fortuin, Vincent; Dresdner, Gideon; Strathmann, Heiko; Rätsch, Gunnar

doi:10.48550/arxiv.1810.10368

Cited by 2 publications

(2 citation statements)

References 15 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Optimization of inducing points. When working with sparse Gaussian processes, the selection of inducing point locations can often be crucial for the quality of the approximation (Titsias, 2009;Fortuin et al, 2018;Burt et al, 2019). In our model, we can optimize these inducing point locations jointly with the other components.…”

Section: Synthetic Moving Ball Datamentioning

confidence: 99%

Scalable Gaussian Process Variational Autoencoders

Jazbec,

Ashman,

Fortuin

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Conventional variational autoencoders fail in modeling correlations between data points due to their use of factorized priors. Amortized Gaussian process inference through GP-VAEs has led to significant improvements in this regard, but is still inhibited by the intrinsic complexity of exact GP inference. We improve the scalability of these methods through principled sparse inference approaches. We propose a new scalable GP-VAE model that outperforms existing approaches in terms of runtime and memory footprint, is easy to implement, and allows for joint end-to-end optimization of all components. IntroductionVariational autoencoders (VAEs) are among the most widely used models in representation learning and generative modeling Welling, 2013, 2019;Rezende et al., 2014). As VAEs typically make use of factorized priors, they fall short when modeling correlations between different data points. However, more expressive priors that capture correlations enable useful applications. Casale et al. (2018), for instance, showed that by modeling prior correlations between the data, one could generate a digit's rotated image based on rotations of the same digit at different angles.Gaussian process VAEs (GP-VAEs) have been designed to overcome this shortcoming (Casale et al., 2018). These models introduce a Gaussian process (GP) prior over the latent variables that correlates pairs of latent variables through a kernel function. While GP-VAEs have outperformed standard VAEs on many tasks (Casale et al., 2018;Pearce, 2020), combining the GPs and VAEs brings along fundamental computational challenges. On the one hand, neural networks reveal their full power in conjunction with large datasets, making mini-batching a practical necessity. GPs, on the other hand, are traditionally restricted to medium-scale datasets due to their unfavorable scaling. In GP-VAEs, these contradictory demands must be reconciled, preferably by reducing the O(N 3 ) complexity of GP inference, where N is the number of data points.Despite recent attempts to improve the scalability of GP-VAE models by using specifically designed kernels and inference methods (Casale et al., 2018;, a generic way to scale these models, regardless of data type or kernel choice, has remained elusive. This limits current GP-VAE implementations to small-scale datasets. In this work, we introduce the first generically scalable method for training GP-VAEs based on inducing points. We thereby improve the computational complexity from O(N 3 ) to O(bm 2 + m 3 ), where m is the number of inducing points and b is the batch size.

show abstract

Section: Synthetic Moving Ball Datamentioning

confidence: 99%

Scalable Gaussian Process Variational Autoencoders

Jazbec,

Ashman,

Fortuin

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…One of the outstanding features of Gaussian Process (GP) prediction, in particular, is its usability to design Bayesian Optimization (BO) algorithms (Moćkus et al, 1978;Jones et al, 1998;Frazier, 2018) and further sequential design strategies (Risk and Ludkovski, 2018;Binois et al, 2019;Bect et al, 2019). While in most usual BO and related contributions the focus is on continuous problems with vector-valued inputs, there has been a growing interest recently for GP-related modelling and BO in the presence of discrete and mixed discrete-continuous inputs (Kondor and Lafferty, 2002;Gramacy and Taddy, 2010;Fortuin et al, 2018;Roustant et al, 2018;Garrido-Merchan and Hernández-Lobato, 2018;Ru et al, 2019;Griffiths and Hernández-Lobato, 2019). Here we focus specifically on kernels dedicated to finite set-valued inputs and their application to GP modelling and BO, notably (but not only) in combinatorial optimization.…”

Section: Introductionmentioning

confidence: 99%

Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

Buathong¹,

Ginsbourger²,

Krityakierne³

2019

Preprint

View full text Add to dashboard Cite

We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We introduce a class of (strictly) positive definite kernels that relies on Reproducing Kernel Hilbert Space embeddings, and successfully generalizes "double sum" set kernels recently considered in Bayesian set optimization, which turn out to be unsuitable for combinatorial optimization. The proposed class of kernels, for which we provide theoretical guarantees, essentially consists in applying an outer kernel on top of the canonical distance induced by a double sum kernel. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results illustrate the added value of the approach and open new perspectives in prediction and sequential design with set inputs.

show abstract

Scalable Gaussian Processes on Discrete Domains

Cited by 2 publications

References 15 publications

Scalable Gaussian Process Variational Autoencoders

Scalable Gaussian Process Variational Autoencoders

Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

Contact Info

Product

Resources

About