2020
DOI: 10.3390/e22080888
|View full text |Cite
|
Sign up to set email alerts
|

Data-Dependent Conditional Priors for Unsupervised Learning of Multimodal Data

Abstract: One of the major shortcomings of variational autoencoders is the inability to produce generations from the individual modalities of data originating from mixture distributions. This is primarily due to the use of a simple isotropic Gaussian as the prior for the latent code in the ancestral sampling procedure for data generations. In this paper, we propose a novel formulation of variational autoencoders, conditional prior VAE (CP-VAE), with a two-level generative process for the observed data where continuous z… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 10 publications
0
2
0
Order By: Relevance
“…However, its performance in terms of the test log-likelihood and quality of generated samples is often dissatisfying, and thus many modifications were proposed. In general, one can obtain a tighter lower bound and, thus, a more powerful and flexible model, by advancing over the following three components: the encoder [1][2][3][4], the prior (or marginal over latents) [5][6][7][8][9], and the decoder [10]. Recent studies have shown that, by employing deep hierarchical architectures and by carefully designing the building blocks of the neural networks, VAEs can successfully model high-dimensional data and reach state-of-the-art test likelihoods [11][12][13].…”
Section: Introductionmentioning
confidence: 99%
“…However, its performance in terms of the test log-likelihood and quality of generated samples is often dissatisfying, and thus many modifications were proposed. In general, one can obtain a tighter lower bound and, thus, a more powerful and flexible model, by advancing over the following three components: the encoder [1][2][3][4], the prior (or marginal over latents) [5][6][7][8][9], and the decoder [10]. Recent studies have shown that, by employing deep hierarchical architectures and by carefully designing the building blocks of the neural networks, VAEs can successfully model high-dimensional data and reach state-of-the-art test likelihoods [11][12][13].…”
Section: Introductionmentioning
confidence: 99%
“…However, as its performance in terms of test likelihood and quality of generated samples was far from the desired one, many modifications were proposed in order to improved its performance on high-dimensional data like natural images. In general, one can obtain a tighter lower bound, and, thus, a more powerful and flexible model, by advancing over the following three elements: the encoder (Rezende et al, 2014;van den Berg et al, 2018;Hoogeboom et al, 2020;Maaløe et al, 2016), the prior (or marginal over latents) (Chen et al, 2016;Habibian et al, 2019;Lavda et al, 2020;Lin & Clark, 2020;Tomczak & Welling, 2017) and the decoder (Gulrajani et al, 2016). Nevertheless, recent studies have shown that by employing deep hierarchical architectures and by carefully designed the building blocks of the neural networks, VAEs can successful model large high-dimensional data and reach state-of-the-art test likelihoods (Zhao et al, 2017;Maaløe et al, 2019;Vahdat & Kautz, 2020).…”
Section: Introductionmentioning
confidence: 99%