2019
DOI: 10.48550/arxiv.1901.02739
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dirichlet Variational Autoencoder

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…Building on the original VAE (Kingma & Welling, 2013), Nalisnick et al (2016) utilise a latent mixture of Gaussians, aiming to capture class structure in an unsupervised fashion, and propose a Bayesian non-parametric prior, further developed in (Nalisnick & Smyth, 2017). Similarly, Joo et al (2019) suggest a Dirichlet posterior in latent space to avoid some of the previously observed component-collapsing phenomena. Lastly, Jiang et al (2017) propose Variational Deep Embedding (VaDE) focused on the goal of clustering in an i.i.d setting.…”
Section: Related Workmentioning
confidence: 97%
See 1 more Smart Citation
“…Building on the original VAE (Kingma & Welling, 2013), Nalisnick et al (2016) utilise a latent mixture of Gaussians, aiming to capture class structure in an unsupervised fashion, and propose a Bayesian non-parametric prior, further developed in (Nalisnick & Smyth, 2017). Similarly, Joo et al (2019) suggest a Dirichlet posterior in latent space to avoid some of the previously observed component-collapsing phenomena. Lastly, Jiang et al (2017) propose Variational Deep Embedding (VaDE) focused on the goal of clustering in an i.i.d setting.…”
Section: Related Workmentioning
confidence: 97%
“…Interestingly, the result is also better than DGR, suggesting that by holistically incorporating the generative process and classifier into the same model, and focusing on the broader unsupervised, task-agnostic perspective, CURL is still effective in the supervised domain. (Kingma & Welling, 2013), DirichletVAE (Joo et al, 2019), SBVAE (Nalisnick & Smyth, 2017), and VaDE (Jiang et al, 2017). We utilise the same architecture and hyperparameter settings as in Joo et al (2019) for consistency, with latent spaces of dimension 50 and 100 for MNIST and Omniglot respectively; and full details of the experimental setup can be found in Appendix C.3.…”
Section: External Benchmarksmentioning
confidence: 99%
“…The expectation values for the sampled vector components are r i = α i / j α j , so as a prior it will create a hierarchy among different mixture components. In our application, it imposes a compact latent space, whose latent dimensions can be interpreted as mixture weights in a multinomial mixture model [49,50].…”
Section: Dirichlet-vaementioning
confidence: 99%
“…We then study a generalisation, in which the Gaussian prior is replaced by a Gaussian-Mixture prior (GM-VAE) [42][43][44][45][46][47][48] in Section 3. In Section 4 we introduce the Dirichlet-VAE (DVAE), which uses a compact latent space with a Dirichlet prior [49,50]. Through a specific choice of decoder architecture we can interpret the decoder weights as the parameters of the mixture distributions in the probabilistic model, and can visualise these to directly interpret what the neural network is learning.…”
Section: Introductionmentioning
confidence: 99%
“…For example, Stick-Breaking VAE (Nalisnick & Smyth, 2017) assumes a Griffiths-Engen-McCloskey (GEM) prior (Pitman, 2002), and the authors utilized the Kumaraswamy distribution (Kumaraswamy, 1980) to approximate the Beta distribution used by the GEM distribution. Dirichlet VAE (Joo et al, 2019) assumes a Dirichlet prior, and the authors utilized the approximation by the inverse Gamma cumulative density function (Knowles, 2015) and the composition of Gamma random variables to form the Dirichlet distribution.…”
Section: Introductionmentioning
confidence: 99%