Proceedings of the 3rd Workshop on Neural Generation and Translation 2019
DOI: 10.18653/v1/d19-5612
|View full text |Cite
|
Sign up to set email alerts
|

On the Importance of the Kullback-Leibler Divergence Term in Variational Autoencoders for Text Generation

Abstract: Variational Autoencoders (VAEs) are known to suffer from learning uninformative latent representation of the input due to issues such as approximated posterior collapse, or entanglement of the latent space. We impose an explicit constraint on the Kullback-Leibler (KL) divergence term inside the VAE objective function. While the explicit constraint naturally avoids posterior collapse, we use it to further understand the significance of the KL term in controlling the information transmitted through the VAE chann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 22 publications
(14 citation statements)
references
References 12 publications
0
14
0
Order By: Relevance
“…The decoding part then strove to reconstruct the ( A i , T i ) from the latent coordinate z i using two parallel networks. The VAE would be trained to minimize the VAE loss, including a reconstruction term and a Kullback−Leibler divergence term ( Prokhorov et al, 2019 ). After that, a Gaussian process regression (GPR) surrogate model was used to predict the fitness function f i of all unsimulated sequences depending on their local positions in the VAE latent space.…”
Section: Application Of ML For Understanding and Design Of Polymer Chainsmentioning
confidence: 99%
“…The decoding part then strove to reconstruct the ( A i , T i ) from the latent coordinate z i using two parallel networks. The VAE would be trained to minimize the VAE loss, including a reconstruction term and a Kullback−Leibler divergence term ( Prokhorov et al, 2019 ). After that, a Gaussian process regression (GPR) surrogate model was used to predict the fitness function f i of all unsimulated sequences depending on their local positions in the VAE latent space.…”
Section: Application Of ML For Understanding and Design Of Polymer Chainsmentioning
confidence: 99%
“…10 https://huggingface.co/transformers/ model_doc/bert.html Coupling Encoder with Decoder. To connect the encoder with the decoder we concatenate the latent variable , sampled from the posterior distribution, to word embeddings of the decoder at each time step (Prokhorov et al, 2019). Also, for GRU encoders we take the last hidden state to parameterise the posterior distribution.…”
Section: Kl-collapsementioning
confidence: 99%
“…In parallel, Variational Autoencoders (VAEs) (Kingma and Welling, 2014) have been effective in capturing semantic closeness of sentences in the learned representation space (Bowman et al, 2016;Prokhorov et al, 2019;Balasubramanian et al, 2020). Furthermore, methods have been developed 2 This, for example, may allow us to cluster sentences' representations not only based on similarity of their active features (as it is the case for dense vectors) but also on active/inactive dimensions.…”
Section: Introductionmentioning
confidence: 99%
“…However, this is not so simple because increasing capacity leads to a worse model fit, as was noted by Alemi et al (2018). More specifically, on text data, Prokhorov et al (2019) noted that the coherence of samples decreases as the target rate increases. Pelsmaeker and Aziz (2019) reported similar findings, and also, that more complex priors or posteriors do not help.…”
Section: The Problem With Memorizationmentioning
confidence: 99%