Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1407
|View full text |Cite
|
Sign up to set email alerts
|

Implicit Deep Latent Variable Models for Text Generation

Abstract: Deep latent variable models (LVM) such as variational auto-encoder (VAE) have recently played an important role in text generation.One key factor is the exploitation of smooth latent structures to guide the generation. However, the representation power of VAEs is limited due to two reasons: (1) the Gaussian assumption is often made on the variational posteriors; and meanwhile (2) a notorious "posterior collapse" issue occurs. In this paper, we advocate sample-based representations of variational distributions … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 53 publications
(62 citation statements)
references
References 29 publications
(49 reference statements)
0
62
0
Order By: Relevance
“…In order to deal with the implicit variational density, it may be worthwhile to consider optimizing the Fenchel dual of the KL divergence, as in [ 31 ]. However, this requires the use of an auxiliary neural network, which may entail a large computational price compared with our simpler particle approximation.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to deal with the implicit variational density, it may be worthwhile to consider optimizing the Fenchel dual of the KL divergence, as in [ 31 ]. However, this requires the use of an auxiliary neural network, which may entail a large computational price compared with our simpler particle approximation.…”
Section: Discussionmentioning
confidence: 99%
“…We thus feel that there is room for implicit methods that perform optimization in the primal space (besides this, they are easier to implement). Moreover, the previous dual optimization approach requires the use of an additional neural network (see the paper on the Coupled Variational Bayes (CVB) approach or [ 31 ]). This adds a large number of parameters and requires another architecture decision.…”
Section: Introductionmentioning
confidence: 99%
“…LIC is also one transformer based generation method and fine-tuned upon the pre-trained model of GPT (Radford et al, 2018). For the dataset of Daily Dialog, its best results are reported by the recently developed method -iVAE MI (Fang et al, 2019), which generates diverse responses with samplebased latent representation. In DSTC7-AVSD, the team of CMU (Sanabria et al, 2019) obtains the best performance across all the evaluation metrics.…”
Section: Compared Methodsmentioning
confidence: 99%
“…To model this one-to-many relationship, CVAE (Zhao et al, 2017) employs Gaussian distri-bution to capture the discourse-level variations of responses. To alleviate the issue of posterior collapse in VAE, some extension approaches are further developed, including conditional Wasserstein auto-encoder of DialogWAE (Gu et al, 2019) and implicit feature learning of iVAE MI (Fang et al, 2019). SpaceFusion aims to jointly optimize diversity and relevance in the latent space, which are roughly matched by the distance and direction from the predicted response vector.…”
Section: Related Workmentioning
confidence: 99%
“…Neural language models parameterized by autogressive architectures are widely used for NLG. To improve the global control ability of generated sentences, variational auto-encoders are considered for language generation (Bowman et al, 2016;Fu et al, 1 The discount factor is set as one for simplicity 2019; Fang et al, 2019;Li et al, 2020a). Recently, GPT-2 (Radford et al, 2019) and GPT-3 (Brown et al, 2020) improve the generation fluency via pre-training on massive text corpus.…”
Section: Related Workmentioning
confidence: 99%