Implicit Deep Latent Variable Models for Text Generation

Fang, Le; Li, Chunyuan; Gao, Jianfeng; Dong, Wen; Chen, Changyou

doi:10.18653/v1/d19-1407

Cited by 53 publications

(62 citation statements)

References 29 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to deal with the implicit variational density, it may be worthwhile to consider optimizing the Fenchel dual of the KL divergence, as in [ 31 ]. However, this requires the use of an auxiliary neural network, which may entail a large computational price compared with our simpler particle approximation.…”

Section: Discussionmentioning

confidence: 99%

“…We thus feel that there is room for implicit methods that perform optimization in the primal space (besides this, they are easier to implement). Moreover, the previous dual optimization approach requires the use of an additional neural network (see the paper on the Coupled Variational Bayes (CVB) approach or [ 31 ]). This adds a large number of parameters and requires another architecture decision.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Variationally Inferred Sampling through a Refined Bound

Gallego

Insua

2021

Entropy

View full text Add to dashboard Cite

In this work, a framework to boost the efficiency of Bayesian inference in probabilistic models is introduced by embedding a Markov chain sampler within a variational posterior approximation. We call this framework “refined variational approximation”. Its strengths are its ease of implementation and the automatic tuning of sampler parameters, leading to a faster mixing time through automatic differentiation. Several strategies to approximate evidence lower bound (ELBO) computation are also introduced. Its efficient performance is showcased experimentally using state-space models for time-series data, a variational encoder for density estimation and a conditional variational autoencoder as a deep Bayes classifier.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Variationally Inferred Sampling through a Refined Bound

Gallego

Insua

2021

Entropy

View full text Add to dashboard Cite

show abstract

“…LIC is also one transformer based generation method and fine-tuned upon the pre-trained model of GPT (Radford et al, 2018). For the dataset of Daily Dialog, its best results are reported by the recently developed method -iVAE MI (Fang et al, 2019), which generates diverse responses with samplebased latent representation. In DSTC7-AVSD, the team of CMU (Sanabria et al, 2019) obtains the best performance across all the evaluation metrics.…”

Section: Compared Methodsmentioning

confidence: 99%

“…To model this one-to-many relationship, CVAE (Zhao et al, 2017) employs Gaussian distri-bution to capture the discourse-level variations of responses. To alleviate the issue of posterior collapse in VAE, some extension approaches are further developed, including conditional Wasserstein auto-encoder of DialogWAE (Gu et al, 2019) and implicit feature learning of iVAE MI (Fang et al, 2019). SpaceFusion aims to jointly optimize diversity and relevance in the latent space, which are roughly matched by the distance and direction from the predicted response vector.…”

Section: Related Workmentioning

confidence: 99%

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Bao¹,

Huang²,

Wang³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

180

157

View full text Add to dashboard Cite

Pre-training models have been proved effective for a wide range of natural language processing tasks. Inspired by this, we propose a novel dialogue generation pre-training framework to support various kinds of conversations, including chitchat , knowledge grounded dialogues, and conversational question answering. In this framework, we adopt flexible attention mechanisms to fully leverage the bi-directional context and the uni-directional characteristic of language generation. We also introduce discrete latent variables to tackle the inherent one-to-many mapping problem in response generation. Two reciprocal tasks of response generation and latent act recognition are designed and carried out simultaneously within a shared network. Comprehensive experiments on three publicly available datasets verify the effectiveness and superiority of the proposed framework.

show abstract

“…Neural language models parameterized by autogressive architectures are widely used for NLG. To improve the global control ability of generated sentences, variational auto-encoders are considered for language generation (Bowman et al, 2016;Fu et al, 1 The discount factor is set as one for simplicity 2019; Fang et al, 2019;Li et al, 2020a). Recently, GPT-2 (Radford et al, 2019) and GPT-3 (Brown et al, 2020) improve the generation fluency via pre-training on massive text corpus.…”

Section: Related Workmentioning

confidence: 99%

Improving Text Generation with Student-Forcing Optimal Transport

Li¹,

Li²,

Wang³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.

show abstract

Implicit Deep Latent Variable Models for Text Generation

Cited by 53 publications

References 29 publications

Variationally Inferred Sampling through a Refined Bound

Variationally Inferred Sampling through a Refined Bound

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable

Improving Text Generation with Student-Forcing Optimal Transport

Contact Info

Product

Resources

About