Adversarial Learning on the Latent Space for Diverse Dialog Generation

Khan, Kashif Hesham; Sahu, Gaurav; Mou, Lili; Vechtomova, Olga

doi:10.18653/v1/2020.coling-main.441

Cited by 8 publications

(6 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…1), and the class predicted by the discriminator network, D ( G ( x )) used on Exp2Sim network applied on the experimental data. This loss function is inspired by 41,42 …”

Section: Exp2simgan and Previous Workmentioning

confidence: 99%

Using generative adversarial networks to match experimental and simulated inelastic neutron scattering data

Anker

Butler

et al. 2023

Digital Discovery

View full text Add to dashboard Cite

show abstract

“…1), and the class predicted by the discriminator network, D ( G ( x )) used on Exp2Sim network applied on the experimental data. This loss function is inspired by 41,42 …”

Section: Exp2simgan and Previous Workmentioning

confidence: 99%

Using generative adversarial networks to match experimental and simulated inelastic neutron scattering data

Anker

Butler

et al. 2023

Digital Discovery

View full text Add to dashboard Cite

show abstract

“…Several studies (Serban et al, 2017a;Zhao et al, 2018;Gao et al, 2019;Cai and Cai, 2022) introduce discrete latent variables to improve the complexity of these distributions. Further studies use more advanced generative models like Generative Adversarial Network (Goodfellow et al, 2020;Gu et al, 2019;Khan et al, 2020) or Normalizing Flows (Rezende and Mohamed, 2015;Luo and Chien, 2021).…”

Section: Related Workmentioning

confidence: 99%

Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation

Yang,

Tran,

Gurevych

2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

Current variational dialog models have employed pre-trained language models (PLMs) to parameterize the likelihood and posterior distributions. However, the Gaussian assumption made on the prior distribution is incompatible with these distributions, thus restricting the diversity of generated responses. These models also suffer from posterior collapse, i.e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism.In this work, we propose Dior-CVAE, a hierarchical conditional variational autoencoder (CVAE) with diffusion priors to address these challenges. We employ a diffusion model to increase the complexity of the prior distribution and its compatibility with the distributions produced by a PLM. Also, we propose memory dropout to the cross-attention mechanism, which actively encourages the use of latent variables for response generation. Our method requires parameters that are comparable to those of previous studies while maintaining comparable inference time, despite the integration of the diffusion model. Overall, experiments across two commonly used open-domain dialog datasets show that our method can generate more diverse responses even without largescale dialog pre-training. Code is available at https://github.com/UKPLab/dior-cvae.

show abstract

“…where D train is the training data, and each sample x = {x (s) , x (t) }. We also add an auxiliary MSE loss to the objective function as it is found to stabilize GAN training (Khan et al 2020). The overall loss for the GAN is:…”

Section: Training Stage 2: Text-cvaementioning

confidence: 99%

“…We use 128 latent dimensions for the mean and sigma vectors. During training, we use a batch size of 32, learning rate of 1e-4, and Adam optimizer (Kingma and Ba 2015). The sampling temperature is 1.0 for both training and inference.…”

Section: Implementation Detailsmentioning

confidence: 99%

LyricJam: A system for generating lyrics for live instrumental music

Vechtomova,

Sahu,

Kumar

2021

Preprint

Self Cite

View full text Add to dashboard Cite

We describe a real-time system that receives a live audio stream from a jam session and generates lyric lines that are congruent with the live music being played. Two novel approaches are proposed to align the learned latent spaces of audio and text representations that allow the system to generate novel lyric lines matching live instrumental music. One approach is based on adversarial alignment of latent representations of audio and lyrics, while the other approach learns to transfer the topology from the music latent space to the lyric latent space. A user study with music artists using the system showed that the system was useful not only in lyric composition, but also encouraged the artists to improvise and find new musical expressions. Another user study demonstrated that users preferred the lines generated using the proposed methods to the lines generated by a baseline model.

show abstract

Adversarial Learning on the Latent Space for Diverse Dialog Generation

Cited by 8 publications

References 14 publications

Using generative adversarial networks to match experimental and simulated inelastic neutron scattering data

Using generative adversarial networks to match experimental and simulated inelastic neutron scattering data

Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation

LyricJam: A system for generating lyrics for live instrumental music

Contact Info

Product

Resources

About