A Diversity-Promoting Objective Function for Neural Conversation Models

Li, Jiwei; Galley, Michel; Brockett, Chris; Gao, Jianfeng; Dolan, Bill

doi:10.18653/v1/n16-1014

Cited by 1,656 publications

(1,697 citation statements)

References 27 publications

Supporting

Mentioning

1,563

Contrasting

Unclassified

Order By: Relevance

“…• MMI-anti: a Seq2Seq model with a Maximum Mutual Information (MMI) criterion (implemented as an anti-language model) (Li et al, 2016a) in the decoding process, which reduces the probability of generating "safe responses".…”

Section: Baselinesmentioning

confidence: 99%

“…Diversity Metrics: To measure the informativeness and diversity of the generated responses, we follow the dist-1 and dist-2 metrics proposed by Li et al (2016a) and , and introduce a Novelty metric. The dist-1 (dist-2) is defined as the number of unique unigrams (bigrams for dist-2).…”

Section: Evaluation Metricsmentioning

confidence: 99%

“…This experimental setting is due to the following two reasons. Firstly, it is difficult to judge a systems diversity based on one single response (Li et al, 2016a;. Secondly, the practical deployment of a chat-oriented conversational system will usually decode an N-best list of candidate responses, from which it random samples the final reply.…”

Section: Evaluation Metricsmentioning

confidence: 99%

See 2 more Smart Citations

Neural Response Generation via GAN with an Approximate Embedding Layer

Xu¹,

Liu²,

Wang³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

This paper presents a Generative Adversarial Network (GAN) to model singleturn short-text conversations, which trains a sequence-to-sequence (Seq2Seq) network for response generation simultaneously with a discriminative classifier that measures the differences between human-produced responses and machinegenerated ones. In addition, the proposed method introduces an approximate embedding layer to solve the non-differentiable problem caused by the sampling-based output decoding procedure in the Seq2Seq generative model. The GAN setup provides an effective way to avoid noninformative responses (a.k.a "safe responses"), which are frequently observed in traditional neural response generators. The experimental results show that the proposed approach significantly outperforms existing neural response generation models in diversity metrics, with slight increases in relevance scores as well, when evaluated on both a Mandarin corpus and an English corpus.

show abstract

Section: Baselinesmentioning

confidence: 99%

Section: Evaluation Metricsmentioning

confidence: 99%

Section: Evaluation Metricsmentioning

confidence: 99%

See 1 more Smart Citation

Neural Response Generation via GAN with an Approximate Embedding Layer

Xu¹,

Liu²,

Wang³

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…The TSM can be described as a conditional probability [32]. It predicts the probability of C being conditioned by A and B, as described in (7).…”

Section: Triple-seq2seq Modelmentioning

confidence: 99%

Sentence-Chain Based Seq2seq Model for Corpus Expansion

Chung¹,

Park²

2017

ETRI Journal

View full text Add to dashboard Cite

This study focuses on a method for sequential data augmentation in order to alleviate data sparseness problems. Specifically, we present corpus expansion techniques for enhancing the coverage of a language model. Recent recurrent neural network studies show that a seq2seq model can be applied for addressing language generation issues; it has the ability to generate new sentences from given input sentences. We present a method of corpus expansion using a sentence-chain based seq2seq model. For training the seq2seq model, sentence chains are used as triples. The first two sentences in a triple are used for the encoder of the seq2seq model, while the last sentence becomes a target sequence for the decoder. Using only internal resources, evaluation results show an improvement of approximately 7.6% relative perplexity over a baseline language model of Korean text. Additionally, from a comparison with a previous study, the sentence chain approach reduces the size of the training data by 38.4% while generating 1.4-times the number of ngrams with superior performance for English text.

show abstract

“…We use 10-fold cross-validation, and only two types of features: n-grams and Word2Vec word embeddings. We expect Word2Vec to be able to capture semantic generalizations that n-grams do not (Socher et al, 2013;Li et al, 2016). The n-gram features include unigrams, bigrams, and trigrams, including sequences of punctuation (for example, ellipses or "!!!…”

Section: Learning Experimentsmentioning

confidence: 99%

Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Katagiri

Nakano

Fernández

et al. 2016

View full text Add to dashboard Cite

We extend special thanks to our Local co-Chairs, Ron Artstein and Alesia Gainer, and their team of student volunteers. We know SIGDIAL 2016 would not have been possible without Ron and Alesia, who invested so much effort in arranging the conference venue and accommodations, handling registration, making banquet arrangements, and handling numerous other preparations for the conference. The student volunteers for on-site assistance also deserve our appreciation.Ethan Selfridge, Sponsorships Chair, has earned our appreciation for recruiting and liaising with our conference sponsors, many of whom continue to contribute year after year. Sponsorships support valuable aspects of the program, such as the invited speakers and conference banquet. In recognition of this, we gratefully acknowledge the support of our sponsors: (Platinum level) Microsoft Research, Xerox and PARC, Intel, (Gold level) Facebook, (Silver level) Amazon Alexa, Interactions, Educational Testing Service, Honda Research Institute, and Yahoo!. At the same time, we thank Priscilla Rasmussen at the ACL for tirelessly handling the financial aspects of sponsorship for SIGDIAL 2016, and for securing our ISBN.iii We also thank the SIGdial board, especially officers Amanda Stent, Jason Williams and Kristiina Jokinen for their advice and support from beginning to end.Finally, we thank all the authors of the papers in this volume, and all the conference participants for making this stimulating event a valuable opportunity for growth in the research areas of discourse and dialogue. AbstractThis paper presents an end-to-end framework for task-oriented dialog systems using a variant of Deep Recurrent QNetworks (DRQN). The model is able to interface with a relational database and jointly learn policies for both language understanding and dialog strategy. Moreover, we propose a hybrid algorithm that combines the strength of reinforcement learning and supervised learning to achieve faster learning speed. We evaluated the proposed model on a 20 Question Game conversational game simulator. Results show that the proposed method outperforms the modular-based baseline and learns a distributed representation of the latent dialog state. IntroductionTask-oriented dialog systems have been an important branch of spoken dialog system (SDS) research (Raux et al., 2005; Young, 2006; Bohus and Rudnicky, 2003). The SDS agent has to achieve some predefined targets (e.g. booking a flight) through natural language interaction with the users. The typical structure of a task-oriented dialog system is outlined in Figure 1 (Young, 2006). This pipeline consists of several independently-developed modules: natural language understanding (the NLU) maps the user utterances to some semantic representation. This information is further processed by the dialog state tracker (DST), which accumulates the input of the turn along with the dialog history. The DST outputs the current dialog state and the dialog policy selects the next system action based on the dialog state. Then natural language gene...

show abstract

A Diversity-Promoting Objective Function for Neural Conversation Models

Cited by 1,656 publications

References 27 publications

Neural Response Generation via GAN with an Approximate Embedding Layer

Neural Response Generation via GAN with an Approximate Embedding Layer

Sentence-Chain Based Seq2seq Model for Corpus Expansion

Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

Contact Info

Product

Resources

About