A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

Sordoni, Alessandro; Galley, Michel; Auli, Michael; Brockett, Chris; Ji, Yangfeng; Mitchell, Margaret; Nie, Jian‐Yun; Gao, Jianfeng; Dolan, Bill

doi:10.3115/v1/n15-1020

Cited by 690 publications

(594 citation statements)

References 23 publications

Supporting

Mentioning

583

Contrasting

Unclassified

Order By: Relevance

“…It can locally produce perfectly sensible language fragments, but it fails to take the meaning of the broader discourse context into account. Much research effort has consequently focused on designing systems able to keep information from the broader context into memory, and possibly even perform simple forms of reasoning about it (Hermann et al, 2015;Hochreiter and Schmidhuber, 1997;Sordoni et al, 2015;Sukhbaatar et al, 2015;Wang and Cho, 2015, a.o.). In this paper, we introduce the LAMBADA dataset (LAnguage Modeling Broadened to Account for Discourse Aspects).…”

Section: Introductionmentioning

confidence: 99%

The LAMBADA dataset: Word prediction requiring a broad discourse context

Paperno¹,

Kruszewski²,

Lazaridou³

et al. 2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

161

157

View full text Add to dashboard Cite

We introduce LAMBADA, a dataset to evaluate the capabilities of computational models for text understanding by means of a word prediction task. LAMBADA is a collection of narrative passages sharing the characteristic that human subjects are able to guess their last word if they are exposed to the whole passage, but not if they only see the last sentence preceding the target word. To succeed on LAM-BADA, computational models cannot simply rely on local context, but must be able to keep track of information in the broader discourse. We show that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-ofthe-art language models reaches accuracy above 1% on this novel benchmark. We thus propose LAMBADA as a challenging test set, meant to encourage the development of new models capable of genuine understanding of broad context in natural language text.

show abstract

Section: Introductionmentioning

confidence: 99%

The LAMBADA dataset: Word prediction requiring a broad discourse context

Paperno¹,

Kruszewski²,

Lazaridou³

et al. 2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

161

157

View full text Add to dashboard Cite

show abstract

“…On the other hand, many recent works focus on models based on recurrent neural network language models (RNNLM) [20]. Sordoni et al [15] employ an RNN architecture to generate responses from a social media corpus, and Vinyals et al [16] present a long short-term memory (LSTM) neural network encoderdecoders to generate dialog responses using movie subtitles or IT support line chats. More recently Wen et al [17] demonstrate a more advanced LSTM that able to control a response semantically by considering dialogue act feature.…”

Section: Related Workmentioning

confidence: 99%

“…However, in the case that no response in the database could adequately respond to a given utterance, this approach will fail. Response generation [15]- [17] which has the ability to generate a new responses, is arguably robust in handling user input comparing to the other approach, however this approach sometimes generates unnatural responses that are incomprehensible to the user [9]. There have been a number of works on response generation for data-driven dialog systems.…”

Section: Related Workmentioning

confidence: 99%

“…Generation models originally adapted statistical machine translation (SMT) to utilize a query-response dialog database as a parallel corpus to "translate" between query input and response output [9], [14]. In the recent developments, there have also been several alternative approaches developed simultaneously to this work that utilize neural networks for dialog generation [15]- [17].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Neural Network Approaches to Dialog Response Retrieval and Generation

Nio

Sakti

Neubig

et al. 2016

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

Lasguido NIO†a) , Sakriani SAKTI †b) , Graham NEUBIG †c) , Koichiro YOSHINO †d) , Nonmembers, and Satoshi NAKAMURA †e) , Member SUMMARYIn this work, we propose a new statistical model for building robust dialog systems using neural networks to either retrieve or generate dialog response based on an existing data sources. In the retrieval task, we propose an approach that uses paraphrase identification during the retrieval process. This is done by employing recursive autoencoders and dynamic pooling to determine whether two sentences with arbitrary length have the same meaning. For both the generation and retrieval tasks, we propose a model using long short term memory (LSTM) neural networks that works by first using an LSTM encoder to read in the user's utterance into a continuous vector-space representation, then using an LSTM decoder to generate the most probable word sequence. An evaluation based on objective and subjective metrics shows that the new proposed approaches have the ability to deal with user inputs that are not well covered in the database compared to standard example-based dialog baselines. key words: example-based dialog system, dialog system, response retrieval, response generation, long short term memory neural network IntroductionNatural language dialogue systems promise to establish efficient interfaces for communication between humans and computers [1]- [5]. One way to create a simple yet effective dialog system is using example-based dialog modeling (EBDM) [6]-[9]. EBDM is a data-driven approach for creating dialog systems that choose how to respond to user input based on a large database of examples consisting of an utterance, and a corresponding natural reply to that utterance. Given a user input, the system then performs response retrieval, selecting the highest scoring response from the existing utterances in the database. EBDM presents a lightweight alternative to more conventional methods for constructing dialog systems, as it only requires the construction of an example base, and has also been shown effective in a number of dialog scenarios. In particular, this approach is able to generate highly natural output when an example that matches closely with the user query is included in the database and the example is appropriately retrieved [6] However, dealing with sparse human language and a finite query-response database, we can imagine easily that such system may fail when attempting to respond to a user utterance that does not match closely with one of the examples in the database. We define this kind of problem as an out of example (OOE) problem. One way to overcome this problem is using response generation. This approach uses the dialog examples as data to train a model that can generate responses not included in the database. Generation has the potential to be more robust to OOE user inputs, but also may generate responses that are incomprehensible to human users [13]. Generation models originally adapted statistical machine translation (SMT) to utilize a query-response dialog ...

show abstract

“…Partially observed Markov descision processes (POMDPs) have been applied to this task (Young et al, 2013), but they typically require handcrafted features. (Sordoni et al, 2015) used a recurrent encoder-decoder model to perform response generation from questions as input, and training the model using two posts as input and the following response as target. (Serban et al, 2016) presented a dialog system built as a hierarchical recurrent LSTM encoder-decoder, where the dialogue is seen as a sequence of utterances, and each utterance is modelled as a sequence of words.…”

Section: User Channelmentioning

confidence: 99%

Assisting Discussion Forum Users using Deep Recurrent Neural Networks

Suorra¹,

Mogren²

2016

Proceedings of the 1st Workshop on Representation Learning for NLP

View full text Add to dashboard Cite

We present a discussion forum assistant based on deep recurrent neural networks (RNNs). The assistant is trained to perform three different tasks when faced with a question from a user. Firstly, to recommend related posts. Secondly, to recommend other users that might be able to help. Thirdly, it recommends other channels in the forum where people may discuss related topics. Our recurrent forum assistant is evaluated experimentally by prediction accuracy for the end-to-end trainable parts, as well as by performing an end-user study. We conclude that the model generalizes well, and is helpful for the users.

show abstract

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses

Cited by 690 publications

References 23 publications

The LAMBADA dataset: Word prediction requiring a broad discourse context

The LAMBADA dataset: Word prediction requiring a broad discourse context

Neural Network Approaches to Dialog Response Retrieval and Generation

Assisting Discussion Forum Users using Deep Recurrent Neural Networks

Contact Info

Product

Resources

About