Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1004
|View full text |Cite
|
Sign up to set email alerts
|

Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study

Abstract: Neural generative models have been become increasingly popular when building conversational agents. They offer flexibility, can be easily adapted to new domains, and require minimal domain engineering. A common criticism of these systems is that they seldom understand or use the available dialog history effectively. In this paper, we take an empirical approach to understanding how these models use the available dialog history by studying the sensitivity of the models to artificially introduced unnatural change… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
87
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 84 publications
(92 citation statements)
references
References 29 publications
3
87
2
Order By: Relevance
“…Results show that all of the REDfull models get larger PPL increases in most kinds of perturbations than origin models and thus more sensitive to history utterance perturbations. It proves that the dynamic (order) information is more e ectively used by RED models according to the premise in [21] that the more sensitive the model to perturbations, the stronger ability for it of modeling dynamics.…”
Section: 32mentioning
confidence: 94%
See 4 more Smart Citations
“…Results show that all of the REDfull models get larger PPL increases in most kinds of perturbations than origin models and thus more sensitive to history utterance perturbations. It proves that the dynamic (order) information is more e ectively used by RED models according to the premise in [21] that the more sensitive the model to perturbations, the stronger ability for it of modeling dynamics.…”
Section: 32mentioning
confidence: 94%
“…We conduct experiments on three multi-turn dialogue datasets with di erent styles, they are the bAbI dialog [4], the PersonaChat [35] and the Chinese customer service dataset (JDC) [34] respectively. Each dataset is split into train/valid/test sets according to the previous works [21,34]. Note that each multi-turn dialogue in the three datasets is processed to many history-response pairs with di erent history lengths.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations