Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1188
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Parameterization for Neural Dialogue Generation

Abstract: Neural conversation systems generate responses based on the sequence-to-sequence (SEQ2SEQ) paradigm. Typically, the model is equipped with a single set of learned parameters to generate responses for given input contexts. When confronting diverse conversations, its adaptability is rather limited and the model is hence prone to generate generic responses. In this work, we propose an Adaptive Neural Dialogue generation model, ADAND, which manages various conversations with conversation-specific parameterization.… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 28 publications
0
7
0
Order By: Relevance
“…Second, replay approaches [34,2,25] (or rehearsal approaches), replay examples of previous tasks while training the model on a new one. Third, architecture-based approaches [4,20,40] rely on the decomposition of the inference function. For instance, new approaches leveraging techniques of neural architecture search [20,40] have been proposed.…”
Section: Related Workmentioning
confidence: 99%
“…Second, replay approaches [34,2,25] (or rehearsal approaches), replay examples of previous tasks while training the model on a new one. Third, architecture-based approaches [4,20,40] rely on the decomposition of the inference function. For instance, new approaches leveraging techniques of neural architecture search [20,40] have been proposed.…”
Section: Related Workmentioning
confidence: 99%
“…Ha et al (2017) propose the general idea of generating the parameters of a network by another network. The proposed model in Cai et al (2019) generates the parameters of an encoderdecoder architecture by referring to the contextaware and topic-aware input. Suarez (2017) uses a hypernetwork to scale the weights of the main recurrent network.…”
Section: Related Workmentioning
confidence: 99%
“…Lifelong learning [6,41] tackles this issue by enhancing the models with the ability to continuously learn over time and accumulate knowledge from streams of information sampled across domains, either previously observed or not. The three common lifelong learning approaches are [41]: 1) regularization that constrains the objective function with a forget cost term [22,26,48]; 2) network expansion that adapts the network architecture to new tasks by adding neurons and layers [5,43]; and 3) memory models that retrain the network using instances selected from a memory drawn from different data distributions [2,32].…”
Section: Background and Related Workmentioning
confidence: 99%