Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.517
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Abstract: Training the generative models with minimal corpus is one of the critical challenges for building open-domain dialogue systems. Existing methods tend to use the meta-learning framework which pre-trains the parameters on all non-target tasks then fine-tunes on the target task. However, fine-tuning distinguishes tasks from the parameter perspective but ignores the model-structure perspective, resulting in similar dialogue models for different tasks. In this paper, we propose an algorithm that can customize a uni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
2

Relationship

2
6

Authors

Journals

citations
Cited by 28 publications
(19 citation statements)
references
References 24 publications
0
13
0
Order By: Relevance
“…Data-driven models can also be combined with graphical models ( Zhou et al, 2020 ; Song et al, 2019 ; Moon et al, 2019 ; Shi et al, 2020 ; Wu B. et al, 2020 ; Xu et al, 2020 ), rule-based or slot-filling systems ( Tammewar et al, 2018 ; Zhang Z. et al, 2019 ), a knowledge-base ( Ganhotra and Polymenakos, 2018 ; Ghazvininejad et al, 2018 ; Luo et al, 2019 ; Yavuz et al, 2019 ; Moon et al, 2019 ; Wu et al, 2019 ; Lian et al, 2019 ; Zhang B. et al, 2020 ; Majumder et al, 2020 ; Tuan et al, 2021 ) or with automatic extraction of attributes from dialogue ( Tigunova et al, 2019 , 2020 ; Wu C.-S. et al, 2020 , 2021 ; Ma et al, 2021 ) to improve the personalised entity selection in responses. Methods that adopt transfer learning ( Genevay and Laroche, 2016 ; Lopez-Paz and Ranzato, 2017 ; Mo et al, 2017 , 2018 ; Yang et al, 2017 , 2018 ; Wolf et al, 2019 ; Golovanov et al, 2020 ), meta-learning ( Finn et al, 2017 ; Santoro et al, 2016 ; Vinyals et al, 2016 ; Munkhdalai and Yu, 2017 ; Madotto et al, 2019 ; Zhang W.-N. et al, 2019 ; Song et al, 2020 ; Tian et al, 2021 ) and key-value memory structures ( Xu et al, 2017 ; Kaiser et al, 2017 ; Zhu and Yang, 2018 , 2020 ; de Masson d’Autume et al, 2019 ) could provide effective insights to alleviate data scarcity and enable quick adaption to various users through improving few-shot and lifelong learning capabilities of the dialogue models ( Wang et al, 2020b ).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Data-driven models can also be combined with graphical models ( Zhou et al, 2020 ; Song et al, 2019 ; Moon et al, 2019 ; Shi et al, 2020 ; Wu B. et al, 2020 ; Xu et al, 2020 ), rule-based or slot-filling systems ( Tammewar et al, 2018 ; Zhang Z. et al, 2019 ), a knowledge-base ( Ganhotra and Polymenakos, 2018 ; Ghazvininejad et al, 2018 ; Luo et al, 2019 ; Yavuz et al, 2019 ; Moon et al, 2019 ; Wu et al, 2019 ; Lian et al, 2019 ; Zhang B. et al, 2020 ; Majumder et al, 2020 ; Tuan et al, 2021 ) or with automatic extraction of attributes from dialogue ( Tigunova et al, 2019 , 2020 ; Wu C.-S. et al, 2020 , 2021 ; Ma et al, 2021 ) to improve the personalised entity selection in responses. Methods that adopt transfer learning ( Genevay and Laroche, 2016 ; Lopez-Paz and Ranzato, 2017 ; Mo et al, 2017 , 2018 ; Yang et al, 2017 , 2018 ; Wolf et al, 2019 ; Golovanov et al, 2020 ), meta-learning ( Finn et al, 2017 ; Santoro et al, 2016 ; Vinyals et al, 2016 ; Munkhdalai and Yu, 2017 ; Madotto et al, 2019 ; Zhang W.-N. et al, 2019 ; Song et al, 2020 ; Tian et al, 2021 ) and key-value memory structures ( Xu et al, 2017 ; Kaiser et al, 2017 ; Zhu and Yang, 2018 , 2020 ; de Masson d’Autume et al, 2019 ) could provide effective insights to alleviate data scarcity and enable quick adaption to various users through improving few-shot and lifelong learning capabilities of the dialogue models ( Wang et al, 2020b ).…”
Section: Discussionmentioning
confidence: 99%
“…Another solution could be to combine a data-driven model with another approach to compensate for the deficiencies in the models, such as combining a generative model (e.g., Sequence-to-Sequence) with a Memory Network (Madotto et al, 2018;Zhang B. et al, 2020) or with transformers (Vaswani et al, 2017), such as in the work of Roller et al (2020), Generative Pre-trained Transformer (GPT) (Radford et al, 2018(Radford et al, , 2019Brown et al, 2020;Zhang Y. et al, 2020), Bidirectional Encoder Representations from Transformers (BERT) (Devlin et al, 2019;Song et al, 2021), and Poly-encoders (Humeau et al, 2020;Li et al, 2020). Data-driven models can also be combined with graphical models (Zhou et al, 2020;Song et al, 2019;Moon et al, 2019;Shi et al, 2020;Wu B. et al, 2020;Xu et al, 2020), rule-based or slot-filling systems (Tammewar et al, 2018;Zhang Z. et al, 2019), a knowledge-base (Ganhotra and Polymenakos, 2018;Ghazvininejad et al, 2018;Luo et al, 2019;Yavuz et al, 2019;Moon et al, 2019;Wu et al, 2019;Lian et al, 2019;Zhang B. et al, 2020;Majumder et al, 2020;Tuan et al, 2021) or with automatic extraction of attributes from dialogue (Tigunova et al, 2019(Tigunova et al, , 2020Wu C.-S. et al, 2020Wu C.-S. et al, , 2021Ma et al, 2021) to improve the personalised entity selection in responses.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Meta-learning has recently been explored in addressing the limited personalized data issue. CMAML (Song et al, 2020c) is a meta-learning based method that learns from few shot personas by customizing the model structures. Besides the meta-learning methods, GDR (Song et al, 2020a) introduces inference ability on the PersonaChat with a generate-refine framework.…”
Section: Compared Methodsmentioning
confidence: 99%
“…This problem becomes even more severe in emerging research topics (Baig, 2020;Baines et al, 2020), such as COVID-19, where curated definitions could be imprecise and do not scale to rapidly proposed terminologies. Neural text generation (Bowman et al, 2016;Vaswani et al, 2017;Sutskever et al, 2014;Song et al, 2020b) could be a plausible solution to this problem by generating definition text based on the terminology text. Encouraging results by neural text generation have been observed on related tasks, such as paraphrase generation (Li et al, 2020), description generation (Cheng et al, 2020), synonym generation (Gupta et al, 2015) and data augmentation (Malandrakis et al, 2019).…”
Section: Introductionmentioning
confidence: 99%