Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.740
|View full text |Cite
|
Sign up to set email alerts
|

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

Abstract: Structured belief states are crucial for user goal tracking and database query in task-oriented dialog systems. However, training belief trackers often requires expensive turn-level annotations of every user utterance. In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning. We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are repr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(36 citation statements)
references
References 36 publications
0
36
0
Order By: Relevance
“…Recently, large-scale multi-domain task-oriented datasets were proposed (Budzianowski et al, 2018;Byrne et al, 2019;Rastogi et al, 2020). To address multiple domains, Zhang et al (2020a) introduce the LABES-S2S model that -in addition to a two-stage seq2seq approach -models belief states as discrete latent variables. Zhang et al (2020b) present DAMD, a three-stage seq2seq architecture which explicitly decodes the system action.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, large-scale multi-domain task-oriented datasets were proposed (Budzianowski et al, 2018;Byrne et al, 2019;Rastogi et al, 2020). To address multiple domains, Zhang et al (2020a) introduce the LABES-S2S model that -in addition to a two-stage seq2seq approach -models belief states as discrete latent variables. Zhang et al (2020b) present DAMD, a three-stage seq2seq architecture which explicitly decodes the system action.…”
Section: Related Workmentioning
confidence: 99%
“…It uses a multi-action data augmentation and multiple GRU (Cho et al, 2014) decoders. Similarly, LABES (Zhang et al, 2020a) employs a few GRU-based decoders, but it represents the dialog state as a latent variable. DoTS (Jeon and Lee, 2021) also uses GRUs, but the model makes use of a BERT encoder (Devlin et al, 2019) to get a context representation.…”
Section: Systems Evaluating On Multiwozmentioning
confidence: 99%
“…To reduce deployment cost and error propagation, end-toend trainable networks [22] are introduced into TOD systems, which have continually been studied recently. Typical endto-end TOD systems include those with structured fusion networks [3], [23], and those with multi-stage sequence-tosequence framework [2], [4], [24], [25]. With the boom of Transformers [26] and its large-scale pretraining [11], [12], TOD systems based on auto-regressive language modeling have also been developed [5]- [7], which achieve strong TOD modeling performance.…”
Section: Related Workmentioning
confidence: 99%
“…1) E2E TOD Models and DST Models: We consider two light-weight baseline E2E TOD models with different types of structures: UniConv [3] uses a structured fusion [23] design, while LABES-S2S [4] is based on a multi-stage Seq2Seq [24] architecture. Besides, we consider two large-scale baseline E2E TOD models developed from pretrained GPT-2 [12]: SimpleTOD [5] directly finetunes GPT-2 to model TOD in the auto-regressive manner, while AuGPT [7] further pretrains GPT-2 on large TOD corpus before finetuning and applies training data augmentation based on back-translation [42]- [44].…”
Section: B Baselinesmentioning
confidence: 99%
See 1 more Smart Citation