A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

Zhang, Yichi; Ou, Zhijian; Hu, Min; Feng, Jianhua

doi:10.18653/v1/2020.emnlp-main.740

Cited by 35 publications

(36 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, large-scale multi-domain task-oriented datasets were proposed (Budzianowski et al, 2018;Byrne et al, 2019;Rastogi et al, 2020). To address multiple domains, Zhang et al (2020a) introduce the LABES-S2S model that -in addition to a two-stage seq2seq approach -models belief states as discrete latent variables. Zhang et al (2020b) present DAMD, a three-stage seq2seq architecture which explicitly decodes the system action.…”

Section: Related Workmentioning

confidence: 99%

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

Kulhánek¹,

Hudeček²,

Nekvinda³

et al. 2021

Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

View full text Add to dashboard Cite

Attention-based pre-trained language models such as GPT-2 brought considerable progress to end-to-end dialogue modelling. However, they also present considerable risks for taskoriented dialogue, such as lack of knowledge grounding or diversity. To address these issues, we introduce modified training objectives for language model finetuning, and we employ massive data augmentation via backtranslation to increase the diversity of the training data. We further examine the possibilities of combining data from multiples sources to improve performance on the target dataset. We carefully evaluate our contributions with both human and automatic methods. Our model substantially outperforms the baseline on the MultiWOZ data and shows competitive performance with state of the art in both automatic and human evaluation.

show abstract

Section: Related Workmentioning

confidence: 99%

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

Kulhánek¹,

Hudeček²,

Nekvinda³

et al. 2021

Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI

View full text Add to dashboard Cite

show abstract

“…It uses a multi-action data augmentation and multiple GRU (Cho et al, 2014) decoders. Similarly, LABES (Zhang et al, 2020a) employs a few GRU-based decoders, but it represents the dialog state as a latent variable. DoTS (Jeon and Lee, 2021) also uses GRUs, but the model makes use of a BERT encoder (Devlin et al, 2019) to get a context representation.…”

Section: Systems Evaluating On Multiwozmentioning

confidence: 99%

Shades of BLEU, Flavours of Success: The Case of MultiWOZ

Nekvinda¹,

Dušek²

2021

Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)

View full text Add to dashboard Cite

The MultiWOZ dataset (Budzianowski et al., 2018) is frequently used for benchmarking context-to-response abilities of task-oriented dialogue systems. In this work, we identify inconsistencies in data preprocessing and reporting of three corpus-based metrics used on this dataset, i.e., BLEU score and Inform & Success rates. We point out a few problems of the MultiWOZ benchmark such as unsatisfactory preprocessing, insufficient or underspecified evaluation metrics, or rigid database. We re-evaluate 7 end-to-end and 6 policy optimization models in as-fair-as-possible setups, and we show that their reported scores cannot be directly compared. To facilitate comparison of future systems, we release our standalone standardized evaluation scripts. We also give basic recommendations for corpus-based benchmarking in future works.

show abstract

“…To reduce deployment cost and error propagation, end-toend trainable networks [22] are introduced into TOD systems, which have continually been studied recently. Typical endto-end TOD systems include those with structured fusion networks [3], [23], and those with multi-stage sequence-tosequence framework [2], [4], [24], [25]. With the boom of Transformers [26] and its large-scale pretraining [11], [12], TOD systems based on auto-regressive language modeling have also been developed [5]- [7], which achieve strong TOD modeling performance.…”

Section: Related Workmentioning

confidence: 99%

“…1) E2E TOD Models and DST Models: We consider two light-weight baseline E2E TOD models with different types of structures: UniConv [3] uses a structured fusion [23] design, while LABES-S2S [4] is based on a multi-stage Seq2Seq [24] architecture. Besides, we consider two large-scale baseline E2E TOD models developed from pretrained GPT-2 [12]: SimpleTOD [5] directly finetunes GPT-2 to model TOD in the auto-regressive manner, while AuGPT [7] further pretrains GPT-2 on large TOD corpus before finetuning and applies training data augmentation based on back-translation [42]- [44].…”

Section: B Baselinesmentioning

confidence: 99%

“…We set the total number of training epoch as 15, with average training time 25 minutes per epoch using 1 GPU. In SeKnow-PLM, we follow AuGPT [7] to pretrain a GPT-2 [12] as our model backbone, using its suggested implementation settings 4 . We further finetune the GPT-2 on the training set of modified MultiWOZ 2.1, where we set the batch size as 8.…”

Section: Appendix B Training and Implementation Detailsmentioning

confidence: 99%

See 1 more Smart Citation

End-to-End Task-Oriented Dialog Modeling with Semi-Structured Knowledge Management

Gao¹,

Takanobu²,

Bosselut³

et al. 2021

Preprint

View full text Add to dashboard Cite

Current task-oriented dialog (TOD) systems mostly manage structured knowledge (e.g. databases and tables) to guide the goal-oriented conversations. However, they fall short of handling dialogs which also involve unstructured knowledge (e.g. reviews and documents). In this paper, we formulate a task of modeling TOD grounded on a fusion of structured and unstructured knowledge. To address this task, we propose a TOD system with semi-structured knowledge management, SeKnow, which extends the belief state to manage knowledge with both structured and unstructured contents. Furthermore, we introduce two implementations of SeKnow based on a non-pretrained sequence-to-sequence model and a pretrained language model, respectively. Both implementations use the end-to-end manner to jointly optimize dialog modeling grounded on structured and unstructured knowledge. We conduct experiments on the modified version of MultiWOZ 2.1 dataset, where dialogs are processed to involve semi-structured knowledge. Experimental results show that SeKnow has strong performances in both end-to-end dialog and intermediate knowledge management, compared to existing TOD systems and their extensions with pipeline knowledge management schemes.

show abstract

A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning

Cited by 35 publications

References 36 publications

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models

Shades of BLEU, Flavours of Success: The Case of MultiWOZ

End-to-End Task-Oriented Dialog Modeling with Semi-Structured Knowledge Management

Contact Info

Product

Resources

About