GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-supervised Learning and Explicit Policy Injection

He, Wanwei; Dai, Yinpei; Zheng, Yinhe; Wu, Yu-Chuan; Cao, Zheng; Liu, Dermot; Yang, Min; Huang, Fei; Si, Luo; Sun, Jian; Li, Yongbin

doi:10.1609/aaai.v36i10.21320

Cited by 52 publications

(42 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Finally, we encourage the establishment of better dialog act taxonomies that are backed by learning sciences research. As outlined in §5.6 and in He et al (2022), a unified taxonomy may also strongly aid in transfer learning.…”

Section: User Study With a Learning Interfacementioning

confidence: 99%

Opportunities and Challenges in Neural Dialog Tutoring

Jakub¹,

Nico²,

Wang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors. Although there have been significant recent advances in neural conversational systems using large language models and growth in available dialog corpora, dialog tutoring has largely remained unaffected by these advances. In this paper, we rigorously analyze various generative language models on two dialog tutoring datasets for language learning using automatic and human evaluations to understand the new opportunities brought by these advances as well as the challenges we must overcome to build models that would be usable in real educational settings. We find that although current approaches can model tutoring in constrained learning scenarios when the number of concepts to be taught and possible teacher strategies are small, they perform poorly in less constrained scenarios. Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring, which measures learning opportunities for students and how engaging the dialog is. To understand the behavior of our models in a real tutoring setting, we conduct a user study using expert annotators and find a significantly large number of model reasoning errors in 45% of conversations. Finally, we connect our findings to outline future work. https://github.com/eth-nlped/ dialog-tutoring

show abstract

Section: User Study With a Learning Interfacementioning

confidence: 99%

Opportunities and Challenges in Neural Dialog Tutoring

Jakub¹,

Nico²,

Wang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…SOLOIST [70] parameterized a task bot using a Transformer-based auto-regressive language model, which subsumed different dialog modules into a single neural model and was pre-trained on two TOD datasets. SPACE [31] proposed to use consistency regularization loss to learn dialog policy from labeled and unlabeled dialog corpora via a semi-supervised manner. To exploit more heterogeneous TOD corpora, PPTOD [88] converted different TOD tasks into the text-to-text generation task with taskspecific prompts on T5 model [76].…”

Section: Pre-trained Conversation Modelsmentioning

confidence: 99%

Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation

Dai

Yang

et al. 2022

Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Self Cite

View full text Add to dashboard Cite

Recently, pre-training methods have shown remarkable success in task-oriented dialog (TOD) systems. However, most existing pretrained models for TOD focus on either dialog understanding or dialog generation, but not both. In this paper, we propose SPACE-3, a novel unified semi-supervised pre-trained conversation model learning from large-scale dialog corpora with limited annotations, which can be effectively fine-tuned on a wide range of downstream dialog tasks. Specifically, SPACE-3 consists of four successive components in a single transformer to maintain a task-flow in TOD systems: (i) a dialog encoding module to encode dialog history, (ii) a dialog understanding module to extract semantic vectors from either user queries or system responses, (iii) a dialog policy module to generate a policy vector that contains high-level semantics of the response, and (iv) a dialog generation module to produce appropriate responses. We design a dedicated pre-training objective for each component. Concretely, we pre-train the dialog encoding module with span mask language modeling to learn contextualized dialog information. To capture the structured dialog semantics, we pre-train the dialog understanding module via a novel tree-induced semi-supervised contrastive learning objective with the help of extra dialog annotations. In addition, we pre-train the dialog policy module by minimizing the L 2 distance between its output policy * Equal Contribution.† Wanwei He is also with the University of Chinese Academy of Sciences. This work was conducted when Wanwei He was interning at Alibaba.

show abstract

“…One possible solution is to employ a regularisation to encourage HSEMEC to consider intact post topics and emotion information to generate both coherent and emotional responses. In addition, dialogue pre-training techniques [40,41] may help the dialogue systems learn better dialogue patterns.…”

Section: Incoherent Responsesmentioning

confidence: 99%

Leveraging hierarchical semantic‐emotional memory in emotional conversation generation

Yang

Wang

et al. 2022

CAAI Trans on Intel Tech

Self Cite

View full text Add to dashboard Cite

Handling emotions in human‐computer dialogues has emerged as a challenging task which requires artificial intelligence systems to generate emotional responses by jointly perceiving the emotion involved in the input posts and incorporating it into the generation of semantically coherent and emotionally reasonable responses. However, most previous works generate emotional responses solely from input posts, which do not take full advantage of the training corpus and suffer from generating generic responses. In this study, we introduce a hierarchical semantic‐emotional memory module for emotional conversation generation (called HSEMEC), which can learn abstract semantic conversation patterns and emotional information from the large training corpus. The learnt semantic and emotional knowledge helps to enrich the post representation and assist the emotional conversation generation. Comprehensive experiments on a large real‐world conversation corpus show that HSEMEC can outperform the strong baselines on both automatic and manual evaluation. For reproducibility, we release the code and data publicly at: https://github.com/siat‐nlp/HSEMEC‐code‐data.

show abstract

GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with Semi-supervised Learning and Explicit Policy Injection

Cited by 52 publications

References 67 publications

Opportunities and Challenges in Neural Dialog Tutoring

Opportunities and Challenges in Neural Dialog Tutoring

Unified Dialog Model Pre-training for Task-Oriented Dialog Understanding and Generation

Leveraging hierarchical semantic‐emotional memory in emotional conversation generation

Contact Info

Product

Resources

About