Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

Budzianowski, Paweł; Vulić, Ivan

doi:10.18653/v1/d19-5602

Cited by 224 publications

(156 citation statements)

References 23 publications

Supporting

Mentioning

156

Contrasting

Order By: Relevance

“…This idea can also be applied to task-oriented dialog systems to transfer general natural language knowledge from large-scale corpora to a specific dialog task. Some early studies have shown the possibility of using pre-training models to model task-oriented dialogs [46,99,100,130,131].…”

Section: Discussion and Future Trendsmentioning

confidence: 99%

“…Wolf et al [99] followed this way by first pre-training a transformer model on large-scale dialog data and then fine-tuning the model on a personalized dialog task with multi-task learning. Budzianowski et al [100] further explored this idea to task-oriented dialog without explicit standalone dialogue policy and generation modules. In this work, the belief state and database state are first converted to natural language text and then taken as input to the transformer decoder besides the context.…”

Section: Unsupervised Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Recent advances and challenges in task-oriented dialog systems

Zhang

Takanobu

Huang

et al. 2020

Sci. China Technol. Sci.

119

View full text Add to dashboard Cite

Due to the significance and value in human-computer interaction and natural language processing, task-oriented dialog systems are attracting more and more attention in both academic and industrial communities. In this paper, we survey recent advances and challenges in task-oriented dialog systems. We also discuss three critical topics for task-oriented dialog systems: (1) improving data efficiency to facilitate dialog modeling in low-resource settings, (2) modeling multi-turn dynamics for dialog policy learning to achieve better task-completion performance, and (3) integrating domain ontology knowledge into the dialog model. Besides, we review the recent progresses in dialog evaluation and some widely-used corpora. We believe that this survey, though incomplete, can shed a light on future research in task-oriented dialog systems. task-oriented dialog systems, natural language understanding, dialog policy, dialog state tracking, natural language generation

show abstract

Section: Discussion and Future Trendsmentioning

confidence: 99%

Section: Unsupervised Methodsmentioning

confidence: 99%

Recent advances and challenges in task-oriented dialog systems

Zhang

Takanobu

Huang

et al. 2020

Sci. China Technol. Sci.

119

View full text Add to dashboard Cite

show abstract

“…Recent research [38] demonstrates that huge language models may outperform specific dialogue systems at the cost of computation. Even so, the use of these pre-trained models and transfer learning is where the research points towards [39] in order to achieve outstanding performances with very little data. Also, this adds robustness against breakdowns when encountering unconsidered user needs [40].…”

Section: Related Workmentioning

confidence: 99%

JAICOB: A Data Science Chatbot

et al. 2020

View full text Add to dashboard Cite

The application of natural language to improve students' interaction with information systems is demonstrated to be beneficial. In particular, advances in cognitive computing enable a new way of interaction that accelerates insight from existing information sources, thereby contributing to the process of learning. This work aims at researching the application of cognitive computing in blended learning environments. We propose a modular cognitive agent architecture for pedagogical question answering, featuring social dialogue (small talk), improved for a specific knowledge domain. This system has been implemented as a personal agent to assist students in learning Data Science and Machine Learning techniques. Its implementation includes the training of machine learning models and natural language understanding algorithms in a human-like interface. The effectiveness of the system has been validated through an experiment. INDEX TERMS Cognitive informatics, Educational technology, Human-computer interaction, Machine learning, Natural language processing.

show abstract

“…After performing experiments by training directly on math word problem corpora, we perform a different set of experiments by pretraining on a general language corpus. The success of pretrained models such as ELMo [17], GPT-2 [18], and BERT [19] for many natural language tasks, provides reasoning that pre-training is likely to produce better learning by our system. We use pre-training so that the system has some foundational knowledge of English before we train it on the domain-specific text of math word problems.…”

Section: Approachmentioning

confidence: 99%

Solving Arithmetic Word Problems Automatically Using Transformer and Unambiguous Representations

Griffith

Kalita

2019

2019 International Conference on Computational Science and Computational Intelligence (CSCI)

View full text Add to dashboard Cite

Constructing accurate and automatic solvers of math word problems has proven to be quite challenging. Prior attempts using machine learning have been trained on corpora specific to math word problems to produce arithmetic expressions in infix notation before answer computation. We find that custombuilt neural networks have struggled to generalize well. This paper outlines the use of Transformer networks trained to translate math word problems to equivalent arithmetic expressions in infix, prefix, and postfix notations. In addition to training directly on domain-specific corpora, we use an approach that pre-trains on a general text corpus to provide foundational language abilities to explore if it improves performance. We compare results produced by a large number of neural configurations and find that most configurations outperform previously reported approaches on three of four datasets with significant increases in accuracy of over 20 percentage points. The best neural approaches boost accuracy by almost 10% on average when compared to the previous state of the art.

show abstract

Hello, It’s GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

Cited by 224 publications

References 23 publications

Recent advances and challenges in task-oriented dialog systems

Recent advances and challenges in task-oriented dialog systems

JAICOB: A Data Science Chatbot

Solving Arithmetic Word Problems Automatically Using Transformer and Unambiguous Representations

Contact Info

Product

Resources

About