Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing 2018
DOI: 10.18653/v1/d18-1547
|View full text |Cite
|
Sign up to set email alerts
|

MultiWOZ - A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling

Abstract: Even though machine learning has become the major scene in dialogue research community, the real breakthrough has been blocked by the scale of data available. To address this fundamental obstacle, we introduce the Multi-Domain Wizard-of-Oz dataset (MultiWOZ), a fully-labeled collection of human-human written conversations spanning over multiple domains and topics. At a size of 10k dialogues, it is at least one order of magnitude larger than all previous annotated task-oriented corpora. The contribution of this… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

5
920
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 739 publications
(1,070 citation statements)
references
References 43 publications
5
920
0
1
Order By: Relevance
“…However, this tool doesn't allow users to specify custom annotations or labels and doesn't support classification or slot-value annotation. This is not compatible with modern dialogue datasets which require such annotations (Budzianowski et al, 2018). INCEpTION (Klie et al, 2018) is a semantic annotation platform for interactive tasks that require semantic resources like entity linking.…”
Section: Main Contributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…However, this tool doesn't allow users to specify custom annotations or labels and doesn't support classification or slot-value annotation. This is not compatible with modern dialogue datasets which require such annotations (Budzianowski et al, 2018). INCEpTION (Klie et al, 2018) is a semantic annotation platform for interactive tasks that require semantic resources like entity linking.…”
Section: Main Contributionsmentioning
confidence: 99%
“…1 https://github.com/Wluper/lida Creating a high-quality dialogue dataset incurs a large annotation cost, which makes good dialogue annotation tools essential to ensure the highest possible quality. Many annotation tools exist for a range of NLP tasks but none are designed specifically for dialogue with modern usability principles in mind -in collecting MultiWOZ, for example, Budzianowski et al (2018) had to create a bespoke annotation interface.…”
Section: Introductionmentioning
confidence: 99%
“…For the second set of experiments, we evaluate the proposed model and TL approaches on the multi-turn Google Simulated Dialogues (GSD) 3 [7]. We explore Microsoft Dialogue Challenge (MDC) 4 [31] and MultiWOZ 2.0 (WOZ) 5 [32] datasets as other dialogue corpora for evaluating the proposed TL approaches. We use the same data division as [7].…”
Section: Datamentioning
confidence: 99%
“…Task-oriented dialogue systems are primarily designed to search and interact with large databases which contain information pertaining to a certain dialogue domain: the main purpose of such systems is to assist the users in accomplishing a welldefined task such as flight booking (El Asri et al, 2017), tourist information (Henderson et al, 2014), restaurant search (Williams, 2012), or booking a taxi (Budzianowski et al, 2018). These systems are typically constructed around rigid task-specific ontologies (Henderson et al, 2014;Mrkšić et al, 2015) which enumerate the constraints the users can express using a collection of slots (e.g., PRICE RANGE for restaurant search) and their slot values (e.g., CHEAP, EXPENSIVE for the aforementioned slots).…”
Section: Introductionmentioning
confidence: 99%