2020
DOI: 10.48550/arxiv.2010.05594
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MultiWOZ 2.3: A multi-domain task-oriented dialogue dataset enhanced with annotation corrections and co-reference annotation

Abstract: Task-oriented dialogue systems have made unprecedented progress with multiple state-ofthe-art (SOTA) models underpinned by a number of publicly available MultiWOZ datasets. Dialogue state annotations are error-prone, leading to sub-optimal performance. Various efforts have been put in rectifying the annotation errors presented in the original Multi-WOZ dataset. In this paper, we introduce MultiWOZ 2.3, in which we differentiate incorrect annotations in dialogue acts from dialogue states, identifying a lack of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
25
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(25 citation statements)
references
References 27 publications
0
25
0
Order By: Relevance
“…Dialogue state tracking (DST) refers to the task of predicting a formal state of a dialogue at its current turn, as a set of slot-value pairs at every turn. State-of-the-art approaches apply large transformer networks (Peng et al, 2020;Hosseini-Asl et al, 2020) to encode the full dialogue history in order to predict slot values. Other approaches include question-answering models , ontology matching in the finite case , or pointer-generator networks (Wu et al, 2019).…”
Section: Dialogue State Trackingmentioning
confidence: 99%
See 1 more Smart Citation
“…Dialogue state tracking (DST) refers to the task of predicting a formal state of a dialogue at its current turn, as a set of slot-value pairs at every turn. State-of-the-art approaches apply large transformer networks (Peng et al, 2020;Hosseini-Asl et al, 2020) to encode the full dialogue history in order to predict slot values. Other approaches include question-answering models , ontology matching in the finite case , or pointer-generator networks (Wu et al, 2019).…”
Section: Dialogue State Trackingmentioning
confidence: 99%
“…Following prior work with this dataset , we drop hospital and police from the training set as they are not included in the validation and test set. After the release of Multi-WOZ 2.0 (Budzianowski et al, 2018), later iterations (Eric et al, 2019;Zang et al, 2020;Han et al, 2020) corrected some of the misannotations.…”
Section: Datasetsmentioning
confidence: 99%
“…Annotation error Even the recent versions of MultiWOZ still have incorrect labels and inconsistent annotations [3,21,4,20,5]. These noises are the primary reason why it is challenging to accurately evaluate the model performance.…”
Section: Data Limitationmentioning
confidence: 99%
“…Yes I would like it made for wednesday for 7 people at -PriceRange (Expensive) Whereas the MultiWOZ has been used as a standard benchmark dataset for DST, there has been an increasing number of recent studies reporting the concerns regarding the inherent limitations of this dataset. First, newer versions of MultiWOZ have been proposed to address certain issues such as annotation errors, typos, standardization, annotation consistency, and other factors [3,21,4,20]. In addition, Qian et al [12] pointed out an entity bias issue, i.e., only a small number of values in the ontology account for the majority of labels.…”
Section: Introductionmentioning
confidence: 99%
“…As a matter of fact, massive efforts have already been made to further improve the annotation quality of MultiWOZ 2.1, resulting in MultiWOZ 2.2 and MultiWOZ 2.3 (Han et al, 2020b). Nonetheless, they both have some limitations.…”
Section: Introductionmentioning
confidence: 99%