2020
DOI: 10.1162/tacl_a_00314
|View full text |Cite
|
Sign up to set email alerts
|

CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset

Abstract: To advance multi-domain (cross-domain) dialogue modeling as well as alleviate the shortage of Chinese task-oriented datasets, we propose CrossWOZ, the first large-scale Chinese Cross-Domain Wizard-of-Oz task-oriented dataset. It contains 6K dialogue sessions and 102K utterances for 5 domains, including hotel, restaurant, attraction, metro, and taxi. Moreover, the corpus contains rich annotation of dialogue states and dialogue acts on both user and system sides. About 60% of the dialogues have cross-domain user… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
68
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 80 publications
(78 citation statements)
references
References 16 publications
0
68
0
Order By: Relevance
“…These datasets have higher language variation and task complexity. While most datasets are in English, Zhu et al [72] propose the first large-scale Chinese task-oriented dataset with rich annotations to facilitate the research of Chinese and cross-lingual dialog modeling. An incomplete survey on these dialog datasets is presented in Table 2.…”
Section: Corporamentioning
confidence: 99%
“…These datasets have higher language variation and task complexity. While most datasets are in English, Zhu et al [72] propose the first large-scale Chinese task-oriented dataset with rich annotations to facilitate the research of Chinese and cross-lingual dialog modeling. An incomplete survey on these dialog datasets is presented in Table 2.…”
Section: Corporamentioning
confidence: 99%
“…ATIS (Hemphill et al, 1990), WOZ 2.0 (Wen et al, 2017), FRAMES (El Asri et al, 2017) and KVRET (Eric et al, 2017) are small-scale datasets built in this way. In contrast, MultiWOZ Budzianowski et al (2018) and Cross-WOZ (Zhu et al, 2020) are two large-scale H2H datasets proposed recently.…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, we have witnessed that a variety of datasets tailored for task-oriented dialogue have been constructed, such as MultiWOZ (Budzianowski et al, 2018), SGD (Rastogi et al, 2019a) and CrossWOZ (Zhu et al, 2020), along with the increasing interest in conversational AI in both academia and industry (Gao et al, 2018). These datasets have triggered extensive research in either end-to-end or traditional modular taskoriented dialogue modeling (Wen et al, 2019;Dai et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations