2020
DOI: 10.48550/arxiv.2004.05986
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CLUE: A Chinese Language Understanding Evaluation Benchmark

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
43
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 32 publications
(43 citation statements)
references
References 24 publications
0
43
0
Order By: Relevance
“…Winograd-Style tasks, including CLUEWSC2020 [39]. CLUEWSC2020 is a Chinese Winograd Schema Challenge dataset, which is an anaphora/coreference resolution task.…”
Section: Task Descriptionmentioning
confidence: 99%
See 1 more Smart Citation
“…Winograd-Style tasks, including CLUEWSC2020 [39]. CLUEWSC2020 is a Chinese Winograd Schema Challenge dataset, which is an anaphora/coreference resolution task.…”
Section: Task Descriptionmentioning
confidence: 99%
“…Common sense reasoning tasks, including C 3 [39]. C 3 is a free-form multiple-choice reading comprehension dataset which can benefit from common sense reasoning.…”
Section: Task Descriptionmentioning
confidence: 99%
“…(1) Text Classification. To evaluate the NLU (Natural Language Understanding) capability on short texts, we adopt news classification task (TNEWS) in Chinese dataset CLUE (Xu et al 2020) (2) Text Retrieval. To measure the discriminative ability of text embedding and the zero-shot transfer ability of facing unseen tasks, we evaluate on AIC-ICC (Wu et al 2017) test subset (same as Section 4.5.1, but only use the texts) where each image has 5 corresponding descriptions.…”
Section: Single-modal Evaluationmentioning
confidence: 99%
“…Chinese Text Data. For the extra single-modal branch, we adopt the single-modal text dataset from CLUE (Xu et al 2020), which is the largest Chinese language understanding evaluation benchmark. We clean the dataset by removing data with low Chinese character ratio (50% in our case) and meaningless symbols.…”
Section: C1 Public Datasetsmentioning
confidence: 99%
“…We mainly experiment on three open datasets, TNEWS, BANKING77, and CLINC150. TNEWS, proposed by [12], has identical essence with intent detection. It includes 53360 samples in 15 categories.…”
Section: Experiments 41 Experimental Setupmentioning
confidence: 99%