CLUE: A Chinese Language Understanding Evaluation Benchmark

Xu, Liang; Zhang, Xuanwei; Lü, Li; Hu, Hai; Cao, Chenjie; Liu, Weitang; Li, Junyi; Li, Yudong; Sun, Kai; Xu, Yechen; Cui, Yiming; Yu, Cong; Dong, Qianqian; Tian, Yin; Yu, Dian; Shi, Bo; Zeng, Jun; Wang, Rongzhao; Xie, Weijian; Li, Yanting; Patterson, Yina; Tian, Zuoyu; Zhang, Yiwen; He, Zhiwei; Liu, Shaoweihua; Zhao, Qipeng; Yu, Cong; Zhang, Xinrui; Yang, Zhengliang; Lan, Zhenzhong

doi:10.48550/arxiv.2004.05986

Cited by 32 publications

(43 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Winograd-Style tasks, including CLUEWSC2020 [39]. CLUEWSC2020 is a Chinese Winograd Schema Challenge dataset, which is an anaphora/coreference resolution task.…”

Section: Task Descriptionmentioning

confidence: 99%

See 1 more Smart Citation

PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

Zeng,

Ren,

et al. 2021

Preprint

View full text Add to dashboard Cite

Large-scale Pretrained Language Models (PLMs) have become the new paradigm for Natural Language Processing (NLP). PLMs with hundreds of billions parameters such as have demonstrated strong performances on natural language understanding and generation with few-shot in-context learning. In this work, we present our practice on training large-scale autoregressive language models named PanGu-α, with up to 200 billion parameters. PanGu-α is developed under the MindSpore 2 and trained on a cluster of 2048 Ascend 910 AI processors 3 . The training parallelism strategy is implemented based on MindSpore Auto-parallel, which composes five parallelism dimensions to scale the training task to 2048 processors efficiently, including data parallelism, op-level model parallelism, pipeline model parallelism, optimizer model parallelism and rematerialization. To enhance the generalization ability of PanGu-α, we collect 1.1TB high-quality Chinese data from a wide range of domains to pretrain the model. We empirically test the generation ability of PanGu-α in various scenarios including text summarization, question answering, dialogue generation, etc. Moreover, we investigate the effect of model scales on the few-shot performances across a broad range of Chinese NLP tasks. The experimental results demonstrate the superior capabilities of PanGu-α in performing various tasks under few-shot or zero-shot settings.

show abstract

“…Winograd-Style tasks, including CLUEWSC2020 [39]. CLUEWSC2020 is a Chinese Winograd Schema Challenge dataset, which is an anaphora/coreference resolution task.…”

Section: Task Descriptionmentioning

confidence: 99%

“…Common sense reasoning tasks, including C 3 [39]. C 3 is a free-form multiple-choice reading comprehension dataset which can benefit from common sense reasoning.…”

Section: Task Descriptionmentioning

confidence: 99%

PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

Zeng,

Ren,

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…(1) Text Classification. To evaluate the NLU (Natural Language Understanding) capability on short texts, we adopt news classification task (TNEWS) in Chinese dataset CLUE (Xu et al 2020) (2) Text Retrieval. To measure the discriminative ability of text embedding and the zero-shot transfer ability of facing unseen tasks, we evaluate on AIC-ICC (Wu et al 2017) test subset (same as Section 4.5.1, but only use the texts) where each image has 5 corresponding descriptions.…”

Section: Single-modal Evaluationmentioning

confidence: 99%

“…Chinese Text Data. For the extra single-modal branch, we adopt the single-modal text dataset from CLUE (Xu et al 2020), which is the largest Chinese language understanding evaluation benchmark. We clean the dataset by removing data with low Chinese character ratio (50% in our case) and meaningless symbols.…”

Section: C1 Public Datasetsmentioning

confidence: 99%

EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling

Wang,

Deng

et al. 2021

Preprint

View full text Add to dashboard Cite

While large scale pre-training has achieved great achievements in bridging the gap between vision and language, it still faces several challenges. First, the cost for pre-training is expensive. Second, there is no efficient way to handle the data noise which degrades model performance. Third, previous methods only leverage limited image-text paired data, while ignoring richer single-modal data, which may result in poor generalization to single-modal downstream tasks. In this work, we propose an EfficientCLIP method via Ensemble Confident Learning to obtain a less noisy data subset. Extra rich non-paired single-modal text data is used for boosting the generalization of text branch. We achieve the state-of-theart performance on Chinese cross-modal retrieval tasks with only 1/10 training resources compared to CLIP and WenLan, while showing excellent generalization to single-modal tasks including text retrieval and text classification.

show abstract

“…We mainly experiment on three open datasets, TNEWS, BANKING77, and CLINC150. TNEWS, proposed by [12], has identical essence with intent detection. It includes 53360 samples in 15 categories.…”

Section: Experiments 41 Experimental Setupmentioning

confidence: 99%

Density-Based Dynamic Curriculum Learning for Intent Detection

Gong

Cao

Yuan

et al. 2021

Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

Pre-trained language models have achieved noticeable performance on the intent detection task. However, due to assigning an identical weight to each sample, they suffer from the overfitting of simple samples and the failure to learn complex samples well. To handle this problem, we propose a density-based dynamic curriculum learning model. Our model defines the sample's difficulty level according to their eigenvectors' density. In this way, we exploit the overall distribution of all samples' eigenvectors simultaneously. Then we apply a dynamic curriculum learning strategy, which pays distinct attention to samples of various difficulty levels and alters the proportion of samples during the training process. Through the above operation, simple samples are well-trained, and complex samples are enhanced. Experiments on three open datasets verify that the proposed density-based algorithm can distinguish simple and complex samples significantly. Besides, our model obtains obvious improvement over the strong baselines.

show abstract

CLUE: A Chinese Language Understanding Evaluation Benchmark

Cited by 32 publications

References 24 publications

PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation

EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling

Density-Based Dynamic Curriculum Learning for Intent Detection

Contact Info

Product

Resources

About