2020
DOI: 10.48550/arxiv.2010.15925
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Abstract: In this paper, we introduce an advanced Russian general language understanding evaluation benchmark -RussianGLUE. Recent advances in the field of universal language models and transformers require the development of a methodology for their broad diagnostics and testing for general intellectual skills -detection of natural language inference, commonsense reasoning, ability to perform simple logical operations regardless of text subject or lexicon. For the first time, a benchmark of nine tasks, collected and org… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(8 citation statements)
references
References 6 publications
0
8
0
Order By: Relevance
“…We continue our work on Russian SuperGLUE 5 [6] which follows the general language understanding evaluation methodology. Similarly to the English prototype, Russian benchmark includes a set of NLU tasks and a publicly available leaderboard.…”
Section: Russian Superglue Tasksmentioning
confidence: 99%
See 2 more Smart Citations
“…We continue our work on Russian SuperGLUE 5 [6] which follows the general language understanding evaluation methodology. Similarly to the English prototype, Russian benchmark includes a set of NLU tasks and a publicly available leaderboard.…”
Section: Russian Superglue Tasksmentioning
confidence: 99%
“…For the obtained test set we re-scored the human benchmark using the same annotation procedure in Yandex.Toloka task as described in [6] but on the new subset of the data. The human performance achieved 80.5% accuracy, while the best model performance on the leaderboard 2 at present is 72.9% (RuBERT conversational).…”
Section: Russementioning
confidence: 99%
See 1 more Smart Citation
“…In addition, a multilingual version of USE [11] embeddings is used. The mentioned models show state-of-the-art results on a number of NLP benchmarks [13], including those in Russian language [8], so it was natural to test them on the task of selecting the best headline for the cluster.…”
Section: Embeddingsmentioning
confidence: 99%
“…For example, news aggregators actively use clustering algorithms to generate news feeds from different sources and to select a single headline. The recent progress in designing multilingual models [13], trained for dozens or even hundreds of languages at once, makes it possible to use them for monolingual tasks, particularly for Russian language tasks [8]. At the same time, Russian BERT-based models are actively evolving, and their comparison with more universal multilingual ones may be of interest.…”
Section: Introductionmentioning
confidence: 99%