2020
DOI: 10.48550/arxiv.2001.06286
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RobBERT: a Dutch RoBERTa-based Language Model

Abstract: Pre-trained language models have been dominating the field of natural language processing in recent years, and have led to significant performance gains for various complex natural language tasks. One of the most prominent pre-trained language models is BERT (Bidirectional Encoders for Transformers), which was released as an English as well as a multilingual version. Although multilingual BERT performs well on many tasks, recent studies showed that BERT models trained on a single language significantly outperf… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
30
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(35 citation statements)
references
References 11 publications
0
30
0
Order By: Relevance
“…While shared subword vocabularies proved to be a practical compromise that allows handling multiple languages within the same network, they are suboptimal when targeting a specific language; recent work reports gains from customized singlelanguage vocabularies (Delobelle et al, 2020).…”
Section: Multilingual Modelsmentioning
confidence: 99%
“…While shared subword vocabularies proved to be a practical compromise that allows handling multiple languages within the same network, they are suboptimal when targeting a specific language; recent work reports gains from customized singlelanguage vocabularies (Delobelle et al, 2020).…”
Section: Multilingual Modelsmentioning
confidence: 99%
“…In the second setup, we use BERTje, the Dutch BERT model of de Vries et al (2019) and RobBERT, the Dutch RoBERTa model of Delobelle et al (2020), with their corresponding English counterparts, as well as multilingual BERT (mBERT), as sequence classifiers on the Entailment task of SICK(-NL). Here we observe a similar pattern in the results in Table 3: while there are individual difference on the same task, the main surprise is that the Dutch dataset is harder, even when exactly the same model (mBERT) is used.…”
Section: Sickmentioning
confidence: 99%
“…Moreover, the syntactically parsed LASSY corpus of written Dutch (van Noord et al, 2013), and the SONAR corpus of written Dutch (Oostdijk et al, 2013) provide rich resources on which NLP systems may be developed. Indeed, Dutch is in the scope of the multilingual BERT models published by Google (Devlin et al, 2019), and two monolingual Dutch BERT models have been published as part of Hugging-Face's transformers library (de Vries et al, 2019;Delobelle et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Initially, most of the research took place in English followed by multilingual approaches (Conneau et al, 2019). Although, multilingual approaches were trained on large texts of many languages, they were outperformed by single language models (de Vries et al, 2019;Martin et al, 2020;Le et al, 2020;Delobelle et al, 2020). Single language models trained with the Open Super-large Crawled ALMAnaCH coRpus (OSCAR) showed good performance due to the size and variance of the OS-CAR (Martin et al, 2020;Delobelle et al, 2020).…”
Section: Introductionmentioning
confidence: 99%