Proceedings of the Fourteenth Workshop on Semantic Evaluation 2020
DOI: 10.18653/v1/2020.semeval-1.290
|View full text |Cite
|
Sign up to set email alerts
|

Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language

Abstract: With the growing use of social media and its availability, many instances of the use of offensive language have been observed across multiple languages and domains. This phenomenon has given rise to the growing need to detect the offensive language used in social media crosslingually. In OffensEval 2020, the organizers have released the multilingual Offensive Language Identification Dataset (mOLID), which contains tweets in five different languages, to detect offensive language. In this work, we introduce a cr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 11 publications
0
2
0
Order By: Relevance
“…On the other hand, cross-lingual language models like XLM were also widely employed, where we found implementation of XLM-RoBERTa (XLM-R) ( Roy et al, 2021a ; Bhatia et al, 2021 ; Wang et al, 2020 ; De la Peña Sarracén & Rosso, 2022 ; Zia et al, 2022 ; Tita & Zubiaga, 2021 ; Ranasinghe & Zampieri, 2021b ; Dadu & Pant, 2020 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a , 2020 ; Mozafari, Farahbakhsh & Crespi, 2022 ; Barbieri, Espinosa Anke & Camacho-Collados, 2022 ; Awal et al, 2024 ; Stappen, Brunn & Schuller, 2020 ), and both XLM-R and XLM-T ( Montariol, Riabi & Seddah, 2022 , Riabi, Montariol & Seddah, 2022 ). These approaches have all been shown to improve performance on tasks involving multilingual/cross-lingual hate speech detection because they are more likely able to capture semantic and syntactic features across languages thanks to their pre-training on multilingual large volumes of texts.…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…On the other hand, cross-lingual language models like XLM were also widely employed, where we found implementation of XLM-RoBERTa (XLM-R) ( Roy et al, 2021a ; Bhatia et al, 2021 ; Wang et al, 2020 ; De la Peña Sarracén & Rosso, 2022 ; Zia et al, 2022 ; Tita & Zubiaga, 2021 ; Ranasinghe & Zampieri, 2021b ; Dadu & Pant, 2020 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a , 2020 ; Mozafari, Farahbakhsh & Crespi, 2022 ; Barbieri, Espinosa Anke & Camacho-Collados, 2022 ; Awal et al, 2024 ; Stappen, Brunn & Schuller, 2020 ), and both XLM-R and XLM-T ( Montariol, Riabi & Seddah, 2022 , Riabi, Montariol & Seddah, 2022 ). These approaches have all been shown to improve performance on tasks involving multilingual/cross-lingual hate speech detection because they are more likely able to capture semantic and syntactic features across languages thanks to their pre-training on multilingual large volumes of texts.…”
Section: Approaches On Multilingual Hate Speech Detectionmentioning
confidence: 99%
“…The second best team (Wang et al 2020) also used an ensemble approach, where they first finetuned two multilingual XLM-RoBERTa models, XLM-RoBERTa-base, and XLM-RoBERTa-large (Conneau et al 2019). In comparison, the third placed team fine-tuned only one multilingual XLM-RoBERTa model (Dadu and Pant 2020).…”
Section: Baselinesmentioning
confidence: 99%
“…The second team Wang et al [143] achieved an F1 score of 0.9204, and it used RoBERTa-large that was fine-tuned with the dataset in an unsupervised way. The third team Dadu and Pant [33], achieved an F1 score of 0.9198, using an ensemble that combined XLM-RoBERTa-base and XLM-RoBERTa-large trained on Subtask A data for all languages. The top-10 teams were close to each other and employed BERT, RoBERTa or XLM-RoBERTa models; sometimes CNNs and LSTMs were also been mentioned either for comparison or hybridization purpose.…”
Section: Overview Of Deep-learning Recordsmentioning
confidence: 99%