Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language

Dadu, Tanvi; Pant, Kartikey

doi:10.18653/v1/2020.semeval-1.290

Cited by 12 publications

(3 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, cross-lingual language models like XLM were also widely employed, where we found implementation of XLM-RoBERTa (XLM-R) ( Roy et al, 2021a ; Bhatia et al, 2021 ; Wang et al, 2020 ; De la Peña Sarracén & Rosso, 2022 ; Zia et al, 2022 ; Tita & Zubiaga, 2021 ; Ranasinghe & Zampieri, 2021b ; Dadu & Pant, 2020 ; Eronen et al, 2022 ; Ranasinghe & Zampieri, 2021a , 2020 ; Mozafari, Farahbakhsh & Crespi, 2022 ; Barbieri, Espinosa Anke & Camacho-Collados, 2022 ; Awal et al, 2024 ; Stappen, Brunn & Schuller, 2020 ), and both XLM-R and XLM-T ( Montariol, Riabi & Seddah, 2022 , Riabi, Montariol & Seddah, 2022 ). These approaches have all been shown to improve performance on tasks involving multilingual/cross-lingual hate speech detection because they are more likely able to capture semantic and syntactic features across languages thanks to their pre-training on multilingual large volumes of texts.…”

Section: Approaches On Multilingual Hate Speech Detectionmentioning

confidence: 99%

A survey on multi-lingual offensive language detection

Mnassri,

Farahbakhsh,

Chalehchaleh

et al. 2024

PeerJ Computer Science

View full text Add to dashboard Cite

The prevalence of offensive content on online communication and social media platforms is growing more and more common, which makes its detection difficult, especially in multilingual settings. The term “Offensive Language” encompasses a wide range of expressions, including various forms of hate speech and aggressive content. Therefore, exploring multilingual offensive content, that goes beyond a single language, focus and represents more linguistic diversities and cultural factors. By exploring multilingual offensive content, we can broaden our understanding and effectively combat the widespread global impact of offensive language. This survey examines the existing state of multilingual offensive language detection, including a comprehensive analysis on previous multilingual approaches, and existing datasets, as well as provides resources in the field. We also explore the related community challenges on this task, which include technical, cultural, and linguistic ones, as well as their limitations. Furthermore, in this survey we propose several potential future directions toward more efficient solutions for multilingual offensive language detection, enabling safer digital communication environment worldwide.

show abstract

Section: Approaches On Multilingual Hate Speech Detectionmentioning

confidence: 99%

A survey on multi-lingual offensive language detection

Mnassri,

Farahbakhsh,

Chalehchaleh

et al. 2024

PeerJ Computer Science

View full text Add to dashboard Cite

show abstract

“…The second best team (Wang et al 2020) also used an ensemble approach, where they first finetuned two multilingual XLM-RoBERTa models, XLM-RoBERTa-base, and XLM-RoBERTa-large (Conneau et al 2019). In comparison, the third placed team fine-tuned only one multilingual XLM-RoBERTa model (Dadu and Pant 2020).…”

Section: Baselinesmentioning

confidence: 99%

OffensEval 2023: Offensive language identification in the age of Large Language Models

Zampieri,

Rosenthal,

Nakov

et al. 2023

Nat. Lang. Eng.

View full text Add to dashboard Cite

The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.

show abstract

“…The second team Wang et al [143] achieved an F1 score of 0.9204, and it used RoBERTa-large that was fine-tuned with the dataset in an unsupervised way. The third team Dadu and Pant [33], achieved an F1 score of 0.9198, using an ensemble that combined XLM-RoBERTa-base and XLM-RoBERTa-large trained on Subtask A data for all languages. The top-10 teams were close to each other and employed BERT, RoBERTa or XLM-RoBERTa models; sometimes CNNs and LSTMs were also been mentioned either for comparison or hybridization purpose.…”

Section: Overview Of Deep-learning Recordsmentioning

confidence: 99%

A systematic review of Hate Speech automatic detection using Natural Language Processing

Saroar¹,

Oussalah²

2021

Preprint

View full text Add to dashboard Cite

With the multiplication of social media platforms, which offer anonymity, easy access and online community formation and online debate, the issue of hate speech detection and tracking becomes a growing challenge to society, individual, policy-makers and researchers. Despite efforts for leveraging automatic techniques for automatic detection and monitoring, their performances are still far from satisfactory, which constantly calls for future research on the issue. This paper provides a systematic review of literature in this field, with a focus on natural language processing and deep learning technologies, highlighting the terminology, processing pipeline, core methods employed, with a focal point on deep learning architecture. From a methodological perspective, we adopt PRISMA guideline of systematic review of the last 10 years literature from ACM Digital Library and Google Scholar. In the sequel, existing surveys, limitations, and future research directions are extensively discussed.

show abstract

Team Rouges at SemEval-2020 Task 12: Cross-lingual Inductive Transfer to Detect Offensive Language

Cited by 12 publications

References 11 publications

A survey on multi-lingual offensive language detection

A survey on multi-lingual offensive language detection

OffensEval 2023: Offensive language identification in the age of Large Language Models

A systematic review of Hate Speech automatic detection using Natural Language Processing

Contact Info

Product

Resources

About