2021
DOI: 10.48550/arxiv.2112.09301
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 0 publications
0
4
0
1
Order By: Relevance
“…We test XLM-T (Barbieri et al, 2021), an XLM-R model (Conneau et al, 2020) pre-trained on an additional 198 million Twitter posts in over 30 languages. 7 XLM-R is a widely-used architecture for multilingual language modelling, which has been shown to achieve near state-of-the-art performance on multilingual hate speech detection (Banerjee et al, 2021;Mandl et al, 2021). We chose XLM-T over XLM-R after initial experiments showed the former to outperform the latter on several hate speech detection datasets as well as MHC.…”
Section: Multilingual Transformer Modelsmentioning
confidence: 99%
“…We test XLM-T (Barbieri et al, 2021), an XLM-R model (Conneau et al, 2020) pre-trained on an additional 198 million Twitter posts in over 30 languages. 7 XLM-R is a widely-used architecture for multilingual language modelling, which has been shown to achieve near state-of-the-art performance on multilingual hate speech detection (Banerjee et al, 2021;Mandl et al, 2021). We chose XLM-T over XLM-R after initial experiments showed the former to outperform the latter on several hate speech detection datasets as well as MHC.…”
Section: Multilingual Transformer Modelsmentioning
confidence: 99%
“…This third edition of HASOC Mandl et al [36] provided another set of tweets dataset with the same subtasks as HASOC 2020. The English dataset consists of 3843 training samples and 1281 samples in the test set.…”
Section: Hasoc 2021mentioning
confidence: 99%
“…The HASOC 2021 Marathi dataset is a dataset presented by the Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages (HASOC 2021) track (Mandl et al, 2021) in the Forum for Information Retrieval Evaluation (FIRE 2021). It contains a total of 2,499 tweets in Marathi manually annotated by native speakers of the language.…”
Section: Downstream Evaluationmentioning
confidence: 99%