2023
DOI: 10.1109/taffc.2022.3219229
|View full text |Cite
|
Sign up to set email alerts
|

Pars-OFF: A Benchmark for Offensive Language Detection on Farsi Social Media

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…Persian is one of the low-resource languages in this regard. The existing datasets for Persian hate speech detection include Pars-OFF (Ataei et al 2022), and two other non-public datasets introduced by Mozafari, Farahbakhsh, andCrespi (2022), andAlavi, Nikvand, andShamsfard (2021). Pars-OFF comprises 7,381 normal and 3,182 offensive Persian tweets, organized into a three-level hierarchy as outlined in Zampieri et al (2019).The process of collecting tweets employed a combination of similarity-based and keyword-based data selection strategies.…”
Section: Hate Speech Datasets In Other Languagesmentioning
confidence: 99%
See 3 more Smart Citations
“…Persian is one of the low-resource languages in this regard. The existing datasets for Persian hate speech detection include Pars-OFF (Ataei et al 2022), and two other non-public datasets introduced by Mozafari, Farahbakhsh, andCrespi (2022), andAlavi, Nikvand, andShamsfard (2021). Pars-OFF comprises 7,381 normal and 3,182 offensive Persian tweets, organized into a three-level hierarchy as outlined in Zampieri et al (2019).The process of collecting tweets employed a combination of similarity-based and keyword-based data selection strategies.…”
Section: Hate Speech Datasets In Other Languagesmentioning
confidence: 99%
“…The chosen approach for data selection can introduce biases or limitations to the datasets. Common approaches include searching for lists of slurs and derogatory keywords (Waseem and Hovy 2016; Kurrek, Saleem, and Ruths 2020), focusing on specific events or contexts (Grimminger and Klinger 2021), or adopting a mixture of strategies (Basile et al 2019;Ataei et al 2022;Fersini, Nozza, and Rosso 2018).…”
Section: Data Collectionmentioning
confidence: 99%
See 2 more Smart Citations
“…Given the staggering volume of at least 500 million tweets being sent daily, manual detection of such content has become an unfeasible task. Consequently, researchers have turned to leveraging NLP and learning techniques to effectively address this issue [7]- [18], [50].…”
mentioning
confidence: 99%