2018
DOI: 10.48550/arxiv.1805.04661
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Examining a hate speech corpus for hate speech detection and popularity prediction

Abstract: As research on hate speech becomes more and more relevant every day, most of it is still focused on hate speech detection. By attempting to replicate a hate speech detection experiment performed on an existing Twitter corpus annotated for hate speech, we highlight some issues that arise from doing research in the field of hate speech, which is essentially still in its infancy. We take a critical look at the training corpus in order to understand its biases, while also using it to venture beyond hate speech det… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 14 publications
0
1
0
Order By: Relevance
“…The authors labeled the tweets as racist, sexist or neither using guidelines inspired by critical race theory and had a domain expert review their labels. However, this dataset has received significant criticism from scholars [23,30], who deride it for most of the racist tweets being anti-muslim and the sexist tweets relating to a debate over an Australian television show. Additionally, this dataset can introduce author bias as it is noted that two users wrote 70% of sexist tweets and 99% of racist tweets were written by another single user.…”
mentioning
confidence: 99%
“…The authors labeled the tweets as racist, sexist or neither using guidelines inspired by critical race theory and had a domain expert review their labels. However, this dataset has received significant criticism from scholars [23,30], who deride it for most of the racist tweets being anti-muslim and the sexist tweets relating to a debate over an Australian television show. Additionally, this dataset can introduce author bias as it is noted that two users wrote 70% of sexist tweets and 99% of racist tweets were written by another single user.…”
mentioning
confidence: 99%