2021
DOI: 10.1109/access.2021.3089515
|View full text |Cite
|
Sign up to set email alerts
|

Advances in Machine Learning Algorithms for Hate Speech Detection in Social Media: A Review

Abstract: The aim of this paper is to review machine learning (ML) algorith ms and techniques for hate speech detection in social media (SM). Hate speech problem is normally model as a text classification task. In this study, we examined the basic baseline components of hate speech classification using ML algorithms. There are five basic baseline componentsdata collection and exploration, feature extraction, dimensionality reduction, classifier selection and training, and model evaluation, were reviewed. There have been… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0
2

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 63 publications
(41 citation statements)
references
References 71 publications
(75 reference statements)
0
19
0
2
Order By: Relevance
“…Notably, incorporating sentence-level context has been shown to improve hateful content detection, with several papers deploying transformer-based models such as Bidirectional Encoder Representation from Transformers (BERT) to better distinguish between hateful and non-hateful content, even when they have lexical similarities [37,57]. Indeed, as Mullah and Zainon [46] write in their comprehensive review of ML methods for automated hate speech detection, deep learning techniques leveraging such language models can considerably improve how context-dependent hate speech is detected. Liu et al [36], for example, leverage transfer learning by fine-tuning a pre-trained BERT model to produce the most accurate classifier for offensive language in the SemEval 2019 Task 6 competition.…”
Section: Automated Methods Of Online Hate Detectionmentioning
confidence: 99%
“…Notably, incorporating sentence-level context has been shown to improve hateful content detection, with several papers deploying transformer-based models such as Bidirectional Encoder Representation from Transformers (BERT) to better distinguish between hateful and non-hateful content, even when they have lexical similarities [37,57]. Indeed, as Mullah and Zainon [46] write in their comprehensive review of ML methods for automated hate speech detection, deep learning techniques leveraging such language models can considerably improve how context-dependent hate speech is detected. Liu et al [36], for example, leverage transfer learning by fine-tuning a pre-trained BERT model to produce the most accurate classifier for offensive language in the SemEval 2019 Task 6 competition.…”
Section: Automated Methods Of Online Hate Detectionmentioning
confidence: 99%
“…Studies seldom cross-examine models and evaluate the performance between non-textual network analysis (a 'who-knows-who' approach), textual, and/or multimedia approaches for ERH detection. While we identified ten prior literature reviews for hateful content detection, none consider the similarities and definitional nuance between ERH concepts and what Extremism, Hate Speech, or Radicalisation means in practice by researchers [2,5,20,42,43,51,82,97,110,114]. Evaluating the consensus for ERH definitions, dataset collection and extraction techniques, model choice and performance are all essential to create ethical models without injurious censorship or blowback.…”
Section: Motivation and Contributionsmentioning
confidence: 99%
“…The TF_IDF statistic is intended to assess the relevance of a word in a set of texts (or corpus). It is represented by an equation (1). The frequency of the term, TF (t, d), is the frequency of occurrence of the term t internal document d. , , ( , )…”
Section: Tf_idf Featuresmentioning
confidence: 99%
“…Nowadays online social networks (OSN) are the most important and fastest means of communication. In fact, it is the popular way to communicate with each other [1]. Offers users freedom of expression.…”
Section: Introductionmentioning
confidence: 99%