2022
DOI: 10.3390/app12136611
|View full text |Cite
|
Sign up to set email alerts
|

Novel Hate Speech Detection Using Word Cloud Visualization and Ensemble Learning Coupled with Count Vectorizer

Abstract: A plethora of negative behavioural activities have recently been found in social media. Incidents such as trolling and hate speech on social media, especially on Twitter, have grown considerably. Therefore, detection of hate speech on Twitter has become an area of interest among many researchers. In this paper, we present a computational framework to (1) examine out the computational challenges behind hate speech detection and (2) generate high performance results. First, we extract features from Twitter data … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(14 citation statements)
references
References 47 publications
0
13
0
1
Order By: Relevance
“…The bigger the term in the cloud, the more often it appears in the original text. Words with more significant font sizes are considered more important or crucial to the overall message [25]. Publications have fluctuated during observation.…”
Section: Methodsmentioning
confidence: 99%
“…The bigger the term in the cloud, the more often it appears in the original text. Words with more significant font sizes are considered more important or crucial to the overall message [25]. Publications have fluctuated during observation.…”
Section: Methodsmentioning
confidence: 99%
“…In the embedding stage, one of the most state-of-the-art embedding models, Sentence Embeddings using Siamese BERT (SBERT), which is a modified BERT network that incorporates Siamese and triplet networks to produce semantically meaningful sentence embeddings, was used [40]; specifically, the allmpnet-base-v2 sentence-transformer model was employed based on its outstanding performance scores. Using the embedding results, we built three supervised models, which utilized three different machine learning algorithms-logistic regression, support vector machine (SVM), and random forest-all of which are commonly used in the relevant research literature [e.g., [41][42][43].…”
Section: Plos Onementioning
confidence: 99%
“…Teknik kedua yang digunakan adalah Count vectorizer yang merupakan teknik feature extraction dan berperan dalam menggambarkan koleksi kata dalam bentuk matriks-matriks [18]. Penelitian berkaitan dengan klasifikasi teks condong menggunakan teknik ini, sebagaimana telah digunakan dalam penelitian sebelumnya mengenai klasifikasi sentimen pesan berindikasi cyberbullying [19].…”
Section: Teknik Yang Digunakanunclassified