2022
DOI: 10.14569/ijacsa.2022.0131020
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised Text Annotation for Hate Speech Detection using K-Nearest Neighbors and Term Frequency-Inverse Document Frequency

Abstract: Sentiment analysis can detect hate speech using the Natural Language Processing (NLP) concept. This process requires annotation of the text in the labeling. However, when carried out by people, this process must use experts in the field of hate speech, so there is no subjectivity. In addition, if processed by humans, it will take a long time and allow errors in the annotation process for extensive data. To solve this problem, we propose an automatic annotation process with the concept of semi-supervised learni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 15 publications
(32 citation statements)
references
References 15 publications
0
17
1
Order By: Relevance
“…As a result, it is still difficult to achieve a scoring of an essay from a system similar to that of an expert (teacher). Thus, the next research will be increased by the Latent Semantic Analysis method or NLP [43,44].…”
Section: Comparison Of Essay Assessment Test Results With Writing Err...mentioning
confidence: 99%
“…As a result, it is still difficult to achieve a scoring of an essay from a system similar to that of an expert (teacher). Thus, the next research will be increased by the Latent Semantic Analysis method or NLP [43,44].…”
Section: Comparison Of Essay Assessment Test Results With Writing Err...mentioning
confidence: 99%
“…This research focuses on hate speech detection using dataset [20] limited to the realm of politics and law in Indonesia. The dataset includes public opinions from YouTube comments on the presidential debate video [5], and opinions about the COVID-19 pandemic [21]. Several reasons justify considering these comments for further research:…”
Section: Datasetsmentioning
confidence: 99%
“…In skip-grams [46], each neuron specializes in comprehending the context around a single target word, while CBOW predicts the target word from context. The activation function is linear Equation ( 4), and the hidden layer encodes semantic relationships between words Equation (5). The output layer employs softmax to convert outputs into probabilities for accurate prediction Equation (6).…”
Section: Word Embedding (Word2vec)mentioning
confidence: 99%
See 1 more Smart Citation
“…So, we used several strategies to find the SSL-Model. Continuing our previous research in [11] [12], we introduce an SSL model for annotating corpus using Naïve Bayes and Random Forest for the classifier model. In our SSL, we use several classifiers that work together but independently to expand the annotated corpus.…”
Section: Introductionmentioning
confidence: 97%