2020 8th International Conference on Information and Communication Technology (ICoICT) 2020
DOI: 10.1109/icoict49345.2020.9166251
|View full text |Cite
|
Sign up to set email alerts
|

Hate Code Detection in Indonesian Tweets using Machine Learning Approach: A Dataset and Preliminary Study

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(6 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…Moreover, some studies focused on techniques to deal with the class imbalance problem, such as oversampling and undersampling. The oversampling technique is applied in the training data to increase the minority class (Chatzakou et al, 2017;Elisabeth et al, 2020), while the undersampling technique reduces the majority class (Miok et al, 2019). However, most of the works did not deal with class imbalance.…”
Section: Automatic Hate Speech Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreover, some studies focused on techniques to deal with the class imbalance problem, such as oversampling and undersampling. The oversampling technique is applied in the training data to increase the minority class (Chatzakou et al, 2017;Elisabeth et al, 2020), while the undersampling technique reduces the majority class (Miok et al, 2019). However, most of the works did not deal with class imbalance.…”
Section: Automatic Hate Speech Detectionmentioning
confidence: 99%
“…In Plaza-Del-Arco et al ( 2020) used TF weighting to represent unigrams and bigrams as vectors of numerical features to misogyny and xenophobia detected in Spanish tweets. Several works used TF-IDF weighting features for hate speech detection (Almatarneh et al, 2019;Elisabeth et al, 2020;Mossie & Wang, 2020;Salminen et al, 2020). The TF-IDF provided good classification performance for hate speech detection with the same dataset to train and test the models.…”
Section: Term Frequencymentioning
confidence: 99%
“…We carefully reviewed each document to obtain the key information of each work. In this part, we focus on [11], [30], [17], [23], [12], [28], [27], [21], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40]…”
Section: B What Has Been Done So Far In Indonesian Abusive Language D...mentioning
confidence: 99%
“…General feature representation methods of text mining have been successfully adapted to the problem of hate speech detection, such as Bag-of-Words (BoW) (Burnap & Williams, 2016;Nobata et al, 2016), n-grams (Corazza et al, 2020;Santosh & Aravind, 2019), dictionaries or lexical resources (Gitari et al, 2015;Mathew et al, 2019), etc. Regarding classification perspective, different algorithms have been employed, such as Logistic Regression (Davidson et al, 2017), Support Vector Machine (SVM) (Salminen et al, 2018), Random Forest (Elisabeth et al, 2020), Decision tree (Plaza-Del-Arco et al, 2020). Davidson et al (2017) addressed the problem of hate speech detection on Twitter, focusing on distinguishing between hate speech and offensive language.…”
Section: Automatic Hate Speech Detectionmentioning
confidence: 99%
“…For the representations f 4 to f 9 , we selected traditional feature extraction methods used for hate speech detection (Almatarneh et al, 2019;Corazza et al, 2020;Elisabeth et al, 2020;Salminen et al, 2020;Senarath & Purohit, 2020;Santosh & Aravind, 2019). These methods are based on the Bag-of-Words (BoW) technique.…”
Section: Pool Generationmentioning
confidence: 99%