Proceedings of the 13th International Workshop on Semantic Evaluation 2019
DOI: 10.18653/v1/s19-2141
|View full text |Cite
|
Sign up to set email alerts
|

UVA Wahoos at SemEval-2019 Task 6: Hate Speech Identification using Ensemble Machine Learning

Abstract: With the growth in the usage of social media, it has become increasingly common for people to hide behind a mask and abuse others. We have attempted to detect such tweets and comments that are malicious in intent, which either targets an individual or a group. Our best classifier for identifying offensive tweets for SubTask A (Classifying offensive vs. nonoffensive) has an accuracy of 83.14% and a f1score of 0.7565 on the actual test data. For SubTask B, to identify if an offensive tweet is targeted (If target… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 11 publications
0
7
0
Order By: Relevance
“…Several NLP approaches have been proposed for the task of hate speech detection (Qian et al, 2018;Indurthi et al, 2019;Vidgen et al, 2021;Fersini et al, 2020a;Attanasio and Pastor, 2020;Kennedy et al, 2020;Attanasio et al, 2022b, inter alia). While ensemble modeling has been proven to be effective for several tasks in NLP (Garmash and Monz, 2016;Nozza et al, 2016;Fadel et al, 2019;Bashmal and AlZeer, 2021), a limited number of research work have investigated its potentiality for hate speech detection (Plaza-del Arco et al, 2019;Ramakrishnan et al, 2019;Zimmer-man et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Several NLP approaches have been proposed for the task of hate speech detection (Qian et al, 2018;Indurthi et al, 2019;Vidgen et al, 2021;Fersini et al, 2020a;Attanasio and Pastor, 2020;Kennedy et al, 2020;Attanasio et al, 2022b, inter alia). While ensemble modeling has been proven to be effective for several tasks in NLP (Garmash and Monz, 2016;Nozza et al, 2016;Fadel et al, 2019;Bashmal and AlZeer, 2021), a limited number of research work have investigated its potentiality for hate speech detection (Plaza-del Arco et al, 2019;Ramakrishnan et al, 2019;Zimmer-man et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Recent papers use word embedding methods more frequently than bagof-words and n-grans because the former can extract semantic information from the text; consequently, an improvement in the performance is expected. Regarding the classifier, different paradigms have been employed; tree-based algorithms such as decision trees and random forest (RF) [17,8,18,19], artificial neural networks such as multi-layer perceptron (MLP) and convolution neural networks (CNN) [20,21,22,16,23,24,25,26,27,28,29],…”
Section: Related Workmentioning
confidence: 99%
“…Bayesian as the naive bayes (NB) [17,8], support vector machines (SVM) [17,8], and ensemble learning, which is marked () in the last column of the [20] Twitter from [30,31] racism, sexism characters, words, and both CNN 2018 Zimmerman et al [21] Twitter from [30] racism, sexism embedding deep learning 2018 Pitsilis et al [22] Twitter from [30] racism, sexism defined by the authors LSTM 2018 Montani and Schuller [18] GermEval 2018 1 general TFIDF, Word2Vec, n-gram LR, RF, ET 2019 Zhang and Luo [16] Twitter from [17,30] [17]: race ethnicity, religion [30]: racism, sexism Word2Vec CNN 2019 Liu et al [32] Twitter from [17] race ethnicity, religion embedding, LDA fuzzy ensemble 2019 Ramakrishnan et al [19] OffensEval [33] general n-gram, GloVe, others LR, RF, XG 2020 Paschalides et al [23] Twitter from [8] racism, sexism, homophobia The most common social media used to extract information to compose a dataset for hate speech detection is Twitter. Despite English being the most used language, there are datasets from many other languages, such as the Arabic-Twitter dataset [26] and Hindi-English Twitter dataset [27].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The results proved that transfer learning improves offensive language detection performance. Ramakrishnan et al (2019) used an ensemble model based on logistic regression and tree-based model to identify offensive language on SemEval-2019 Task 6. Char n-grams, word n-grams, part of speech and GloVe embedding were used as features.…”
Section: Related Workmentioning
confidence: 99%