2018
DOI: 10.1007/978-3-319-73706-5_15
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Classification of Abusive Language and Personal Attacks in Various Forms of Online Communication

Abstract: Abstract. The sheer ease with which abusive and hateful utterances can be made online -typically from the comfort of your home and the lack of any immediate negative repercussions -using today's digital communication technologies (especially social media), is responsible for their significant increase and global ubiquity. Natural Language Processing technologies can help in addressing the negative effects of this development. In this contribution we evaluate a set of classification algorithms on two types of u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 24 publications
(19 citation statements)
references
References 16 publications
1
18
0
Order By: Relevance
“…In (Wang et al 2017), naive Bayesian classifier demonstrated the worst classification quality. In another study (Bourgonje et al 2018), the relatively high quality of this model was observed only with a relatively large length of texts.…”
Section: Naive Bayesian Classifiermentioning
confidence: 87%
See 2 more Smart Citations
“…In (Wang et al 2017), naive Bayesian classifier demonstrated the worst classification quality. In another study (Bourgonje et al 2018), the relatively high quality of this model was observed only with a relatively large length of texts.…”
Section: Naive Bayesian Classifiermentioning
confidence: 87%
“…It was noted that the average length of texts affects the result of classification. In (Bourgonje et al 2018) the authors deal with publications in social network Twitter and articles from Wikipedia with an average length of 18 and 65 words respectively. As a result of experiments, it became obvious that different classifiers give best result for different average length of texts.…”
Section: Influence Of the Length Of Text On The Quality Of Classificamentioning
confidence: 99%
See 1 more Smart Citation
“…From an NLP perspective, the challenge of dealing with this problem is further exemplified by the fact that annotated data is hard to find, and, if present, exhibits rather low inter-annotator agreement. Approaching the "abusive language" and "hate speech" problem from an NLP angle (Bourgonje et al, 2017), (Ross et al, 2016) introduce a German corpus of tweets and annotate it for hate speech, resulting in figures for Krippendorff's α between 0.18 and 0.29, (Waseem, 2016) compare amateur (CrowdFlower) annotations and expert annotations on an English corpus of Tweets and report figures for Cohen's Kappa of 0.14, (Van Hee et al, 2015) use a Dutch corpus annotated for cyberbullying and report Kappa scores between 0.19 and 0.69, and (Kwok and Wang, 2013) investigate English racist tweets and report an overall interannotator agreement of only 33%.…”
Section: Related Workmentioning
confidence: 99%
“…3); the concept has been devised in a research and technology transfer project, in which smart technologies for curating large amounts of digital content are being developed and applied by companies that cover different sectors including journalism (Rehm and Sasaki 2015;Bourgonje et al 2016a,b;Rehm et al 2017). Among others, we currently develop services aimed at the detection and classification of abusive language (Bourgonje et al 2017a) and clickbait content (Bourgonje et al 2017b). The proposed hybrid infrastructure combines automatic language technology components and user-generated annotations and is meant to empower internet users better to handle the modern online media phenomena mentioned above.…”
Section: Introductionmentioning
confidence: 99%