Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

Kocoń, Jan; Figas, Alicja; Gruza, Marcin; Puchalska, Daria; Kajdanowicz, Tomasz; Kazienko, Przemysław

doi:10.1016/j.ipm.2021.102643

Cited by 71 publications

(39 citation statements)

References 39 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The inter-annotator agreement plays a vital role in creating the datasets for hate speech as it affects the performance of a ML algorithm (Kocoń et al 2021 ). In context of fake news and hate speech, Twitter is the preferred social media platform for extracting information and preparing a dataset.…”

Section: Datasetsmentioning

confidence: 99%

Detection and moderation of detrimental content on social media platforms: current status and future directions

Gongane

Munot

Anuse

2022

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

Social Media has become a vital component of every individual's life in society opening a preferred spectrum of virtual communication which provides an individual with a freedom to express their views and thoughts. While virtual communication through social media platforms is highly desirable and has become an inevitable component, the dark side of social media is observed in form of detrimental/objectionable content. The reported detrimental contents are fake news, rumors, hate speech, aggressive, and cyberbullying which raise up as a major concern in the society. Such detrimental content is affecting person’s mental health and also resulted in loss which cannot be always recovered. So, detecting and moderating such content is a prime need of time. All social media platforms including Facebook, Twitter, and YouTube have made huge investments and also framed policies to detect and moderate such detrimental content. It is of paramount importance in the first place to detect such content. After successful detection, it should be moderated. With an overflowing increase in detrimental content on social media platforms, the current manual method to identify such content will never be enough. Manual and semi-automated moderation methods have reported limited success. A fully automated detection and moderation is a need of time to come up with the alarming detrimental content on social media. Artificial Intelligence (AI) has reached across all sectors and provided solutions to almost all problems, social media content detection and moderation is not an exception. So, AI-based methods like Natural Language Processing (NLP) with Machine Learning (ML) algorithms and Deep Neural Networks is rigorously deployed for detection and moderation of detrimental content on social media platforms. While detection of such content has been receiving good attention in the research community, moderation has received less attention. This research study spans into three parts wherein the first part emphasizes on the methods to detect the detrimental components using NLP. The second section describes about methods to moderate such content. The third part summarizes all observations to provide identified research gaps, unreported problems and provide research directions.

show abstract

Section: Datasetsmentioning

confidence: 99%

Detection and moderation of detrimental content on social media platforms: current status and future directions

Gongane

Munot

Anuse

2022

Soc. Netw. Anal. Min.

View full text Add to dashboard Cite

show abstract

“…Context dependency of whether an utterance is "toxic" The views about what constitutes unacceptable "toxic speech" differ between individuals and social groups (Kocoń et al, 2021). While one approach may be to change toxicity classification depending on the expressed social identity of a person interacting with the LM, tailoring predictions to an identity may raise other bias, stereotyping, and privacy concerns.…”

Section: Additional Considerationsmentioning

confidence: 99%

“…First, setting such performance thresholds in a clear and accountable way requires participatory input from a broad community of stakeholders, which must be structured and facilitated. Second, views on what level of performance is needed are likely to diverge -for example, people hold different views of what constitutes unacceptable "toxic speech" (Kocoń et al, 2021). This raises political questions about how best to arbitrate conflicting perspectives (Gabriel, 2020a), and knock-on questions such as who constitutes the appropriate reference group in relation to a particular application or product.…”

Section: Benchmarking: When Is a Model "Fair Enough"?mentioning

confidence: 99%

Ethical and social risks of harm from Language Models

Weidinger¹,

Mellor²,

Rauh³

et al. 2021

Preprint

105

View full text Add to dashboard Cite

This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary literature from computer science, linguistics, and social sciences.

show abstract

“…(2) Creating identity-based pools on pre-existing datasets that looks for differences based on markers like age, gender, ESL, education e.g. on the Wikipedia Detox Dataset [22]. (3) Creating small, expert-based pools that perform annotations based on certain markers e.g.…”

Section: Related Workmentioning

confidence: 99%

“…This way, those who are likely to be targeted, and who would be best equipped to label the data, would be the ones to determine the ground truth for models that classify toxicity online. This paper continues to build upon research in this space of creating groups of annotators based on some differentiating factor(s) [3,16,22,41]. More specifically, we explore how raters from two relevant identity groups, African American and LGBTQ, label data that represents those identities, and whether their ratings vary from those provided by a randomly selected pool of raters who do not self-identify with these identity groups.…”

Section: Introductionmentioning

confidence: 99%

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Goyal¹,

Kivlichan²,

Rosen³

et al. 2022

Preprint

View full text Add to dashboard Cite

Machine learning models are commonly used to detect toxicity in online conversations. These models are trained on datasets annotated by human raters. We explore how raters' self-described identities impact how they annotate toxicity in online comments. We first define the concept of specialized rater pools: rater pools formed based on raters' self-described identities, rather than at random. We formed three such rater pools for this study-specialized rater pools of raters from the U.S. who identify as African American, LGBTQ, and those who identify as neither. Each of these rater pools annotated the same set of comments, which contains many references to these identity groups. We found that rater identity is a statistically significant factor in how raters will annotate toxicity for identity-related annotations. Using preliminary content analysis, we examined the comments with the most disagreement between rater pools and found nuanced differences in the toxicity annotations. Next, we trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets. Finally, we discuss how using raters that self-identify with the subjects of comments can create more inclusive machine learning models, and provide more nuanced ratings than those by random raters. Please be advised that this work contains examples of toxic and offensive content.

show abstract

Offensive, aggressive, and hate speech analysis: From data-centric to human-centered approach

Cited by 71 publications

References 39 publications

Detection and moderation of detrimental content on social media platforms: current status and future directions

Detection and moderation of detrimental content on social media platforms: current status and future directions

Ethical and social risks of harm from Language Models

Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on Toxicity Annotation

Contact Info

Product

Resources

About