Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.210
|View full text |Cite
|
Sign up to set email alerts
|

Ruddit: Norms of Offensiveness for English Reddit Comments

Abstract: Warning: This paper contains comments that may be offensive or upsetting.On social media platforms, hateful and offensive language negatively impact the mental well-being of users and the participation of people from diverse backgrounds. Automatic methods to detect offensive language have largely relied on datasets with categorical labels. However, comments can vary in their degree of offensiveness. We create the first dataset of English language Reddit comments that has fine-grained, real-valued scores betwee… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 16 publications
(9 citation statements)
references
References 47 publications
0
5
0
Order By: Relevance
“…Identifying Toxicity -Most works on identifying toxic language looked at isolated social media posts or comments while ignoring the context (Davidson et al, 2017;Xu et al, 2012;Zampieri et al, 2019;Rosenthal et al, 2020;Kumar et al, 2018;Garibo i Orts, 2019;Ousidhoum et al, 2019;Breitfeller et al, 2019;Hada et al, 2021;Barikeri et al, 2021) train chatbots to avoid sensitive discussions by changing the topic of the conversation. In contrast, we tackle contextual offensive language by fine-tuning models to generate neutral and safe responses in offensive contexts.…”
Section: Related Workmentioning
confidence: 99%
“…Identifying Toxicity -Most works on identifying toxic language looked at isolated social media posts or comments while ignoring the context (Davidson et al, 2017;Xu et al, 2012;Zampieri et al, 2019;Rosenthal et al, 2020;Kumar et al, 2018;Garibo i Orts, 2019;Ousidhoum et al, 2019;Breitfeller et al, 2019;Hada et al, 2021;Barikeri et al, 2021) train chatbots to avoid sensitive discussions by changing the topic of the conversation. In contrast, we tackle contextual offensive language by fine-tuning models to generate neutral and safe responses in offensive contexts.…”
Section: Related Workmentioning
confidence: 99%
“…There is an abundance of datasets for moderating user-generated content, mostly generated on online social networking sites. Examples of these include Jigsaw (Jigsaw, 2017), Twitter (Zampieri et al, 2019;Basile et al, 2019), Stormfront (de Gibert et al, 2018, Reddit (Hada et al, 2021), Hateful Memes (Kiela et al, 2021). However, the task of guarding LLM-generated content differs from the human-generated content moderation as 1) the style and length of text produced by humans is different from that of LLMs, 2) the type of potential harms encountered in human-generated content are typically limited to hate speech, while LLM moderation requires dealing with a broader range of potential harms 3) guarding LLM-generated involves dealing with prompt-response pairs.…”
Section: Related Workmentioning
confidence: 99%
“…Since Reddit involved additional data collection (a time consuming process), we chose a popular dataset that contains less than 10,000 datapoints. Annotated hate speech data: We use the following english hate speech datasets for our experiments (See Table 1 for more information on dataset statistics) -(i) HateXplain-GAB dataset (Mathew et al, 2021) (contains data from GAB), (ii) LTI-GAB dataset (Qian et al, 2019) (contains data from GAB) and, (iii) Ruddit (Hada et al, 2021) (contains data from Reddit).…”
Section: Datasetsmentioning
confidence: 99%