Detecting Harmful Content on Online Platforms: What Platforms Need vs. Where Research Efforts Go

Arora, Arnav; Nakov, Preslav; Hardalov, Momchil; Sarwar, Sheikh Muhammad; Nayak, Vibha; Dinkov, Yoan; Zlatkova, Dimitrina; Dent, Kyle; Ameya, Bhatawdekar,; Bouchard, Guillaume; Augenstein, Isabelle

doi:10.1145/3603399

Cited by 9 publications

(4 citation statements)

References 78 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Strategies and resources have also been put forward for identifying threatening content in low-resource languages [63,64]. Additionally, comprehensive surveys on threat detection techniques and moderation policies on tackling such content by online platforms have been conducted [65,66]. Many languages still lack sufficient linguistic resources for NLP-related tasks [67].…”

Section: Downstream Tasks In Hausa Languagementioning

confidence: 99%

Detection and Analysis of Offensive Online Content in Hausa Language

Adam,

Zandam,

Inuwa-Dutse

2024

Preprint

View full text Add to dashboard Cite

Hausa, a major Chadic language spoken by over 100 million people in Africa, faces a challenge in the digital age. While widely used, it is considered a low-resource language from a computational linguistic perspective. This means there are limited resources and tools to analyse Hausa text, making it difficult to detect offensive and threatening language online. Our study aimed to bridge this gap. We conducted two user studies (n = 180) to understand cyberbullying in Hausa. We then created the first-ever dataset of offensive and threatening Hausa phrases to train detection systems. We developed a system to flag such content and compared it to Google translation’s ability to detect these terms. Our findings revealed a concerning trend: offensive and threatening language is prevalent online, especially in discussions about religion and politics. Our detection system was able to detect more than 70% of offensive and threatening content, although many of these were mistranslated by Google’s translation engine. We attribute this to the subtle relationship between offensive and threatening content and idiomatic expressions in the Hausa language. This highlights the importance of considering cultural nuances and idiomatic expressions in Hausa. To create a safer online environment for Hausa speakers, we recommend involving diverse stakeholders who understand local contexts and demographics. This will allow for the development of more accurate detection systems and targeted moderation strategies. Trigger Warning: Readers may find some of the terms in this study distressing or disturbing; all examples are for illustration only.

show abstract

Section: Downstream Tasks In Hausa Languagementioning

confidence: 99%

Detection and Analysis of Offensive Online Content in Hausa Language

Adam,

Zandam,

Inuwa-Dutse

2024

Preprint

View full text Add to dashboard Cite

show abstract

“…7 Specifically, on May 23rd we queried Twitter for all the accounts that shared a tweet in UK-RU and FR-22, obtaining almost 2M users that were suspended by the platform for violating their rules. Twitter might suspend an account in a variety of circumstances that range from promoting violence and glorifying crime to hate speech, spam, and impersonation; similarly to other Big Tech platforms, these guidelines are considered among the most stringent [ 61 ]. More details about reasons for suspension are available in the Twitter documentation.…”

Section: Data Collectionmentioning

confidence: 99%

How does Twitter account moderation work? Dynamics of account creation and suspension on Twitter during major geopolitical events

Pierri,

Luceri,

Chen

et al. 2023

EPJ Data Sci.

View full text Add to dashboard Cite

Social media moderation policies are often at the center of public debate, and their implementation and enactment are sometimes surrounded by a veil of mystery. Unsurprisingly, due to limited platform transparency and data access, relatively little research has been devoted to characterizing moderation dynamics, especially in the context of controversial events and the platform activity associated with them. Here, we study the dynamics of account creation and suspension on Twitter during two global political events: Russia’s invasion of Ukraine and the 2022 French Presidential election. Leveraging a large-scale dataset of 270M tweets shared by 16M users in multiple languages over several months, we identify peaks of suspicious account creation and suspension, and we characterize behaviors that more frequently lead to account suspension. We show how large numbers of accounts get suspended within days of their creation. Suspended accounts tend to mostly interact with legitimate users, as opposed to other suspicious accounts, making unwarranted and excessive use of reply and mention features, and sharing large amounts of spam and harmful content. While we are only able to speculate about the specific causes leading to a given account suspension, our findings contribute to shedding light on patterns of platform abuse and subsequent moderation during major events.

show abstract

“…Research on this topic was motivated by the pressing need to create safer environments in social media platforms through strategies such as automatic content moderation (Weerasooriya et al 2023). With the goal of aiding content moderation, systems are trained to recognize a variety of related phenomena such as aggression, cyberbulling, hate speech, and toxicity (Arora et al 2023).…”

Section: Introductionmentioning

confidence: 99%

OffensEval 2023: Offensive language identification in the age of Large Language Models

Zampieri,

Rosenthal,

Nakov

et al. 2023

Nat. Lang. Eng.

Self Cite

View full text Add to dashboard Cite

The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.

show abstract

Detecting Harmful Content on Online Platforms: What Platforms Need vs. Where Research Efforts Go

Cited by 9 publications

References 78 publications

Detection and Analysis of Offensive Online Content in Hausa Language

Detection and Analysis of Offensive Online Content in Hausa Language

How does Twitter account moderation work? Dynamics of account creation and suspension on Twitter during major geopolitical events

OffensEval 2023: Offensive language identification in the age of Large Language Models

Contact Info

Product

Resources

About