Bias and comparison framework for abusive language datasets

Wich, Maximilian; Eder, Tobias; Kuwatly, Hala Al; Groh, Georg

doi:10.1007/s43681-021-00081-0

Cited by 5 publications

(2 citation statements)

References 37 publications

(53 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…pose a significant concern as they diminish the generalizability of models and may increase the risk of developing discriminatory models against certain social categories. Our awareness is limited to a single study (Wich et al, 2022) that reviewed biases in Arabic toxic language datasets. Specifically, the study covered six Arabic datasets and five in English.…”

Section: Findings and Discussionmentioning

confidence: 99%

Toxic language detection: A systematic review of Arabic datasets

Bensalem,

Rosso,

Zitouni

2024

Expert Systems

View full text Add to dashboard Cite

The detection of toxic language in the Arabic language has emerged as an active area of research in recent years, and reviewing the existing datasets employed for training the developed solutions has become a pressing need. This paper offers a comprehensive survey of Arabic datasets focused on online toxic language. We systematically gathered a total of 54 available datasets and their corresponding papers and conducted a thorough analysis, considering 18 criteria across four primary dimensions: availability details, content, annotation process, and reusability. This analysis enabled us to identify existing gaps and make recommendations for future research works. For the convenience of the research community, the list of the analysed datasets is maintained in a GitHub repository.

show abstract

Section: Findings and Discussionmentioning

confidence: 99%

Toxic language detection: A systematic review of Arabic datasets

Bensalem,

Rosso,

Zitouni

2024

Expert Systems

View full text Add to dashboard Cite

show abstract

“… Al Kuwatly, Wich, and Groh (2020) identified annotator bias based on several demographic characteristics such as age, first language, and education level that leads to biased abusive language and hate speech detectors. Lastly, Wich, Bauer, and Groh (2020) found a negative effect of political bias in hate speech detection models and later developed a framework to analyse and uncover inherent biases in abusive language datasets ( Wich, Eder, Al Kuwatly, & Groh, 2022 ). In this paper, we address the ethical principles of fairness and prevention of harm ( High-Level Expert Group on AI, 2019 ).…”

Section: Related Workmentioning

confidence: 99%

Preventing profiling for ethical fake news detection

Allein

Moens

Perrotta

2023

Information Processing & Management

View full text Add to dashboard Cite

Towards AI ethics’ institutionalization: knowledge bridges from business ethics to advance organizational AI ethics

Schultz

Seele

2022

AI Ethics

View full text Add to dashboard Cite

This paper proposes to generate awareness for developing Artificial intelligence (AI) ethics by transferring knowledge from other fields of applied ethics, particularly from business ethics, stressing the role of organizations and processes of institutionalization. With the rapid development of AI systems in recent years, a new and thriving discourse on AI ethics has (re-)emerged, dealing primarily with ethical concepts, theories, and application contexts. We argue that business ethics insights may generate positive knowledge spillovers for AI ethics, given that debates on ethical and social responsibilities have been adopted as voluntary or mandatory regulations for organizations in both national and transnational contexts. Thus, business ethics may transfer knowledge from five core topics and concepts researched and institutionalized to AI ethics: (1) stakeholder management, (2) standardized reporting, (3) corporate governance and regulation, (4) curriculum accreditation, and as a unified topic (5) AI ethics washing derived from greenwashing. In outlining each of these five knowledge bridges, we illustrate current challenges in AI ethics and potential insights from business ethics that may advance the current debate. At the same time, we hold that business ethics can learn from AI ethics in catching up with the digital transformation, allowing for cross-fertilization between the two fields. Future debates in both disciplines of applied ethics may benefit from dialog and cross-fertilization, meant to strengthen the ethical depth and prevent ethics washing or, even worse, ethics bashing.

show abstract

Bias and comparison framework for abusive language datasets

Cited by 5 publications

References 37 publications

Toxic language detection: A systematic review of Arabic datasets

Toxic language detection: A systematic review of Arabic datasets

Preventing profiling for ethical fake news detection

Towards AI ethics’ institutionalization: knowledge bridges from business ethics to advance organizational AI ethics

Contact Info

Product

Resources

About