Designing Toxic Content Classification for a Diversity of Perspectives

Kumar, Deepak; Kelley, Patrick Gage; Consolvo, Sunny; Mason, Joshua; Bursztein, Elie; Durumeric, Zakir; Thomas, Kurt; Bailey, Michael

doi:10.48550/arxiv.2106.04511

Cited by 3 publications

(7 citation statements)

References 19 publications

(25 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the jury composition task, we displayed one of 5 possible comment sets (generated by random samples from our comment toxicity dataset [49] stratified by toxicity severity and labeler disagreement) to exemplify the type of content they would need to moderate on YourPlatform. Participants were then shown a simplified jury composition input form that allowed them to allocate 12-person jury slots using three demographic attributes: (1) gender (Female, Male, Non-binary, Other), ( 2) race (Black or African American, White, Asian, Hispanic, American Indian or Alaska Native, Native Hawaiian or Pacific Islander, Other) and (3) political affiliation (Conservative, Liberal, Independent, Other).…”

Section: Methodsmentioning

confidence: 99%

“…Moderators of online communities (N=18) were asked to author juries for a comment toxicity classification task. We find that the resulting juries contain 2.9 times the representation of non-White jurors and 31.5 times the representation of non-binary jurors compared to those created implicitly by a large toxicity dataset [49]. This increased diversity in the jury composition changed the algorithm's classifications on 14% of items, reflecting the fact that jury learning captured those individual jurors' views far better than a baseline, state of the art aggregated model (with an MAE of 0.62 versus 1.05).…”

Section: Introductionmentioning

confidence: 92%

“…Indeed, a toxicity model tuned with a simple positive or negative offset (i.e. baseline) for each annotator achieves far more accurate perannotator results than a standard classifier [49].…”

Section: Disagreement Datasets and Machine Learningmentioning

confidence: 99%

“…Jury learning enables the creation of many possible classifiers from a single dataset of labels, with the added requirement that the dataset contain information about each annotator for any group or voice that the practitioner wishes to include. For instance, the toxicity dataset we use as our example application domain [49] includes education, past work experience or qualifications, racial identity, gender identity, political affiliation, age, disability status. Practitioners specify jurors either individually, or using any group membership criteria available in the dataset.…”

Section: Approach and Interactionmentioning

confidence: 99%

“…4.1.1 Dataset description. We train our model using a publicly available balanced dataset [49] in which 107,620 social media comments were labeled by five annotators each, from a pool of 17,280 unique annotators. This dataset was collected to understand how user expectations for what constitutes toxic content differ across demographics, beliefs, and personal experiences.…”

Section: Implementation For Toxicity Detectionmentioning

confidence: 99%

See 4 more Smart Citations

Jury Learning: Integrating Dissenting Voices into Machine Learning Models

Gordon,

Lam,

Park

et al. 2022

Preprint

View full text Add to dashboard Cite

Whose labels should a machine learning (ML) algorithm learn to emulate? For ML tasks ranging from online comment toxicity to misinformation detection to medical diagnosis, different groups in society may have irreconcilable disagreements about ground truth labels. Supervised ML today resolves these label disagreements implicitly using majority vote, which overrides minority groups' labels. We introduce jury learning, a supervised ML approach that resolves these disagreements explicitly through the metaphor of a jury: defining which people or groups, in what proportion, determine the classifier's prediction. For example, a jury learning model

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 92%

Section: Disagreement Datasets and Machine Learningmentioning

confidence: 99%

Section: Approach and Interactionmentioning

confidence: 99%

Section: Implementation For Toxicity Detectionmentioning

confidence: 99%

See 3 more Smart Citations

Jury Learning: Integrating Dissenting Voices into Machine Learning Models

Gordon,

Lam,

Park

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance

Raji

Honigsberg

et al. 2022

Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society

View full text Add to dashboard Cite

Much attention has focused on algorithmic audits and impact assessments to hold developers and users of algorithmic systems accountable. But existing algorithmic accountability policy approaches have neglected the lessons from non-algorithmic domains: notably, the importance of interventions that allow for the effective participation of third parties. Our paper synthesizes lessons from other fields on how to craft effective systems of external oversight for algorithmic deployments. First, we discuss the challenges of third party oversight in the current AI landscape. Second, we survey audit systems across domains -e.g., financial, environmental, and health regulation -and show that the institutional design of such audits are far from monolithic. Finally, we survey the evidence base around these design components and spell out the implications for algorithmic auditing. We conclude that the turn toward audits alone is unlikely to achieve actual algorithmic accountability, and sustained focus on institutional design will be required for meaningful third party involvement. CCS CONCEPTS• Social and professional topics → Governmental regulations.

show abstract

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

Weerasooriya,

Dutta,

Ranasinghe

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

This paper discusses and contains content that is offensive or disturbing. Offensive speech detection is a key component of content moderation. However, what is offensive can be highly subjective. This paper investigates how machine and human moderators disagree on what is offensive when it comes to real-world social web political discourse. We show that (1) there is extensive disagreement among the moderators (humans and machines); and (2) human and large-language-model classifiers are unable to predict how other human raters will respond, based on their political leanings. For (1), we conduct a noise audit at an unprecedented scale that combines both machine and human responses. For (2), we introduce a firstof-its-kind dataset 1 of vicarious offense. Our noise audit reveals that moderation outcomes vary wildly across different machine moderators. Our experiments with human moderators suggest that political leanings combined with sensitive issues affect both first-person and vicarious offense.

show abstract

Designing Toxic Content Classification for a Diversity of Perspectives

Cited by 3 publications

References 19 publications

Jury Learning: Integrating Dissenting Voices into Machine Learning Models

Jury Learning: Integrating Dissenting Voices into Machine Learning Models

Outsider Oversight: Designing a Third Party Audit Ecosystem for AI Governance

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

Contact Info

Product

Resources

About