2018
DOI: 10.48550/arxiv.1809.10610
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Counterfactual Fairness in Text Classification through Robustness

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(17 citation statements)
references
References 0 publications
0
17
0
Order By: Relevance
“…Recent research indicates that state of the art benchmarks for toxicity disproportionately misclassify utterances from marginalised social groups as toxic (Welbl et al, 2021), a concern that is particularly pronounced for African American English (Dixon et al, 2018;Ghaffary, 2019;Hanu et al, 2021;Sap et al, 2019) 6 . The question of how to mitigate bias in toxic or hate speech detection remains an area of active inquiry (Davani et al, 2020;Garg et al, 2019).…”
Section: Racist Bias In Toxicity Detectionmentioning
confidence: 99%
“…Recent research indicates that state of the art benchmarks for toxicity disproportionately misclassify utterances from marginalised social groups as toxic (Welbl et al, 2021), a concern that is particularly pronounced for African American English (Dixon et al, 2018;Ghaffary, 2019;Hanu et al, 2021;Sap et al, 2019) 6 . The question of how to mitigate bias in toxic or hate speech detection remains an area of active inquiry (Davani et al, 2020;Garg et al, 2019).…”
Section: Racist Bias In Toxicity Detectionmentioning
confidence: 99%
“…To assess accuracy, following Singh & Joachims (2019), we report the average stochastic test NDCG by sampling 25 rankings for each query from the learned ranking policy. To assess individual fairness, we use ranking stability with respect to demographic perturbations, which is the natural analogue of an evaluation metric for individual fairness in classification (Yurochkin & Sun, 2021;Garg et al, 2018). In particular, for each query, we create a hypothetical query by flipping the (binary) gender of each individual in the query, and deterministically rank by sorting the items by their scores.…”
Section: German Credit Data Setmentioning
confidence: 99%
“…The choice of d Y is often determined by the form of the output. For example, if the ML model outputs a vector of the logits, then we may pick the Euclidean norm as d Y [17,10]. The metric d X is the crux of (2.1) because it encodes our intuition of which inputs are similar for the ML task at hand.…”
Section: A Transport-based Definition Of Individual Fairnessmentioning
confidence: 99%