Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

Ravfogel, Shauli; Elazar, Yanai; Gonen, Hila; Twiton, Michael; Goldberg, Yoav

doi:10.48550/arxiv.2004.07667

Cited by 23 publications

(49 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…compute the counterfactual representation by pre-training an additional instance of the language representation model employed by the classifier, with an adversarial component designed to "forget" the concept of choice, while controlling for confounding concepts. Ravfogel et al (2020) offered a method for removing information from neural representations by iteratively training linear classifiers and projecting the representations on their null-spaces.…”

Section: Causal Model Interpretationsmentioning

confidence: 99%

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Feder

Keith²,

Manzoor³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

Section: Causal Model Interpretationsmentioning

confidence: 99%

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Feder

Keith²,

Manzoor³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…We have selected a subset of the Blogs data for this experiment, where author occupation is either student or arts, and the age is either teen or adult (two domain obfuscation). we have taken an approach similar to that of Ravfogel et al (2020), where we create 4 different levels of imbalance. In all cases, the dataset is balanced with respect to both occupation and age.…”

Section: Fairness Resultsmentioning

confidence: 99%

“…A large body of prior work has attempted to address algorithmic bias by modifying different stages of the natural language processing (NLP) pipeline. For example, Ravfogel et al (2020) attempt to de-bias word embeddings used by NLP systems, while Elazar and Goldberg (2018) address the bias in learned model representations and encodings. While effective in many cases, such approaches do nothing to mitigate bias in decisions made by humans based on text.…”

Section: Introductionmentioning

confidence: 99%

“…of prior work has attempted to address algorithmic bias by modifying different stages of the natural language processing (NLP) pipeline Blodgett et al (2021),Barikeri et al (2021), Farrand et al (2020,Mireshghallah et al (2021a). andSheng et al (2019) propose and analyze benchmarks for evaluating fairness in different applications Ravfogel et al (2020),Kaneko and Bollegala (2019). Shin et al (2020) andKaneko and Bollegala (2021) attempt to de-bias word embeddings used by NLP systems, whileElazar and Goldberg (2018);Barrett et al (2019);Wang et al (2021) attempt to de-bias model representations and encodings.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness

Mireshghallah¹,

Berg-Kirkpatrick²

2021

Preprint

View full text Add to dashboard Cite

Text style can reveal sensitive attributes of the author (e.g. race or age) to the reader, which can, in turn, lead to privacy violations and bias in both human and algorithmic decisions based on text. For example, the style of writing in job applications might reveal protected attributes of the candidate which could lead to bias in hiring decisions, regardless of whether hiring decisions are made algorithmically or by humans. We propose a VAE-based framework that obfuscates stylistic features of human-generated text through style transfer by automatically re-writing the text itself. Our framework operationalizes the notion of obfuscated style in a flexible way that enables two distinct notions of obfuscated style: (1) a minimal notion that effectively intersects the various styles seen in training, and (2) a maximal notion that seeks to obfuscate by adding stylistic features of all sensitive attributes to text, in effect, computing a union of styles.Our style-obfuscation framework can be used for multiple purposes, however, we demonstrate its effectiveness in improving the fairness of downstream classifiers. We also conduct a comprehensive study on style pooling's effect on fluency, semantic consistency, and attribute removal from text, in two and three domain style obfuscation. 1

show abstract

“…Manzini et al (2019) extended this work to the multi-class setting, enabling debiasing in race and religion. Concurrent to their work, (Ravfogel et al, 2020) propose iterative null space 3 Experimental Setup…”

Section: Debiasing Word Embeddingsmentioning

confidence: 99%

Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP

Chen

Berlot-Attwell²,

Wang³

et al. 2020

Proceedings of the 3rd Clinical Natural Language Processing Workshop

View full text Add to dashboard Cite

Clinical machine learning is increasingly multimodal, collected in both structured tabular formats and unstructured forms such as free text. We propose a novel task of exploring fairness on a multimodal clinical dataset, adopting equalized odds for the downstream medical prediction tasks. To this end, we investigate a modality-agnostic fairness algorithm-equalized odds post processing-and compare it to a text-specific fairness algorithm: debiased clinical word embeddings. Despite the fact that debiased word embeddings do not explicitly address equalized odds of protected groups, we show that a text-specific approach to fairness may simultaneously achieve a good balance of performance and classical notions of fairness. We hope that our paper inspires future contributions at the critical intersection of clinical NLP and fairness. The full source code is available here: https://github.com/ johntiger1/multimodal_fairness

show abstract

Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection

Cited by 23 publications

References 0 publications

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Causal Inference in Natural Language Processing: Estimation, Prediction, Interpretation and Beyond

Style Pooling: Automatic Text Style Obfuscation for Improved Classification Fairness

Exploring Text Specific and Blackbox Fairness Algorithms in Multimodal Clinical NLP

Contact Info

Product

Resources

About