Perturbation Sensitivity Analysis to Detect Unintended Model Biases

Prabhakaran, Vinodkumar; Hutchinson, Ben; Mitchell, Margaret

doi:10.18653/v1/d19-1578

Cited by 57 publications

(36 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…basic negation, agent/object distinction, etc). Even though some of these failures have been observed by others, such as typos (Belinkov and Bisk, 2018;Rychalska et al, 2019) and sensitivity to name changes (Prabhakaran et al, 2019), we believe the majority are not known to the community, and that comprehensive and structured testing will lead to avenues of improvement in these and other tasks.…”

Section: Discussionmentioning

confidence: 86%

“…There are existing perturbation techniques meant to evaluate specific behavioral capabilities of NLP models such as logical consistency and robustness to noise (Belinkov and Bisk, 2018), name changes (Prabhakaran et al, 2019), or adversaries (Ribeiro et al, 2018). CheckList provides a framework for such techniques to systematically evaluate these alongside a variety of other capabilities.…”

Section: Related Workmentioning

confidence: 99%

“…A number of additional evaluation approaches have been proposed, such as evaluating robustness to noise (Belinkov and Bisk, 2018;Rychalska et al, 2019) or adversarial changes (Ribeiro et al, 2018;Iyyer et al, 2018), fairness (Prabhakaran et al, 2019), logical consistency , explanations (Ribeiro et al, 2016), diagnostic datasets (Wang et al, 2019b), and interactive error analysis (Wu et al, 2019). However, these approaches focus either on individual tasks such as Question Answering or Natural Language Inference, or on a few capabilities (e.g.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Ribeiro¹,

Wu²,

Guestrin³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

474

253

View full text Add to dashboard Cite

Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a taskagnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.

show abstract

Section: Discussionmentioning

confidence: 86%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Ribeiro¹,

Wu²,

Guestrin³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

474

253

View full text Add to dashboard Cite

show abstract

“…Following (Garg et al, 2019;Prabhakaran et al, 2019), we use the notion of perturbation, whereby the phrases for referring to people with disabilities, described above, are all inserted into the same slots in sentence templates. We start by first retrieving a set of naturally-occurring sentences that contain the pronouns he or she.…”

Section: Biases In Text Classification Modelsmentioning

confidence: 99%

Social Biases in NLP Models as Barriers for Persons with Disabilities

Hutchinson¹,

Prabhakaran²,

Denton³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Self Cite

157

102

View full text Add to dashboard Cite

Building equitable and inclusive NLP technologies demands consideration of whether and how social attitudes are represented in ML models. In particular, representations encoded in models often inadvertently perpetuate undesirable social biases from the data on which they are trained. In this paper, we present evidence of such undesirable biases towards mentions of disability in two different English language models: toxicity prediction and sentiment analysis. Next, we demonstrate that the neural embeddings that are the critical first step in most NLP pipelines similarly contain undesirable biases towards mentions of disability. We end by highlighting topical biases in the discourse about disability which may contribute to the observed model biases; for instance, gun violence, homelessness, and drug addiction are over-represented in texts discussing mental illness.

show abstract

“…A recent thread of work aims to study how language models recall and leverage information about names and entities. Prabhakaran et al (2019) shows that names can have a measurable effect on the prediction of sentiment analysis systems. Shwartz et al (2020) demonstrates that pre-trained language models implicitly resolve entity ambiguity by grounding names to entities based on the pretraining corpus.…”

Section: Related Workmentioning

confidence: 99%

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

Chen

Gudipati²,

Longpre

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

View full text Add to dashboard Cite

Retrieval is a core component for open-domain NLP tasks. In open-domain tasks, multiple entities can share a name, making disambiguation an inherent yet under-explored problem. We propose an evaluation benchmark for assessing the entity disambiguation capabilities of these retrievers, which we call Ambiguous Entity Retrieval (AmbER) sets. We define an AmbER set as a collection of entities that share a name along with queries about those entities. By covering the set of entities for polysemous names, AmbER sets act as a challenging test of entity disambiguation. We create AmbER sets for three popular open-domain tasks: fact checking, slot filling, and question answering, and evaluate a diverse set of retrievers. We find that the retrievers exhibit popularity bias, significantly under-performing on rarer entities that share a name, e.g., they are twice as likely to retrieve erroneous documents on queries for the less popular entity under the same name. These experiments on AmbER sets show their utility as an evaluation tool and highlight the weaknesses of popular retrieval systems. 1

show abstract

Perturbation Sensitivity Analysis to Detect Unintended Model Biases

Abstract: arXiv:1910.04210v1 [cs.CL]

Cited by 57 publications

References 18 publications

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Social Biases in NLP Models as Barriers for Persons with Disabilities

Evaluating Entity Disambiguation and the Role of Popularity in Retrieval-Based NLP

Contact Info

Product

Resources

About