Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1483
|View full text |Cite
|
Sign up to set email alerts
|

Detecting and Reducing Bias in a High Stakes Domain

Abstract: Gang-involved youth in cities such as Chicago sometimes post on social media to express their aggression towards rival gangs and previous research has demonstrated that a deep learning approach can predict aggression and loss in posts. To address the possibility of bias in this sensitive application, we developed an approach to systematically interpret the state of the art model. We found, surprisingly, that it frequently bases its predictions on stop words such as "a" or "on", an approach that could harm soci… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 29 publications
(23 reference statements)
0
1
0
Order By: Relevance
“…Note that, since nPMI is a frequency metric, code-switching with nPMI results in this set of words that includes not only frame-indicative words but also a lot of stop words and common words such as "a", "the", "he" or "are". An alternative method, which we called "omitted words" suggests determining important words by omitting a word from the headline and reapplying the trained classifier to the headline with the missing word (similar to Zhong et al (2019);Ribeiro et al (2016)). We then compute the drop in the probability as an importance measure for word x j , Importance(x j ) = p(y|x 1 , .…”
Section: Code-switching Analysismentioning
confidence: 99%
“…Note that, since nPMI is a frequency metric, code-switching with nPMI results in this set of words that includes not only frame-indicative words but also a lot of stop words and common words such as "a", "the", "he" or "are". An alternative method, which we called "omitted words" suggests determining important words by omitting a word from the headline and reapplying the trained classifier to the headline with the missing word (similar to Zhong et al (2019);Ribeiro et al (2016)). We then compute the drop in the probability as an importance measure for word x j , Importance(x j ) = p(y|x 1 , .…”
Section: Code-switching Analysismentioning
confidence: 99%