Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Softw 2021
DOI: 10.1145/3468264.3468537
|View full text |Cite
|
Sign up to set email alerts
|

Bias in machine learning software: why? how? what to do?

Abstract: Increasingly, software is making autonomous decisions in case of criminal sentencing, approving credit cards, hiring employees, and so on. Some of these decisions show bias and adversely affect certain social groups (e.g. those defined by sex, race, age, marital status). Many prior works on bias mitigation take the following form: change the data or learners in multiple ways, then see if any of that improves fairness. Perhaps a better approach is to postulate root causes of bias and then applying some resoluti… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 103 publications
(69 citation statements)
references
References 25 publications
0
62
0
Order By: Relevance
“…ML software is developed following the data-driven programming paradigm. Therefore, data determine the decision logic of ML software to a large extent [17], and data bias is considered a main root cause of ML software bias [48]. Data testing aims to detect different types of data bias, including checking whether the labels of training data are biased (label bias) [35], whether the distribution of training data implies an unexpected correlation between the sensitive attribute and the outcome label (selection bias) [49], whether the features of training data contain bias (feature bias) [50].…”
Section: Fairness Testing Componentsmentioning
confidence: 99%
See 3 more Smart Citations
“…ML software is developed following the data-driven programming paradigm. Therefore, data determine the decision logic of ML software to a large extent [17], and data bias is considered a main root cause of ML software bias [48]. Data testing aims to detect different types of data bias, including checking whether the labels of training data are biased (label bias) [35], whether the distribution of training data implies an unexpected correlation between the sensitive attribute and the outcome label (selection bias) [49], whether the features of training data contain bias (feature bias) [50].…”
Section: Fairness Testing Componentsmentioning
confidence: 99%
“…For example, engineers cannot judge whether a system is fair to women if they are unaware of the outcomes that the system provides to men. In practice, metamorphic relations and statistical measurements are adopted to tackle the oracle problem of fairness testing [48], [55].…”
Section: Software Testing Vs Fairness Testingmentioning
confidence: 99%
See 2 more Smart Citations
“…Reject option classification [24] is a post-processing strategy that translates favorable outcomes from the privileged group to the unprivileged group and unfavorable outcomes from the unprivileged group to the privileged group based upon a certain level of confidence and uncertainty. Chakraborty et al proposed Fair-SMOTE [25], a pre-processing and inprocessing approach, which balances class and label distributions and performs situation testing (i.e testing individual fairness through alternate "worlds").…”
Section: Related Workmentioning
confidence: 99%