The Impact of Data Preparation on the Fairness of Software Systems

Valentim, Ines; Lourenço, Nuno; Antunes, Nuno

doi:10.1109/issre.2019.00046

Cited by 12 publications

(7 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, Zhang and Harman [78] investigated potential influencing factors of software fairness and found that enlarging the feature set was a possible way to improve fairness. Valentim et al [70] and Biswas and Rajan [26] explored the impact of different pre-processing techniques on fairness and derived insights for choosing appropriate techniques to improve software fairness.…”

Section: Related Workmentioning

confidence: 99%

MAAT: a novel ensemble approach to addressing fairness and performance bugs for machine learning software

Chen

Zhang

Sarro

et al. 2022

Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Enginee

View full text Add to dashboard Cite

Machine Learning (ML) software can lead to unfair and unethical decisions, making software fairness bugs an increasingly significant concern for software engineers. However, addressing fairness bugs often comes at the cost of introducing more ML performance (e.g., accuracy) bugs. In this paper, we propose MAAT, a novel ensemble approach to improving fairness-performance trade-off for ML software. Conventional ensemble methods combine different models with identical learning objectives. MAAT, instead, combines models optimized for different objectives: fairness and ML performance. We conduct an extensive evaluation of MAAT with 5 state-of-the-art methods, 9 software decision tasks, and 15 fairness-performance measurements. The results show that MAAT significantly outperforms the state-of-the-art. In particular, MAAT beats the trade-off baseline constructed by a recent benchmarking tool in 92.2% of the overall cases evaluated, 12.2 percentage points more than the best technique currently available. Moreover, the superiority of MAAT over the state-of-the-art holds on all the tasks and measurements that we study. We have made publicly available the code and data of this work to allow for future replication and extension. CCS CONCEPTS• Software and its engineering → Software creation and management; • Computing methodologies → Machine learning.

show abstract

Section: Related Workmentioning

confidence: 99%

MAAT: a novel ensemble approach to addressing fairness and performance bugs for machine learning software

Chen

Zhang

Sarro

et al. 2022

Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Enginee

View full text Add to dashboard Cite

show abstract

“…It is a common practice in ML software to include data processing stages to manipulate and transform the training data for the downstream learning tasks. Biswas and Rajan [45] and Valentim et al [151] tested whether data processing methods introduce fairness bugs using causal reasoning. Specifically, they employed each commonly-used data processing method as an intervention into the development process of ML software and kept other settings unchanged.…”

Section: Algorithm Testingmentioning

confidence: 99%

Fairness Testing: A Comprehensive Survey and Analysis of Trends

Chen¹,

Zhang²,

Hort³

et al. 2022

Preprint

View full text Add to dashboard Cite

Software systems are vulnerable to fairness bugs and frequently exhibit unfair behaviors, making software fairness an increasingly important concern for software engineers. Research has focused on helping software engineers to detect fairness bugs automatically. This paper provides a comprehensive survey of existing research on fairness testing. We collect 113 papers and organise them based on the testing workflow (i.e., the testing activities) and the testing components (i.e., where to find fairness bugs) for conducting fairness testing. We also analyze the research focus, trends, promising directions, as well as widely-adopted datasets and open source tools for fairness testing.

show abstract

“…Our work takes inspiration from earlier empirical studies and comparisons of fairness techniques [6,13,17,23,27,28,31], which help practitioners and researchers better understand the state of the art. But unlike these works, we experiment with ensembles and with fairness stability.…”

Section: Related Workmentioning

confidence: 99%

An Empirical Study of Modular Bias Mitigators and Ensembles

Feffer¹,

Hirzel²,

Hoffman³

et al. 2022

Preprint

View full text Add to dashboard Cite

There are several bias mitigators that can reduce algorithmic bias in machine learning models but, unfortunately, the effect of mitigators on fairness is often not stable when measured across different data splits. A popular approach to train more stable models is ensemble learning. Ensembles, such as bagging, boosting, voting, or stacking, have been successful at making predictive performance more stable. One might therefore ask whether we can combine the advantages of bias mitigators and ensembles? To explore this question, we first need bias mitigators and ensembles to work together. We built an open-source library enabling the modular composition of 10 mitigators, 4 ensembles, and their corresponding hyperparameters. Based on this library, we empirically explored the space of combinations on 13 datasets, including datasets commonly used in fairness literature plus datasets newly curated by our library. Furthermore, we distilled the results into a guidance diagram for practitioners. We hope this paper will contribute towards improving stability in bias mitigation. INTRODUCTIONAlgorithmic bias and discrimination in machine learning are a huge problem. If learned estimators make biased predictions, they might discriminate against underprivileged groups in various domains including job hiring, healthcare, loan approvals, criminal justice, higher education, and even child care. These biased predictions can reduce diversity, for instance, in the workforce of a company or in the student population of an educational institution. Such lack of diversity can cause adverse business or educational outcomes. In addition, several of the above-mentioned domains are governed by laws and regulations that prohibit biased decisions. And finally, biased decisions can severely damage the reputation of the organization that makes them. Of course, bias in machine learning is a sociotechnical problem that cannot be solved with technical solutions alone. That said, to make tangible progress, this paper focuses on bias mitigators that can reduce bias in machine learning models. We acknowledge that bias mitigators can, at most, be a part of a larger solution.A bias mitigator either improves or replaces an existing machine learning estimator (e.g., a classifier) so it makes less biased predictions (e.g., class labels) as measured by a fairness metric (e.g., disparate impact). Unfortunately, bias mitigation often suffers from high volatility. There is usually less training data available for underrepresented groups.Less data means the learned estimator has fewer examples to generalize from for these groups.

show abstract

The Impact of Data Preparation on the Fairness of Software Systems

Cited by 12 publications

References 16 publications

MAAT: a novel ensemble approach to addressing fairness and performance bugs for machine learning software

MAAT: a novel ensemble approach to addressing fairness and performance bugs for machine learning software

Fairness Testing: A Comprehensive Survey and Analysis of Trends

An Empirical Study of Modular Bias Mitigators and Ensembles

Contact Info

Product

Resources

About