Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2015
DOI: 10.1145/2783258.2783311
|View full text |Cite
|
Sign up to set email alerts
|

Certifying and Removing Disparate Impact

Abstract: What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender) and an explicit description of the process.When computers are involved, determining disparate impact (and hence bias) is harder. It might not be possible to disclose the process. In additi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
1,312
0
10

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 1,381 publications
(1,410 citation statements)
references
References 16 publications
3
1,312
0
10
Order By: Relevance
“…Others attempt to probe the social consequences directly by porting methods from the social sciences, like the audit study (Sandvig et al 2014), although this is most suited to the limited subset of algorithmic influence that presents itself through public interfaces. Finally, there is a small but growing number of computer scientists who are attempting to develop anti-discriminatory remedies at the level of data and algorithms (Feldman et al 2014;Hajian and Domingo-Ferrer 2012). While having the merit of trying to correct data science from a perspective that understands the technicalities of its operations, it is constrained by seeing data science as an external set of methods rather than as a broader social apparatus in Foucault's sense, that is 'a thoroughly heterogeneous ensemble consisting of discourses, institutions, architectural forms, regulatory decisions, laws, administrative measures, scientific statements, philosophical, moral and philanthropic propositions' (Foucault 1988).…”
Section: The Problem With Data Sciencementioning
confidence: 99%
“…Others attempt to probe the social consequences directly by porting methods from the social sciences, like the audit study (Sandvig et al 2014), although this is most suited to the limited subset of algorithmic influence that presents itself through public interfaces. Finally, there is a small but growing number of computer scientists who are attempting to develop anti-discriminatory remedies at the level of data and algorithms (Feldman et al 2014;Hajian and Domingo-Ferrer 2012). While having the merit of trying to correct data science from a perspective that understands the technicalities of its operations, it is constrained by seeing data science as an external set of methods rather than as a broader social apparatus in Foucault's sense, that is 'a thoroughly heterogeneous ensemble consisting of discourses, institutions, architectural forms, regulatory decisions, laws, administrative measures, scientific statements, philosophical, moral and philanthropic propositions' (Foucault 1988).…”
Section: The Problem With Data Sciencementioning
confidence: 99%
“…In order to do this, we must find a way to capture information flow from one feature to another. We take a learning-theoretic perspective on this problem, which can be summarized via the principle, first enunciated in [10] in the context of certifying and removing bias in classifiers:…”
Section: B Our Workmentioning
confidence: 99%
“…Data preprocessing methods [1,12,13,17,21,37,43] modify the historic data to remove discriminatory effect according to some discrimination measure before learning a predictive model. For example, in [17] several methods for modifying data were proposed.…”
Section: Discrimination Preventionmentioning
confidence: 99%
“…These methods include Massaging, which changes the labels of some individuals in the dataset to remove discrimination, Reweighting, which assigns weights to individuals to balance the dataset, and Sampling, which changes the sample sizes of different subgroups to make the dataset discrimination-free. In [12], the distribution of the non-protected attributes in the dataset is modified such that the protected attribute cannot be estimated from the nonprotected attributes. Proposed methods for discrimination prevention using algorithm tweaking require some tweak of predictive models [7,9,14,15,18,19,35].…”
Section: Discrimination Preventionmentioning
confidence: 99%
See 1 more Smart Citation