Certifying and Removing Disparate Impact

Feldman, Michael B.; Friedler, Sorelle A.; Moeller, John F.; Scheidegger, Carlos; Venkatasubramanian, Suresh

doi:10.1145/2783258.2783311

Cited by 1,381 publications

(1,410 citation statements)

References 16 publications

Supporting

Mentioning

1,312

Contrasting

Unclassified

Order By: Relevance

“…Others attempt to probe the social consequences directly by porting methods from the social sciences, like the audit study (Sandvig et al 2014), although this is most suited to the limited subset of algorithmic influence that presents itself through public interfaces. Finally, there is a small but growing number of computer scientists who are attempting to develop anti-discriminatory remedies at the level of data and algorithms (Feldman et al 2014;Hajian and Domingo-Ferrer 2012). While having the merit of trying to correct data science from a perspective that understands the technicalities of its operations, it is constrained by seeing data science as an external set of methods rather than as a broader social apparatus in Foucault's sense, that is 'a thoroughly heterogeneous ensemble consisting of discourses, institutions, architectural forms, regulatory decisions, laws, administrative measures, scientific statements, philosophical, moral and philanthropic propositions' (Foucault 1988).…”

Section: The Problem With Data Sciencementioning

confidence: 99%

Data Science as Machinic Neoplatonism

McQuillan

2017

Philos. Technol.

View full text Add to dashboard Cite

Data science is not simply a method but an organising idea. Commitment to the new paradigm overrides concerns caused by collateral damage, and only a counterculture can constitute an effective critique. Understanding data science requires an appreciation of what algorithms actually do; in particular, how machine learning learns. The resulting 'insight through opacity' drives the observable problems of algorithmic discrimination and the evasion of due process. But attempts to stem the tide have not grasped the nature of data science as both metaphysical and machinic. Data science strongly echoes the neoplatonism that informed the early science of Copernicus and Galileo. It appears to reveal a hidden mathematical order in the world that is superior to our direct experience. The new symmetry of these orderings is more compelling than the actual results. Data science does not only make possible a new way of knowing but acts directly on it; by converting predictions to pre-emptions, it becomes a machinic metaphysics. The people enrolled in this apparatus risk an abstraction of accountability and the production of 'thoughtlessness'. Susceptibility to data science can be contested through critiques of science, especially standpoint theory, which opposes the 'view from nowhere' without abandoning the empirical methods. But a counterculture of data science must be material as well as discursive. Karen Barad's idea of agential realism can reconfigure data science to produce both non-dualistic philosophy and participatory agency. An example of relevant praxis points to the real possibility of 'machine learning for the people'.

show abstract

Section: The Problem With Data Sciencementioning

confidence: 99%

Data Science as Machinic Neoplatonism

McQuillan

2017

Philos. Technol.

View full text Add to dashboard Cite

show abstract

“…In order to do this, we must find a way to capture information flow from one feature to another. We take a learning-theoretic perspective on this problem, which can be summarized via the principle, first enunciated in [10] in the context of certifying and removing bias in classifiers:…”

Section: B Our Workmentioning

confidence: 99%

Auditing black-box models for indirect influence

et al. 2017

Self Cite

View full text Add to dashboard Cite

Abstract-Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior, and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models, or asserting that certain problematic attributes (like race or gender) are not unduly influencing decisions.In this paper, we present a technique for auditing black-box models, which lets us study the extent to which existing models take advantage of particular features in the dataset, without knowing how the models work. Our work focuses on the problem of indirect influence: how some features might indirectly influence outcomes via other, related features. As a result, we can find attribute influences even in cases where, upon further direct examination of the model, the attribute is not referred to by the model at all.Our approach does not require the black-box model to be retrained. This is important if (for example) the model is only accessible via an API, and contrasts our work with other methods that investigate feature influence like feature selection. We present experimental evidence for the effectiveness of our procedure using a variety of publicly available datasets and models. We also validate our procedure using techniques from interpretable learning and feature selection, as well as against other black-box auditing procedures.

show abstract

“…Data preprocessing methods [1,12,13,17,21,37,43] modify the historic data to remove discriminatory effect according to some discrimination measure before learning a predictive model. For example, in [17] several methods for modifying data were proposed.…”

Section: Discrimination Preventionmentioning

confidence: 99%

“…These methods include Massaging, which changes the labels of some individuals in the dataset to remove discrimination, Reweighting, which assigns weights to individuals to balance the dataset, and Sampling, which changes the sample sizes of different subgroups to make the dataset discrimination-free. In [12], the distribution of the non-protected attributes in the dataset is modified such that the protected attribute cannot be estimated from the nonprotected attributes. Proposed methods for discrimination prevention using algorithm tweaking require some tweak of predictive models [7,9,14,15,18,19,35].…”

Section: Discrimination Preventionmentioning

confidence: 99%

“…These approaches classify discrimination into different types such as group discrimination, individual discrimination, direct and indirect discrimination. Based on that, methods for discrimination prevention have been proposed [1,7,9,[12][13][14][15][17][18][19]21,35,37,43] which either use data preprocessing or algorithm tweaking. However, these works are mainly based on correlation or association-based measures which cannot be used to estimate the causal effect of the protected attributes on the decision.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Anti-discrimination learning: a causal modeling-based framework

Zhang

2017

Int J Data Sci Anal

View full text Add to dashboard Cite

Anti-discrimination learning is an increasingly important task in data mining. Discrimination discovery is the problem of unveiling discriminatory practices by analyzing a dataset of historical decision records, and discrimination prevention aims to remove discrimination by modifying the biased data and/or the predictive algorithms. Discrimination is causal, which means that to prove discrimination one needs to derive a causal relationship rather than an association relationship. Although it is well known that association does not mean causation, the gap between association and causation is not paid enough attention by many researchers. In this paper, we introduce a causal modeling-based framework for anti-discrimination learning. Discrimination is categorized according to two dimensions: direct/indirect and system/group/individual level. Within the causal framework, we introduce a work for discovering and preventing both direct and indirect system-level discrimination in the training data, and a work for extending the non-discrimination result from the training data to prediction. We then introduce two works for group-level direct discrimination and individual-level direct discrimination respectively. The aim of this paper is to deepen the understanding of discrimination in data mining from the causal modeling perspective, and suggest several potential future research directions.

show abstract

Certifying and Removing Disparate Impact

Cited by 1,381 publications

References 16 publications

Data Science as Machinic Neoplatonism

Data Science as Machinic Neoplatonism

Auditing black-box models for indirect influence

Anti-discrimination learning: a causal modeling-based framework

Contact Info

Product

Resources

About