Racial categories in machine learning

Benthall, Sebastian; Haynes, Bruce D.

doi:10.1145/3287560.3287575

Cited by 76 publications

(87 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although such a design is derived from the naturally occurring labels of the crawled and referenced datasets, the selected groupings have inherent limitations. Unlike the Pilot Parliaments Benchmark from the Gender Shades study [6] where the intersectional groups are defined with respect to skin type, "ethnicity" is an attribute that is highly correlated but not deterministically linked to racial categories, which are themselves nebulous social constructs, encompassing individuals with a wide range of phenotypic features [3]. Similarly, binary gender labels are compatible with the format of commercial product outputs, but exclusionary of those not presenting in the stereotypical representations of each selected gender identity [29].…”

Section: Tensionmentioning

confidence: 99%

Saving Face

Raji

Gebru

Mitchell

et al. 2020

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society

170

View full text Add to dashboard Cite

Although essential to revealing biased performance, well intentioned attempts at algorithmic auditing can have effects that may harm the very populations these measures are meant to protect. This concern is even more salient while auditing biometric systems such as facial recognition, where the data is sensitive and the technology is often used in ethically questionable manners. We demonstrate a set of five ethical concerns in the particular case of auditing commercial facial processing technology, highlighting additional design considerations and ethical tensions the auditor needs to be aware of so as not exacerbate or complement the harms propagated by the audited system. We go further to provide tangible illustrations of these concerns, and conclude by reflecting on what these concerns mean for the role of the algorithmic audit and the fundamental product limitations they reveal.

show abstract

Section: Tensionmentioning

confidence: 99%

Saving Face

Raji

Gebru

Mitchell

et al. 2020

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society

170

View full text Add to dashboard Cite

show abstract

“…40 In machine learning, fairness encompasses concerns about how data-driven approaches can reflect and perpetuate biases rooted in social inequality and discrimination. 41,42 A model's predictions can vary systematically across demographic groups if, for example, the data being sampled reflects societal inequalities (i.e. historical bias) or if the sampling methods result in the underrepresentation of certain groups (i.e.…”

Section: Machine Learning Models: Performance Versus Interpretabilitymentioning

confidence: 99%

Ethical dilemmas posed by mobile health and machine learning in psychiatry research

Jacobson

Bentley

Walton

et al. 2020

Bull. World Health Organ.

View full text Add to dashboard Cite

The application of digital technology to psychiatry research is rapidly leading to new discoveries and capabilities in the field of mobile health. However, the increase in opportunities to passively collect vast amounts of detailed information on study participants coupled with advances in statistical techniques that enable machine learning models to process such information has raised novel ethical dilemmas regarding researchers' duties to: (i) monitor adverse events and intervene accordingly; (ii) obtain fully informed, voluntary consent; (iii) protect the privacy of participants; and (iv) increase the transparency of powerful, machine learning models to ensure they can be applied ethically and fairly in psychiatric care. This review highlights emerging ethical challenges and unresolved ethical questions in mobile health research and provides recommendations on how mobile health researchers can address these issues in practice. Ultimately, the hope is that this review will facilitate continued discussion on how to achieve best practice in mobile health research within psychiatry.

show abstract

“…Their study showed that there are gaps when using three different semantic representations such as bag-of-words, Deep Recurrent Neural Networks (DRNN), and word embedding [38]. Benthall and Haynes have investigated supervised learning algorithms and revealed that they are exposed to racial bias because of the differentiation that is embedded in systematic patterns [39].…”

Section: B Population Biasmentioning

confidence: 99%

Bias Remediation in Driver Drowsiness Detection Systems Using Generative Adversarial Networks

2020

View full text Add to dashboard Cite

Datasets are crucial when training a deep neural network. When datasets are unrepresentative, trained models are prone to bias because they are unable to generalise to real world settings. This is particularly problematic for models trained in specific cultural contexts, which may not represent a wide range of races, and thus fail to generalise. This is a particular challenge for Driver drowsiness detection, where many publicly available datasets are unrepresentative as they cover only certain ethnicity groups. Traditional augmentation methods are unable to improve a models performance when tested on other groups with different facial attributes, and it is often challenging to build new, more representative datasets. In this paper, we introduce a novel framework that boosts the performance of detection of drowsiness for different ethnicity groups. Our framework improves Convolutional Neural Network (CNN) trained for prediction by using Generative Adversarial networks (GAN) for targeted data augmentation based on a population bias visualisation strategy that groups faces with similar facial attributes and highlights where the model is failing. A sampling method selects faces where the model is not performing well, which are used to finetune the CNN. Experiments show the efficacy of our approach in improving driver drowsiness detection for under represented ethnicity groups. Here, models trained on publicly available datasets are compared with a model trained using the proposed data augmentation strategy. Although developed in the context of driver drowsiness detection, the proposed framework is not limited to the driver drowsiness detection task, but can be applied to other applications.

show abstract

Racial categories in machine learning

Cited by 76 publications

References 40 publications

Saving Face

Saving Face

Ethical dilemmas posed by mobile health and machine learning in psychiatry research

Bias Remediation in Driver Drowsiness Detection Systems Using Generative Adversarial Networks

Contact Info

Product

Resources

About