“…Wang et al (2021) propose to remove dimensions in the CLIP embedding that are highly correlated with gender attributes. Berg et al (2022) debias the CLIP models with prompt learning via an adversarial approach. Recently, Zhang & Ré (2022) address the group robustness of vision-language models with contrastive learning.…”
Section: Biases In Vision Modelsmentioning
confidence: 99%
“…Prompts for text-image retrieval on FairFace Dataset. We adopt the 10 training concepts from (Berg et al, 2022) to construct the prompts for FairFace. These concepts are irrelevant to gender, race, or age, which makes them suitable for evaluating the model biases.…”
Section: B2 Lemma 42mentioning
confidence: 99%
“…Biases also exist in generative models, where generated images may exhibit bias towards certain genders and races (Cho et al, 2022;Mishkin et al, 2022). Substantial progress has been made recently toward mitigating biases in vision-language models (Parraga et al, 2022;Berg et al, 2022;Zhang & Ré, 2022). However, many current approaches for addressing bias in models require training or fine-tuning the models using resampled datasets or modified objectives, which can be computationally intensive for foundation models.…”
Machine learning models have been shown to inherit biases from their training datasets, which can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet. The biases can be amplified and propagated to downstream applications like zero-shot classifiers and text-to-image generative models. In this study, we propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. In particular, we show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models. The closed-form solution enables easy integration into large-scale pipelines, and empirical results demonstrate that our approach effectively reduces social bias and spurious correlation in both discriminative and generative visionlanguage models without the need for additional data or training. The code is available at https: //github.com/chingyaoc/debias_vl.
“…Wang et al (2021) propose to remove dimensions in the CLIP embedding that are highly correlated with gender attributes. Berg et al (2022) debias the CLIP models with prompt learning via an adversarial approach. Recently, Zhang & Ré (2022) address the group robustness of vision-language models with contrastive learning.…”
Section: Biases In Vision Modelsmentioning
confidence: 99%
“…Prompts for text-image retrieval on FairFace Dataset. We adopt the 10 training concepts from (Berg et al, 2022) to construct the prompts for FairFace. These concepts are irrelevant to gender, race, or age, which makes them suitable for evaluating the model biases.…”
Section: B2 Lemma 42mentioning
confidence: 99%
“…Biases also exist in generative models, where generated images may exhibit bias towards certain genders and races (Cho et al, 2022;Mishkin et al, 2022). Substantial progress has been made recently toward mitigating biases in vision-language models (Parraga et al, 2022;Berg et al, 2022;Zhang & Ré, 2022). However, many current approaches for addressing bias in models require training or fine-tuning the models using resampled datasets or modified objectives, which can be computationally intensive for foundation models.…”
Machine learning models have been shown to inherit biases from their training datasets, which can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet. The biases can be amplified and propagated to downstream applications like zero-shot classifiers and text-to-image generative models. In this study, we propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. In particular, we show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models. The closed-form solution enables easy integration into large-scale pipelines, and empirical results demonstrate that our approach effectively reduces social bias and spurious correlation in both discriminative and generative visionlanguage models without the need for additional data or training. The code is available at https: //github.com/chingyaoc/debias_vl.
“…Several methodologies to measure and mitigate bias cannot be applied in our setting given the lack of public access to GPT-3's model architecture or training dataset, and the enormous resources needed to retrain the model from scratch. In particular, this includes training data augmentation (Sen et al, 2021), adjusting model behaviour via adversarial learning (Zhang et al, 2018;Berg et al, 2022), and amending model embeddings (Dev and Phillips, 2019).…”
The growing capability and availability of generative language models has enabled a wide range of new downstream tasks. Academic research has identified, quantified and mitigated biases present in language models but is rarely tailored to downstream tasks where wider impact on individuals and society can be felt. In this work, we leverage one popular generative language model, GPT-3, with the goal of writing unbiased and realistic job advertisements. We first assess the bias and realism of zero-shot generated advertisements and compare them to real-world advertisements. We then evaluate prompt-engineering and fine-tuning as debiasing methods. We find that prompt-engineering with diversity-encouraging prompts gives no significant improvement to bias, nor realism. Conversely, fine-tuning, especially on unbiased real advertisements, can improve realism and reduce bias.
“…Several methodologies to measure and mitigate bias cannot be applied in our setting given the lack of public access to GPT-3's model architecture or training dataset, and the enormous resources needed to retrain the model from scratch. In particular, this includes training data augmentation (Sen et al, 2021), adjusting model behaviour via adversarial learning Berg et al, 2022), and amending model embeddings . Our analysis instead focuses on the text-level bias of model-generated outputs which we measure via a composite score based on the prevalence of certain gender-laden terms, and debiasing methods which require no access to the model architecture, nor original training data.…”
Warning: This work contains strong and offensive language, sometimes uncensored.To tackle the rising phenomenon of hate speech, efforts have been made towards data curation and analysis. When it comes to analysis of bias, previous work has focused predominantly on race. In our work, we further investigate bias in hate speech datasets along racial, gender and intersectional axes. We identify strong bias against African American English (AAE), masculine and AAE+Masculine tweets, which are annotated as disproportionately more hateful and offensive than from other demographics. We provide evidence that BERT-based models propagate this bias and show that balancing the training data for these protected attributes can lead to fairer models with regards to gender, but not race.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.