No abstract
Machine-learned computer vision algorithms for tagging images are increasingly used by developers and researchers, having become popularized as easy-to-use "cognitive services." Yet these tools struggle with gender recognition, particularly when processing images of women, people of color and non-binary individuals. Socio-technical researchers have cited data bias as a key problem; training datasets often over-represent images of people and contexts that convey social stereotypes. The social psychology literature explains that people learn social stereotypes, in part, by observing others in particular roles and contexts, and can inadvertently learn to associate gender with scenes, occupations and activities. Thus, we study the extent to which image tagging algorithms mimic this phenomenon. We design a controlled experiment, to examine the interdependence between algorithmic recognition of context and the depicted person's gender. In the spirit of auditing to understand machine behaviors, we create a highly controlled dataset of people images, imposed on gender-stereotyped backgrounds. Our methodology is reproducible and our code publicly available. Evaluating five proprietary algorithms, we find that in three, gender inference is hindered when a background is introduced. Of the two that "see" both backgrounds and gender, it is the one whose output is most consistent with human stereotyping processes that is superior in recognizing gender. We discuss the accuracy--fairness trade-off, as well as the importance of auditing black boxes in better understanding this double-edged sword.
There are increasing expectations that algorithms should behave in a manner that is socially just. We consider the case of image tagging APIs and their interpretations of people images. Image taggers have become indispensable in our information ecosystem, facilitating new modes of visual communication and sharing. Recently, they have become widely available as Cognitive Services. But while tagging APIs offer developers an inexpensive and convenient means to add functionality to their creations, most are opaque and proprietary. Through a cross-platform comparison of six taggers, we show that behaviors differ significantly. While some offer more interpretation on images, they may exhibit less fairness toward the depicted persons, by misuse of gender-related tags and/or making judgments on a person’s physical appearance. We also discuss the difficulties of studying fairness in situations where algorithmic systems cannot be benchmarked against a ground truth.
No abstract
Summary In this work, we investigate how students in fields adjacent to algorithms development perceive fairness, accountability, transparency, and ethics in algorithmic decision-making. Participants (N = 99) were asked to rate their agreement with statements regarding six constructs that are related to facets of fairness and justice in algorithmic decision-making using scenarios, in addition to defining algorithmic fairness and providing their view on possible causes of unfairness, transparency approaches, and accountability. The findings indicate that “agreeing” with a decision does not mean that the person “deserves the outcome,” perceiving the factors used in the decision-making as “appropriate” does not make the decision of the system “fair,” and perceiving a system's decision as “not fair” is affecting the participants' “trust” in the system. Furthermore, fairness is most likely to be defined as the use of “objective factors,” and participants identify the use of “sensitive attributes” as the most likely cause of unfairness.
No abstract
While professionals are increasingly relying on algorithmic systems for making a decision, on some occasions, algorithmic decisions may be perceived as biased or not just. Prior work has looked into the perception of algorithmic decision-making from the user's point of view. In this work, we investigate how students in fields adjacent to algorithm development perceive algorithmic decisionmaking. Participants (N=99) were asked to rate their agreement with statements regarding six constructs that are related to facets of fairness and justice in algorithmic decision-making in three separate scenarios. Two of the three scenarios were independent of each other, while the third scenario presented three different outcomes of the same algorithmic system, demonstrating perception changes triggered by different outputs. Quantitative analysis indicates that ) 'agreeing' with a decision does not mean the person 'deserves the outcome', ) perceiving the factors used in the decision-making as 'appropriate' does not make the decision of the system 'fair' and ) perceiving a system's decision as 'not fair' is affecting the participants' 'trust' in the system. In addition, participants found proportional distribution of benefits more fair than other approaches. Qualitative analysis provides further insights into that information the participants find essential to judge and understand an algorithmic decision-making system's fairness. Finally, the level of academic education has a role to play in the perception of fairness and justice in algorithmic decision-making.
Crowdsourcing plays a key role in developing algorithms for image recognition or captioning. Major datasets, such as MS COCO or Flickr30K, have been built by eliciting natural language descriptions of images from workers. Yet such elicitation tasks are susceptible to human biases, including stereotyping people depicted in images. Given the growing concerns surrounding discrimination in algorithms, as well as in the data used to train them, it is necessary to take a critical look at this practice. We conduct experiments at Figure Eight using a controlled set of people images. Men and women of various races are positioned in the same manner, wearing a grey t-shirt. We prompt workers for 10 descriptive labels, and consider them using the human-centric approach, which assumes reporting bias. We find that “what’s worth saying” about these uniform images often differs as a function of the gender and race of the depicted person, violating the notion of group fairness. Although this diversity in natural language people descriptions is expected and often beneficial, it could result in automated disparate impact if not managed properly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.